# Network Intrusion Detection System (NIDS) using Tree-based Ensemble Learning
## 1. Overview
Did you ever thought if somebody is trying to "hack" your network? Or realize how "secure" are the network where we use to connect our devices?.
Well, most of the time usually is secure enough unless there is a smart bad guy or many of them who want to break into the network.
Then, not only your password or privacy get will be compromised, also a whole organization could break down into peaces due to impact of cyber-attacks.
There is when a anti-virus or firewall is just not enough to protect us and here is where the Network Intrusion Detection came to help us against the bad guys.
## 2. Project Structure
Network Intrusion Detection System (NIDS) is a security technology software application designed to
- monitor network traffic behaviour,
- detect **malicius activity**,
- identify cyber-attacks,
- alert to administrators.
The approach of this project is to use different tree-based **machine learning** models on a pre-proceced dataset to compare its accuracy and precision.
```graphviz
digraph hierarchy {
nodesep=0.4 // increases the separation between nodes
node [color=Darkgreen,fontname=Arial,shape=box] //All nodes will this shape and colour
edge [color=black, style=solid] //All the lines look like this
"Networking"->{SDN, NFV, "Cyber-Security"}
"Cyber-Security" -> {NIDS, HIDS}
NIDS->{"CSE-CIC-IDS-Dataset"}
"CSE-CIC-IDS-Dataset"->"Machine Learning"
#{rank=same;ITManager Teacher1 Teacher2} // Put them on the same level
}
```
## 3. Details of tree-based algoritms
```graphviz
digraph hierarchy {
nodesep=0.4 // increases the separation between nodes
node [color=Darkgreen,fontname=Arial,shape=box] //All nodes will this shape and colour
edge [color=black, style=solid] //All the lines look like this
"Machine Learning"->{ "Decision Tree" "Random Forest" "Bagging" "XGBoost" "CatBoost" "LightGBM" }
}
```
### Binary Decision Tree Classifier
```graphviz
digraph BinaryTreeClassifier {
node [shape=box];
Root [label="Weather?\nSunny / Rainy"];
Left [label="Play"];
Right [label="Temperature?\nHot / Cold"];
RightLeft [label="Not Play"];
RightRight [label="Play"];
Root -> Left [label=" Sunny"];
Root -> Right [label=" Rainy"];
Right -> RightLeft [label=" Hot"];
Right -> RightRight [label=" Cold"];
}
```
### Random Forest
```graphviz
digraph RandomForestClassifier {
subgraph cluster_tree1 {
label="Tree 1";
node [shape=box, style=rounded];
Tree1Root [label="Age? \n < 30 / >= 30"];
Tree1Left [label="Eligible"];
Tree1Right [label="Not Eligible"];
Tree1Root -> Tree1Left [label=" < 30"];
Tree1Root -> Tree1Right [label=" >= 30"];
}
subgraph cluster_tree2 {
label="Tree 2";
node [shape=box, style=rounded];
Tree2Root [label="Income? \n < 50k / >= 50k"];
Tree2Left [label="Not Eligible"];
Tree2Right [label="Eligible"];
Tree2Root -> Tree2Left [label=" < 50k"];
Tree2Root -> Tree2Right [label=" >= 50k"];
}
}
```
### Bagging
```graphviz
digraph Baggin {
subgraph root_tree {
label="Root Tree";
node [shape=box];
"Root Tree" -> {Tree1Root Tree2Root Tree3Root};
subgraph cluster_tree1 {
label="Tree 1";
node [shape=box];
Tree1Root [label="Age? \n < 30 / >= 30"];
Tree1Left [label="Eligible"];
Tree1Right [label="Not Eligible"];
Tree1Root -> Tree1Left [label=" < 30"];
Tree1Root -> Tree1Right [label=" >= 30"];
}
subgraph cluster_tree2 {
label="Tree 2";
node [shape=box];
Tree2Root [label="Income? \n < 50k / >= 50k"];
Tree2Left [label="Not Eligible"];
Tree2Right [label="Eligible"];
Tree2Root -> Tree2Left [label=" < 50k"];
Tree2Root -> Tree2Right [label=" >= 50k"];
}
subgraph cluster_tree3 {
label="Tree 3";
node [shape=box];
Tree3Root [label="Experience? \n < 2 years / >= 2 years"];
Tree3Left [label="Not Eligible"];
Tree3Right [label="Eligible"];
Tree3Root -> Tree3Left [label=" < 2 years "];
Tree3Root -> Tree3Right [label=" >= 2 years "];
}
}
subgraph results {
label="results";
node [shape=box];
{Tree3Left Tree3Right Tree2Right Tree2Left Tree1Right Tree1Left} -> {"Decision"};
}
}
```
<!--You can run this repository directly in a Google Colab environment in the following url [Open Google Colab.](https://githubtocolab.com/mjacker/MJCapstone/blob/master/0_merged_ipynb_files_for_google_colab.ipynb)
Otherwise you can clone this repository, then install a venv enviroment using de requirements files to try it by yourself.
-->
<!--
- Networking
- sofware defined networks
- mininet -> virtual network
- rfc-org -> protocols
- cybersecurity
- attacks
- wireshark
- machine learning0
- decicion tree
- Random Forest
- (begging)
-->
## 4. Dataset Description
The dataset was created as a result of a collaborative project between the Canadian Institute for Cybersecurity (CIC) and the Communications Security Establishments (CSE).
Due to privacy and confidentiality, organizations will not share their traffic data, this itself is a significant challenge and availability becomes extremely rare.

Figure 1 : Network Topology.
> The dataset `CSE-CIC-IDS2018` is hosted in Amazon Web Services (AWS).
## 5 Type of Attacks
:::warning
|Name |Attack​|
|---|--------|
| 1 | Bruteforce attack| |
| 2 | Web attack|
| 3 | Infiltration attack|
| 4 | Botnet attack|
| 5 | DDos and Port Scan|
:::
## 6. Results
:::success







:::
## 6. Prerequisites and Installations
You can run and test this project repository directly in Google Colab environment in the following url [Open Google Colab](https://githubtocolab.com/mjacker/MJCapstone/blob/master/0_merged_ipynb_files_for_google_colab.ipynb) or Scan the QR below.
Another way is cloning this repository, then install a `venv` enviroment using the `requirement.yml` file. [Open Github Repository](https://github.com/mjacker/MJCapstone/tree/master) or Scan the QR below.
 
<!--
---
ahora mismo solo voy a escribir lo que estoy pensando en como separar mi presentacion
mi presentacion se divide en tres partes quisiera pensar
Tools
Google colab deployment
Github
-->