Skip to content

paramveerkaur1/anomaly-detection-in-network-traffic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anomaly Detection In Network Traffic

This project uses unsupervised learning techniques — specifically Isolation Forest and Autoencoders — to detect anomalies in network traffic data. These anomalies could indicate potential cybersecurity threats, unauthorized access, or system malfunctions.

KDD Cup 1999 Dataset

A benchmark dataset for network intrusion detection or anomaly detection in network traffic data.

  • Contains millions of connection records labeled as normal or attack.
  • Features include protocol, duration, service, source bytes, and content-based features.

Available at:

Project Objective

  • Detect unusual network activity patterns using unsupervised anomaly detection.
  • Implement and compare:
    • Isolation Forest
    • Autoencoder Neural Network
  • Evaluate model performance using reconstruction error and anomaly scores.
  • Visualize and interpret results.

Tools & Technologies

  • Python 3.10+
  • Scikit-learn – for Isolation Forest and preprocessing
  • TensorFlow / Keras – for Autoencoder implementation
  • Pandas / NumPy – for data handling
  • Matplotlib / Seaborn – for visualization

Workflow

Data Preparation

  • Load kddcup.data_10_percent_corrected and assign column names.
  • Encode categorical features using Label Encoding.
  • Normalize numeric features using StandardScaler.
  • Map labels to binary: 0 = normal, 1 = anomaly.

Isolation Forest Model

Isolation Forest model for anomaly detection in network traffic is implemented here: https://github.com/paramveerkaur1/anomaly-detection-in-network-traffic/blob/main/anomaly-detection-using-isolation-forest.ipynb

Workflow:

  • Train an Isolation Forest model on the dataset.
  • Predict anomalies (-1 = anomaly, 1 = normal) and map to encoded values.
  • Convert predictions for evaluation.
  • Evaluate using F1-score, precision, recall, and ROC AUC.

Autoencoder Model

Autoencoder model for anomaly detection in network traffic is implemented here: https://github.com/paramveerkaur1/anomaly-detection-in-network-traffic/blob/main/anomaly_detection_using_autoencoder.ipynb

Workflow:

  • Build a deep autoencoder neural network.
  • Train only on normal records to learn expected behavior.
  • Use reconstruction error to identify outliers.
  • Determine threshold (e.g., 95th percentile of error).
  • Evaluate predictions using
    • Classification Report (Precision, Recall, F1-Score)
    • Confusion Matrix
    • ROC AUC Score
    • Reconstruction Error Distribution plots

Output

Isolation Forest Model:

  1. Confusion Matrix
isolation output
  1. ROC Curve
isolation output 2

Autoencoder Model:

  1. Confusion Matrix
autoencoder output
  1. Autoencoder Model Summary
autoencoder model

For questions or suggestions, contact: 14paramveer@gmail.com

About

Using unsupervised learning techniques such as isolation forests or autoencoders to detect unusual patterns or anomalies in network traffic data, which could indicate potential security breaches or system malfunctions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors