This project implements a Distributed Deep Neural Network system for image classification with an optimized offloading mechanism. The system consists of a local (edge) component and a cloud component, with a deep learning-based offloading mechanism to optimize computational resource usage and classification accuracy.
- Local Feature Extractor: Performs initial feature extraction on the edge device
- Local Classifier: Makes preliminary classifications on the edge
- Cloud CNN: More complex model running on cloud infrastructure
- Offload Mechanism: Deep learning-based decision system for intelligent offloading
- Multi-dataset Support: CIFAR-10, CIFAR-100, CINIC-10, SVHN, GTSRB-32
- Multiple Input Modes: 'feat', 'logits', 'logits_plus', 'img'
- Comprehensive Experiments: Offload mechanism testing, timing analysis, overfitting detection, border/noisy sample analysis
- Modular Architecture: Clean separation of models, data loaders, training, and evaluation
- Unified CLI: Single script for all experiments with intuitive arguments
├── src/
│ ├── models/
│ │ ├── ddnn_models.py # LocalFeatureExtractor, LocalClassifier, CloudCNN
│ │ └── offloading_model.py # OffloadMechanism
│ ├── data/
│ │ └── data_loader.py # Multi-dataset loaders
│ ├── utils/
│ │ └── utils.py # Helper functions (BKS computation, model initialization)
│ ├── training.py # DDNN and offload mechanism training
│ └── evaluation.py # 11 comprehensive test functions
├── scripts/
│ └── main.py # Unified experiment runner
├── models/ # Saved model weights (.pth files)
├── plots/ # Generated visualization outputs
├── data/ # Dataset directory (auto-downloaded)
├── archive/ # Legacy code (reference only)
└── README.md
# Core dependencies
Python >= 3.8
torch >= 1.9.0
torchvision >= 0.10.0
numpy >= 1.19.0
matplotlib >= 3.3.0
scipy >= 1.5.0
# Optional (for GPU acceleration)
CUDA >= 11.0Install dependencies:
pip install torch torchvision numpy matplotlib scipy# Train DDNN and test offload mechanism on CIFAR-10
python scripts/main.py --mode train --dataset cifar10 --epochs_ddnn 50 --epochs_offload 30
# Run experiments with pretrained models (only if models already exist from same dataset/parameters)
python scripts/main.py --mode load --dataset cifar10 --testing_mode timing-
--mode: Execution mode (default:train)train: Train DDNN from scratch, then run experimentsload: Use only if you have pretrained DDNN models (from a previous run with the same dataset and parameters)
-
--dataset: Dataset selectioncifar10: CIFAR-10 (10 classes, baseline)cifar100: CIFAR-100 (100 classes, harder variant)cinic10: CINIC-10 (CIFAR-10 + ImageNet mix)svhn: Street View House Numbersgtsrb32: German Traffic Sign Recognition (43 classes)
-
--testing_mode: Type of experiment to runoffload_mechanism(default): Test DDNNs perfomance using the optimized offloading mechanismtiming: Inference time benchmarkingborder_noisy: Misclassification analysis (border/noisy samples)overfitting: Train/validation accuracy tracking
--epochs_ddnn: DDNN training epochs (default: 50)--epochs_offload: Offload mechanism training epochs (default: 30)--batch_size: Batch size (default: 256)--local_weight: Local loss weight in DDNN training (default: 0.7)--L0: Target local percentage for single-L0 experiments (default: 0.54)
python scripts/main.py \
--mode train \
--dataset cifar10 \
--testing_mode offload_mechanism \
--epochs_ddnn 50 \
--epochs_offload 30 \
--batch_size 128Output: Plot showing DDNN accuracy vs local percentage for different methods
python scripts/main.py \
--mode train \
--dataset cifar10 \
--testing_mode timing \
--epochs_ddnn 50 \
--epochs_offload 20 \
--batch_size 128 \
--L0 0.54Output: Timing comparison plot for each method
python scripts/main.py \
--mode train \
--dataset cifar10 \
--testing_mode border_noisy \
--epochs_ddnn 50 \
--epochs_offload 50 \
--batch_size 128 \
--L0 0.54Output: Test misclassification rates and training labeling quality analysis
python scripts/main.py \
--mode train \
--dataset cifar10 \
--testing_mode overfitting \
--epochs_ddnn 50 \
--epochs_offload 50 \
--batch_size 128 \
--L0 0.54Output: Train vs validation accuracy curves
# Train on CIFAR-100 (100 classes)
python scripts/main.py --mode train --dataset cifar100 --epochs_ddnn 50 --epochs_offload 30
# Train on GTSRB-32 (43 traffic sign classes)
python scripts/main.py --mode train --dataset gtsrb32 --epochs_ddnn 50 --epochs_offload 30The system provides 4 comprehensive experiment modes:
- Tests multiple L0 values (0%, 10%, ..., 100% local processing)
- Compares different input representations (features, logits, logits+margin+entropy)
- Benchmarks against baselines (entropy, oracle, random)
- Generates accuracy vs local percentage plots
- Measures inference time for each method
- Runs multiple iterations for statistical reliability
- Outputs per-sample timing (milliseconds)
- Creates timing comparison visualizations
- Identifies misclassified samples near decision boundaries
- Analyzes oracle labeling quality on training data
- Computes misclassification rates for different sample types
- Generates detailed analysis plots
- Tracks train/validation accuracy across training epochs
- Compares different input modes (feat, logits)
- Helps identify optimal stopping points
- Visualizes learning curves
Pre-trained models are saved in models/ directory:
local_feature_extractor.pth- Edge feature extractionlocal_classifier.pth- Edge classification headcloud_cnn.pth- Cloud classification networkoffload_mechanism.pth- Learned offload decisionsbest_offload_mechanism.pth- Best offload model (auto-saved during training)
All plots are saved to plots/ directory:
ddnn_overall_<dataset>_<modes>.png- Overall accuracy plotstiming_benchmark_<dataset>_L0<value>.png- Timing comparisonoverfitting_<dataset>_L0<value>.png- Train/val curvestest_misclassification_<dataset>_L0<value>.png- Misclassification analysistrain_labeling_quality_<dataset>_L0<value>.png- Labeling quality
This project is part of a diploma thesis on Distributed Deep Neural Networks.
For questions or collaborations, please open an issue on GitHub.