This repository contains the solution for Assignment 1 of the Deep Learning course at the University of Tehran, focusing on image classification, adversarial attacks, and defensive techniques.
The project explores the robustness of ResNet models against noise and contrasts it with the performance of Vision Transformers (ViT). A significant part of the work involves implementing adversarial attacks (like FGSM) and evaluating defensive methods, specifically adversarial training.
This assignment was designed to provide hands-on experience with:
- Implementing and training standard models like ResNet on image datasets.
- Evaluating model robustness against simple perturbations like Gaussian Noise.
- Understanding and implementing Adversarial Attacks to exploit model vulnerabilities.
- Applying Defensive Techniques (e.g., Adversarial Training) to build more robust models.
- Fine-tuning and training Vision Transformers (ViT) and comparing their behavior to CNNs.
/Q1.ipynb: The main Jupyter Notebook containing all the code, training loops, attack implementations, and visualizations./Q1.pdf: The detailed Persian report (Ϊ―Ψ²Ψ§Ψ±Ψ΄ Ϊ©Ψ§Ψ±) explaining the theory, methodology, and results.README.md: This file.
We analyzed the models' performance not just on accuracy, but on why they make certain decisions, especially under attack.
We first established a baseline by training a ResNet model. We found that adding simple Gaussian noise significantly degraded performance, highlighting the sensitivity of standard models.
We then trained two ViT models: one fine-tuned from pre-trained weights and one trained from scratch. The fine-tuned model achieved superior results, demonstrating the power of transfer learning.
[Image Placeholder]
This was the core of the project. We observed that standard models are extremely vulnerable to adversarial attacks, even when invisible to the human eye.
Our key result, shown through Grad-CAM, is that Adversarial Training fundamentally changes how the model "sees" an image.
- Standard Model (ViT-Finetuned): Focuses on small, high-frequency textures (e.g., a few specific petals). This is a "brittle" strategy.
- Defended Model (ViT-Finetuned-Adv): Learns to look at the overall, holistic shape of the object (e.g., the entire cluster of flowers). This is a much more robust and human-like strategy.
[Image Placeholder]
To run this project locally, ensure you have the necessary libraries.
- Python 3.9+
- PyTorch
- Torchvision
- NumPy
- Matplotlib
- Tqdm
-
Clone the repository:
git clone [https://github.com/](https://github.com/)[YourUsername]/[Your-Repo-Name].git cd [Your-Repo-Name] -
Create a virtual environment (Recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies: (I am assuming these based on your
Q1.ipynbimports)pip install torch torchvision numpy matplotlib tqdm jupyter
All the code is contained within the Jupyter Notebook:
jupyter notebook Q1.ipynbYou can run the cells sequentially to reproduce the training, attacks, and visualizations.
- Course: Deep Learning (Neural Networks) - University of Tehran
- Authors:
- Ali Ghorbani Bargani (810103209)
- Mobin Tirafkan (810103091)
This project is licensed under the MIT License.