Skip to content

Commit f086908

Browse files
committed
Updated Main Pipeline and Reflected all updates in README
1 parent d5d0648 commit f086908

2 files changed

Lines changed: 115 additions & 11 deletions

File tree

README.md

Lines changed: 57 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -82,8 +82,39 @@ This is a [summary](./report/CSI5186_AI_Testing_Project_Report___Fernando__Kelvi
8282
```
8383

8484
## Execution Guide
85-
### Data Preparation
86-
1. Run `data_download.py` to firstly download the datasets needed.
85+
86+
### Complete Pipeline (Recommended)
87+
Run the complete pipeline from data preparation through experiments to final training:
88+
89+
```bash
90+
# Run all models with all optimizers
91+
python main.py
92+
93+
# Run specific model with all optimizers
94+
python main.py --model dt
95+
96+
# Run specific model with specific optimizer
97+
python main.py --model cnn --optimizer rs
98+
99+
# Force re-download and re-process datasets
100+
python main.py --force
101+
```
102+
103+
**Available arguments:**
104+
- `--model`: Model to run - `["dt", "knn", "cnn"]`. If omitted, runs all models.
105+
- `--optimizer`: Optimizer to use - `["rs", "ga-standard", "ga-memetic", "pso"]`. If omitted, runs all optimizers for each model.
106+
- `--force`: Force re-download and re-processing of datasets (default: False).
107+
108+
**Pipeline stages:**
109+
1. **Data Download**: Automatically downloads CIFAR-10 dataset (if not present or with `--force`)
110+
2. **Data Processing**: Processes and prepares datasets (if not present or with `--force`)
111+
3. **Hyperparameter Search**: Runs experiments for specified model(s) and optimizer(s) combinations
112+
4. **Final Training**: Trains models with best-found hyperparameters on full dataset and evaluates on test set
113+
114+
### Data Preparation (Manual)
115+
If you prefer to run data preparation separately:
116+
117+
1. Run `data_download.py` to download the datasets needed.
87118
* Note: Data are stored into the `.cache\` folder which is gitignored.
88119
* Note: Should you rerun the script again, and the folder already exists with contents, please run the script with argument `--force` to enable a smooth overwriting behavior.
89120
2. Run `data_process.py` to process the images in the datasets.
@@ -137,24 +168,41 @@ python hparam_search.py
137168
* It passes parameters directly to model creation/training; and so, it'll be less flexible for advanced training configs (e.g., custom epochs, patience).
138169
* It is intended for quick experiments, visualizations, and debugging with a single optimizer.
139170

140-
### Run a Full Experiment
141-
* You can run a full hyperparameter search based on this script:
171+
### Run a Full Experiment (Advanced)
172+
If you want more control over the experiment parameters, you can run experiments directly using this script:
142173
```bash
143-
python scripts/run_experiment.py
174+
python scripts/run_experiment.py --model dt
144175
```
145176
**Available arguments:**
146-
- `--model`: Model Choices - `["dt", "knn", "cnn"]`, default `dt`. **Mandatory** Argument!!!
177+
- `--model`: Model Choices - `["dt", "knn", "cnn"]`. **Mandatory** argument!!!
178+
- `--optimizer`: Optimizer to use - `["rs", "ga-standard", "ga-memetic", "pso"]`. If omitted, runs all optimizers.
147179
- `--runs`: Number of independent runs - default 1.
148-
- `--trials`: Number of Evaluations - default 5.
149-
- `--evaluations`: Number of fitness evaluations per run - default 50
180+
- `--evaluations`: Number of fitness evaluations per run - default 50.
150181
- `--seed`: Base seed for randomization - default 42.
151182
- `--n-jobs`: Number of parallel workers - default 1 for a sequential run. Use -1 for all CPUs.
152183

153184
* It is designed for systematic, reproducible experiments across all kinds of optimizers.
154185
* All optimizers are supported and selectable via CLI.
155186
* It saves results, convergence traces, and summaries to disk for later analysis (but not on TensorBoard).
156187
* For CNN, it uses a TrainingConfig object for fine-grained control (learning rate, weight decay, optimizer, batch size, patience); and disables early stopping for CNN by default for fair comparison.
157-
* Given its flexibility and robustness, it is intended for benchmarking, comparison, and research—especially when comparing optimizer performanc.
188+
* Given its flexibility and robustness, it is intended for benchmarking, comparison, and research—especially when comparing optimizer performance.
189+
190+
### Run Final Training (Advanced)
191+
After running experiments, you can run final training separately:
192+
```bash
193+
python scripts/run_final_training.py
194+
```
195+
**Available arguments:**
196+
- `--seed`: Seed for final training runs - default 42.
197+
- `--experiments`: Optional list of experiment names (e.g., `dt-rs`, `cnn-ga-standard`) to include. If omitted, trains all experiments found in `.cache/experiment/`.
198+
- `--max-parallel-cnn`: Maximum parallel CNN trainings - default 1.
199+
- `--max-parallel-classic`: Maximum parallel DT/KNN trainings - default 1.
200+
201+
This script:
202+
- Loads best hyperparameters from each experiment's best run
203+
- Trains models on the full training set
204+
- Evaluates on the held-out test set
205+
- Saves trained models and results to `.cache/final_training/`
158206

159207
### Analyze Results
160208
Upon completion of an execution from the `run_experiment.py` script, you will likely get some folders under `.cache/experiment` folder. You can visualize plots for analysis based on this script:

main.py

Lines changed: 58 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,62 @@
1+
import argparse
2+
from pathlib import Path
3+
import sys
4+
ROOT = Path(__file__).resolve().parent
5+
SCRIPTS_DIR = ROOT / "scripts"
6+
BASE_DATASETS_DIR = ROOT / ".cache" / "base_datasets"
7+
PROCESSED_DATASETS_DIR = ROOT / ".cache" / "processed_datasets"
8+
BASE_CIFAR10_DIR = BASE_DATASETS_DIR / "cifar10"
9+
PROCESSED_CIFAR10_DIR = PROCESSED_DATASETS_DIR / "cifar10"
10+
sys.path.append(str(ROOT))
11+
sys.path.append(str(SCRIPTS_DIR))
12+
13+
# Import Scripts
14+
from scripts.data_download import download_dataset
15+
from scripts.data_process import main as preprocess
16+
from scripts.run_experiment import main as run_experiment
17+
from scripts.run_final_training import main as run_final_training
18+
119
def main():
2-
print("Hello from hyperparameter-tuning-search-project!")
20+
parser = argparse.ArgumentParser(description="Main Pipeline: Run Experiments and Final Training")
21+
parser.add_argument("--force", default=False, help="Force re-download of datasets") # DEFAULT: do not re-download if already exists
22+
parser.add_argument("--model", type=str, default=None, help="Optional Model type to run experiments on. Otherwise runs all models.")
23+
parser.add_argument("--optimizer", type=str, default=None, help="Optional Optimizer type to run experiments on. Otherwise runs all optimizers.")
24+
args = parser.parse_args()
25+
if not BASE_DATASETS_DIR.exists() or args.force:
26+
download_dataset(
27+
repo_id="uoft-cs/cifar10",
28+
destination=BASE_CIFAR10_DIR,
29+
force=args.force,
30+
)
31+
if not PROCESSED_CIFAR10_DIR.exists() or args.force:
32+
preprocess()
33+
34+
# Remove '--force' and its value from sys.argv if present
35+
for item in ['--force', str(args.force)]:
36+
if item in sys.argv:
37+
sys.argv.remove(item)
38+
39+
models = [args.model] if args.model else ['dt', 'knn', 'cnn']
40+
41+
for modelName in models:
42+
sys.argv += ['--model', modelName]
43+
if args.optimizer:
44+
sys.argv += ['--optimizer', args.optimizer]
45+
46+
run_experiment()
47+
48+
# Clean up arguments
49+
if '--model' in sys.argv:
50+
sys.argv.remove('--model')
51+
if modelName in sys.argv:
52+
sys.argv.remove(modelName)
53+
if args.optimizer and '--optimizer' in sys.argv:
54+
sys.argv.remove('--optimizer')
55+
if args.optimizer and args.optimizer in sys.argv:
56+
sys.argv.remove(args.optimizer)
57+
58+
return run_final_training()
359

460

561
if __name__ == "__main__":
6-
main()
62+
exit(main())

0 commit comments

Comments
 (0)