Repository to track the progress for AI in Medical Imaging Diagnostics project about spine segmentation.
- Python 3.9.21 (recommended)
- Create a virtual environment:
python3.9 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt-
Download datasets from the sources listed below and place them in the appropriate directories as specified in
config.jsonor modify config itself. -
Run the training pipeline:
./cli.pyIf configured properly, cli.py will:
- Preprocess the datasets
- Create dataloaders with train/val/test splits
- Start training the model
You can specify a custom config file using --config:
./cli.py --config my_config.jsonIt is recommended to place new configs for separate experiments in experiments/ directory in appropriately named subdirectory.
Preprocessing is executed automatically if preprocessed_data_dir in Config do not exist and data sources in PreprocessingConfig are available. So if you want to create another version of preprocessing modify preprocessing code and run cli.py with different preprocessed_data_dir.
The project follows a modular structure with components in the src/ directory:
config.py: Configuration management using dataclassespreprocessing.py: Data preprocessing pipelinedataloader.py: PyTorch dataloaders with data augmentationmodel.py: Model factory and architecture definitionstrain.py: Training loop and validationloss.py,metrics.py,optimizer.py,scheduler.py: Training components
The project uses a single Config object (from src.config) that is dependency-injected throughout the codebase. This Config object contains all hyperparameters, paths, and settings organized into nested dataclasses:
TrainingConfig: Training hyperparameters (learning rate, epochs, scheduler, etc.)PreprocessingConfig: Data preprocessing settingsDataLoaderConfig: Dataloader configuration and augmentation parametersModelConfig: Model architecture parametersWandBConfig: Weights & Biases logging configuration
All components receive the same Config instance, ensuring consistency and making it easy to manage experiments through JSON configuration files.
After change of Config structure run python -m src.config in order to regenerate default_config.json.
After changes made to dataloader.py you can run python -m src.dataloader in order to test it.
Similarly for preprocessing.py
For hyperparameter optimization using Weights & Biases, see WANDB_SWEEP_README.md.
EDA was done in EDA/datasets_eda.ipynb.
Datasets' sources: