Modern ML Cookiecutter

🚀 A modality-aware, end-to-end template for modern machine learning projects covering NLP, Speech, and Vision with best-in-class models and researcher-friendly configuration.

📖 Read the story behind this template: Why I Built a Modern ML Cookiecutter

✨ Features

🎯 Multi-Modal Support: Choose from NLP (DistilBERT), Speech (Whisper), or Vision (ViT) with optimized configurations
⚡ Fast Dependency Management: Uses uv for lightning-fast package management
🧠 ML-Centric Configuration: Researcher-friendly parameter names (epochs not num_train_epochs)
🖥️ Mac MPS Support: Optimized for Apple Silicon with Metal Performance Shaders
☁️ Local & Cloud Training: Seamless training with Hugging Face Accelerate locally or SkyPilot in the cloud
🚀 Production-Ready Serving: High-performance model serving with LitServe
📊 Experiment Tracking: Optional integration with tracelet
🔧 Type-Safe Configuration: Pydantic-based settings with modality-aware validation

🎯 Supported Modalities

Modality	Task	Model	Dataset	Key Libraries
NLP	Text Classification	DistilBERT	IMDB	transformers, datasets
Speech	ASR (Speech-to-Text)	Whisper	Common Voice	transformers, librosa, torchaudio
Speech	TTS (Text-to-Speech)	CSM (Sesame)	Conversational	transformers, CSM
Vision	Image Classification	Vision Transformer	CIFAR-10	torchvision, PIL, opencv

🚀 Quick Start

Install cruft (recommended) or cookiecutter:

uvx cruft create https://github.com/prassanna-ravishankar/cookiecutter-modern-ml
# or: uv tool install cookiecutter && cookiecutter https://github.com/prassanna-ravishankar/cookiecutter-modern-ml

Generate a new project:

# Using cruft (recommended for template updates)
uvx cruft create https://github.com/prassanna-ravishankar/cookiecutter-modern-ml

# Using cookiecutter directly
cookiecutter https://github.com/prassanna-ravishankar/cookiecutter-modern-ml

Choose your modality and configuration:

[1/12] project_name (My ML Project): Voice Assistant
[2/12] Select modality:
  1 - nlp
  2 - speech  
  3 - vision
  Choose from [1/2/3] (1): 2
[3/12] Select speech_task:
  1 - asr
  2 - tts
  Choose from [1/2] (1): 2
[4/12] Select use_tracelet:
  1 - yes
  2 - no

Start developing:

cd voice_assistant
uv sync
uv run task train

📋 What's Included

Project Structure

your_project/
├── .github/workflows/    # Modern CI with astral-sh/setup-uv
├── configs/             # ML-centric YAML configuration
├── models/              # Trained model artifacts  
├── notebooks/           # Jupyter notebooks (optional)
├── your_package/        # Source code
│   ├── config.py        # Modality-aware configuration
│   ├── data_utils.py    # Polars-based data processing
│   ├── deployment/      # LitServe model serving
│   └── models/          # Modality-specific training
├── tests/               # Pytest test suite
├── pyproject.toml       # Modern Python packaging
└── sky_task.yaml        # Cloud training config

Pre-configured Tools

uv: Ultra-fast Python package management
Transformers: State-of-the-art models for all modalities
Accelerate: Multi-device training (CUDA/MPS/CPU)
LitServe: High-performance model serving
Polars: Fast data processing (not pandas)
Pydantic: Type-safe configuration management
Ruff: Fast Python linter and formatter
Pytest: Testing framework

🎯 Example Workflows

NLP: Sentiment Analysis

# Train DistilBERT on IMDB
uv run task train

# Serve the model
uv run task serve

# Test inference
curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "This movie was amazing!"}'

Vision: Image Classification

# Train ViT on CIFAR-10
uv run task train

# The model automatically detects Mac MPS, CUDA, or CPU
# Batch sizes adjust automatically for memory constraints

Speech: ASR (Whisper) or TTS (CSM)

# ASR: Train Whisper for speech-to-text
uv run task train

# TTS: Train CSM for conversational speech generation  
uv run task train

# Automatically handles audio preprocessing and evaluation metrics

🔧 Configuration

ML-Researcher Friendly Settings

# configs/settings.yaml
modality: "nlp"

experiment:
  name: "bert_baseline"
  seed: 42

training:
  epochs: 5           # Not num_train_epochs!
  batch_size: 32      # Not per_device_train_batch_size!
  learning_rate: 3e-4
  warmup_ratio: 0.1

model:
  checkpoint: "distilbert-base-uncased"
  max_length: 512
  dropout: 0.1

compute:
  device: "auto"      # Automatically detects MPS/CUDA/CPU
  fp16: true
  gradient_checkpointing: true

Quick Experiment Setup

from your_package.config import create_experiment_config

# Researcher-friendly experiment creation
config = create_experiment_config(
    name="distilbert_large_lr",
    learning_rate=5e-4,
    epochs=10,
    batch_size=64
)

Modality-Specific Features

NLP: Automatic tokenizer padding, sequence classification metrics
Speech ASR: 16kHz audio processing, WER metrics, Whisper optimizations
Speech TTS: 24kHz generation, naturalness metrics, CSM conversational features
Vision: Image preprocessing, patch-based transformers, classification metrics

📊 Device Optimization

The template automatically optimizes for your hardware:

Mac MPS: Optimized batch sizes, no fp16, proper memory pinning
CUDA: Full fp16, TensorFloat-32, optimal batch sizes
CPU: Conservative batch sizes, fp32, minimal workers

☁️ Cloud Training

Deploy to any cloud with SkyPilot:

uv run task train-cloud

Supports AWS, GCP, Azure with spot instance optimization.

🧪 Testing

Run the full test suite:

python3 self_test.py  # Validate template completeness
uv run task test      # Run generated project tests
uv run task lint      # Code quality checks

📚 Documentation

🎨 Design Philosophy

Simplicity over Features: Avoid over-engineering, focus on researcher needs
ML-Centric: Parameter names and structure match ML research conventions
Modality-Aware: Each domain (NLP/Speech/Vision) has optimized defaults
Modern Tooling: Latest best practices (uv, Polars, LitServe, Pydantic)
Mac-First: Optimized for Apple Silicon development

🤝 Contributing

Contributions welcome! This template prioritizes simplicity and researcher experience.

📄 License

MIT License - build amazing ML projects!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
test_output		test_output
{{cookiecutter.project_slug}}		{{cookiecutter.project_slug}}
CLAUDE.md		CLAUDE.md
README.md		README.md
cookiecutter.json		cookiecutter.json
gh-social.webp		gh-social.webp
self_test.py		self_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modern ML Cookiecutter

✨ Features

🎯 Supported Modalities

🚀 Quick Start

📋 What's Included

Project Structure

Pre-configured Tools

🎯 Example Workflows

NLP: Sentiment Analysis

Vision: Image Classification

Speech: ASR (Whisper) or TTS (CSM)

🔧 Configuration

ML-Researcher Friendly Settings

Quick Experiment Setup

Modality-Specific Features

📊 Device Optimization

☁️ Cloud Training

🧪 Testing

📚 Documentation

🎨 Design Philosophy

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Modern ML Cookiecutter

✨ Features

🎯 Supported Modalities

🚀 Quick Start

📋 What's Included

Project Structure

Pre-configured Tools

🎯 Example Workflows

NLP: Sentiment Analysis

Vision: Image Classification

Speech: ASR (Whisper) or TTS (CSM)

🔧 Configuration

ML-Researcher Friendly Settings

Quick Experiment Setup

Modality-Specific Features

📊 Device Optimization

☁️ Cloud Training

🧪 Testing

📚 Documentation

🎨 Design Philosophy

🤝 Contributing

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages