YOLO Ultralytics + ClearML

End-to-end YOLO training pipeline for CVAT-annotated video data. Annotations and videos are exported from CVAT as project ZIPs, stored in S3, sampled into a versioned dataset, and used to train YOLO — with all experiments, datasets, and model artifacts tracked in ClearML (backed by S3, nothing stored in ClearML's file server).

Pipeline Overview

CVAT export ZIP → S3
       │
       ▼
  ingest.py          Download ZIPs from S3, extract, re-upload organised structure
       │
       ▼
  sample.py          Sample every Nth frame, convert CVAT JSON → YOLO format,
       │              upload images + labels to S3, create versioned ClearML Dataset
       ▼
  train.py           Pull dataset from ClearML (S3), train YOLO, register model artifact

Project Structure

yolo-ultralytics-clearml/
├── ingest.py                        # Job 1: ingest CVAT export ZIPs from S3
├── sample.py                        # Job 2: sample frames, build ClearML Dataset
├── train.py                         # Job 3: train YOLO, log to ClearML
├── conf/config.yaml                 # Hydra config for all three jobs
├── src/yolo_training/
│   ├── cvat_parser.py               # Parse CVAT native JSON → per-frame annotations
│   └── s3_ops.py                    # S3 upload/download/list helpers
├── ALLOWED_CLASS.txt                # Classes to keep (others are dropped)
├── pyproject.toml                   # Project dependencies
└── LICENSE

S3 Layout

s3://gt-cvat-annotations/
  ├── <project>.zip                  # CVAT project export ZIPs (uploaded manually)
  │
  ├── raw/
  │   ├── .markers/<zip>.done        # Processed ZIP markers (contains S3 ETag)
  │   └── <project-name>/
  │       ├── project.json
  │       └── task_N/
  │           ├── task.json
  │           ├── annotations.json
  │           └── video.mp4
  │
  ├── datasets/
  │   └── yolo-cvat/
  │       └── 1.0.0/
  │           ├── train/
  │           │   ├── images/
  │           │   └── labels/
  │           └── val/
  │               ├── images/
  │               └── labels/
  │
  └── clearml/                       # ClearML metadata + model artifacts

Installation

git clone https://github.com/yourusername/yolo-ultralytics-clearml.git
cd yolo-ultralytics-clearml

# PyTorch CUDA (not on PyPI — install separately)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126

# Everything else
pip install -e .

Credentials

AWS — boto3 uses the standard credential chain:

IAM instance role (recommended on EC2 — no credentials needed)
~/.aws/credentials via aws configure
Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION

ClearML — run once on each machine:

clearml-init

Configuration

All settings are in conf/config.yaml. Key sections:

Section	Purpose
`ingest`	S3 bucket, ZIP prefix, markers prefix, optional temp dir
`sample`	Sample rate, val ratio, allowed classes, dataset version
`dataset`	ClearML dataset to pull for training
`training`	Model weights, S3 output URI for artifacts
`yolo_args`	YOLO hyperparameters (epochs, batch, imgsz, device, …)

Any key can be overridden on the command line via Hydra.

Usage

Job 1 — Ingest

Upload one or more CVAT project export ZIPs to s3://gt-cvat-annotations/, then run:

python ingest.py

Already-ingested ZIPs are skipped automatically (ETag-based). If a ZIP is re-uploaded with new content, it is re-ingested.

# Use /dev/shm as temp dir if root partition is tight
python ingest.py ingest.tmp_dir=/dev/shm

Job 2 — Sample

python sample.py

# Tune sampling rate or cut a new dataset version
python sample.py sample.sample_every_n=5 sample.dataset_version=1.1.0

Creates a versioned ClearML Dataset whose files are S3 references — no data is copied into ClearML.

Job 3 — Train

python train.py

# Use a specific dataset version
python train.py dataset.version=1.1.0 yolo_args.epochs=100

YOLO metrics are auto-logged to ClearML. The best model weights (best.pt) are registered as a ClearML OutputModel artifact stored in S3.

Allowed Classes

Edit ALLOWED_CLASS.txt (one class per line) to control which CVAT labels are kept. All other labels are silently dropped during sampling.

Current defaults: person, bicycle, car.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
conf		conf
src/yolo_training		src/yolo_training
.gitignore		.gitignore
.python-version		.python-version
ALLOWED_CLASS.txt		ALLOWED_CLASS.txt
LICENSE		LICENSE
README.md		README.md
ingest.py		ingest.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sample.py		sample.py
split.txt		split.txt
train.py		train.py
yolo.yaml		yolo.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLO Ultralytics + ClearML

Pipeline Overview

Project Structure

S3 Layout

Installation

Credentials

Configuration

Usage

Job 1 — Ingest

Job 2 — Sample

Job 3 — Train

Allowed Classes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YOLO Ultralytics + ClearML

Pipeline Overview

Project Structure

S3 Layout

Installation

Credentials

Configuration

Usage

Job 1 — Ingest

Job 2 — Sample

Job 3 — Train

Allowed Classes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages