NWDAF ML Service

FastAPI service for ML model lifecycle management and performance monitoring on 5G network data.

Technologies

Python 3.12
FastAPI - REST API framework
PyTorch - model training
MLflow - model registry and experiment tracking
PostgreSQL / MinIO - persistence and artifact storage

How It Works

Models are registered with a configuration (architecture, input/output fields, window size, lookback/forecast steps)
Training jobs fetch historical windows from the Data Storage API, build sliding sequences and train a PyTorch model, logging results to MLflow
Inference fetches the latest cell data from ClickHouse and runs the elected best model for that field
A background monitoring loop re-scores the best model per output field on a configurable interval; if performance degrades past a threshold, all models for that field are retrained and re-evaluated automatically

Databases & Integrations

Service	Role
MLflow	Model registry, experiment tracking, per-field performance tags
PostgreSQL	Model configs, training job log, score history
MinIO	S3-compatible artifact store for trained model weights
Data Storage API	Source of processed ClickHouse windows for training, evaluation, and inference

API

Base path: /v1

Method	Endpoint	Description
`GET`	`/fields`	List available output fields. `?include_model_status=true` adds `has_models` flag per field
`GET`	`/models`	List all models. `?include_details=true` adds scores, best-for fields, training status
`POST`	`/models`	Create a new model config
`GET`	`/models/{model_id}`	Get model detail
`DELETE`	`/models/{model_id}`	Delete model from registry and config store
`POST`	`/training/train`	Queue a training job (202 Accepted)
`GET`	`/training/jobs`	List training jobs. `?model_id=` / `?status=` filters
`GET`	`/training/jobs/{job_id}`	Get job detail (status, timestamps, error)
`DELETE`	`/training/jobs/{job_id}`	Cancel a job
`POST`	`/inference`	Run inference for a cell. Omit `model_id` to use the best model
`POST`	`/performance/{field}/evaluate`	Score all models for a field, elect best. `?metric=rmse\|mae\|mape\|r2`
`GET`	`/performance/{field}`	Cached evaluation result (no live scoring)
`GET`	`/performance/{field}/best`	Current best model with baseline score and degradation threshold
`POST`	`/performance/{field}/set-best/{model_id}`	Override best model without re-evaluating
`POST`	`/performance/{field}/monitor`	Re-score best model only; does not change best designation
`GET`	`/performance/{field}/status`	State machine status: state, active jobs, next check time, thresholds
`GET`	`/performance/{field}/history`	Full score measurement history. `?model_id=` filter

Model Config Parameters

Parameter	Type	Description
`architecture`	`ann` \| `lstm`	Model type
`input_fields`	`list[str]`	Fields used as input features
`output_fields`	`list[str]`	Fields to predict
`window_duration_seconds`	int	Time bucket size in seconds (e.g. 60, 300)
`lookback_steps`	int	Number of past windows fed as input
`forecast_steps`	int	Number of future windows predicted
`hidden_size`	int	Hidden layer width (default: 32)

Training Job Statuses

queued -> running -> completed | failed | cancelled

Performance MLflow Tags

Tag	Description
`best_for:{field}`	`"true"` on the elected best model
`baseline_score:{field}`	Score at election time - degradation reference
`score_for:{field}`	Latest score
`eval_metric:{field}`	Metric used (`rmse` / `mae` / `mape` / `r2`)
`eval_at:{field}`	ISO timestamp of last evaluation

Auto-Monitoring Loop

Runs as a background task on startup when MONITORING_ENABLED=true. Per-field state machine:

MONITORING --(degraded?)--> RETRAINING --(all jobs done)--> EVALUATING --> MONITORING

MONITORING - re-scores the best model every MONITORING_INTERVAL_SECONDS. Triggers retraining if score > baseline x MONITORING_DEGRADATION_FACTOR
RETRAINING - waits for all training jobs to reach a terminal state
EVALUATING - re-ranks all models and elects a new best

On startup, stale is_training locks from crashed runs are automatically cleared.

Configuration

Variable	Description
`MLFLOW_TRACKING_URI`	MLflow server URL
`DATABASE_URL`	PostgreSQL connection string
`DATA_STORAGE_API_URL`	Data Storage service base URL
`DATA_STORAGE_DATA_ENDPOINT`	Processed data endpoint (e.g. `/api/v1/processed`)
`DATA_STORAGE_EXAMPLE_ENDPOINT`	Example schema endpoint
`DATA_STORAGE_CELL_ENDPOINT`	Cell list endpoint
`DATA_STORAGE_EXCLUDED_FIELDS`	Comma-separated metadata fields to exclude from field discovery
`AWS_ACCESS_KEY_ID`	MinIO access key
`AWS_SECRET_ACCESS_KEY`	MinIO secret key
`AWS_S3_ENDPOINT_URL`	MinIO endpoint URL
`MONITORING_ENABLED`	Enable background monitoring loop (default: `true`)
`MONITORING_INTERVAL_SECONDS`	Seconds between monitor checks (default: `300`)
`MONITORING_DEGRADATION_FACTOR`	Score multiplier that triggers retraining (default: `1.5`)
`API_HOST`	Bind address (default: `0.0.0.0`)
`API_PORT`	Port (default: `8060`)
`ML_PORT`	Port (default: `8060`)
`LOG_LEVEL`	Log verbosity (default: `INFO`)

Running

cp .env.example .env
docker compose up

Name		Name	Last commit message	Last commit date
Latest commit History 418 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
sql		sql
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Jenkinsfile		Jenkinsfile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NWDAF ML Service

Technologies

How It Works

Databases & Integrations

API

Model Config Parameters

Training Job Statuses

Performance MLflow Tags

Auto-Monitoring Loop

Configuration

Running

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NWDAF ML Service

Technologies

How It Works

Databases & Integrations

API

Model Config Parameters

Training Job Statuses

Performance MLflow Tags

Auto-Monitoring Loop

Configuration

Running

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages