tutorials_as_code/talks-articles/machine-learning/MLOps.md at master · abhishekkr/tutorials_as_code

Machine Learning Operations

Intro to MLOps

source: YT/Sokratis Kartakis

At the cross section of ML, Data Engineering & DevOps.

MLOps Maturity Model

(Initial) Time to PoC: Standardize env & processes; access; track experiments
(Repeatable) Model deployment time: Standardize SCM, automate model build & deployment, automate governed access, centralize management of models
(Reliable) ML Defect Rate: Introduce auto testing, monitoring & lineage tracking. Standardize CI/CD & multi-account deployment.
(Scalable) ML Lifecycle Time: ML Lifecycle reproducible via templates across teams. Standardize infra & team onboarding.

MLOps on AWS

Self-service secure Infra deployment for ML use cases: AWS Sagemaker Projects, AWS Service Catalogs, IaC
Auditability: Sagemaker Experiments, Model Registry & Model Monitor, Model Dashboard & Cards
Increased Collboration among Data Scientists: Sagemaker Studio
Enable sustanability: Sagemaker Processor & Pipelines
Embedded QA & Automated Testing: Sagemaker Model Registry, CodePipeline & CodeBuild
Data Preparation: Sagemaker (Spark) Processors
Explainability & Bias Reporting: Sagemaker Clarify & Model Monitor
Production-ready ML Workflows: Sagemaker Pipelines

MLOps KPI Metrics

Time to Value (Inception to Production)
Time to productionize existing ML use cases
Percent of Template Driven Development
Time to init new MLOps infra & ML Projects
Execute ML solutions w/o internet access in Private Cloud
Reduce Infra Costs

ML Lifecycle: research to production

 [Data     ]---->[Data Curation, Quality]--->[Data prep, pipeline]-->
 [Ingestion]     [ & Cataloging         ]    [ & sharing         ]  |
 ,________________________________________________________________<-|
 |
 |           _______________________________________________________
 |         ,/                                                       \
 | ETL  [Data sampling]    [New feature]    [Model Build]    [Model Eval]
 |----->[& exploration]--->[engineering]--->[/Fine Tune ]--->[ & PoV    ]
 |
 |           __________________________________________________________
 | ML      ,/                                                          \
 '----->[Auto re-train]--->[Model version]--->[Model deployment]-->[Model  ]
        [at Scale     ]    [& auditing   ]    [& serve at scale]   [Monitor]

Separation of Concern: Platform Administration, Data, {Experimentation, Model Build, Model Test, Model Deployment}, ML Governance

Experimentation: Notebooks

Model Pipeline: Pre-Migration Notebooks moved to a standard structure project; CI/CD to run-test-debug each execution. Standardize repo structure for ML build & deploy phases.

Standardize data storage & versioning based on ML Pipelines

AWS Sagemaker Guides, Samples & Tools

github: Amazon Sagemaker Examples

github: Amazon Sagemaker 101 Workshop

github: Amazon Sagemaker MLOps Workshop

github: Amazon Sagemaker Secure MLOps

github: Amazon Sagemaker Build-Train-Deploy Sample

github: Amazon Sagemaker Notebook Instance Lifecycle Config Sample

github: Amazon Sagemaker Safe Deployment Pipeline Sample

github: Amazon Sagemaker MLFlow Fargate Sample

github: Amazon Sagemaker GenerativeAI Sample

github: Amazon Sagemaker Distributed Training Workshop

Amazon Sagemaker Experiments: track parameters & metrics across ML experiments
Sagemaker Pipelines: Pre-process, Train, Evaluation & Register Models
Sagemaker Model Registry: Store, Version & Trigger model promotion
Sagemaker Projects: manager repo & CI/CD per project; organize ML Lifecycle under one namespace
Sagemaker Model Monitor: Auto detection of data & model quality drifts
Sagemaker Lineage Tracking: track workflow steps; model & data lineage; establish model governance & audit
ML Governance - Model Cards: create single source of truth for model information; visibility with Sagemaker Model Dashboard
Sagemaker RT Endpoints, BatchTransform, Shadow Testing for model testing & deployment
Sagemaker Custom Projects Templates & AWS Service Catalog for better env init
Sagemaker Data Wrangler, Sagemaker Feature Store, AWS LakeFormation, AWS Glue, AWS EMR for auto data flow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Machine Learning Operations

Intro to MLOps

MLOps Maturity Model

MLOps on AWS

MLOps KPI Metrics

ML Lifecycle: research to production

AWS Sagemaker Guides, Samples & Tools

FilesExpand file tree

MLOps.md

Latest commit

History

MLOps.md

File metadata and controls

Machine Learning Operations

Intro to MLOps

MLOps Maturity Model

MLOps on AWS

MLOps KPI Metrics

ML Lifecycle: research to production

AWS Sagemaker Guides, Samples & Tools