Skip to content

Latest commit

 

History

History
63 lines (42 loc) · 2.07 KB

File metadata and controls

63 lines (42 loc) · 2.07 KB

Localstack Demo: Training and deploying ML classifier with MWAA

App that creates a DAG inside MWAA that takes a dataset, and builds a classifier model based on the feature columns and targetting column. A classifier is trained and the one with the best accuracy out of a bunch of three algorithms is picked up: SVM, Logistic Regression, and Decision Tree. Finally, the model is deployed as a Lambda function.

To keep it simple, no external dependencies (custom Docker images) were added, and the training happens locally in Airflow. Following that, the model gets deployed as a Lambda function. While not ideal, as usually all workloads are supposed to be off-loaded (i.e. with SageMaker, or EC2 / AWS Batch jobs), but easily trained models can still technically be run with the local executor.

The only input the DAG has is a airflow/variables/dataset_spec secret in SecretsManager service, like the following one:

{
    "url": "https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv",
    "name": "iris.data",
    "feature_columns": ["sepal.length", "sepal.width", "petal.length", "petal.width"],
    "target_column": "variety"
}

Prerequisites

  • LocalStack
  • Docker
  • Python 3.8+ / Python Pip
  • make
  • jq
  • curl
  • awslocal

Installing

To install the dependencies:

make install

Starting LocalStack

Make sure that LocalStack is started:

LOCALSTACK_AUTH_TOKEN=... make start

Running

Run the sample demo script:

make run

Proxying Secrets

To proxy Airflow variables to upstream AWS, you can use the proxy.conf config file to only use upstream AWS secrets as the Airflow variables. That's because we're sourcing the Airflow variables from the AWS Secrets backend. This assumes you have the localstack-extension-aws-replicator extension installed onto the LocalStack instance: https://pypi.org/project/localstack-extension-aws-replicator/.

localstack aws proxy -c proxy.conf --container

License

This code is available under the Apache 2.0 license.