AWS (amazon web services) setup for MATSim simulations.
Provides generic infrastructure as code templates to setup MATSim simulations to run as AWS batch jobs.
Includes:
- S3 buckets for input (including executable jars) and output
- IAM identity roles with the appropriate access rights
- ECR repository to hold docker/podman images for execution
- VPC setup
- Batch setup to start individual jobs including starting and shutting down appropriately sized machines
- (optionally) a job notification lambda function that notifies the user about the end of a job via a predefined slack workspace and channel
- Java + Maven
- your own AWS account
- AWS cdk installed for deploying the AWS resources
- podman/docker for building the job image
- AWS cli installed for pushing the docker image to ECR
- a packaged executable jar (i.e., shaded via maven)
Fill in the environment variables in the environment.env
file. You will need at least the AWS account number and your
desired region. This environment will act as a single source
of truth throughout the setup.
Make sure that your credentials are available as environment variables.
As an alternative, if your organization supports it, use AWS SSO as described here. Be careful, the region of the SSO might be different from the region of your target account! You can see the parameters you need after logging in, by selecting the "Access Keys" link next to your account.
Use aws sso login --profile my-profile before running the deploy script.
Run the 1_deployAWSInfrastructure.sh script. This will
- go to the
matsim-aws-setupmaven module. - execute maven
compilegoal. - Run:
cdk bootstrapcdk deploy --all
Afterwards, check if there is a cdk.out folder under matsim-aws-setup
Use 2_deployMatsimImage.sh for building and pushing the job image once
the ECR repository has been setup.
The scenarios folder contains the equil example scenario from MATSim original repo.
Within the matsim-aws-setup module, there is the io.moia.aws.run package that shows
how to get your first simulation running.
First, (with correct AWS credentials and the environment.env variables in your environment)
run the PrepareInput class, which simply uploads the required scenario files to your newly
created S3 input bucket.
Next, you need to package the maven module into an executable jar, such that it contains the RunEquil
main class. You can use the 3_updateJar.sh script provided here to run the package command and
update the resulting jar into the correct input bucket path.
Once the input and jar are uploaded you need to define a AWS batch job definition.
The job definition acts like a template for defining how a job should be run.
The definition defines various parameters, such as the batch job queue, the input/output buckets,
Main class, etc. Run the EquilExampleJobDefinition (again with correct environment variables)
to register the definition in your account.
Now you can run the EquilExampleJobSubmissionclass to actually submit your job. A link to the
AWS batch job will be printed to the console. The output will be synced to your output bucket.
mvn packagecompile and run testscdk lslist all stacks in the appcdk synthemits the synthesized CloudFormation templatecdk deploydeploy this stack to your default AWS account/regioncdk diffcompare deployed stack with current statecdk docsopen CDK documentation
At the end of each job, run.sh writes a _run_metadata.json file to the job's S3 output prefix:
s3://{output-bucket}/{OUTPUT_SCENARIO}/{JOB_NAME}/_run_metadata.json
The file always contains:
| Field | Value |
|---|---|
jobName |
The AWS Batch job name |
outputPath |
The full S3 key prefix for this job's outputs |
completedAt |
ISO 8601 UTC timestamp of job completion |
status |
"success" or "failed" |
Additional fields can be injected at submission time by setting the RUN_METADATA_EXTRA environment variable (via container overrides) to a JSON fragment — comma-separated "key": "value" pairs without the enclosing braces:
RUN_METADATA_EXTRA='"triggeredBy": "alice", "githubRunId": "12345"'By default, the deployment creates two S3 buckets (input and output) and manages their configuration — including lifecycle rules — via CloudFormation. Re-running the deploy script on an existing setup is safe: CloudFormation updates bucket properties in-place without recreating them or affecting stored data.
If you want to bring your own pre-existing buckets and have CDK reference them without managing their configuration, set the useExistingBuckets context flag:
cdk deploy --all --context useExistingBuckets=trueWhen this flag is set, CDK imports the buckets by their expected names and does not create or modify them. Lifecycle rules and other bucket properties will not be applied.
The output S3 bucket includes a lifecycle rule (DeleteFailedSimulationOutputs) that automatically deletes the outputs of failed simulation runs. When a job exits with a non-zero code, run.sh tags every object in the job's output prefix with SimulationStatus=failed. The lifecycle rule deletes all tagged objects after a configurable retention period.
Configure the retention period at deploy time:
cdk deploy --context failedRunRetentionDays=14 # default: 7The code is provided as is. There is no warranty about the correct usage of AWS resources. Any costs incurring in the user's account should be monitored closely and are in the whole responsibility of the user.
! Any deployment of this app may alter your AWS setup unintentionally !