Skip to content

feat: Add feature to enable dynamic ec2 config via workflow labels#5003

Merged
Brend-Smits merged 34 commits into
github-aws-runners:mainfrom
edersonbrilhante:feat-ec2-dynamic-config
Jun 10, 2026
Merged

feat: Add feature to enable dynamic ec2 config via workflow labels#5003
Brend-Smits merged 34 commits into
github-aws-runners:mainfrom
edersonbrilhante:feat-ec2-dynamic-config

Conversation

@edersonbrilhante

@edersonbrilhante edersonbrilhante commented Jan 19, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR resumes and completes the work started in #4529.

It also allows to use any other dynamic labels with prefix ghr-. Giving support for unique labels per job or per group of jobs

It ensures that EC2-specific config can be defined via run-ons

How to test:

Use your regular labels, and add ghr-ec2-instance-type and ghr-ec2-image-id

run-ons:
  - <regular-labels>
  - ghr-ec2-instance-type:<different-instance-type>
  - ghr-ec2-image-id:<different-ami-id>

In this case:

  • The runner is resolved from <regular-labels>
  • The EC2 instance type and AMI ID are taken exactly from the provided labels

@edersonbrilhante edersonbrilhante marked this pull request as ready for review January 20, 2026 16:16
@edersonbrilhante edersonbrilhante requested review from a team as code owners January 20, 2026 16:16
@npalm

npalm commented Jan 30, 2026

Copy link
Copy Markdown
Member

@edersonbrilhante great to see this PR.

@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch 5 times, most recently from 5dab5e4 to afdfb36 Compare February 5, 2026 13:24
@andrecastro

Copy link
Copy Markdown
Contributor

This is a really interesting feature!

Just one suggestion from my side: would it be possible to support a whitelist of allowed instance types?

Also, it could be really powerful to have some kind of feature-flag / policy control over which parts of the configuration are allowed to be dynamic. For example, in my org we don’t want developers to be able to select arbitrary AMIs (only a pre-approved set), but it would be awesome to still allow them to choose the instance type for workflow jobs, as long as it’s constrained to an allowed list.

Maybe the "feature flag" is not even necessary as long as we could define the "allowed values" for each configuration, with this we could list only the pre-approved AMIs.

@edersonbrilhante

Copy link
Copy Markdown
Contributor Author

@andrecastro I liked and makes a lot of sense to me. I just need more time to think about the implementation. And tbis PR is already big enough XD. I could create a following for adding this restricted values feature

@stuartp44 stuartp44 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to approve, but I do have a statement about incorrect labels and the effect on the process.

Comment thread lambdas/functions/webhook/src/runners/dispatch.test.ts
Comment thread lambdas/functions/control-plane/src/aws/runners.test.ts
@stuartp44

stuartp44 commented Feb 20, 2026

Copy link
Copy Markdown
Contributor

I also agree with what was previously mentioned; we probably need a safelist, as we don't want lateral movement when a compromised pipeline is used, especially with the VPC setting. Maybe worth some "allowed_instance_type" setting or something to that effect that can be checked against, and if not in the list, ignored.

@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch from afdfb36 to 0e4f870 Compare March 6, 2026 23:13
@edersonbrilhante

Copy link
Copy Markdown
Contributor Author

@stuartp44 We can add the safelist, but I think we need a deeper discussion about the implementation

@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch from aae08eb to 750a9ff Compare March 9, 2026 20:53
@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch 4 times, most recently from ad2f612 to e67a37e Compare March 11, 2026 19:41
@npalm

npalm commented Mar 17, 2026

Copy link
Copy Markdown
Member

@edersonbrilhante tried to test the feature but so far not got it wokring

Used labels: runs-on: [self-hosted, x64, linux, ubuntu-latest, ghr-ec2-instance-type:c5.2xlarge, ghr-ec2-ebs-volume-size:200] But need to check what is worng. I enabled the feature as well rebuild the lambda stack.

@edersonbrilhante

Copy link
Copy Markdown
Contributor Author

Can you print the logs from dispatch-to-runner? Check if the labels were accepted or not.

npalm
npalm previously requested changes Apr 9, 2026

@npalm npalm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great feature. Maybe we can add a small saftety net in the disspachter. A clear warning in the docs. And refer in the variable where users can enable to the risk in the docs.

Comment thread lambdas/functions/webhook/src/runners/dispatch.ts Outdated
Comment thread docs/configuration.md
Comment thread lambdas/functions/webhook/src/runners/dispatch.ts
@edersonbrilhante

Copy link
Copy Markdown
Contributor Author

@npalm can you review

@edersonbrilhante edersonbrilhante requested a review from npalm June 10, 2026 11:37
@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch from eca2581 to e6e19ce Compare June 10, 2026 11:50
@edersonbrilhante edersonbrilhante force-pushed the feat-ec2-dynamic-config branch from e6e19ce to f987e00 Compare June 10, 2026 11:57
@Brend-Smits Brend-Smits dismissed npalm’s stale review June 10, 2026 12:17

feedback processed - good to merge.

@Brend-Smits Brend-Smits left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested it and it seems to work. Documentation looking great.
Thanks a lot!

Let's improve it from here by adding allow/deny lists 👍🏼

@Brend-Smits Brend-Smits merged commit c68445d into github-aws-runners:main Jun 10, 2026
42 checks passed
Brend-Smits pushed a commit that referenced this pull request Jun 11, 2026
🤖 I have created a release *beep* *boop*
---


##
[7.7.0](v7.6.1...v7.7.0)
(2026-06-11)


### Features

* Add feature to enable dynamic ec2 config via workflow labels
([#5003](#5003))
([c68445d](c68445d))
* add support for macos runners
([#4930](#4930))
([3e179a3](3e179a3))
* Introduce Amazon Linux 2023 ARM image
([#4780](#4780))
([e572ae5](e572ae5))
* relax cpu_options schema and add amd_sev_snp + nested_virtualization
support
([#5039](#5039))
([5a3746d](5a3746d))
* **runner-role:** Enable using separate IAM role for runners
([#4875](#4875))
([6642e57](6642e57))


### Bug Fixes

* **ci:** sign auto-generated docs commits
([#5154](#5154))
([a6af4d2](a6af4d2))
* **runners:** wire job_retry.lambda_memory_size and lambda_timeout
([#5120](#5120))
([404785e](404785e))
* **scale-up:** Add ec2:TerminateInstances permission to scale-up Lambda
IAM policy
([#5152](#5152))
([94c4e12](94c4e12))
* **scale-up:** prevent negative TotalTargetCapacity when runners exceed
maximum
([#5062](#5062))
([9ab7410](9ab7410))
* **webhook:** Fix publish events to EventBridge
([#5143](#5143))
([a72b737](a72b737))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: runners-releaser[bot] <194412594+runners-releaser[bot]@users.noreply.github.com>
@piotrj

piotrj commented Jun 11, 2026

Copy link
Copy Markdown

@edersonbrilhante @Brend-Smits I'm working on getting rid of conflicts in #5077 and there is a Priority value that kind of clashes with this feature.

I wonder what's the use case of Priority being set by dynamic label? Priority makes sense mostly (and possibly only) when you define an EC2 fleet that has multiple instance_types and have prioritised allocation type. Then during a scale-up it tries to launch the instance type with the highest priority (and then going down the list in case of no available capacity).

For the case of dynamic labels where you create a separate fleet and that fleet has only 1 instance type it doesn't really make sense to define the priority.

@edersonbrilhante

Copy link
Copy Markdown
Contributor Author

@piotrj the logic is the same as today: "the first match", then uses the dynamic labels to override the launch template.
Your feature is compatible. The non-dynamic labels will find the first match, using the priority config, then it will override the ec2 configs passed by ec2-*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants