Skip to content

Fix IMDS autoConfigureHopLimit to scope to VPC instead of region#42

Open
johnlam90 wants to merge 2 commits intomainfrom
claude/fix-aws-multi-eni-bug-Ybvg6
Open

Fix IMDS autoConfigureHopLimit to scope to VPC instead of region#42
johnlam90 wants to merge 2 commits intomainfrom
claude/fix-aws-multi-eni-bug-Ybvg6

Conversation

@johnlam90
Copy link
Copy Markdown
Owner

@johnlam90 johnlam90 commented Mar 25, 2026

Summary

  • Fix tryVPCWideConfiguration() to filter by VPC ID instead of querying all running instances in the entire AWS region. Adds a vpc-id filter to the DescribeInstances call and a new resolveCurrentVPCID() method that determines the VPC via private IP lookup before the last-resort strategy runs.
  • Add guard clause rejecting empty VPC IDs to prevent region-wide modification if VPC resolution fails.
  • Update SECURITY.md to include missing ec2:DescribeInstances and ec2:ModifyInstanceMetadataOptions IAM permissions.

Test plan

  • New unit tests for empty VPC ID guard clause, aggressive config disabled path, and VPC resolution error handling
  • go build ./... passes
  • go vet ./... passes
  • go test ./pkg/aws/... passes

Fixes #33

The tryVPCWideConfiguration() function was querying all running instances
in the AWS region instead of filtering by VPC. This adds a vpc-id filter
to the DescribeInstances call and resolves the current VPC ID lazily via
private IP lookup before invoking the last-resort strategy.

Also adds missing ec2:DescribeInstances and ec2:ModifyInstanceMetadataOptions
permissions to SECURITY.md.

Fixes #33

https://claude.ai/code/session_019cXyAQaLbKyaHWnZupcNUC
Copy link
Copy Markdown
Owner Author

The CI failure in "Run Tests" is pre-existing and unrelated to this PR. The failing tests (TestRegularENINoSRIOVConfiguration, TestPCIAddressMapping, TestSRIOVResourceNameParsing) are integration tests in test/integration/ that require a Kubernetes cluster, which isn't available in the CI environment. These same tests fail identically on main.

All tests in pkg/aws/ (including the new VPC filter tests) pass cleanly.

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Mar 25, 2026

🤖 Augment PR Summary

Summary: Fix IMDS hop-limit auto-configuration to only target instances in the controller’s VPC rather than the whole region.

Key change: Introduces resolveCurrentVPCID() (private-IP DescribeInstances lookup) and threads the VPC ID into the VPC-wide fallback.

Safety: Adds a guard that aborts VPC-wide configuration when the VPC ID is empty/unresolvable.

Scoping: Adds a vpc-id filter to the last-resort DescribeInstances call so only running instances in that VPC are considered.

Docs: Updates SECURITY.md with required ec2:DescribeInstances and ec2:ModifyInstanceMetadataOptions IAM actions.

Tests: Adds unit tests for the empty-VPC guard, aggressive-config disabled path, and VPC resolution error handling.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread pkg/aws/ec2.go
},
}

result, err := c.EC2.DescribeInstances(ctx, input)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveCurrentVPCID will panic if c.EC2 is nil (e.g., when EC2Client is constructed without NewEC2Client), since it unconditionally calls c.EC2.DescribeInstances here; consider returning a clear error when the client isn't initialized.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread pkg/aws/ec2.go

for _, reservation := range result.Reservations {
for _, instance := range reservation.Instances {
if instance.VpcId != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns the first VpcId from a private-IP lookup; if multiple instances match the same private IP across VPCs (overlapping CIDRs), this could select the wrong VPC and mis-scope the subsequent VPC-wide modifications—consider detecting ambiguity and failing.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread pkg/aws/imds_vpc_test.go
// This will fail because there's no EC2 client and the network interface
// lookup will either return a local IP or fail - either way, without a
// real EC2 client the DescribeInstances call would fail.
_, err := client.resolveCurrentVPCID(context.Background())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be flaky (or panic): if getPrivateIPFromNetworkInterface() finds any private IP on the test host, resolveCurrentVPCID() will reach client.EC2.DescribeInstances(...) while EC2 is nil.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread pkg/aws/imds_vpc_test.go Outdated
// returns an error when aggressive configuration is disabled.
func TestTryVPCWideConfiguration_AggressiveDisabled(t *testing.T) {
// Ensure aggressive configuration is disabled
os.Unsetenv("IMDS_AGGRESSIVE_CONFIGURATION")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests mutate process-wide environment variables via os.Unsetenv without restoring them, which can make other tests in the package order-dependent; consider using t.Setenv (or deferring restore) to avoid leakage.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

- Add nil check for c.EC2 in resolveCurrentVPCID to prevent panic
- Detect ambiguous VPC resolution when private IP matches multiple VPCs
- Replace os.Unsetenv with t.Setenv for test isolation
- Replace flaky resolveCurrentVPCID test with nil EC2 client test

https://claude.ai/code/session_019cXyAQaLbKyaHWnZupcNUC
@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Mar 25, 2026

augment review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] imds.autoConfigureHopLimit not limited to VPC

2 participants