Skip to content

fix(patch): strip hasProviderDefault fields recursively through arrays#423

Merged
JeroenSoeters merged 2 commits into
mainfrom
fix/recursive-provider-default-stripping
Apr 21, 2026
Merged

fix(patch): strip hasProviderDefault fields recursively through arrays#423
JeroenSoeters merged 2 commits into
mainfrom
fix/recursive-provider-default-stripping

Conversation

@JeroenSoeters
Copy link
Copy Markdown
Collaborator

Summary

Re-applying an unchanged forma could silently destroy working stateful resources. The most visible case: adding any unrelated resource to a forma that contains an AWS::ECS::Service triggered a full replace of the Service (5–10 min outage), even though the Service's own inputs hadn't changed.

Root cause was in the patch-generation path's handling of provider-defaulted sub-fields inside nested list elements:

  • AWS::ECS::ContainerDefinition.Cpu is annotated hasProviderDefault = true. AWS Read returns Cpu: 0 even when the user didn't set it. Desired state (from PKL eval) omits the field.
  • jsonpatch compares list-valued fields as sets by default — serializing each element to JSON bytes. {"Cpu":0,…}{…} as bytes, so container elements fail to match their counterparts.
  • compareArray then emits add/remove patch ops against /ContainerDefinitions/N. That path is createOnlyneedsReplacement = true → resource replace.
  • The Service replaces in turn because its TaskDefinition resolvable diffs.

The existing hasProviderDefault stripping worked only at the top level — not on sub-fields inside list elements.

Change

Replace the prior document-only stripper with a two-sided walker (stripProviderDefaultPath) that descends a dotted schema path through both the document and the patch simultaneously:

  • Pure object paths (e.g. BucketEncryption): preserve the original conditional behaviour — strip from the document only when the field is absent from the patch, so user overrides still diff.
  • Any path that traverses an array: strip the leaf key from both sides on every reachable element. This is the only correct semantic under set-based list comparison — there's no reliable pairing between doc and patch array elements.
  • Applies recursively, so arbitrarily deep paths like ContainerDefinitions.PortMappings.HostPort work.

A new helper removeProviderDefaultFieldsBoth returns both stripped buffers; createPatchDocument uses it in place of the prior document-only call. The pre-existing public removeProviderDefaultFields is kept for callers that only need the document side.

Tests

New cases in patch_document_test.go:

  • TestGeneratePatch_ProviderDefaultInsideArray_SingleElement_NoReplace — Grafana-style repro, single ContainerDefinition, Cpu: 0 vs absent.
  • TestGeneratePatch_ProviderDefaultInsideArray_MixedElements_NoReplace — two containers, one sets Cpu, the other doesn't.
  • TestGeneratePatch_ProviderDefaultInsideNestedArray_PortMappingHostPort and ..._Mixed — two levels of array nesting (ContainerDefinitions.PortMappings.HostPort).
  • TestGeneratePatch_UserChangedFieldInsideArray_StillReplaces — negative test: a real user change to a createOnly list element still trips needsReplacement.
  • TestGeneratePatch_TopLevelProviderDefaultOverride_StillDiffs — guard against regressing the existing top-level override semantic.

Updated: TestRemoveProviderDefaultFields_NestedFieldInsideArray — the old assertion required that an array-nested provider-default field present only in the patch be preserved in the document. That semantic was the bug under set-comparison. The test now asserts symmetric stripping, with a comment explaining why.

Full go test -tags=unit ./... suite green locally, including internal/workflow_tests.

Related

  • Companion schema fix on formae-plugin-aws@main (commit 073270f) annotates PortMapping.hostPort with hasProviderDefault = true so the PKL emitter surfaces the new path.
  • Engineering note with full root-cause trace: formae-plugin-aws/2026-04-20-ecs-spurious-replace-on-reapply.md.

Provider-default sub-fields inside nested list elements (e.g.
ContainerDefinition.Cpu, ContainerDefinition.PortMappings.HostPort)
caused spurious resource replacements on re-apply. The stored state,
populated by cloud-provider Read, carried these fields; the evaluated
desired state did not. jsonpatch's default set-based list comparison
serializes each element to JSON bytes, so provider-populated elements
never matched their desired counterparts, emitting add/remove ops on
paths marked createOnly — and tripping needsReplacement for unchanged
resources (most painfully, destroying running ECS services on reapply).

Replace the prior document-only stripper with a two-sided walker that
descends a dotted schema path through both document and patch. For
pure object paths the conditional top-level behavior is preserved so
user overrides keep diffing. For any path that traverses an array,
the leaf key is stripped symmetrically from both sides on every
reachable element — the only correct semantic under set-based list
comparison. Applies recursively, so arbitrarily deep paths like
ContainerDefinitions.PortMappings.HostPort work.

Includes tests for the exact Grafana-style reproduction, mixed
elements (one sets the field, one doesn't), two-level array nesting,
a negative test where a real user change still forces a replace,
and a guard against regressing the existing top-level provider-default
override semantic.
@JeroenSoeters JeroenSoeters force-pushed the fix/recursive-provider-default-stripping branch from c93b3db to 3e0aadd Compare April 21, 2026 17:14
@JeroenSoeters JeroenSoeters merged commit ed4af12 into main Apr 21, 2026
44 of 61 checks passed
@JeroenSoeters JeroenSoeters deleted the fix/recursive-provider-default-stripping branch April 21, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant