Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# ADR-121: Rename IAM Policies for Clarity
# ADR-124: Rename IAM Policies for Clarity

**Status:** Accepted
**Date:** 2026-02-02
Expand Down
58 changes: 58 additions & 0 deletions docs/adr/126-multi-region-independent-tables.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# ADR-126: Multi-Region via Independent Regional Tables

**Status:** Proposed
**Date:** 2026-02-14

## Context

Users need to enforce rate limits across multiple AWS regions (e.g., an organization's
global RPM limit must be shared by clients in us-east-1 and eu-west-1). DynamoDB Global
Tables is the obvious candidate, but it has fundamental conflicts with zae-limiter's
write patterns:

- **ADD counter loss:** Global Tables uses last-writer-wins at the item level. Concurrent
`ADD tk -consumed` from two regions results in one write overwriting the other, silently
losing consumption data and causing over-admission.
- **Transaction non-atomicity:** `TransactWriteItems` is ACID only in the originating
region. Cascade child+parent writes appear as partial updates in other regions.
- **Double refill:** Each region's aggregator processes its own stream. Replicated writes
appear in all streams, requiring filtering to avoid double-counting and double-refilling.

The namespace feature (issue #376) already provides write isolation: each namespace has
its own partition key prefix, so records in different namespaces never collide.

## Decision

Multi-region must use **independent DynamoDB tables per region**, one per deployed stack.
Each region must use a dedicated namespace for its rate-limiting data. Cross-region
coordination must be handled by a periodic sync mechanism (see ADR-127, ADR-130), not by
DynamoDB replication. Global Tables must not be used for the rate-limiting table.

## Consequences

**Positive:**
- Write cost stays at 1x (no replicated WCU tax)
- All existing write patterns (speculative, optimistic lock, transactions) work unchanged
- Aggregator Lambda processes only local events, no stream filtering needed
- Each region is fully independent; one region's failure does not affect others

**Negative:**
- No automatic data replication; regional data is lost if a region fails permanently
- Cross-region coordination requires a new sync component (ADR-127, ADR-130)
- Rate limiter state is ephemeral; region loss causes temporary over-admission until
sync catches up, bounded by one sync window

## Alternatives Considered

### DynamoDB Global Tables with namespace-per-region isolation
Rejected because: replicated WCUs double write cost, ADD operations lose data under
concurrent cross-region writes, and transactions are not atomic across regions.

### DynamoDB Global Tables with counter sharding (per-region SET attributes)
Rejected because: requires reworking the composite bucket schema (ADR-114), breaks
speculative writes, and the 2x write cost is not justified when a sync mechanism is
needed regardless.

### Centralized single-region table with cross-region API calls
Rejected because: adds 50-150ms latency to every acquire() call for remote-region
clients, creating a single point of failure with no local fallback.
59 changes: 59 additions & 0 deletions docs/adr/127-s3-sync-exchange.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# ADR-127: S3-Based Cross-Region Sync Exchange

**Status:** Proposed
**Date:** 2026-02-14

## Context

With independent DynamoDB tables per region (ADR-126), a sync mechanism must exchange
consumption data between regions. The exchange payload is a snapshot of all active
entities' bucket states: `total_consumed_milli`, `tokens_milli`, and `capacity_milli`
per entity, resource, and limit.

Using DynamoDB as the exchange medium means writing one item per active (entity, resource)
pair per sync cycle. At 2,000 active entities with 10 resources and a 10-second sync
window, this costs ~$3,900/month in WCU alone — roughly 3x the entire acquire budget.

The sync payload is a batch snapshot: all active entities' state at a point in time.
This is a bulk data transfer problem, not an item-level access problem.

## Decision

Each region's sync Lambda must write its consumption snapshot as a single S3 object
(JSON) to a shared sync bucket, keyed by `{region}/snapshot.json`. Remote regions must
read these objects via cross-region S3 GET. DynamoDB must not be used for publishing
sync reports.

The snapshot must include, per active (entity, resource) pair: the per-limit
`total_consumed_milli` counter, the current `tokens_milli`, and the configured
`capacity_milli`. Snapshot objects must have a TTL (S3 lifecycle) of 5 minutes.

## Consequences

**Positive:**
- Publishing cost drops to ~$1.30/month regardless of entity count (1 S3 PUT per cycle)
- Reading cost is ~$0.10/month per remote region (1 S3 GET per cycle)
- Snapshot size is bounded: 2,000 entities x 10 resources x 60 bytes = ~1.2 MB per PUT
- S3 is highly available and durable; no capacity planning needed

**Negative:**
- Introduces S3 dependency for cross-region coordination (new failure mode)
- S3 eventual consistency means a GET may return a slightly stale snapshot (~1s)
- Requires a shared S3 bucket accessible from all regions (cross-region GET latency
~100ms, acceptable for background sync)
- Snapshot format must be versioned to handle schema evolution

## Alternatives Considered

### DynamoDB items for sync reports (one per entity per resource)
Rejected because: WCU cost scales linearly with entity count, reaching $3,900/month
at 2,000 active entities with 10-second sync — 3x the acquire budget.

### DynamoDB items for sync reports (one batch item per resource)
Rejected because: 400KB item size limit caps at ~4,000 entities per item, and large
item writes consume proportionally more WCUs, offering no cost advantage over
individual items.

### SQS/SNS for event-driven sync
Rejected because: requires per-event cross-region message delivery, adding complexity
and cost proportional to acquire volume rather than sync frequency.
63 changes: 63 additions & 0 deletions docs/adr/128-quota-enforcement-via-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# ADR-128: Quota Enforcement via Entity Config Overrides

**Status:** Proposed
**Date:** 2026-02-14

## Context

With independent tables per region (ADR-126) and S3-based sync (ADR-127), each region's
sync Lambda computes a regional quota for each entity. This quota must be enforced by
the rate limiter's hot path without modifying the acquire flow.

Three enforcement mechanisms were evaluated:

1. **Entity config overrides:** Write adjusted limits via `set_limits()`, picked up by
the existing config cache on the next resolve.
2. **Shadow counter on bucket item:** Write a `remote_tc` attribute on the bucket and
add a condition to speculative writes.
3. **Direct token deduction:** `ADD b_rpm_tk -remote_delta` on the bucket item.

The shadow counter approach has a semantic mismatch: `total_consumed_milli` is a lifetime
monotonic counter incompatible with the per-window token bucket model. Direct token
deduction does not adjust the refill ceiling — each region's bucket refills at the full
global rate, causing N regions to provide Nx the intended refill.

## Decision

Regional quotas must be enforced by writing entity-level config overrides via the
existing `set_limits(entity_id, resource, limits)` API. The sync Lambda must compute
the allocated capacity per entity and write it as an entity config record. The rate
limiter's existing config resolution hierarchy (Entity > Resource > System) must be
the sole mechanism for quota enforcement.

## Consequences

**Positive:**
- Zero changes to the acquire hot path (speculative writes, optimistic lock, bucket math)
- Uses the existing config hierarchy; no new DynamoDB schema or access patterns
- Capacity adjustment naturally controls refill ceiling via token bucket math
- Config cache TTL provides built-in staleness tolerance (already accepted in ADR-105)

**Negative:**
- Token drain lag: if current tokens exceed the new reduced capacity, the entity can
consume excess tokens until they drain naturally (bounded by consumption rate)
- Config writes are the dominant sync cost (~$40/month at 2,000 active entities with
trigger-based filtering per ADR-129)
- Config cache TTL (default 60s) delays quota enforcement after a config write; the
sync Lambda and application use separate Repository instances

## Alternatives Considered

### Shadow counter attribute on bucket item (remote_tc)
Rejected because: `total_consumed_milli` is a monotonic lifetime counter that cannot
be compared against a per-window capacity limit, and modifying the speculative write
condition changes the hot path for all users.

### Direct token deduction (ADD tk -remote_delta)
Rejected because: each region's bucket still refills at the full global rate, so N
regions produce Nx total refill — the deduction fights the bucket math without
correcting the underlying refill ceiling.

### In-memory client-side consumption map (no DynamoDB writes)
Rejected because: requires background polling threads and in-memory state, which works
for long-running services but not for Lambda-based rate limiting.
61 changes: 61 additions & 0 deletions docs/adr/129-trigger-based-sync-writes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# ADR-129: Trigger-Based Sync Config Writes

**Status:** Proposed
**Date:** 2026-02-14

## Context

With quota enforcement via entity config overrides (ADR-128), the sync Lambda writes
a `set_limits()` call for every active (entity, resource) pair each cycle. At 2,000
active entities with 10 resources and a 10-second sync window, this produces 20,000
WCU per cycle — ~$486/month. Most of these writes are wasted: 80% of entities are well
under their limits with stable allocations.

The sync window already defines the worst-case over-admission bound (entities can
double-consume for one sync window regardless of write frequency). Config writes do
not improve the worst case; they only tighten steady-state accuracy.

## Decision

The sync Lambda must only write entity config overrides when one of two triggers fires:

1. **Exhaustion trigger:** The entity's projected time-to-exhaustion (remaining tokens
divided by recent consumption rate) is less than twice the sync window. This prevents
the entity from running out of tokens before the next sync cycle can react.

2. **Drift trigger:** The computed allocation differs from the currently configured
capacity by more than 15%. This corrects stale quotas for entities whose traffic
pattern has shifted significantly.

All other entities must be skipped (no config write). Trigger evaluation must be
computed from S3 snapshot data (ADR-127) without additional DynamoDB reads.

## Consequences

**Positive:**
- Config writes drop from ~20,000 to ~600 per cycle at steady state (~97% reduction)
- Monthly sync cost drops from ~$486 to ~$40 at 2,000 active entities
- DynamoDB write throughput spikes are smoothed (fewer concurrent writes)
- Worst-case over-admission is unchanged (bounded by sync window, not write frequency)

**Negative:**
- Entities with slowly drifting traffic (<15% per cycle) may have stale quotas for
multiple sync cycles before the drift threshold triggers
- Exhaustion prediction depends on consumption rate estimation, which may be noisy for
bursty workloads
- Two tunable parameters (exhaustion horizon = 2x sync window, drift threshold = 15%)
require validation under production traffic patterns

## Alternatives Considered

### Write every entity every cycle (no filtering)
Rejected because: 97% of writes are redundant, costing ~$450/month in unnecessary WCU
without improving the over-admission bound set by the sync window.

### Write only on exhaustion (drop drift trigger)
Rejected because: entities with shifting traffic patterns would keep stale allocations
indefinitely, wasting regional quota until they approach exhaustion.

### Event-driven writes via DynamoDB Streams (write on every bucket change)
Rejected because: couples sync frequency to acquire volume rather than a fixed window,
producing more writes than periodic polling for high-throughput entities.
56 changes: 56 additions & 0 deletions docs/adr/130-per-region-sync-lambda.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# ADR-130: Per-Region Sync Lambda

**Status:** Proposed
**Date:** 2026-02-14

## Context

Cross-region sync (ADR-126, ADR-127) requires a Lambda function that reads consumption
snapshots, computes quota allocations, and writes config overrides. Two topologies were
evaluated:

- **Single coordinator:** One Lambda in a designated region reads all snapshots, computes
all quotas, and writes configs to all regions. Total cross-region calls: `2(N-1)` per
cycle (reads + writes). Single point of failure.
- **Per-region:** Each region runs its own Lambda that reads remote snapshots, computes
its own quota locally, and writes only to its local table. Total cross-region calls
per Lambda: `N-1` reads, 0 writes.

Both topologies produce the same quota allocation when given the same inputs. The
per-region Lambda runs a deterministic function: given the same S3 snapshots, every
region independently computes the same allocation. No distributed consensus is needed.

## Decision

Each region must run its own sync Lambda, triggered by EventBridge on a fixed schedule
(configurable sync window). Each Lambda must read its local bucket states, write its
snapshot to S3 (ADR-127), read all remote snapshots from S3, compute quotas using a
deterministic allocation function, and write triggered config overrides (ADR-129) to
its local DynamoDB table only. No Lambda may write to a remote region's DynamoDB table.

## Consequences

**Positive:**
- Symmetric architecture: every region deploys the same CloudFormation stack
- No single point of failure: one region's Lambda failure does not affect other regions
- Zero cross-region DynamoDB writes (only cross-region S3 reads, ~100ms latency)
- Scales naturally: adding a region means deploying the same stack, no coordinator changes
- Each region can independently tune its sync window

**Negative:**
- N Lambdas compute the same allocation independently (redundant CPU, negligible cost)
- Slight snapshot staleness between Lambdas reading at different moments within a cycle
(sub-second divergence, converges on next cycle)
- More infrastructure per region (EventBridge rule + Lambda + IAM), though identical
across regions and part of the standard stack deployment

## Alternatives Considered

### Single coordinator Lambda in a designated region
Rejected because: introduces an asymmetric "special" region, creates a single point of
failure for all global quota allocation, and requires cross-region DynamoDB writes for
config overrides in remote regions.

### Peer-to-peer gossip between regional Lambdas
Rejected because: adds network coordination complexity (discovery, message ordering)
without improving on the deterministic-computation-from-shared-S3 approach.
66 changes: 66 additions & 0 deletions docs/adr/131-sync-config-ttl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# ADR-131: TTL on Sync-Written Entity Config Records

**Status:** Proposed
**Date:** 2026-02-14

## Context

The sync Lambda enforces regional quotas by writing entity-level config overrides via
`set_limits()` (ADR-128). Per ADR-119, buckets using entity custom limits persist
indefinitely (no TTL), while buckets using resource/system defaults have TTL and
auto-expire.

When the sync Lambda writes an entity config, the bucket transitions from
"default-limit" (has TTL) to "custom-limit" (no TTL). If the entity later goes idle
and the sync Lambda stops writing configs (no trigger fires per ADR-129), both the
config record and its bucket persist indefinitely. For high-churn entity populations
(anonymous users, ephemeral API keys), this causes unbounded storage growth.

The fix must not affect operator-written entity configs, which intentionally persist
indefinitely per ADR-119.

## Decision

Sync-written entity config records must include a DynamoDB TTL attribute set to
`now + 3 × sync_window`. The sync Lambda must refresh the TTL on each config write.

This extends ADR-119's bucket TTL rule. The updated bucket TTL logic is:

- Bucket has **no TTL** if: entity config exists **without** a `ttl` attribute
(operator-written, persists indefinitely — unchanged from ADR-119)
- Bucket **has TTL** if: entity config exists **with** a `ttl` attribute
(sync-written, treated as default-like for TTL calculation)
- Bucket **has TTL** if: no entity config exists
(using resource/system defaults — unchanged from ADR-119)

When the entity goes idle (no trigger fires for 3 sync windows), the config record
auto-expires via DynamoDB TTL. The entity reverts to resource/system defaults, and
bucket TTL behavior per ADR-119 resumes.

## Consequences

**Positive:**
- Idle entities auto-cleanup: sync config expires, bucket regains TTL, storage bounded
- No new attributes needed: the DynamoDB `ttl` attribute already exists in the table schema
- Self-healing: if a sync Lambda fails permanently, all its configs expire within 3 windows

**Negative:**
- Bucket TTL logic (ADR-119) must check whether the entity config has a `ttl` attribute
to distinguish sync-written from operator-written configs
- Config records gain a new write pattern: conditional refresh of TTL alongside limits
- DynamoDB TTL deletion is asynchronous (up to 48 hours), so expired configs may linger
in scans; queries using strong conditions are unaffected

## Alternatives Considered

### Explicit cleanup pass in the sync Lambda (delete stale configs)
Rejected because: requires the sync Lambda to maintain a "previously synced" entity set
across invocations, adding state management complexity to a stateless Lambda function.

### Separate DynamoDB sort key for sync configs (#SYNC_CONFIG#{resource})
Rejected because: adds a new config level to the resolution hierarchy (ADR-118), breaking
the existing four-level precedence model and requiring changes to `resolve_limits()`.

### No TTL on sync configs (rely on operator cleanup)
Rejected because: operators should not need to manually clean up configs created by an
automated sync process, especially for ephemeral entities at scale.
Loading
Loading