Skip to content

✨ Add language-agnostic DynamoDB schema definition (schema.yaml) #261

@sodre

Description

@sodre

Problem or Use Case

The current DynamoDB schema is defined implicitly in Python code (schema.py and repository.py). This makes it difficult to:

  1. Implement alternative aggregators (e.g., Rust) that need to read/write the same data
  2. Generate type-safe clients in other languages
  3. Validate schema consistency across implementations
  4. Document the schema in a language-neutral format

As the project moves toward alternative backends (v1.4.0) and multi-language ecosystem growth, a single source of truth for the DynamoDB schema becomes essential.

Proposed Solution

Create a schema.yaml (or schema.json) file that defines:

  1. Table structure: PK/SK patterns, GSI definitions
  2. Record types: All item types with their attributes and types
  3. Key builders: Documented patterns for constructing keys
  4. Attribute constraints: Required fields, enums, value ranges
# Example structure
table:
  name: rate_limits
  billing_mode: PAY_PER_REQUEST
  
keys:
  primary:
    partition_key: PK
    sort_key: SK
  gsi1:
    partition_key: GSI1PK
    sort_key: GSI1SK
  gsi2:
    partition_key: GSI2PK
    sort_key: GSI2SK

prefixes:
  entity: "ENTITY#"
  parent: "PARENT#"
  resource: "RESOURCE#"
  system: "SYSTEM#"
  audit: "AUDIT#"

record_types:
  entity_meta:
    pk_pattern: "ENTITY#{entity_id}"
    sk: "#META"
    attributes:
      entity_id: { type: string, required: true }
      parent_id: { type: string, required: false }
      cascade: { type: boolean, default: false }
      created_at: { type: integer, description: "Unix timestamp ms" }
  
  composite_bucket:
    pk_pattern: "ENTITY#{entity_id}"
    sk_pattern: "#BUCKET#{resource}"
    attributes:
      # Composite bucket fields use b_{limit}_{field} naming
      "b_{limit}_tk": { type: integer, description: "Tokens (millitokens)" }
      "b_{limit}_cp": { type: integer, description: "Capacity (millitokens)" }
      # ...

Alternatives Considered

  1. OpenAPI/JSON Schema: More verbose, less suited for DynamoDB-specific patterns
  2. Protobuf: Adds compilation step, not as readable
  3. Keep Python-only: Blocks multi-language growth

Acceptance Criteria

  • schema.yaml file exists at project root (or docs/schema.yaml)
  • Schema defines all record types currently in repository.py
  • Schema includes all key prefixes and builders from schema.py
  • Schema documents attribute types matching Python type hints
  • Unit test validates Python implementation matches schema.yaml
  • CLAUDE.md updated to reference schema.yaml as schema source of truth

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/infraCloudFormation, IAM, infrastructure
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions