Skip to content

feat: wire difficulty profiles into simulation engine#52

Merged
shaypal5 merged 2 commits into
mainfrom
feat/difficulty-modulation
May 3, 2026
Merged

feat: wire difficulty profiles into simulation engine#52
shaypal5 merged 2 commits into
mainfrom
feat/difficulty-modulation

Conversation

@shaypal5

@shaypal5 shaypal5 commented May 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • The simulation engine now modulates behavior based on the difficulty profile YAML parameters that were previously loaded but ignored
  • Conversion rates now fall within declared ranges: intro 30-45%, intermediate 18-28%, advanced 8-15% (calibrated across 20 seeds × 5 motif families)
  • Snapshot features receive post-simulation distortions (noise, missingness, outliers) scaled by difficulty tier
  • The check_difficulty_ordering() validator is no longer a no-op — it verifies actual rates

Changes

File Change
leadforge/core/models.py DifficultyParams frozen dataclass + field on GenerationConfig
leadforge/mechanisms/policies.py Per-motif calibration computes target daily hazard from conversion_rate_range; signal_strength scales LatentScore weights
leadforge/simulation/engine.py Threads difficulty_params; churn rate modulated by committee_friction
leadforge/render/snapshots.py _apply_difficulty_distortions() injects noise/missingness/outliers
leadforge/api/generator.py Constructs DifficultyParams from profile YAML
leadforge/api/bundle.py Passes params to build_snapshot()
leadforge/validation/difficulty.py Real rate validation with ±5% tolerance
tests/test_difficulty_modulation.py 13 new tests

Calibration results (20 seeds)

Tier Target Measured mean Measured range
intro 30-45% 42.6% 36.6-49.4%
intermediate 18-28% 22.2% 18.5-26.9%
advanced 8-15% 8.7% 6.5-10.2%

Test plan

  • All 865 tests pass (852 original + 13 new)
  • Ruff lint clean
  • Determinism verified: same (seed, difficulty) → identical output
  • Conversion rate ordering verified: intro > intermediate > advanced across all tested seeds
  • Run python scripts/build_public_release.py to verify three tiers produce visibly different datasets

🤖 Generated with Claude Code

The simulation engine now modulates behavior based on difficulty profile
parameters (signal_strength, noise_scale, missing_rate, outlier_rate,
conversion_rate_range, committee_friction). Previously all three tiers
produced identical ~70% conversion; now intro targets 30-45%,
intermediate 18-28%, and advanced 8-15%.

Key changes:
- DifficultyParams dataclass carries numeric profile parameters
- assign_mechanisms() uses per-motif calibration to compute target
  daily hazard rates from conversion_rate_range
- signal_strength scales LatentScore weights
- committee_friction modulates churn rate
- build_snapshot() injects Gaussian noise, MCAR missingness, and
  outliers based on noise_scale/missing_rate/outlier_rate
- check_difficulty_ordering() now validates actual conversion rates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 3, 2026 21:54
@shaypal5 shaypal5 added type: feature New capability layer: mechanisms mechanisms/ generators and transitions layer: simulation simulation/ discrete-time engine layer: validation validation/ invariants and checks labels May 3, 2026
@github-actions

This comment has been minimized.

- Extract motif calibration to module-level constant with clear docs
- Make _apply_difficulty_distortions() pure (copies input, no mutation)
- Use LEAD_SNAPSHOT_FEATURES spec for column eligibility (not runtime
  dtype sniffing) — prevents accidental distortion of categoricals
- Raise InvalidRecipeError on missing profile keys instead of silent
  defaults that mask typos
- Non-uniform signal_strength scaling: primary weight × s, secondary
  weights × s^1.5 — reduces discriminability rather than just shifting
  the sigmoid
- Use 5σ for outlier injection (vs 3σ) to be distinguishable from
  natural variation
- Tests use tmp_path fixture instead of hardcoded /tmp paths
- Add direct test of _apply_difficulty_distortions() verifying noise
  changes values and function is pure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 3, 2026

Copy link
Copy Markdown

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #52 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25292145634 attempt 1
Comment timestamp: 2026-05-03T22:06:12.426314+00:00
PR head commit: c23a541521265e0ac3a5b515be0675e097ff7513

@shaypal5 shaypal5 merged commit 8c85726 into main May 3, 2026
8 checks passed
@shaypal5 shaypal5 deleted the feat/difficulty-modulation branch May 3, 2026 22:08
@shaypal5 shaypal5 removed the request for review from Copilot May 3, 2026 22:15
shaypal5 added a commit that referenced this pull request May 3, 2026
…tion (#53)

* fix: update release docs for difficulty modulation and regenerate bundles

Remove stale "Known limitations" section claiming difficulty tiers share
the same conversion rate. Replace with a "Difficulty tiers" section
documenting actual conversion ranges. Regenerated all 4 release bundles
with the difficulty-aware engine (PR #52): intro 41.5%, intermediate
20.1%, advanced 7.9%.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: consolidate difficulty info, update HF card, restore determinism task

- Remove duplicate "Difficulty tiers" section from README; merge
  conversion rate (target + observed) into the existing Dataset summary
  table. Single source of truth, no duplication.
- Remove hardcoded ranges from intro paragraph to reduce staleness risk.
- Add conversion rate row to HF_DATASET_CARD.md summary table and
  update difficulty bullet — card was inconsistent with README.
- Restore SHA-256 determinism verification as an unchecked Phase 5 task
  in .agent-plan.md (was silently dropped in previous commit).
- Point to difficulty_profiles.yaml as source of truth for target ranges.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: mechanisms mechanisms/ generators and transitions layer: simulation simulation/ discrete-time engine layer: validation validation/ invariants and checks type: feature New capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant