leadforge-dev · shaypal5 · May 3, 2026 · May 3, 2026 · May 3, 2026 · May 3, 2026
diff --git a/.agent-plan.md b/.agent-plan.md
@@ -6,11 +6,51 @@
 
 ## Current System State
 
-**v1.0.0 released (2026-05-02).** All milestones (M0–M13) complete. Package version bumped to 1.0.0 in pyproject.toml and leadforge/version.py. README updated with `pip install leadforge`. CHANGELOG consolidated under v1.0.0 heading.
+**v1.0.0 released (2026-05-02).** All milestones (M0–M13) complete. Teaching dataset series (v1–v7) approved by consumer. Package version bumped to 1.0.0 in pyproject.toml and leadforge/version.py.
 
 ---
 
-## Next Up — v4 Lead Scoring Dataset
+## Next Up — Public Kaggle/HuggingFace Release
+
+First public dataset release: `leadforge-b2b-lead-scoring`. Three difficulty tiers (intro/intermediate/advanced) as full relational bundles + flat CSV convenience exports, plus a research_instructor companion for intermediate.
+
+### Public release — Phase 1: Dataset card improvement ✓ (in PR)
+
+- [x] `render_dataset_card()` accepts `table_counts` dict → renders table inventory
+- [x] Feature categories section rendered from `LEAD_SNAPSHOT_FEATURES` (category counts, examples, leakage flags)
+- [x] `write_bundle()` passes `table_row_counts` to card renderer
+- [x] 4 new tests (table inventory with/without counts, feature categories, leakage flags)
+
+### Public release — Phase 2: Build script + flat CSV ✓ (in PR)
+
+- [x] `scripts/build_public_release.py` — generates 4 bundles, validates, creates flat CSV exports
+- [x] Flat CSV drops `current_stage` (contains terminal stages that encode the label at 90-day horizon)
+- [x] All 4 bundles pass `validate_bundle()`
+
+### Public release — Phase 3: Platform README + HF card ✓ (in PR)
+
+- [x] `release/README.md` — landing page with directory structure, quick-start snippets, dataset summary, provenance
+- [x] `release/HF_DATASET_CARD.md` — YAML frontmatter with configs for each difficulty tier
+
+### Public release — Phase 4: Baseline notebook ✓ (in PR)
+
+- [x] `release/notebooks/01_baseline_lead_scoring.ipynb` — LR + GBM baselines, P@K, value-aware ranking, feature importance
+- [x] Excludes `current_stage` and leakage-flagged columns
+- [x] Works from pre-generated Parquet files (no leadforge install needed)
+
+### Public release — Phase 5: Generate final release + upload (pending)
+
+- [ ] Run build script, verify SHA-256 hash determinism
+- [ ] Upload to Kaggle and HuggingFace
+- [ ] Announce
+
+### Known issue: `current_stage` leakage at 90-day horizon
+
+The full bundle snapshot includes `current_stage` which at day 90 contains terminal stages (`closed_won`/`closed_lost`). This perfectly encodes the label. The flat CSV export drops it; the Parquet task splits retain it with documentation. A proper fix (windowed snapshot or column redaction in the exposure layer) is deferred.
+
+---
+
+## Previous Focus — v4–v7 Lead Scoring Datasets
 
 The primary focus is producing a v4 lead scoring dataset that fixes the issues found in v1–v3 datasets. This requires targeted engine changes + a build pipeline, followed by dataset release.
 

diff --git a/.gitignore b/.gitignore
@@ -208,3 +208,11 @@ __marimo__/
 
 # MacOS DS_Store files
 .DS_Store
+
+# Generated output bundles
+out/
+release/intro/
+release/intermediate/
+release/advanced/
+release/intermediate_instructor/
+release/LICENSE
diff --git a/leadforge/api/bundle.py b/leadforge/api/bundle.py
@@ -80,7 +80,9 @@ def write_bundle(
     # ------------------------------------------------------------------
     # 3. Dataset card and feature dictionary
     # ------------------------------------------------------------------
-    (root / "dataset_card.md").write_text(render_dataset_card(bundle.spec, task_manifest=task))
+    (root / "dataset_card.md").write_text(
+        render_dataset_card(bundle.spec, task_manifest=task, table_counts=table_row_counts)
+    )
     write_feature_dictionary(root / "feature_dictionary.csv")
 
     # ------------------------------------------------------------------

diff --git a/leadforge/narrative/dataset_card.py b/leadforge/narrative/dataset_card.py
@@ -6,8 +6,11 @@
 
 from __future__ import annotations
 
+from collections import Counter
 from typing import TYPE_CHECKING
 
+from leadforge.schema.features import LEAD_SNAPSHOT_FEATURES
+
 if TYPE_CHECKING:
     from leadforge.core.models import WorldSpec
     from leadforge.schema.tasks import TaskManifest
@@ -16,6 +19,7 @@
 def render_dataset_card(
     world_spec: WorldSpec,
     task_manifest: TaskManifest | None = None,
+    table_counts: dict[str, int] | None = None,
 ) -> str:
     """Return a Markdown dataset card string for *world_spec*.
 
@@ -24,17 +28,18 @@ def render_dataset_card(
         task_manifest: Optional task manifest whose ``description`` is used
             as the label definition prose.  When ``None`` or when
             ``description`` is empty, a generic fallback is rendered.
+        table_counts: Optional mapping of table name → row count.  When
+            provided, the table inventory section renders actual counts
+            instead of a placeholder.
 
-    Sections present at all milestones:
+    Sections:
     - Header (recipe id, version, seed, exposure mode)
     - Narrative summary (company, product, market, GTM)
     - Primary task and label definition
-    - Suggested use cases
-    - Caveats
-
-    Sections populated in later milestones (rendered as stubs here):
     - Table inventory
     - Feature categories
+    - Suggested use cases
+    - Caveats
     """
     cfg = world_spec.config
     narrative = world_spec.narrative
@@ -122,24 +127,47 @@ def render_dataset_card(
     ]
 
     # ------------------------------------------------------------------
-    # Table inventory (stub — populated in later milestones)
+    # Table inventory
     # ------------------------------------------------------------------
-    lines += [
-        "## Table inventory",
-        "",
-        "*Table counts will appear here once the simulation layer is implemented (v0.3.0+).*",
-        "",
-    ]
+    lines += ["## Table inventory", ""]
+    if table_counts is not None:
+        lines += [
+            "| Table | Rows |",
+            "|---|---:|",
+        ]
+        for tbl, count in table_counts.items():
+            lines.append(f"| {tbl} | {count:,} |")
+        lines.append("")
+    else:
+        lines += [
+            "*Table counts not available (pass ``table_counts`` to populate).*",
+            "",
+        ]
 
     # ------------------------------------------------------------------
-    # Feature categories (stub)
+    # Feature categories
     # ------------------------------------------------------------------
+    lines += ["## Feature categories", ""]
+    category_counts: Counter[str] = Counter()
+    for feat in LEAD_SNAPSHOT_FEATURES:
+        category_counts[feat.category] += 1
     lines += [
-        "## Feature categories",
-        "",
-        "*Feature dictionary will appear here once the schema layer is implemented (v0.3.0+).*",
-        "",
+        "| Category | Count | Examples |",
+        "|---|---:|---|",
     ]
+    for cat, count in category_counts.items():
+        examples = [
+            f.name for f in LEAD_SNAPSHOT_FEATURES if f.category == cat and not f.is_target
+        ][:3]
+        lines.append(f"| {cat} | {count} | {', '.join(examples)} |")
+    leakage_cols = [f.name for f in LEAD_SNAPSHOT_FEATURES if f.leakage_risk]
+    if leakage_cols:
+        lines += [
+            "",
+            f"**Leakage-flagged columns:** {', '.join(f'`{c}`' for c in leakage_cols)}. "
+            "See `feature_dictionary.csv` for details.",
+        ]
+    lines.append("")
 
     # ------------------------------------------------------------------
     # Suggested use cases

diff --git a/leadforge/schema/features.py b/leadforge/schema/features.py
@@ -116,8 +116,12 @@ class FeatureSpec:
     FeatureSpec(
         "current_stage",
         "string",
-        "Funnel stage at snapshot anchor date.",
+        "Funnel stage at snapshot anchor date. WARNING: at full-horizon "
+        "(90-day) snapshots this contains terminal stages (closed_won / "
+        "closed_lost) that encode the label. Exclude from modeling or use "
+        "a windowed snapshot.",
         "lead_meta",
+        leakage_risk=True,
     ),
     FeatureSpec(
         "is_mql",

diff --git a/release/HF_DATASET_CARD.md b/release/HF_DATASET_CARD.md
@@ -0,0 +1,104 @@
+---
+language:
+  - en
+license: mit
+task_categories:
+  - tabular-classification
+tags:
+  - lead-scoring
+  - b2b
+  - crm
+  - synthetic
+  - relational
+  - sales
+  - funnel
+  - binary-classification
+  - reproducible
+size_categories:
+  - 1K-10K
+configs:
+  - config_name: intro
+    data_files:
+      - split: train
+        path: intro/tasks/converted_within_90_days/train.parquet
+      - split: validation
+        path: intro/tasks/converted_within_90_days/valid.parquet
+      - split: test
+        path: intro/tasks/converted_within_90_days/test.parquet
+  - config_name: intermediate
+    data_files:
+      - split: train
+        path: intermediate/tasks/converted_within_90_days/train.parquet
+      - split: validation
+        path: intermediate/tasks/converted_within_90_days/valid.parquet
+      - split: test
+        path: intermediate/tasks/converted_within_90_days/test.parquet
+  - config_name: advanced
+    data_files:
+      - split: train
+        path: advanced/tasks/converted_within_90_days/train.parquet
+      - split: validation
+        path: advanced/tasks/converted_within_90_days/valid.parquet
+      - split: test
+        path: advanced/tasks/converted_within_90_days/test.parquet
+---
+
+# LeadForge: Synthetic B2B Lead Scoring Dataset
+
+A relational, reproducible, multi-difficulty lead scoring dataset generated by [leadforge](https://github.com/leadforge-dev/leadforge) -- an open-source Python framework for synthetic CRM/funnel data.
+
+## Why this dataset?
+
+1. **Relational structure.** 9 normalized tables plus ML-ready task splits. Practice feature engineering from raw tables, or grab the flat file and start modeling.
+2. **Three difficulty tiers.** Same world, different signal-to-noise ratios.
+3. **Reproducible and leakage-safe.** Deterministic generation (seed 42), SHA-256 hashes, explicit leakage trap.
+
+## Quick start
+
+```python
+from datasets import load_dataset
+
+# Load intermediate difficulty
+ds = load_dataset("leadforge/leadforge-b2b-lead-scoring", name="intermediate")
+train = ds["train"].to_pandas()
+valid = ds["validation"].to_pandas()  # Note: file is valid.parquet, split name is "validation"
+test = ds["test"].to_pandas()
+```
+
+Or use the flat CSV:
+
+```python
+import pandas as pd
+df = pd.read_csv("hf://datasets/leadforge/leadforge-b2b-lead-scoring/intermediate/lead_scoring.csv")
+```
+
+## Dataset summary
+
+| | Intro | Intermediate | Advanced |
+|---|---|---|---|
+| Leads | 5,000 | 5,000 | 5,000 |
+| Features | 35 | 35 | 35 |
+| Target | `converted_within_90_days` | `converted_within_90_days` | `converted_within_90_days` |
+| Signal strength | 0.90 | 0.70 | 0.50 |
+| Noise scale | 0.10 | 0.30 | 0.55 |
+| Missing rate | 2% | 8% | 18% |
+
+## The scenario
+
+**Veridian Technologies** sells cloud procurement automation to mid-market firms (200-2,000 employees). Sales channels: inbound (45%), SDR outbound (35%), partner referrals (20%). Four buyer personas. **Task:** predict conversion within 90 days.
+
+## Relational tables
+
+Each difficulty tier includes 9 Parquet tables under `tables/`: accounts, contacts, leads, touches, sessions, sales_activities, opportunities, customers, subscriptions. These form a normalized CRM schema linked by foreign keys.
+
+## Leakage trap
+
+`total_touches_all` counts touches over the full 90-day window including post-snapshot events. Flagged as `leakage_risk=True` in `feature_dictionary.csv`.
+
+## Research companion
+
+`intermediate_instructor/` includes the full causal structure: world graph (DAG), latent trait registry, and mechanism assignments.
+
+## Provenance
+
+Generated by [leadforge](https://github.com/leadforge-dev/leadforge) v1.0.0, recipe `b2b_saas_procurement_v1`, seed 42. MIT license. See `manifest.json` in each bundle for SHA-256 hashes.