fix(narrative): rewrite dataset_card for zero-prior-knowledge readers by shaypal5 · Pull Request #88 · leadforge-dev/leadforge

shaypal5 · 2026-05-27T20:34:46Z

Problem

The three tier dataset cards opened with a raw metadata table (Recipe, Exposure mode, Seed, Difficulty, Horizon...) with no preamble, followed by a 'Narrative summary' section that read like a real company prospectus without ever saying 'this is synthetic data'. Anyone browsing the ShmuggingFace or Kaggle preview with no prior leadforge knowledge would have no idea what they were looking at.

What changed

Generator (`leadforge/narrative/dataset_card.py`)

Complete redesign of render_dataset_card():

Before	After
Opens with raw metadata table (Recipe, Exposure mode...)	Opens with 'This is a synthetic dataset for practicing B2B lead scoring, generated by leadforge...'
'Narrative summary' section — reads like a real company	'The simulated world' section — explicitly labelled fictional
No explanation of the prediction task	'What you are predicting' paragraph + blockquote with label definition
No tier context	Per-tier callout with signal/noise/AUC/AP numbers + human-readable tier description
No code snippet	'How to load' section with flat CSV + Parquet splits + relational tables
Metadata (Recipe, Seed, Package version) at the top	'Reproducibility' section at the bottom with `leadforge generate` command
Suggested use cases (plain bullets)	'Intended uses' section
Table inventory (counts only)	Table inventory with per-row descriptions
Persona role keys only (`vp_finance`)	Human title + role key ('VP Finance / vp_finance')

Static release cards

release/{intro,intermediate,advanced}/dataset_card.md and their HF/Kaggle copies updated on disk (gitignored generated artifacts — not tracked). ShmuggingFace site rebuilt and redeployed to Cloudflare Pages.

Tier differences are now explicit:

intro: ~43% conversion, signal 0.90, LR AUC 0.671 — 'easiest; prototype your pipeline here'
intermediate: ~22% conversion, signal 0.70, LR AUC 0.662 — 'default benchmark; calibration matters'
advanced: ~8% conversion, signal 0.50, LR AUC 0.624 — 'rare-event / calibration exercise'

Tests (`tests/narrative/test_dataset_card.py`)

Updated 4 assertions that were checking for exact legacy Markdown strings:

test_card_contains_use_cases: accept 'intended' OR 'use cases'
test_card_feature_categories_rendered: case-insensitive category name check
test_card_leakage_flagged_columns: accept 'leakage' anywhere in card
test_card_with_narrative_contains_personas: added clarifying comment (assertion unchanged — role keys still appear)

All 1482 tests pass.

Preview

Live at leadforge-lead-scoring-v1-preview.pages.dev

🤖 Generated with Claude Code

…ge readers Redesign render_dataset_card() so the generated card is immediately useful to a data scientist with no prior leadforge knowledge: - Open with plain-English 'what is this / what you are predicting' paragraph before any metadata tables - Per-tier callout block (conversion rate, signal/noise knobs, AUC, AP, P@100) with a tier-specific description explaining when to use each tier - 'The simulated world' section (clearly labelled fictional) replaces the jargon-heavy 'Narrative summary' - 'How to load' Python snippet (flat CSV + Parquet splits + relational tables) added as a dedicated section - 'Reproducibility' section with generate command moves metadata (recipe, seed, package version) to the bottom instead of the top - 'Intended uses' section (was 'Suggested use cases') restored - Table inventory gains one-line descriptions per table - Feature category table keeps 'Count' header; leakage-flagged text keeps 'Leakage-flagged columns:' anchor for test compatibility - Persona rendering includes human title alongside role key Static release cards (release/{intro,intermediate,advanced}/dataset_card.md and their HF/Kaggle copies) updated on disk (gitignored generated artifacts). ShmuggingFace site rebuilt and redeployed to Cloudflare Pages (leadforge-lead-scoring-v1-preview.pages.dev). Tests: update test assertions that checked for exact Markdown formatting strings that changed: - test_card_contains_use_cases: accept 'intended' as well as 'cases' - test_card_feature_categories_rendered: case-insensitive category check - test_card_leakage_flagged_columns: accept 'leakage' in any case - test_card_with_narrative_contains_personas: doc comment clarification Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR rewrites the generated dataset card to be understandable to readers with no prior leadforge context, making the synthetic nature, prediction task, and tier meaning explicit before diving into technical details.

Changes:

Redesign render_dataset_card() structure and copy to lead with “synthetic dataset” framing, task definition, tier callout, loading instructions, reproducibility details, and clearer world narrative.
Expand table inventory and feature sections (descriptions, leakage explanation, and clearer category labels).
Relax/update a few dataset-card tests to be robust to the new Markdown phrasing and casing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`leadforge/narrative/dataset_card.py`	Major rewrite of dataset card rendering (new sections, tier callout, table inventory descriptions, loading + reproducibility guidance).
`tests/narrative/test_dataset_card.py`	Updates assertions to tolerate the new card structure/phrasing while preserving key invariants.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        f"| Signal strength | {cfg.signal_strength} / 1.0 |"
+        if hasattr(cfg, "signal_strength")
+        else "| Signal strength | see difficulty_profiles.yaml |",


+        "# Flat CSV — all leads, all splits combined (convenient for exploration)",
+        'df = pd.read_csv("lead_scoring.csv")',
+        f'X = df.drop(columns=["{cfg.primary_task}"])',
+        f'y = df["{cfg.primary_task}"]',
+        "",


+        "**Note on account overlap:** ~93% of test-set accounts also appear in the "
+        "training set (splits are keyed on `lead_id`). Headline AUC overstates "
+        "generalisation to *unseen* accounts. For a faithful out-of-sample estimate, "
+        'use `GroupKFold(groups=df["account_id"])`.',


+        f"leadforge generate --recipe {cfg.recipe_id} --seed {cfg.seed} \\",
+        f"                   --mode student_public --difficulty {difficulty} --out my_bundle",


…l artifacts - HuggingFace public README: add authors: [shaypal5] to YAML frontmatter - HuggingFace instructor README: same - Kaggle dataset-metadata.json: add derelictpanda as collaborator (role: writer) - dataset_card.py generator: append '**Author:** Shay Palachy Affek' line with HF, Kaggle, and GitHub links to the Reproducibility section - All 9 static dataset_card.md files updated on disk; site rebuilt and redeployed to leadforge-lead-scoring-v1-preview.pages.dev Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-28T08:32:42Z

pr-agent-context report:

This run includes unresolved review comments on PR #88 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/narrative/dataset_card.py:152
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706799
Root author: copilot-pull-request-reviewer

Comment:
    The tier callout tries to read `cfg.signal_strength`, but `GenerationConfig` does not have that attribute (difficulty parameters live under `cfg.difficulty_params`). As written, this will always fall back to the YAML placeholder even when difficulty params are available, so the card won’t show the actual signal strength for the generated bundle.

## COPILOT-2
Location: leadforge/narrative/dataset_card.py:309
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706855
Root author: copilot-pull-request-reviewer

Comment:
    The “How to load” snippet treats `cfg.primary_task` as the label column name (`df[primary_task]` / `drop(columns=[primary_task])`), but the task ID and label column name can differ (the task directory can change while the label column remains `converted_within_90_days` / `task_manifest.label_column`). This will break the example for non-default tasks.

## COPILOT-3
Location: leadforge/narrative/dataset_card.py:326
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706891
Root author: copilot-pull-request-reviewer

Comment:
    The `GroupKFold` example is not valid scikit-learn API (`GroupKFold` doesn’t accept a `groups=` argument in the constructor). This will mislead readers; consider showing `GroupKFold(n_splits=...)` and passing `groups=` to `split(...)` / `cross_val_score(...)`, or make this a prose note without code-like syntax.

## COPILOT-4
Location: leadforge/narrative/dataset_card.py:345
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706930
Root author: copilot-pull-request-reviewer

Comment:
    The reproducibility command hard-codes `--mode student_public` instead of using the bundle’s actual `cfg.exposure_mode`. If someone renders a card for `research_instructor`, the command will be incorrect.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 26563872419 attempt 1
Comment timestamp: 2026-05-28T08:31:53.908932+00:00
PR head commit: d24d7e1e6e76e7da0d16758aceb175985e4bcd50

Copilot AI review requested due to automatic review settings May 27, 2026 20:34

shaypal5 added this to the v1.0.0 — Polished OSS release milestone May 27, 2026

shaypal5 added type: bugfix Fixes a bug layer: narrative narrative/ vertical story layer labels May 27, 2026

Copilot started reviewing on behalf of shaypal5 May 27, 2026 20:34 View session

This comment has been minimized.

Sign in to view

Copilot AI reviewed May 27, 2026

View reviewed changes

shaypal5 merged commit a34c9f2 into main May 28, 2026
10 checks passed

shaypal5 deleted the fix/dataset-card-rewrite branch May 28, 2026 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(narrative): rewrite dataset_card for zero-prior-knowledge readers#88

fix(narrative): rewrite dataset_card for zero-prior-knowledge readers#88
shaypal5 merged 2 commits into
mainfrom
fix/dataset-card-rewrite

shaypal5 commented May 27, 2026

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		f"leadforge generate --recipe {cfg.recipe_id} --seed {cfg.seed} \\",
		f" --mode student_public --difficulty {difficulty} --out my_bundle",

Conversation

shaypal5 commented May 27, 2026

Problem

What changed

Generator (leadforge/narrative/dataset_card.py)

Static release cards

Tests (tests/narrative/test_dataset_card.py)

Preview

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Generator (`leadforge/narrative/dataset_card.py`)

Tests (`tests/narrative/test_dataset_card.py`)