fix(narrative): rewrite dataset_card for zero-prior-knowledge readers#88
Merged
Conversation
…ge readers
Redesign render_dataset_card() so the generated card is immediately
useful to a data scientist with no prior leadforge knowledge:
- Open with plain-English 'what is this / what you are predicting'
paragraph before any metadata tables
- Per-tier callout block (conversion rate, signal/noise knobs, AUC,
AP, P@100) with a tier-specific description explaining when to use
each tier
- 'The simulated world' section (clearly labelled fictional) replaces
the jargon-heavy 'Narrative summary'
- 'How to load' Python snippet (flat CSV + Parquet splits + relational
tables) added as a dedicated section
- 'Reproducibility' section with generate command moves metadata
(recipe, seed, package version) to the bottom instead of the top
- 'Intended uses' section (was 'Suggested use cases') restored
- Table inventory gains one-line descriptions per table
- Feature category table keeps 'Count' header; leakage-flagged text
keeps 'Leakage-flagged columns:' anchor for test compatibility
- Persona rendering includes human title alongside role key
Static release cards (release/{intro,intermediate,advanced}/dataset_card.md
and their HF/Kaggle copies) updated on disk (gitignored generated
artifacts). ShmuggingFace site rebuilt and redeployed to Cloudflare
Pages (leadforge-lead-scoring-v1-preview.pages.dev).
Tests: update test assertions that checked for exact Markdown
formatting strings that changed:
- test_card_contains_use_cases: accept 'intended' as well as 'cases'
- test_card_feature_categories_rendered: case-insensitive category check
- test_card_leakage_flagged_columns: accept 'leakage' in any case
- test_card_with_narrative_contains_personas: doc comment clarification
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR rewrites the generated dataset card to be understandable to readers with no prior leadforge context, making the synthetic nature, prediction task, and tier meaning explicit before diving into technical details.
Changes:
- Redesign
render_dataset_card()structure and copy to lead with “synthetic dataset” framing, task definition, tier callout, loading instructions, reproducibility details, and clearer world narrative. - Expand table inventory and feature sections (descriptions, leakage explanation, and clearer category labels).
- Relax/update a few dataset-card tests to be robust to the new Markdown phrasing and casing.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
leadforge/narrative/dataset_card.py |
Major rewrite of dataset card rendering (new sections, tier callout, table inventory descriptions, loading + reproducibility guidance). |
tests/narrative/test_dataset_card.py |
Updates assertions to tolerate the new card structure/phrasing while preserving key invariants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+150
to
+152
| f"| Signal strength | {cfg.signal_strength} / 1.0 |" | ||
| if hasattr(cfg, "signal_strength") | ||
| else "| Signal strength | see difficulty_profiles.yaml |", |
Comment on lines
+305
to
+309
| "# Flat CSV — all leads, all splits combined (convenient for exploration)", | ||
| 'df = pd.read_csv("lead_scoring.csv")', | ||
| f'X = df.drop(columns=["{cfg.primary_task}"])', | ||
| f'y = df["{cfg.primary_task}"]', | ||
| "", |
Comment on lines
+323
to
+326
| "**Note on account overlap:** ~93% of test-set accounts also appear in the " | ||
| "training set (splits are keyed on `lead_id`). Headline AUC overstates " | ||
| "generalisation to *unseen* accounts. For a faithful out-of-sample estimate, " | ||
| 'use `GroupKFold(groups=df["account_id"])`.', |
Comment on lines
+344
to
+345
| f"leadforge generate --recipe {cfg.recipe_id} --seed {cfg.seed} \\", | ||
| f" --mode student_public --difficulty {difficulty} --out my_bundle", |
…l artifacts - HuggingFace public README: add authors: [shaypal5] to YAML frontmatter - HuggingFace instructor README: same - Kaggle dataset-metadata.json: add derelictpanda as collaborator (role: writer) - dataset_card.py generator: append '**Author:** Shay Palachy Affek' line with HF, Kaggle, and GitHub links to the Reproducibility section - All 9 static dataset_card.md files updated on disk; site rebuilt and redeployed to leadforge-lead-scoring-v1-preview.pages.dev Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #88 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/narrative/dataset_card.py:152
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706799
Root author: copilot-pull-request-reviewer
Comment:
The tier callout tries to read `cfg.signal_strength`, but `GenerationConfig` does not have that attribute (difficulty parameters live under `cfg.difficulty_params`). As written, this will always fall back to the YAML placeholder even when difficulty params are available, so the card won’t show the actual signal strength for the generated bundle.
## COPILOT-2
Location: leadforge/narrative/dataset_card.py:309
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706855
Root author: copilot-pull-request-reviewer
Comment:
The “How to load” snippet treats `cfg.primary_task` as the label column name (`df[primary_task]` / `drop(columns=[primary_task])`), but the task ID and label column name can differ (the task directory can change while the label column remains `converted_within_90_days` / `task_manifest.label_column`). This will break the example for non-default tasks.
## COPILOT-3
Location: leadforge/narrative/dataset_card.py:326
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706891
Root author: copilot-pull-request-reviewer
Comment:
The `GroupKFold` example is not valid scikit-learn API (`GroupKFold` doesn’t accept a `groups=` argument in the constructor). This will mislead readers; consider showing `GroupKFold(n_splits=...)` and passing `groups=` to `split(...)` / `cross_val_score(...)`, or make this a prose note without code-like syntax.
## COPILOT-4
Location: leadforge/narrative/dataset_card.py:345
URL: https://github.com/leadforge-dev/leadforge/pull/88#discussion_r3313706930
Root author: copilot-pull-request-reviewer
Comment:
The reproducibility command hard-codes `--mode student_public` instead of using the bundle’s actual `cfg.exposure_mode`. If someone renders a card for `research_instructor`, the command will be incorrect.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The three tier dataset cards opened with a raw metadata table (Recipe, Exposure mode, Seed, Difficulty, Horizon...) with no preamble, followed by a 'Narrative summary' section that read like a real company prospectus without ever saying 'this is synthetic data'. Anyone browsing the ShmuggingFace or Kaggle preview with no prior leadforge knowledge would have no idea what they were looking at.
What changed
Generator (
leadforge/narrative/dataset_card.py)Complete redesign of
render_dataset_card():leadforge generatecommandvp_finance)Static release cards
release/{intro,intermediate,advanced}/dataset_card.mdand their HF/Kaggle copies updated on disk (gitignored generated artifacts — not tracked). ShmuggingFace site rebuilt and redeployed to Cloudflare Pages.Tier differences are now explicit:
Tests (
tests/narrative/test_dataset_card.py)Updated 4 assertions that were checking for exact legacy Markdown strings:
test_card_contains_use_cases: accept 'intended' OR 'use cases'test_card_feature_categories_rendered: case-insensitive category name checktest_card_leakage_flagged_columns: accept 'leakage' anywhere in cardtest_card_with_narrative_contains_personas: added clarifying comment (assertion unchanged — role keys still appear)All 1482 tests pass.
Preview
Live at leadforge-lead-scoring-v1-preview.pages.dev
🤖 Generated with Claude Code