Skip to content

fix(release): remove Kaggle-invalid keywords from default keyword set#100

Merged
shaypal5 merged 1 commit into
mainfrom
fix/kaggle-keywords-and-metadata
May 30, 2026
Merged

fix(release): remove Kaggle-invalid keywords from default keyword set#100
shaypal5 merged 1 commit into
mainfrom
fix/kaggle-keywords-and-metadata

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

  • Removes b2b, crm, lead-scoring, saas, synthetic-data from DEFAULT_KEYWORDS in package_kaggle_release.py — Kaggle rejected all five as unknown tags during dataset_metadata_update
  • Aligns REQUIRED_COMMON_TAGS in lint_platform_metadata.py to only require tags valid on both platforms (only tabular survives); HF-specific tags are still enforced by the separate EXPECTED_HF_TAGS exact-match check
  • Updates test fixture in test_lint_platform_metadata.py and regenerates release/_preview_committed/kaggle.html golden file

Context

The Kaggle live dataset (derelictpanda/leadforge-lead-scoring-v1) has already had its metadata updated with the correct description, subtitle, MIT license, and cover image. The invalid keywords were silently dropped by the API during that update; this PR makes the source artifacts consistent with what Kaggle actually accepted.

Test plan

  • pytest tests/scripts/test_lint_platform_metadata.py — 12 passed
  • pytest tests/scripts/test_preview_kaggle_page.py — 26 passed
  • Full suite: 1482 passed, 1 skipped

🤖 Generated with Claude Code

Kaggle rejected "b2b", "crm", "lead-scoring", "saas", "synthetic-data"
as unknown tags. Drop them from DEFAULT_KEYWORDS in the Kaggle packager,
align REQUIRED_COMMON_TAGS in the lint script to only include tags valid
on both platforms (keeping "tabular"), and update the test fixture and
golden preview HTML to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 30, 2026 21:04
@shaypal5 shaypal5 added type: bugfix Fixes a bug layer: exposure exposure/ truth-mode filtering labels May 30, 2026
@shaypal5 shaypal5 merged commit 9c0af8d into main May 30, 2026
10 of 11 checks passed
@shaypal5 shaypal5 deleted the fix/kaggle-keywords-and-metadata branch May 30, 2026 21:04

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@github-actions

Copy link
Copy Markdown

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #100 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: pull request opened
Workflow run: 26694889886 attempt 1
Comment timestamp: 2026-05-30T21:04:35.558435+00:00
PR head commit: c6dbd07ee66849dc1acdeb0b8ded419904591e1b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: exposure exposure/ truth-mode filtering type: bugfix Fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants