Skip to content

feat(scripts): publish_kaggle + publish_hf + v1 release notes (PR 7.3)#87

Merged
shaypal5 merged 2 commits into
mainfrom
feat/pr7.3-publish-scripts
May 28, 2026
Merged

feat(scripts): publish_kaggle + publish_hf + v1 release notes (PR 7.3)#87
shaypal5 merged 2 commits into
mainfrom
feat/pr7.3-publish-scripts

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

Adds the final two publish scripts and the v1 release runbook. All dry-runs pass locally.

Scripts

scripts/publish_kaggle.py

Three-stage runbook:

  1. --dry-run — re-packages release/kaggle/ + lints metadata, no upload
  2. Default — uploads as private via kaggle datasets create
  3. --public — single-step public upload; or flip via Kaggle web UI after review

Also supports --update MESSAGE for future version bumps (kaggle datasets version).

scripts/publish_hf.py

Same three-stage pattern for HuggingFace:

  1. --dry-run — re-packages + lints + runs load_dataset() smoke test (G12.3 / G12.4)
  2. Default — uploads as private via huggingface_hub.HfApi.upload_folder
  3. --go-public — flips visibility via HfApi.update_repo_visibility

--variant=public|instructor handles both the public (3-tier) and instructor companion repos.

Dry-run results

python scripts/publish_kaggle.py --dry-run
# [ 1/3 ] Packaging release/kaggle/ …  OK
# [ 2/3 ] Linting platform metadata … OK
# Dry-run complete — all pre-flight checks passed.

python scripts/publish_hf.py --dry-run
# [ 1/4 ] Packaging release/huggingface … OK
# [ 2/4 ] Linting platform metadata … OK
# [ 3/4 ] Running load_dataset() smoke tests …
#   OK   config='intro': 3 splits, 5,000 rows total      ← G12.3 ✅
#   OK   config='intermediate': 3 splits, 5,000 rows total
#   OK   config='advanced': 3 splits, 5,000 rows total
# Dry-run complete — all pre-flight checks passed.

python scripts/publish_hf.py --dry-run --variant=instructor
# [ 3/4 ] Running load_dataset() smoke tests …
#   OK   config='intermediate': 3 splits, 5,000 rows total  ← G12.4 ✅
# Dry-run complete — all pre-flight checks passed.

docs/release/v1_release_notes.md

Full release runbook covering:

  • 7-step pre-publish checklist (rebuild → validate → dry-runs → ShmuggingFace preview → platform previews)
  • Publish steps for all three repos (Kaggle public, HF public, HF instructor)
  • Tag + announce instructions
  • Change log vs alpha bundles (relational leakage, snapshot leaks, noise clamp, platform hardening)

ShmuggingFace preview

Rebuilt with build_shmuggingface_site.py after PR 8.4 hardening — 48 files generated.

What's left (manual, requires credentials)

  • Upload to Kaggle and Hugging Face (see runbook in docs/release/v1_release_notes.md)
  • Tag leadforge-lead-scoring-v1
  • Announce

🤖 Generated with Claude Code

Adds the two publish scripts and the v1 release runbook.

scripts/publish_kaggle.py
- Three-stage runbook: --dry-run (package+lint) → private upload → --public
- Wraps package_kaggle_release.run_packager + lint_platform_metadata.run_lint
  as pre-flight; both must pass before any upload attempt
- Calls kaggle datasets create (new dataset) or kaggle datasets version
  (--update MESSAGE) for future version bumps
- Dry-run confirmed passing: package OK, lint OK

scripts/publish_hf.py
- Same three-stage pattern for HuggingFace + instructor companion
- Adds load_dataset() smoke test (G12.3 / G12.4) as step 3/4:
  all 3 public configs load (5 000 rows x 3 splits each);
  instructor config loads (intermediate)
- Uploads via huggingface_hub.HfApi.upload_folder (private by default)
- --go-public flips visibility via HfApi.update_repo_visibility
- --variant=public|instructor selects the target repo
- Dry-run confirmed passing for both variants

docs/release/v1_release_notes.md
- Full pre-publish runbook (7 steps)
- Publish steps (private -> review -> public) for all three repos
- Tag + announce instructions
- Change log vs alpha bundles

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 27, 2026 11:31
@shaypal5 shaypal5 added type: feature New capability type: docs Documentation or narrative changes labels May 27, 2026
@github-actions

This comment has been minimized.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the final two release publish scripts (Kaggle and Hugging Face) plus the v1 release runbook. Each script wraps existing packaging/lint helpers, supports a credential-free --dry-run mode, and (for HF) runs a local load_dataset() smoke test before delegating to the platform CLI/SDK for the actual upload.

Changes:

  • New scripts/publish_kaggle.py with package + lint + kaggle datasets create/version flow.
  • New scripts/publish_hf.py with package + lint + load_dataset smoke + HfApi.upload_folder, plus a --go-public visibility flip.
  • New docs/release/v1_release_notes.md runbook and .agent-plan.md checkbox update.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
scripts/publish_kaggle.py Three-stage Kaggle publish wrapper around package + lint + Kaggle CLI.
scripts/publish_hf.py Three-stage HF publish wrapper with load_dataset smoke test and HfApi upload/visibility flip.
docs/release/v1_release_notes.md v1 release runbook: pre-publish checklist, publish steps, tag/announce, change log.
.agent-plan.md Marks PR 7.3 done and breaks out remaining manual upload/tag/announce items.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/publish_kaggle.py Outdated
Comment thread scripts/publish_hf.py
Comment thread scripts/publish_kaggle.py Outdated
@github-actions

This comment has been minimized.

COPILOT-1 (publish_kaggle.py docstring):
Remove the bogus '--go-public' CLI flag reference from step 3 of the
module docstring. There is no such flag; the public-flip is a manual
step (Kaggle web UI or 'kaggle datasets metadata' + 'update'). Rewrite
step 3 to document the actual flow that main() already prints after a
successful private upload.

COPILOT-2 (publish_hf.py --private flag):
'action=store_true' with 'default=True' made --private permanently True
and un-overridable. Switch to BooleanOptionalAction (Python 3.9+), giving
'--private' (explicit True) and '--no-private' (False), both with
default=True. Users can now pass '--no-private' to upload publicly in a
single step without the separate --go-public call.

COPILOT-3 (publish_kaggle.py visibility reporting):
'visibility = "public" if (args.public or args.update)' falsely reported
"public" for '--update' runs. 'kaggle datasets version' pushes to whatever
the repo's current visibility is; a version bump on a private dataset is
still private. Split into three distinct messages: 'new version pushed'
(--update), 'public' (--public), 'private' (default create).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

pr-agent-context report:

This run includes unresolved review comments on PR #87 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: scripts/publish_kaggle.py
URL: https://github.com/leadforge-dev/leadforge/pull/87#discussion_r3310547823
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    The module docstring describes step 3 as `python scripts/publish_kaggle.py --go-public`, but the argparse definition below only exposes `--public` (single-step public create) and there is no `--go-public` flag. The prose right after ("Calls `kaggle datasets metadata --unshare` … actually this is done via the Kaggle web UI or API") also contradicts itself. Please reconcile the docstring with the actual CLI: either remove the `--go-public` section or implement it, and rewrite the step 3 description to match the manual web-UI / `kaggle datasets metadata` + `update` flow that the `main()` epilogue already documents.

## COPILOT-2
Location: scripts/publish_hf.py:284
URL: https://github.com/leadforge-dev/leadforge/pull/87#discussion_r3310547896
Root author: copilot-pull-request-reviewer

Comment:
    `--private` is declared as `action="store_true"` with `default=True`, which means the flag can never be turned off — `args.private` is always `True` regardless of whether the user passes `--private` or not. The help text "(default; review before going public)" and the docstring `--private  Force private upload even when the repo already exists` suggest the intent is a togglable private/public switch, but as written there is no way for a user to upload directly as public on first create (the `--go-public` flow only runs after a separate upload). Consider using `argparse.BooleanOptionalAction` (giving `--private` / `--no-private`) or adding a separate `--public` flag, so the default-private behavior can actually be overridden.

## COPILOT-3
Location: scripts/publish_kaggle.py
URL: https://github.com/leadforge-dev/leadforge/pull/87#discussion_r3310547914
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    The reported visibility treats `args.update` (a new dataset version) as equivalent to "public", but `kaggle datasets version` simply pushes a new version to whatever the repo's current visibility is — it could still be private if the user is iterating before going public. Printing "Upload succeeded (public)." after a version bump can be misleading. Consider only reporting "public" when `args.public` is set, and using a neutral message (e.g. "new version pushed") for the `--update` path.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 26564455345 attempt 1
Comment timestamp: 2026-05-28T08:44:45.007824+00:00
PR head commit: 50263fe00a44d8a28dc2f808c7f1d650d9e19b7b

@shaypal5 shaypal5 merged commit ea8b6b0 into main May 28, 2026
10 checks passed
@shaypal5 shaypal5 deleted the feat/pr7.3-publish-scripts branch May 28, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: docs Documentation or narrative changes type: feature New capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants