feat(scripts): publish_kaggle + publish_hf + v1 release notes (PR 7.3)#87
Merged
Conversation
Adds the two publish scripts and the v1 release runbook. scripts/publish_kaggle.py - Three-stage runbook: --dry-run (package+lint) → private upload → --public - Wraps package_kaggle_release.run_packager + lint_platform_metadata.run_lint as pre-flight; both must pass before any upload attempt - Calls kaggle datasets create (new dataset) or kaggle datasets version (--update MESSAGE) for future version bumps - Dry-run confirmed passing: package OK, lint OK scripts/publish_hf.py - Same three-stage pattern for HuggingFace + instructor companion - Adds load_dataset() smoke test (G12.3 / G12.4) as step 3/4: all 3 public configs load (5 000 rows x 3 splits each); instructor config loads (intermediate) - Uploads via huggingface_hub.HfApi.upload_folder (private by default) - --go-public flips visibility via HfApi.update_repo_visibility - --variant=public|instructor selects the target repo - Dry-run confirmed passing for both variants docs/release/v1_release_notes.md - Full pre-publish runbook (7 steps) - Publish steps (private -> review -> public) for all three repos - Tag + announce instructions - Change log vs alpha bundles Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Adds the final two release publish scripts (Kaggle and Hugging Face) plus the v1 release runbook. Each script wraps existing packaging/lint helpers, supports a credential-free --dry-run mode, and (for HF) runs a local load_dataset() smoke test before delegating to the platform CLI/SDK for the actual upload.
Changes:
- New
scripts/publish_kaggle.pywith package + lint +kaggle datasets create/versionflow. - New
scripts/publish_hf.pywith package + lint +load_datasetsmoke +HfApi.upload_folder, plus a--go-publicvisibility flip. - New
docs/release/v1_release_notes.mdrunbook and.agent-plan.mdcheckbox update.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| scripts/publish_kaggle.py | Three-stage Kaggle publish wrapper around package + lint + Kaggle CLI. |
| scripts/publish_hf.py | Three-stage HF publish wrapper with load_dataset smoke test and HfApi upload/visibility flip. |
| docs/release/v1_release_notes.md | v1 release runbook: pre-publish checklist, publish steps, tag/announce, change log. |
| .agent-plan.md | Marks PR 7.3 done and breaks out remaining manual upload/tag/announce items. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
COPILOT-1 (publish_kaggle.py docstring): Remove the bogus '--go-public' CLI flag reference from step 3 of the module docstring. There is no such flag; the public-flip is a manual step (Kaggle web UI or 'kaggle datasets metadata' + 'update'). Rewrite step 3 to document the actual flow that main() already prints after a successful private upload. COPILOT-2 (publish_hf.py --private flag): 'action=store_true' with 'default=True' made --private permanently True and un-overridable. Switch to BooleanOptionalAction (Python 3.9+), giving '--private' (explicit True) and '--no-private' (False), both with default=True. Users can now pass '--no-private' to upload publicly in a single step without the separate --go-public call. COPILOT-3 (publish_kaggle.py visibility reporting): 'visibility = "public" if (args.public or args.update)' falsely reported "public" for '--update' runs. 'kaggle datasets version' pushes to whatever the repo's current visibility is; a version bump on a private dataset is still private. Split into three distinct messages: 'new version pushed' (--update), 'public' (--public), 'private' (default create). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #87 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: scripts/publish_kaggle.py
URL: https://github.com/leadforge-dev/leadforge/pull/87#discussion_r3310547823
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The module docstring describes step 3 as `python scripts/publish_kaggle.py --go-public`, but the argparse definition below only exposes `--public` (single-step public create) and there is no `--go-public` flag. The prose right after ("Calls `kaggle datasets metadata --unshare` … actually this is done via the Kaggle web UI or API") also contradicts itself. Please reconcile the docstring with the actual CLI: either remove the `--go-public` section or implement it, and rewrite the step 3 description to match the manual web-UI / `kaggle datasets metadata` + `update` flow that the `main()` epilogue already documents.
## COPILOT-2
Location: scripts/publish_hf.py:284
URL: https://github.com/leadforge-dev/leadforge/pull/87#discussion_r3310547896
Root author: copilot-pull-request-reviewer
Comment:
`--private` is declared as `action="store_true"` with `default=True`, which means the flag can never be turned off — `args.private` is always `True` regardless of whether the user passes `--private` or not. The help text "(default; review before going public)" and the docstring `--private Force private upload even when the repo already exists` suggest the intent is a togglable private/public switch, but as written there is no way for a user to upload directly as public on first create (the `--go-public` flow only runs after a separate upload). Consider using `argparse.BooleanOptionalAction` (giving `--private` / `--no-private`) or adding a separate `--public` flag, so the default-private behavior can actually be overridden.
## COPILOT-3
Location: scripts/publish_kaggle.py
URL: https://github.com/leadforge-dev/leadforge/pull/87#discussion_r3310547914
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The reported visibility treats `args.update` (a new dataset version) as equivalent to "public", but `kaggle datasets version` simply pushes a new version to whatever the repo's current visibility is — it could still be private if the user is iterating before going public. Printing "Upload succeeded (public)." after a version bump can be misleading. Consider only reporting "public" when `args.public` is set, and using a neutral message (e.g. "new version pushed") for the `--update` path.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the final two publish scripts and the v1 release runbook. All dry-runs pass locally.
Scripts
scripts/publish_kaggle.pyThree-stage runbook:
--dry-run— re-packagesrelease/kaggle/+ lints metadata, no uploadkaggle datasets create--public— single-step public upload; or flip via Kaggle web UI after reviewAlso supports
--update MESSAGEfor future version bumps (kaggle datasets version).scripts/publish_hf.pySame three-stage pattern for HuggingFace:
--dry-run— re-packages + lints + runsload_dataset()smoke test (G12.3 / G12.4)huggingface_hub.HfApi.upload_folder--go-public— flips visibility viaHfApi.update_repo_visibility--variant=public|instructorhandles both the public (3-tier) and instructor companion repos.Dry-run results
docs/release/v1_release_notes.mdFull release runbook covering:
ShmuggingFace preview
Rebuilt with
build_shmuggingface_site.pyafter PR 8.4 hardening — 48 files generated.What's left (manual, requires credentials)
docs/release/v1_release_notes.md)leadforge-lead-scoring-v1🤖 Generated with Claude Code