Skip to content

feat(remote): add Cloudflare archive mode#76

Merged
vincentkoc merged 10 commits into
mainfrom
feature/cloudflare-remote-archives
May 27, 2026
Merged

feat(remote): add Cloudflare archive mode#76
vincentkoc merged 10 commits into
mainfrom
feature/cloudflare-remote-archives

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

@vincentkoc vincentkoc commented May 27, 2026

Summary

  • add Cloudflare remote archive config/read mode and discrawl cloud publish
  • add GitHub-backed remote login with OAuth and token-env bootstrap
  • keep existing Git share publish/subscribe/update flows unchanged
  • clarify that the Worker is deployed separately in openclaw/crawl-remote; discrawl only stores endpoint/archive config and calls it

Validation

  • GOWORK=off go test -count=1 ./internal/cli ./internal/config -run 'TestRemoteLoginStoresKeyringToken|TestRemoteLoginWithGitHubTokenEnvStoresKeyringToken|TestCloudStatusJSONUsesRemoteWithoutLocalDB|TestCloudSearchAndMessagesUseRemoteWithoutLocalDB|TestConfig'
  • GOWORK=off go test -count=1 ./...
  • live deployed Worker proof with temp config and no local DB

Release

  • consumes github.com/openclaw/crawlkit v0.8.0

@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 27, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedgithub.com/​openclaw/​crawlkit@​v0.7.0 ⏵ v0.8.095100100100100

View full report

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 27, 2026

Codex review: needs changes before merge. Reviewed May 27, 2026, 4:07 PM ET / 20:07 UTC.

Summary
Adds Cloudflare remote archive config/read/publish commands, GitHub-backed remote login, docs/tests, a crawlkit v0.8.0 bump, and a repo-local autoreview skill bundle.

Reproducibility: yes. for the review findings via source inspection: remote login stores keyring auth in cfg.Remote.Auth, while cloud publish rebuilds a config without that auth, and the login opener still launches start.URL without allowlisting. I did not run tests because this was a read-only review.

Review metrics: 3 noteworthy metrics.

  • Diff size: 31 files, +3,558/-35. The PR spans CLI dispatch, config, docs, tests, dependencies, release notes, and automation, so review has several independent surfaces.
  • Automation bundle: 3 added files, +1,542 lines. The autoreview skill dominates the diff and changes repo-local automation outside the Cloudflare archive feature.
  • Current checks: 12 successful, 0 failing listed check-runs. The earlier lint/Gosec failure appears cleared, leaving the remaining blockers as code review and merge-risk issues rather than red CI.

Merge readiness
Overall: 🦐 gold shrimp
Proof: 🌊 off-meta tidepool
Patch quality: 🦐 gold shrimp
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Reuse the shared remote token resolution path for cloud publish and add keyring-authenticated publish coverage.
  • Validate browser login URLs before OS opener launch.
  • Remove or split .agents/skills/autoreview/** and restore CHANGELOG.md to the base version.

Risk before merge

  • Users who authenticate with discrawl remote login can still have discrawl cloud publish fail unless they also export a token, because publish discards the keyring auth config.
  • The PR adds a repo-local autoreview skill and executable subprocess helper unrelated to Cloudflare archive mode, so maintainers would be adopting automation surface in a feature PR.
  • The browser login path opens a service-returned URL through OS handlers; CI now passes, but the code still lacks an explicit http/https/login URL allowlist.

Maintainer options:

  1. Fix auth and scope before merge (recommended)
    Reuse the shared remote token resolution path for cloud publish, validate browser login URLs, remove the autoreview bundle from this feature PR, and restore release-owned changelog edits before maintainer re-review.
  2. Explicitly adopt the automation bundle
    Maintainers can keep .agents/skills/autoreview/**, but it should be reviewed as automation/security-owned surface and the publish-auth and login-URL issues still need fixes.
  3. Split the branch
    If remote archive API ownership or repo-local automation ownership is not settled, split the work into a focused remote archive PR and a separate automation PR.
Copy recommended automerge instruction
@clawsweeper automerge

Special instructions:
Update `discrawl cloud publish` to preserve normalized remote config/auth or call `config.ResolveRemoteToken`, add a regression test that a token stored by `remote login` authorizes `cloud publish` without `--token-env`, validate `remote login` browser URLs before `exec.CommandContext`, remove `.agents/skills/autoreview/**` from this feature PR, and restore `CHANGELOG.md` to the base version while keeping release context in the PR body.

Next step before merge
A focused repair can address the concrete publish-auth, opener-validation, automation-scope, and changelog blockers before maintainer review resumes.

Security
Needs attention: Needs attention: the diff still opens service-returned login URLs without allowlisting and adds unrelated command-executing repo automation.

Review findings

  • [P1] Preserve login auth for cloud publish — internal/cli/cloud_commands.go:59-65
  • [P2] Validate browser login URLs before opening — internal/cli/remote_commands.go:175-177
  • [P2] Remove the unrelated autoreview bundle — .agents/skills/autoreview/SKILL.md:1-3
Review details

Best possible solution:

Land a narrow Cloudflare remote archive PR that reuses shared remote token resolution, validates the login opener, and removes or splits repo-local automation and release-owned changelog changes.

Do we have a high-confidence way to reproduce the issue?

Yes for the review findings via source inspection: remote login stores keyring auth in cfg.Remote.Auth, while cloud publish rebuilds a config without that auth, and the login opener still launches start.URL without allowlisting. I did not run tests because this was a read-only review.

Is this the best way to solve the issue?

No. The remote archive direction may be valid, but this branch should first share the remote token path, validate the opener URL, and split automation/release-owned files before merge.

Full review comments:

  • [P1] Preserve login auth for cloud publish — internal/cli/cloud_commands.go:59-65
    remote login stores the session token in cfg.Remote.Auth/keyring, but cloud publish constructs a fresh remote config from only endpoint, archive, and token-env. Since crawlkit defaults that client to env-token lookup, users who authenticated with remote login can hit unauthenticated publish unless they also export a token.
    Confidence: 0.92
  • [P2] Validate browser login URLs before opening — internal/cli/remote_commands.go:175-177
    The OAuth start response is passed directly to OS URL handlers. The #nosec comment covers shell injection, but the opener should still allowlist expected http/https login URLs before launching a service-provided string.
    Confidence: 0.78
  • [P2] Remove the unrelated autoreview bundle — .agents/skills/autoreview/SKILL.md:1-3
    This Cloudflare archive PR also adds a repo-local autoreview skill and executable helper that launches review engines through subprocesses. That changes repository automation surface and should be split into its own reviewed automation PR or removed here.
    Confidence: 0.86
  • [P3] Restore release-owned changelog edits — CHANGELOG.md:7-16
    This PR edits CHANGELOG.md, but OpenClaw release notes are release-owned in this workflow. Keep the release context in the PR body or commits and let the release process update the changelog.
    Confidence: 0.8

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: not found in the target repository.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f59809c18e73.

Label changes

Label changes:

  • add merge-risk: 🚨 security-boundary: Merging as-is keeps a service-returned login URL flowing to OS URL handlers without an explicit allowlist.

Label justifications:

  • P2: This is a normal-priority feature PR with limited blast radius but blocking auth, security-boundary, and automation-scope review issues.
  • merge-risk: 🚨 auth-provider: Merging as-is can make the documented remote login/keyring auth path fail for discrawl cloud publish.
  • merge-risk: 🚨 automation: Merging as-is adds a new repo-local agent review helper and executable scripts unrelated to the remote archive feature.
  • merge-risk: 🚨 security-boundary: Merging as-is keeps a service-returned login URL flowing to OS URL handlers without an explicit allowlist.
  • rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🌊 off-meta tidepool and patch quality is 🦐 gold shrimp.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The external-contributor proof gate does not apply to this MEMBER-authored PR; the body states live deployed Worker proof with a temp config and no local DB.
Evidence reviewed

Security concerns:

  • [medium] Unvalidated browser launch URL — internal/cli/remote_commands.go:176
    The remote login flow passes a Worker-returned URL directly to open, rundll32, or xdg-open; allowlisting expected web login URLs would reduce local handler abuse risk beyond the current Gosec suppression.
    Confidence: 0.78
  • [medium] Unrelated command-executing agent helper — .agents/skills/autoreview/scripts/autoreview:83
    The added .agents/skills/autoreview helper invokes subprocess-based review engines and can launch shell-based parallel test commands, which changes future agent/automation behavior outside the PR's stated Cloudflare archive scope.
    Confidence: 0.84

Acceptance criteria:

  • GOWORK=off go test -count=1 ./internal/cli ./internal/config -run 'TestRemoteLoginStoresKeyringToken|TestRemoteLoginWithGitHubTokenEnvStoresKeyringToken|TestCloudPublishSendsNonDMRows|TestCloudStatusJSONUsesRemoteWithoutLocalDB|TestCloudSearchAndMessagesUseRemoteWithoutLocalDB|TestConfig'
  • GOWORK=off go test -count=1 ./...
  • GOWORK=off gosec -exclude=G101,G115,G202,G301,G304 ./...

What I checked:

  • Target AGENTS policy check: No AGENTS.md exists inside the discrawl checkout; the only AGENTS.md found was adjacent ClawSweeper policy outside the target repo and was not applied as discrawl policy.
  • Current main lacks the requested remote mode: Current main has Git share remote handling but no cloud, remote, whoami, subscribe-cloud, [remote] config, or crawlremote client code, so this PR is not obsolete on main. (internal/cli/cli.go:217, f59809c18e73)
  • PR surface: GitHub reports 31 changed files with +3,558/-35, including 3 added .agents/skills/autoreview/** files totaling +1,542 lines. (daa9316ee1af)
  • Cloud publish drops stored auth: runCloudPublish builds a fresh crawlremote config from endpoint/archive/token-env only, so it does not preserve cfg.Remote.Auth or the keyring token written by remote login. (internal/cli/cloud_commands.go:59, daa9316ee1af)
  • crawlkit v0.8 default token behavior: crawlremote.NewClientFromConfig defaults to EnvTokenProvider{Name: cfg.TokenEnv} when no explicit TokenProvider is supplied, confirming that dropped auth falls back to env-only publish auth. (github.com/openclaw/crawlkit/remote/remote.go:132)
  • Remote login stores keyring auth: finishRemoteLogin stores the returned token through config.StoreRemoteToken and writes cfg.Remote.Auth, while remoteClient later resolves tokens through config.ResolveRemoteToken. (internal/cli/remote_commands.go:196, daa9316ee1af)

Likely related people:

  • Peter Steinberger: Current-main blame and history for CLI dispatch, config normalization, keyring fallback, Git share publish/subscribe, and release integration point to Peter as the central owner for the affected baseline paths. (role: core CLI/config contributor; confidence: high; commits: 118dea0a308d, 1808bef68f35, 624b7718947c; files: internal/cli/cli.go, internal/config/config.go, internal/config/discord_token.go)
  • Vincent Koc: Current-main history shows Vincent on crawlkit control/TUI integration and repo-local skills, which makes him relevant to the remote/crawlkit and .agents/skills surfaces beyond merely opening this PR. (role: recent adjacent contributor; confidence: high; commits: 638fa1c4565c, c4be70e52191, 5e5c40153111; files: .agents/skills/crabbox/SKILL.md, internal/cli/control_commands.go, internal/cli/tui_commands.go)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@vincentkoc vincentkoc marked this pull request as ready for review May 27, 2026 19:12
@vincentkoc vincentkoc requested a review from a team as a code owner May 27, 2026 19:12
@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels May 27, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 27, 2026

ClawSweeper PR egg

🔥 Warming up: real-behavior proof passed; findings, security review, or rank-up moves are still in progress.

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.
What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@clawsweeper clawsweeper Bot added the merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. label May 27, 2026
@vincentkoc vincentkoc merged commit dcf0c9c into main May 27, 2026
12 checks passed
@vincentkoc vincentkoc deleted the feature/cloudflare-remote-archives branch May 27, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P2 Normal priority bug or improvement with limited blast radius. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant