Skip to content

feat: one-click prune nodes outside geofilter (#669 M4)#738

Merged
KpaBap merged 8 commits into
Kpa-clawbot:masterfrom
efiten:feat/geofilter-m4-clean
May 21, 2026
Merged

feat: one-click prune nodes outside geofilter (#669 M4)#738
KpaBap merged 8 commits into
Kpa-clawbot:masterfrom
efiten:feat/geofilter-m4-clean

Conversation

@efiten
Copy link
Copy Markdown
Contributor

@efiten efiten commented Apr 14, 2026

Summary

  • Adds POST /api/admin/prune-geo-filter endpoint — dry-run by default, ?confirm=true to permanently delete nodes outside the current geofilter polygon + buffer. Requires X-API-Key header.
  • Adds Prune nodes section inside the GeoFilter customizer tab (write-access only, same writeEnabled gate as PUT). Preview lists affected nodes; Confirm delete removes them.
  • Adds GetNodesForGeoPrune and DeleteNodesByPubkeys DB helpers.
  • Updates docs/user-guide/geofilter.md — documents the UI button as primary workflow, CLI script as alternative.

Depends on M3 (feat/geofilter-m3-customizer, PR #736). Merge M3 first.

Test plan

  • cd cmd/server && go test ./... — all pass
  • Customizer GeoFilter tab without apiKey — Prune section not visible
  • With apiKey + polygon active — Prune section visible
  • Preview returns list of nodes outside polygon (no deletions)
  • Confirm delete removes nodes, list clears
  • POST /api/admin/prune-geo-filter without X-API-Key → 401
  • POST /api/admin/prune-geo-filter with no polygon configured → 400

🤖 Generated with Claude Code

Copy link
Copy Markdown
Owner

@Kpa-clawbot Kpa-clawbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: one-click prune nodes outside geofilter (M4)

Good overall design: dry-run by default, API key required, no-GPS nodes always kept. Tests are thorough. A few items to address:

Must-fix

1. TOCTOU race between preview and confirm delete
The dry-run and confirm are two separate API calls. Between them, nodes could be ingested (or coordinates updated). The confirm call re-evaluates the filter independently — it doesn't use the preview list. This means:

  • Nodes not shown in preview could be deleted on confirm (ingested between calls, with out-of-bounds coords)
  • Nodes shown in preview might not be deleted (if their coords were updated to be inside)

The user sees a preview list, clicks "Delete N nodes", but a different set gets deleted. This violates the principle of least surprise for a destructive operation.

Fix: Either (a) pass the pubkey list from the preview to the confirm call (server-side or client-side), so only previewed nodes are deleted, or (b) return the actual deleted list in the confirm response so the UI shows what really happened, and document this behavior.

2. No cascade/cleanup of related data
DeleteNodesByPubkeys does DELETE FROM nodes WHERE public_key IN (...). If other tables reference these pubkeys (packets with source/dest fields, any future neighbor/edge table), orphaned references will remain. This isn't breaking today (no FK constraints), but it's technical debt that will bite when the schema evolves.

Fix: At minimum, add a code comment documenting that this only deletes from the nodes table and does not cascade. Ideally, also delete from any tables that reference the node pubkey (check transmissions/observations).

Should-fix

3. Irreversibility — no safety net
This is a permanent DELETE with no backup, no soft-delete, no undo. For a one-click operation in a UI panel, that's aggressive. The confirm() dialog is the only guard.

Suggestion: Log the deleted pubkeys + names at INFO level (not just the count) so an operator can at least see what was removed. The nodes will re-appear if the mesh re-advertises them, but historical metadata (first_seen, advert_count) is lost permanently.

4. Frontend prune confirm doesn't show the re-evaluated list
_gfPruneConfirm sends confirm=true but the response only contains { deleted: N }, not the list of what was deleted. If the set changed between preview and confirm (per issue #1), the user has no visibility into what actually happened.

Nits

  • handlePruneGeoFilter calls NodePassesGeoFilter(lat, lon, ...) where lat/lon are concrete float64 (defaulting to 0 when nil). This works correctly because PassesFilter has the lat == 0 && lon == 0 escape hatch, but it's semantically confusing — a nil coordinate is being passed as 0.0 wrapped in interface{}. Consider checking n.Lat == nil directly and skipping the filter call entirely for no-GPS nodes.
  • geofilter-builder.html loads Leaflet from unpkg CDN — consider using the same CDN/version as the rest of the app for consistency and offline scenarios.

What's good

  • Dry-run default is the right call — safe by default
  • requireAPIKey middleware + IsWeakAPIKey gate on the UI controls
  • Tests cover all the important paths: dry-run, confirm, no-filter-400, no-key-401
  • No-GPS nodes correctly preserved (verified through the full call chain)
  • Atomic config write (temp file + rename)
  • Documentation is comprehensive

Verdict: Requesting changes for the TOCTOU issue (#1). The rest can be addressed in follow-up, but #1 is a data safety concern for a destructive endpoint.

efiten added a commit to efiten/meshcore-analyzer that referenced this pull request Apr 19, 2026
- Fix TOCTOU race: confirm now requires pubkeys from preview in request
  body; server intersects with still-outside nodes so exactly the
  previewed set is deleted (no more, no less)
- Add cascade comment to DeleteNodesByPubkeys documenting that only the
  nodes table is affected (no FK constraints today)
- Log each deleted node by name + pubkey for operator visibility
- Return deleted node list in confirm response so UI shows what happened
- Check lat/lon nil directly instead of passing 0.0 to NodePassesGeoFilter
- Update confirm test to send pubkeys body; add test for missing body → 400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@efiten efiten force-pushed the feat/geofilter-m4-clean branch from ebcfc18 to b3d29be Compare April 20, 2026 07:39
efiten added a commit to efiten/meshcore-analyzer that referenced this pull request Apr 20, 2026
- Fix TOCTOU race: confirm now requires pubkeys from preview in request
  body; server intersects with still-outside nodes so exactly the
  previewed set is deleted (no more, no less)
- Add cascade comment to DeleteNodesByPubkeys documenting that only the
  nodes table is affected (no FK constraints today)
- Log each deleted node by name + pubkey for operator visibility
- Return deleted node list in confirm response so UI shows what happened
- Check lat/lon nil directly instead of passing 0.0 to NodePassesGeoFilter
- Update confirm test to send pubkeys body; add test for missing body → 400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Kpa-clawbot
Copy link
Copy Markdown
Owner

Re-review: PR #738 — geo-prune M4 (post b3d29be1)

Verdict: COMMENT — ready after rebase

Original review items (2026-04-15)

# Item Status
1 TOCTOU race between preview and confirm — different node set could be deleted ✅ Fixed. Confirm now requires pubkeys in request body; server intersects with current outside-filter set (routes.go:2565-2580). Frontend sends preview pubkeys in _gfPruneConfirm. Solid fix.
2 No cascade/cleanup comment on DeleteNodesByPubkeys ✅ Fixed. Comment added in db.go:2355-2357 documenting no FK cascade.
3 Irreversibility — no audit trail ✅ Fixed. Each deleted node logged individually at INFO: log.Printf("[geo-prune] deleted node %q (%s)") (routes.go:2596).
4 Confirm response didn't show what was deleted ✅ Fixed. Response now includes "nodes": toDelete with full list (routes.go:2600).
5 Nil coords passed as 0.0 to filter ✅ Fixed. NodeForGeoPrune uses *float64 pointers; handler checks n.Lat == nil before calling NodePassesGeoFilter (routes.go:2554).
6 Leaflet CDN in geofilter-builder.html Minor — still uses unpkg, rest of app uses CDN link in index.html. Non-blocking.

New findings (djb + munger)

No must-fix items found. The TOCTOU fix is well-designed — the intersection approach means confirm can never delete more than what was previewed, even if the polygon or GPS data changes between calls.

Observations (informational, not blocking):

  • confirm=true is a query param on a POST — acceptable since the body carries the pubkey payload and the endpoint requires X-API-Key. A future improvement could move it to the body.
  • Single DELETE FROM nodes WHERE public_key IN (...) is atomic — no partial-delete risk.
  • Same X-API-Key gates all admin endpoints. Acceptable for a single-operator tool; if multi-operator access is added later, consider per-endpoint permissions.
  • No soft-delete/undo — nodes re-appear when the mesh re-advertises them, but historical metadata (first_seen, advert_count) is lost. Documented behavior; operator is warned via confirm() dialog.

Merge verdict

Ready after rebase. PR is currently CONFLICTING against master. Depends on PR #736 (M3) — merge that first, then rebase this. Zero must-fix items remain.

@Kpa-clawbot
Copy link
Copy Markdown
Owner

@efiten same comment as elsewhere - please clean up this PR and get it ready for merging or abandon it.

efiten and others added 5 commits May 1, 2026 13:36
…bot#669 M3)

Backend:
- Add PUT /api/config/geo-filter (requires X-API-Key) — saves geo_filter
  back to config.json atomically and updates in-memory config immediately,
  no restart needed
- Add SaveGeoFilter() to config.go: reads config as raw map (preserving
  _comment fields), updates geo_filter key, writes back via temp+rename
- Add writeEnabled field to GET /api/config/geo-filter response so the
  frontend can gate editing controls on server write capability
- Add Server.configDir field; wired from -config-dir flag in main.go
- Tests: TestPutConfigGeoFilter (4 cases) + TestSaveGeoFilter (3 cases)

Frontend:
- Add GeoFilter tab (🗺️) to the customizer between Display and Export
- Tab shows current polygon on a Leaflet map (read-only for all users)
- Editing controls (undo, clear, buffer km, API key input, save/remove)
  are only revealed when the server reports writeEnabled=true — i.e. the
  deployment has a write-capable apiKey configured. Public instances see
  a read-only polygon view.
- Save calls PUT /api/config/geo-filter; Remove clears the filter
- Map is destroyed on tab switch and panel close to avoid Leaflet leaks

Docs:
- Add docs/user-guide/geofilter.md (full guide: config, customizer,
  builder, prune script, API)
- Update configuration.md and customization.md with geo_filter section
- Update config.example.json _comment to mention the Customizer tab

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clicking the small inline map in the customizer GeoFilter tab now opens
a full-screen modal (92vw × 86vh) with Undo/Clear/Done/Cancel controls.
The inline map becomes a read-only preview. Both maps and the standalone
geofilter-builder.html now use CartoDB Positron (light) instead of dark.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix data race on s.cfg.GeoFilter: add cfgMu RWMutex with getGeoFilter/
  setGeoFilter accessors used in all handler goroutines
- Add 1 MB MaxBytesReader cap on PUT /api/config/geo-filter request body
- Validate polygon coordinate ranges (lat ∈ [-90,90], lon ∈ [-180,180])
  and reject NaN/Inf values
- Cap polygon at 1000 points to bound storage and computation
- Document writeEnabled information-leak trade-off with a comment
- Add tests for coordinate range rejection and oversized polygon rejection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds POST /api/admin/prune-geo-filter endpoint (dry-run by default,
?confirm=true to delete). Wires a Prune section into the GeoFilter
customizer tab — preview lists affected nodes, confirm deletes them.
Requires write-capable apiKey (writeEnabled gate, same as PUT).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix TOCTOU race: confirm now requires pubkeys from preview in request
  body; server intersects with still-outside nodes so exactly the
  previewed set is deleted (no more, no less)
- Add cascade comment to DeleteNodesByPubkeys documenting that only the
  nodes table is affected (no FK constraints today)
- Log each deleted node by name + pubkey for operator visibility
- Return deleted node list in confirm response so UI shows what happened
- Check lat/lon nil directly instead of passing 0.0 to NodePassesGeoFilter
- Update confirm test to send pubkeys body; add test for missing body → 400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@efiten efiten force-pushed the feat/geofilter-m4-clean branch from b3d29be to 0f565e4 Compare May 1, 2026 12:14
@KpaBap KpaBap enabled auto-merge (squash) May 2, 2026 18:18
@efiten
Copy link
Copy Markdown
Contributor Author

efiten commented May 20, 2026

Must-fix #1 (TOCTOU) is addressed:

  • Confirm call requires pubkeys from preview in request body; missing body → 400
  • Backend intersects preview pubkeys with current outside-filter set before deleting — only nodes shown in preview AND still outside are deleted
  • Response returns the actual deleted node list (nodes array)
  • Each deletion is logged at INFO level with name + pubkey

Tests cover: dry-run preview, confirm delete with pubkeys, missing body → 400, no filter → 400, no API key → 401.

Ready for re-review.

Kpa-clawbot added 3 commits May 21, 2026 02:54
…xample/builder

Brings Kpa-clawbot#1289 (server read-only, prune moved to ingestor) and other
master changes into the geo-prune branch. Resolutions:

- cmd/server/config.go: keep both SaveGeoFilter (this branch) and
  obsBlacklistSet/AnalyticsConfig (master).
- cmd/server/routes.go: drop /api/admin/prune handler+route per Kpa-clawbot#1283;
  retain /api/admin/prune-geo-filter (this branch).
- cmd/server/db.go: drop server-side PruneOldPackets / PruneOldMetrics
  / RemoveStaleObservers (now ingestor-owned). DeleteNodesByPubkeys
  still present — moved to ingestor in the next commit.
- config.example.json: prefer master's GeoFilter Builder wording.
- public/geofilter-builder.html: take master version.
- s.store.graph is now atomic.Pointer; add .Load() at the call site.
…pa-clawbot#738)

After Kpa-clawbot#1283/Kpa-clawbot#1289 the server opens SQLite mode=ro. The geo-prune feature
introduced in PR Kpa-clawbot#738 invokes DB.DeleteNodesByPubkeys from an HTTP handler;
in production this would fail with 'attempt to write a readonly database'
— the feature is dead on arrival. Tests pass only because setupTestDB
uses :memory: without mode=ro.

This commit reinstates the master-side cmd/server/readonly_invariant_test.go
and extends TestServerDBHasNoWriteMethods to assert DeleteNodesByPubkeys is
NOT a method on the server *DB. With the method still present (from the
M4 feature commit), the test FAILS:

    readonly_invariant_test.go:85: server *DB exposes forbidden write
    method "DeleteNodesByPubkeys" — must be relocated to ingestor (Kpa-clawbot#1283)

The next commit (GREEN) removes DeleteNodesByPubkeys from cmd/server,
moves the DELETE to the ingestor via the new internal/prunequeue
marker-file protocol, and rewrites handlePruneGeoFilter to enqueue
requests rather than write directly.
…a-clawbot#738 / Kpa-clawbot#1289)

GREEN for the invariant test added in the prior commit. Resolves the dead-on-
arrival blocker introduced by PR Kpa-clawbot#738: the server opens SQLite mode=ro after
Kpa-clawbot#1283/Kpa-clawbot#1289, so the `confirm=true` path of `/api/admin/prune-geo-filter` would
fail in production with "attempt to write a readonly database". The author's
tests only passed because setupTestDB uses `:memory:` without mode=ro.

## Architecture: file-marker queue (chosen over IPC/HTTP proxy)

Server → marker file → ingestor → result file → server. New package
`internal/prunequeue` defines the on-disk protocol:

  <dir(dbPath)>/prune-requests/request-<id>.json   (written by server)
  <dir(dbPath)>/prune-requests/result-<id>.json    (written by ingestor)

Atomic via os.Rename. The ingestor removes the request file as part of writing
the result, so the directory naturally drains.

Why file markers over an internal localhost HTTP socket:
  * No new listener / port / auth surface to manage.
  * Inherits the same backup/permissions story as the SQLite file — they share
    a directory and a deployment.
  * Survives ingestor restarts: pending requests are processed on next boot.
  * Trivial to inspect / clear with `ls` / `trash` during ops.
  * Matches the existing maintenance-ticker pattern in cmd/ingestor/main.go.

## HTTP surface (unchanged for clients in spirit)

POST /api/admin/prune-geo-filter
  * dry-run (default): same response shape as before — preview list.
  * confirm=true: now returns 202 Accepted + {requestId, statusUrl, count, nodes}.
    Body must still carry the pubkeys snapshot from the preview (TOCTOU guard
    preserved — the ingestor honors the list verbatim, does NOT re-evaluate
    geo_filter membership).

GET /api/admin/prune-geo-filter/status?id=<id>  (NEW, requires API key)
  * pending: 200 {requestId, status:"pending"}
  * done:    200 {requestId, status:"done", deleted, requestedAt, completedAt}
  * error:   200 {requestId, status:"error", error, ...}
  * 404 when neither marker nor result exist.
  * 400 for invalid ids (path-traversal guard: hex/[0-9a-f] only, ≤64 chars).

## TOCTOU + cascade-cleanup from prior CR

  * TOCTOU: server snapshots the preview list at confirm time and writes the
    PRUNED list (intersection of preview + still-outside) into the marker.
    The ingestor does NOT re-check geo_filter — it executes exactly what the
    operator confirmed.
  * Cascade cleanup: ingestor's DeleteNodesByPubkeys chunks under SQLite's
    variable limit (500 per stmt). It deliberately limits scope to the `nodes`
    table; the schema has no FKs, so observation/transmission rows are left
    for the regular packet-retention ticker. Documented in-line.

## Read-only invariant test

Extends cmd/server/readonly_invariant_test.go (TestServerDBHasNoWriteMethods)
to forbid `DeleteNodesByPubkeys` on the server `*DB`. With this commit the
method is gone, so the test passes:

  $ go test -run TestServerDBHasNoWriteMethods ./...
  ok      github.com/corescope/server     0.190s

## Tests

* cmd/server/routes_test.go: rewrote the confirm-test to assert 202 +
  marker file existence; added pending→done status-roundtrip test;
  added 404 + path-traversal-id rejection tests. Old TestDeleteNodesByPubkeys
  removed (the method now lives in cmd/ingestor).
* cmd/ingestor/prune_geofilter_test.go: end-to-end queue test (request file
  on disk → RunPendingPruneRequests → result file + DB rows actually deleted),
  plus empty-queue no-op.
* internal/prunequeue/prunequeue_test.go: round-trip, path-traversal rejection,
  missing-result returns (nil, nil).

## Files

* internal/prunequeue/{prunequeue.go,prunequeue_test.go,go.mod} — new package
* cmd/ingestor/prune_geofilter.go — Store.DeleteNodesByPubkeys + RunPendingPruneRequests
* cmd/ingestor/prune_geofilter_test.go — queue end-to-end test
* cmd/ingestor/main.go — startup pass + 15s ticker for the queue, Stop on shutdown
* cmd/server/db.go — DeleteNodesByPubkeys removed; only GetNodesForGeoPrune (read) remains
* cmd/server/routes.go — handlePruneGeoFilter rewritten to enqueue;
  handlePruneGeoFilterStatus added; route registered
* cmd/server/routes_test.go — new tests; old direct-DELETE coverage removed
@KpaBap KpaBap merged commit 51f823b into Kpa-clawbot:master May 21, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants