Skip to content

search: logsim search daemon + IR API + TUI + --to local forwarding#48

Draft
hyfather wants to merge 8 commits into
masterfrom
claude/logsim-search-daemon-tui-pZ5eQ
Draft

search: logsim search daemon + IR API + TUI + --to local forwarding#48
hyfather wants to merge 8 commits into
masterfrom
claude/logsim-search-daemon-tui-pZ5eQ

Conversation

@hyfather
Copy link
Copy Markdown
Owner

@hyfather hyfather commented May 6, 2026

Summary

logsim search starts an in-process HTTP daemon on :3700 that ingests Splunk/Cribl HEC and exposes a small set of information-retrieval functions over HTTP, plus a bubbletea TUI for spot-checking what's been ingested.

  • Daemon: chi router on 127.0.0.1:3700, CORS open for the future frontend, ctrl-c shuts down + deletes every db.
  • DBs: 6-char [a-z0-9] codes (auto-generated or caller-supplied), backed by in-memory DuckDB. Each db is one DuckDB connection with a fixed schema modelled on Splunk HEC: (time, host, source, sourcetype, index, raw, fields JSON).
  • Ingest: POST /dbs/{code}/services/collector/event (NDJSON HEC) and /raw (line-by-line). HEC ingest auto-creates the target db.
  • IR API (room to grow): get_raw, get_summary (count/sum/avg/min/max/distinct_count, optional group_by), get_distribution (time-bucketed counts), get_top_values. The function set is intentionally narrow — every operation is expressible in Splunk SPL / Cribl Search / Datadog / Quickwit, so future backends can implement them natively.
  • TUI: live db list (code · events · oldest · newest · age), Enter to spot-check the first 100 raw lines of any db.
  • Forwarding: logsim run <scenario> --to local derives a db code from the scenario slug (e.g. web-servicewebser) and forwards via the existing CriblSink. --to local:<code> pins to a specific code; --to local:my-cache-test slugifies.
$ logsim search --no-tui &
$ logsim run web-service --ticks 5 --to local --quiet --force
logsim: sent 183 events to local:webser in 2 batches
$ curl -s http://127.0.0.1:3700/dbs | jq '.dbs[0].event_count'
183
$ curl -s -X POST http://127.0.0.1:3700/dbs/webser/get_summary \
       -H 'content-type: application/json' \
       -d '{"agg_fn":"count","group_by":"sourcetype"}'
{"rows":[{"group":"nodejs","value":56},{"group":"mysql:query","value":56},{"group":"nginx:access","value":56},{"group":"aws:vpcflow","value":15}]}

Architecture

  • pkg/search/backend.goBackend interface (Ingest, GetRaw, GetSummary, GetDistribution, GetTopValues, Stats, Close) + canonical Event type. Swappable seam for Splunk/Cribl/Datadog/Quickwit later.
  • pkg/search/duckdb.goDuckDBBackend (CGO; //go:build cgo).
  • pkg/search/duckdb_nocgo.go — stub returning ErrNoCGO so the rest of the module still builds with CGO_ENABLED=0.
  • pkg/search/registry.go — process-wide map of code → backend, lifecycle, GetOrCreate for HEC auto-create.
  • pkg/search/hec.go + raw_lines.go — HEC envelope and raw-line parsers.
  • pkg/search/server.go — chi handlers.
  • pkg/search/tui.go — bubbletea program.
  • cmd/logsim/search.go — wires it all together.

CGO / Vercel

The duckdb-go binding requires CGO. To keep the Vercel functions in api/ unaffected:

  • pkg/search/duckdb.go is gated //go:build cgo; a !cgo stub satisfies the Backend interface with ErrNoCGO.
  • api/ does not import pkg/search, so CGO_ENABLED=0 go build ./api/... builds and runs as before.
  • CGO_ENABLED=0 go test ./... is also clean (the duckdb-specific test file is gated too).

Follow-up: release.yml currently builds with CGO_ENABLED=0, which means binaries from GitHub Releases will return ErrNoCGO if the user runs logsim search. Enabling CGO in the release matrix needs cross-compile toolchains (e.g. gcc-aarch64-linux-gnu for linux/arm64, native macos runners for darwin) — punted to a separate PR so this one stays focused. go install from source works today.

Test plan

  • CGO_ENABLED=1 go test ./... — passes
  • CGO_ENABLED=0 go test ./... — passes (duckdb tests gated)
  • CGO_ENABLED=0 go build ./api/... — passes (Vercel-safe)
  • End-to-end smoke: logsim search --no-tui + logsim run web-service --to local + curl /dbs/webser/get_summary round-trips
  • HEC parser handles NDJSON, object events, missing time, mixed empty lines, malformed lines
  • Registry race test confirms GetOrCreate is single-flight

Out of scope (called out for the next iteration)

  • Auth/CSRF on :3700 (deferred per request)
  • Frontend wiring against the IR API
  • More IR functions (substring search, get_correlated, get_patterns)
  • Persistent dbs / on-disk DuckDB
  • Splunk HEC ack endpoints (?channel=..., /ack)
  • release.yml CGO build (so release binaries support logsim search)

Generated by Claude Code

…warding

`logsim search` starts an in-process HTTP daemon on :3700 that ingests
Splunk/Cribl HEC and exposes a small set of information-retrieval
functions (get_raw, get_summary, get_distribution, get_top_values) over
HTTP, plus a bubbletea TUI for spot-checking what's been ingested. Each
db is identified by a 6-char alphanumeric code; ctrl-c tears down every
db on shutdown.

The IR function set is deliberately narrow so it stays expressible in
Splunk/Cribl/Datadog/Quickwit query languages — the Backend interface
in pkg/search is the swappable seam for those future implementations.
DuckDB is the v1 backend; it is gated behind `//go:build cgo` with a
non-CGO stub so api/* (Vercel) builds and `CGO_ENABLED=0 go test ./...`
both stay clean.

`logsim run <scenario> --to local` now forwards via HEC to the daemon,
auto-creating a db whose code is derived from the scenario slug
(`web-service` → `webser`); `--to local:<code>` overrides.
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
logsim2 Ready Ready Preview, Comment May 6, 2026 10:46pm

JSON-encoding a nil slice yields `null`; clients have to special-case
that. Initialise the result slices up front so an empty match returns
`{"events":[],"total":0}` etc. — easier for the upcoming frontend.
Adds a pure-Go `MemoryBackend` (no CGO) implementing the same `Backend`
interface as `DuckDBBackend`, plus a single Vercel function at
`api/search/[...path].go` that mounts the `pkg/search` chi router with
that backend. The /dbs routes, HEC ingest, and IR queries are identical
to what `logsim search` exposes locally; on Vercel the storage is in the
warm function instance's memory (cold starts reset, which matches the
per-session UX).

Frontend changes:

- `src/lib/searchClient.ts` — typed wrappers for createDb, ingestHEC,
  getRaw, getSummary, getDistribution, getTopValues against
  /api/search/...
- `useSimulationStore.dbCode` — code of the current play's db, set on
  Run, cleared on Reset.
- `Topbar` Play / Step paths create a db, ingest each tick batch via
  HEC (fire-and-forget), and best-effort delete on Reset / next play.
  Failures during ingest fall back silently — `logBuffer` still drives
  the live view.
- `logsAt` accepts an optional `dbCode`; when set, scrubbing reads
  events out of the daemon instead of re-running the engine via
  /api/logs_at. `ScrubbedLogs` passes the current `dbCode` through.

Forward mode (server-side flat-out HEC) doesn't stream logs back to
the browser, so write-through there needs a separate server-side tee
to /api/search — left as a follow-up.

`CGO_ENABLED=0 go build ./...` and `go test ./...` are clean (Vercel
build path), and `npx tsc` + `npx next build` succeed.
Moves the editor's per-tick HEC ingest out of the browser and into the
backend. /api/run and /api/logs_at now accept a `search_db_code` field;
when set, every batch of events the engine emits is also POSTed to
/api/search/dbs/<code>/services/collector/event from the same Vercel
function. Forward mode benefits the most — events were never visible
to the browser there, so until now they couldn't land in a search db
at all. The new tee runs alongside the Cribl HEC sink so destinations
still receive events the same way.

The frontend mints a 6-char code locally (the daemon auto-creates the
db on first ingest), passes it via `searchDBCode` to runStream /
runForward / logsAt, and stores it as `dbCode` for downstream queries.
Realtime, Forward, and Step all use the same path. The redundant
ingestLogs() call in Topbar's onTick is gone.

Verified end-to-end via cmd/devserver: a 10-tick scenario forwarded
with search_db_code=abc123 produced 366 events; get_summary returned
the right per-sourcetype counts and get_top_values returned realistic
status_code distributions (200, 304, 301, 201, 204, 202, 400).
Vercel's Go runtime treats every .go file under api/** as a separate
function entry point and fails the deploy when one doesn't export
Handler. The new search_tee.go helpers in api/run/ and api/logs_at/
hit exactly that error. Moving the helper into pkg/apihelp (which is
already shared across the lambdas) is the established pattern for
api/* utilities that don't ship as their own function.
ScrubbedLogs called logsAt with `from=0, to=tick` — tick indices, not
timestamps. The daemon path translated those via `dbStartTimeMs ?? 0`,
which fell back to epoch 1970 because nothing was wiring the run's
wall-clock anchor through. Result: every fetch on pause queried a
window decades before the actual events and returned nothing — logs
visibly vanished from the panel the moment playback stopped.

Adds `dbStartTimeMs` to the simulation store, populates it alongside
`dbCode` at every play / forward / step start, threads it through
ScrubbedLogs → logsAt → /api/search/dbs/<code>/get_raw. logsAt now
also adds one tick of slack to the upper bound so the trailing tick's
events (which can land slightly after their tick boundary due to
engine jitter) are included.

Verified end-to-end: 10-tick web-service run produces 366 events;
the fixed window returns 363 of them, the pre-fix epoch-1970 window
returns 0.
…views)

The server-side tee in /api/run + /api/logs_at relied on cross-function
HTTP from those handlers to /api/search/dbs/<code>/services/collector/event.
That POST has no auth cookie — and Vercel preview deployments often
have password protection enabled, which 401s any anonymous request
including function-to-function calls within the same deployment. The
tee silently failed, the daemon stayed empty, and ScrubbedLogs found
nothing on pause: "logs disappear".

Browser-side ingest works because the user's session cookie comes
along on each fetch. Reverting streaming and step to ingest from
Topbar's onTick / handleStep paths fixes the bug under preview
protection without changing the architecture for the deployed-public
case (still works there, just via the browser).

Forward mode still uses server-side tee since the browser never sees
events in that mode — it's the only path that needs cross-function
HTTP, and it's a known limitation that forward + auth-protected preview
won't populate the search db. Documented in the code comments.
When the user hits pause, isRunning flips to false and ScrubbedLogs
swaps from liveLogs (the in-memory buffer) to scrubLogs (the daemon's
view). The swap was unconditional, so two cases left the panel blank:

- The 120ms debounce window before the daemon fetch returns.
- The daemon legitimately has no events (auth-walled cross-function
  POST 401'd, network failure, cold function instance, etc.).

Both manifested as "logs disappear when I hit pause" because liveLogs
still has the events the user just watched stream by — we just stopped
showing them.

Now scrubLogs only takes over when it actually has rows; otherwise the
panel stays on liveLogs. This is also robust against future daemon
failures: the local view never goes blank just because the persistent
store is unhappy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants