This project is intentionally light on moving parts, but the operator path is still easier if there is one place to look for the recurring commands.
If you need the 60-second system map before touching anything, read AT_A_GLANCE.md first. It points to the main subsystems, their code owners, and the first knobs to check when the node drifts.
Current reference host image:
Ubuntu Server 24.04.4 LTS
As of March 30, 2026, Ubuntu 26.04 LTS is still beta with final release
expected on April 23, 2026, so 24.04.4 LTS remains the stable hosting base
for this stack.
Bootstrap a new server:
./scripts/first_boot.sh --public-host 203.0.113.10 --deployRe-deploy after code or config changes:
./scripts/update.sh --public-host 203.0.113.10Run the fast repo checks before deploy:
./scripts/check.shClear local cache and test/browser noise when this clone has been used for setup or diagnostics:
./scripts/clean_local.shRun the operator doctor for env, browser, and storage posture:
./scripts/doctor.shWalk through the install-day hardware and kiosk checklist:
open docs/installation-checklist.mdOpen the reference Ubuntu host recipe:
open docs/UBUNTU_APPLIANCE.mdReview the current hands-free trigger path for /kiosk/:
open docs/HANDS_FREE_CONTROLS.mdOpen the short recovery card for a non-author steward:
open docs/OPERATOR_DRILL_CARD.mdSee service state and backend readiness:
./scripts/status.shTail recent logs for one service:
./scripts/status.sh --logs api
./scripts/status.sh --logs worker --tail 80Create a backup:
./scripts/backup.shCreate a consistency-first backup for research snapshots:
./scripts/backup.sh --consistentCreate a portable export bundle from the latest backup:
./scripts/export_bundle.sh --latestRun close-of-day archive flow (consistent backup + export, optional USB copy):
./scripts/session_close_archive.sh
./scripts/session_close_archive.sh --to-usb /absolute/mount/pathRestore a backup:
./scripts/restore.sh --from backups/20260317-120000Create a remote-friendly support bundle with logs and health snapshots:
./scripts/support_bundle.shscripts/first_boot.shcreates.envif needed, replaces development defaults, and optionally chains into deployment.scripts/deploy.shwrites host and TLS settings into.env, refuses obvious development secrets, and runs compose.scripts/ubuntu_appliance.shconfigures the current Ubuntu host recipe: narrow firewall defaults plus a restart-on-boot systemd unit for this repo checkout.scripts/update.shis the normal existing-server path: fast-forward pull, checks, doctor, backup, deploy, and final status.scripts/check.shis the quick sanity pass for browser JavaScript syntax, frontend unit tests with Node coverage thresholds, the default Playwright browser subset, Python, the Django behavior suite with Python coverage thresholds and reports, shell syntax, andgit diff --check.scripts/release_smoke.shis the disposable compose-backed appliance proof: it boots an isolated smoke stack on127.0.0.1:18080, waits for/healthzand/readyz, then runs the live Playwright ritual for kiosk submit, room playback, and ops visibility.scripts/research_smoke.shis the evaluation-focused disposable proof: it runs a deeper submit/revoke/remove flow, verifies audit trail visibility, and creates backup/export artifacts (plus optional disposable restore rehearsal). It is intentionally software-scoped and does not prove physical mic/speaker routing or steward comprehension.scripts/clean_local.shremoves regenerable local caches such asapi/.test-cache,__pycache__, and Playwright output. Pass--include-screenshotsif you also want to clear generated screenshots..github/workflows/check.ymlruns that samescripts/check.shgate in GitHub Actions using a repo-local.venv, so CI stays aligned with the local check path.scripts/doctor.shchecks.env, compose state, narrow API health through/healthz, broader cluster readiness through/readyz, and browser/TLS constraints that affect recording.scripts/browser_kiosk.shlaunches Chromium into/kiosk/,/room/, or/ops/with a repeatable kiosk-safe flag set. The/room/role adds autoplay-hardening flags automatically./ops/also now includes an operator-only monitor panel for output-tone checks and local live mic play-through. Use that surface, not/kiosk/, when you need to verify the current steward machine's local routing. Do not overread it as proof of the separate kiosk recorder or room playback machine.scripts/status.shprintsdocker compose psand then fetches/healthzand/readyzfrom inside the API container.scripts/backup.shwrites timestamped Postgres and MinIO snapshots intobackups/, includes checksums/provenance metadata, and supports--consistentmode for short write-path pauses during capture.scripts/restore.shrestores one of those snapshots into the current stack and now asks for explicit confirmation plus a fresh pre-restore consistent snapshot by default.scripts/export_bundle.shpackages one backup snapshot into a portable.tgzwith a manifest, checksums, explicit import instructions, and an artifact summary when the API container is running.scripts/session_close_archive.shis the close-of-day wrapper: consistent backup first, export second, and optional USB copy/checksum in one bounded host command.scripts/support_bundle.shgathers a redacted.env,/healthz,/readyz, compose status, doctor output, recent logs, and an artifact summary into a single handoff archive./api/v1/operator/artifact-summarygives stewards the same artifact posture snapshot as a direct JSON download from/ops/.docs/installation-checklist.mdis the install-day checklist for kiosk hardware, browser mode, audio routing, and auto-start verification.docs/UBUNTU_APPLIANCE.mdis the explicitUbuntu Server 24.04.4 LTShost recipe for firewall and restart-on-boot posture.docs/HANDS_FREE_CONTROLS.mddocuments the current Leonardo-based kiosk button path that reuses the browser shortcut contract instead of adding a new host control layer.docs/OPERATOR_DRILL_CARD.mdis the shortest recovery ritual for kiosk, room, operator, and emergency archive removal when time is tight.- Django also validates runtime config relationships at startup now, so bad threshold ordering or insecure origin posture fails fast before the stack enters service.
INSTALLATION_PROFILEcan provide a named starting posture for room behavior and kiosk defaults. Explicit env vars still override profile defaults.ENGINE_DEPLOYMENTdeclares the active deployment kind (memorydefault; alsoquestion,prompt,repair,witness,oracle) so/ops/, participant framing, artifact metadata, and playback weighting can branch safely without changing routes.docker-compose.ymlnow pins MinIO andmcto fixed official release tags instead oflatest. If you want to bump them, changeMINIO_SERVER_IMAGEandMINIO_MC_IMAGEintentionally, then run the normal check + smoke path before deploy.- Public write paths are also guarded by server-side WAV validation and two-layer DRF throttling: a kiosk-friendly client limit plus a broader IP abuse ceiling. If you tune those limits, update
INGEST_MAX_UPLOAD_BYTES,INGEST_MAX_DURATION_SECONDS,PUBLIC_INGEST_RATE,PUBLIC_INGEST_IP_RATE,PUBLIC_REVOKE_RATE, andPUBLIC_REVOKE_IP_RATEtogether. /ops/now shows those configured budgets plus recent throttle hits, and/kiosk/shows a soft warning when the current station is nearing its remaining ingest budget.- Leave
DJANGO_TRUST_X_FORWARDED_FOR=0unless your reverse proxy strips and rewrites forwarded headers correctly. If you turn it on, throttling and steward network allowlists will trust that header. - Django now defaults its shared cache to
CACHE_URLand otherwise falls back toREDIS_URLwhen present, so cache-backed lockouts, throttle snapshots, heartbeat timestamps, and playback-ack dedupe live in shared Redis instead of per-process local memory. Outside debug mode, startup now fails immediately if neither is present unless you explicitly setDJANGO_ALLOW_LOCAL_MEMORY_CACHE=1for an isolated local harness. /readyzand/ops/now expect fresh Celery worker and beat heartbeats./healthzstays narrow so the API container health check does not depend on broader worker/beat state.- Operator sessions now default to
OPS_SESSION_BINDING_MODE=user_agent, which is less brittle than pinning to the steward IP. Usestrictif you explicitly want IP+browser binding, ornonefor a very trusted single-site install. - Failed operator sign-ins now default to
OPS_LOGIN_LOCKOUT_SCOPE=ip_user_agent, so a bad secret attempt is less likely to lock out unrelated stewards behind the same NAT. Useiponly if you explicitly want network-wide lockout behavior.
The official supported runtime is the Docker Compose stack, with the API image
from api/Dockerfile pinned to Python 3.12.
What that means in practice:
- deployment and operator guidance assume the containerized stack
docker compose up --buildis the source-of-truth runtime./scripts/check.shis the source-of-truth repo gate- local
.venvusage is still useful, but it is a convenience path rather than the primary support contract
If ./scripts/check.sh reports a host Python other than 3.12, treat that as
best-effort local maintenance. It may still work, but the repo does not promise
that every dependency will install or behave identically outside the container.
After deploy, run these once before calling the node install-ready:
curl -sSI http://127.0.0.1/ | grep -E 'X-Content-Type-Options|X-Frame-Options|Referrer-Policy|Permissions-Policy'
docker compose exec -T api sh -lc 'id && touch /var/log/memory_engine/.write-test && rm -f /var/log/memory_engine/.write-test'
docker compose exec -T api sh -lc 'python - <<\"PY\"\nimport tempfile\nf=tempfile.NamedTemporaryFile(delete=True)\nf.write(b\"ok\")\nf.flush()\nprint(\"tmp-ok\")\nPY'Interpretation:
- missing expected proxy headers means Caddy hardening config is not active
- failed write test under
/var/log/memory_enginemeans volume ownership/permissions need attention - failed tempfile write suggests container runtime permissions are too restrictive for normal app behavior
Current bundled installation profiles:
custom: no bundled behavior defaults beyond the normal repo baselinequiet_gallery: slower pacing, gentler tone, and quiet-hours enabledshared_lab: balanced defaults for a recording kiosk plus a separate playback surfaceactive_exhibit: quicker pacing, shorter slice windows, and more overlap
For the current reference host, the shortest repeatable path is:
sudo ./scripts/ubuntu_appliance.sh
./scripts/first_boot.sh --public-host memory.example.com --deploy
./scripts/status.sh
./scripts/doctor.shThat gives the host three things the repo previously only implied:
ufwenabled with SSH, HTTP, and HTTPS open- a
memory-engine-compose.serviceunit under/etc/systemd/system/ - Docker and the compose unit enabled for restart-on-boot
Common variants:
sudo ./scripts/ubuntu_appliance.sh --ssh-port 2222
sudo ./scripts/ubuntu_appliance.sh --start-nowIf the host recipe changes materially, update both this runbook and UBUNTU_APPLIANCE.md together.
MinIO is part of the core storage path for raw audio, derivatives, backup, restore, and export, so this stack now treats image drift as an operational risk instead of a convenience.
Current default pinned images:
MINIO_SERVER_IMAGE=minio/minio:RELEASE.2025-04-22T22-12-26ZMINIO_MC_IMAGE=minio/mc:RELEASE.2025-04-16T18-13-26Z
Those defaults live in .env.example, and docker-compose.yml uses them with
shell-fallback defaults so a missing local .env does not silently revert to
latest.
Upgrade posture:
- bump MinIO tags intentionally
- run
./scripts/check.sh - run the release smoke or a real local compose bring-up
- only then deploy to a stewarded node
For a normal update on an existing server:
./scripts/update.sh --public-host memory.example.comThat is the default conservative path for an existing server. It will:
- Fast-forward pull the current branch from
origin. - Run
./scripts/check.sh. - Run
./scripts/doctor.sh. - Run
./scripts/backup.sh. - Run
./scripts/deploy.sh --public-host .... - Run
./scripts/status.sh.
Then open /ops/ and confirm the node is ready with no critical storage or pool warnings.
Sign in there with OPS_SHARED_SECRET; the dashboard now protects live operator controls behind that shared secret, optional trusted-network rules, login lockout, and browser-bound steward sessions.
That sequence is deliberately conservative. The extra backup step matters more here than squeezing a few seconds out of deploy time.
If you need to skip one phase intentionally:
./scripts/update.sh --public-host memory.example.com --skip-pull
./scripts/update.sh --public-host memory.example.com --skip-backup
./scripts/update.sh --public-host 203.0.113.10 --tls internalThere are four practical health surfaces:
docker compose pstells you whether the containers are running and whether Docker thinks health checks are passing./healthzis the narrow API/dependency view and is the source used by the API container health check./readyzis the broader cluster readiness view, including worker/beat heartbeat state./ops/is the human-facing dashboard for steward use during install or troubleshooting./ops/is now the authenticated steward surface. It exposes maintenance mode, pause-intake, pause-playback, and quieter-mode controls once the steward secret is accepted./ops/can also be narrowed to trusted IPs or CIDR ranges withOPS_ALLOWED_NETWORKS.- repeated bad sign-in attempts now lock out temporarily based on
OPS_LOGIN_MAX_ATTEMPTSandOPS_LOGIN_LOCKOUT_SECONDS. /ops/also reports retention posture: raw audio still held, raw audio expiring soon, fossils retained, and fossils that now exist only as residue./ops/is also the place to run the deeper monitor check: output tone plus live mic pass-through, both local to the steward browser and never archived.- For unattended listening machines, launch Chromium through
./scripts/browser_kiosk.sh --role room --base-url ...so the browser picks up the autoplay-safe flags instead of relying on a one-tap recovery after every reboot.
Use this sequence before the public arrives:
- Run
./scripts/status.shand./scripts/doctor.sh. - Open
/ops/and confirm the state readsreadyor a known non-critical degraded state. - Run the
/ops/output tone. - Run live monitor only if local steward-browser routing needs proof.
- Open
/kiosk/,/room/, and/revoke/on their intended machines. - Confirm intake and playback are not paused by accident.
Practical reminder:
- the
/ops/monitor proves the steward browser's local routing only - it does not certify the dedicated kiosk recorder path
- it does not certify the separate room playback machine
Use this sequence when the session ends:
- Confirm no one is still recording and the room can fall quiet naturally.
- Check
/ops/for critical storage or queue warnings that should be handed off immediately. - Use
Clear session framingin/ops/or/ops/bench/. - Run
./scripts/session_close_archive.sh(or--to-usb /absolute/mount/pathfor USB handoff). - Leave a short steward note with the printed backup/export paths.
- Use maintenance mode only if the node should stay explicitly out of service until the next steward returns.
Expected healthy services:
proxyapidbredisminioworkerbeat
minio_init is expected to complete and exit.
If the kiosk machine boots and the Leonardo path suddenly appears dead, check browser focus before checking firmware or wiring, whether the trigger is a panel button or footswitch.
The usual failure pattern is:
- the board still sends HID key events
- Chromium reopened with a restore prompt, permission chip, or browser chrome in front
- the kiosk surface is no longer the focused target for those key events
Recovery order:
- Confirm Chromium is frontmost on
/kiosk/. - Dismiss any restore or permission UI that may have appeared after boot.
- Test a real keyboard
SpaceorEscape. - If the keyboard works, the Leonardo path is almost certainly fine too.
- Relaunch via
./scripts/browser_kiosk.sh --role kiosk --base-url ...if the browser came back in a bad posture.
Do not debug the microcontroller first unless a normal keyboard also fails to move the kiosk.
Do this before first public deployment, and repeat it after major storage, retention, or infrastructure changes.
- Run
./scripts/backup.sh. - Run
./scripts/export_bundle.sh --latest. - Copy the newest backup directory or export bundle to a throwaway host or throwaway clone. Do not rehearse by overwriting the live node first.
- On that rehearsal target, bring up the stack and run
./scripts/restore.sh --from /path/to/backup-directory. - Open
/ops/,/kiosk/,/room/, and/revoke/on the rehearsal target and confirm they still behave like a coherent appliance. - Record the elapsed time, any missing secret or permission surprises, and any restore-only errors in the steward notes for that installation.
Rehearsal is only complete when:
/ops/signs in and reports an understood state- the kiosk can still submit one test recording
- the room can still play restored audio
- the steward can point to the latest backup and export bundle without guessing
Quick compose commands if the helper script is not enough:
docker compose ps
docker compose logs --tail 100 api
docker compose logs --tail 100 worker
docker compose logs --tail 100 proxyIf the operator dashboard says degraded or broken, look at api first. If the API is healthy but playback is missing, inspect worker, beat, and minio.
Backups currently capture:
- Postgres metadata as
postgres.sql.gz - MinIO object data as
minio-data.tgz
Each backup lands under backups/YYYYMMDD-HHMMSS/ with a small manifest file.
Restore cautions:
scripts/restore.shreplaces the current database contents.scripts/restore.shreplaces the current MinIO object store.scripts/restore.shnow takes that fresh pre-restore consistent backup automatically unless you pass--skip-snapshot.scripts/restore.shalso asks you to typeRESTOREunless you pass--yes.- Expect active playback and ingest to be interrupted during restore.
Export bundle notes:
scripts/export_bundle.sh --latestpackages the newest backup intoexports/.scripts/export_bundle.sh --latest --to-usb /mount/pointalso copies that archive onto a mounted USB path, verifies SHA-256 parity, and writes a sidecar.sha256file next to the copied archive.- Each export includes the Postgres dump, MinIO archive, source manifest when available, a bundle manifest,
CHECKSUMS.txt,IMPORT-INSTRUCTIONS.txt, and anonymized summary stats (anonymized-stats.json, plusartifact-summary.jsoncompatibility alias) when the API container is available. - The unpacked export bundle is itself a valid
scripts/restore.sh --from ...source directory, so the handoff format stays aligned with the existing restore flow. - Use export bundles for migration, archival handoff, or off-machine storage where a single file is easier to manage than a backup folder.
USB handoff ritual (fossils + anonymized stats):
- Insert and mount the USB drive on the steward host.
- Run:
./scripts/export_bundle.sh --latest --to-usb /absolute/mount/path - Confirm the script prints both:
USB copy created: ...USB checksum file: ...
- Optional double-check on that same mount:
- Linux:
sha256sum -c /absolute/mount/path/memory-engine-export-*.tgz.sha256 - macOS:
shasum -a 256 /absolute/mount/path/memory-engine-export-*.tgz
- Linux:
- Eject the USB drive only after the checksum step succeeds.
Audience presence sensing (optional):
- Presence sensing is off by default.
- To enable it, set
PRESENCE_SENSING_ENABLED=1in.env. - Start the sensor service with the compose profile:
docker compose --profile presence up -d presence_sensor - Keep
PRESENCE_CAMERA_DEVICEas a host device path (for compose mapping), such as/dev/video0. - Use
PRESENCE_CAMERA_SOURCEfor OpenCV capture source (/dev/video0or0). - When enabled,
/readyzand/ops/include apresencecomponent. If the webcam feed or sensor loop goes stale, readiness drops todegraded. - This phase is motion-only (
opencvframe differencing). It stores no video frames and only publishes aggregate presence state plus heartbeat timing. - For ethics posture, signage language, Redis key details, and pilot boundary rules, use PRESENCE_SENSING.md.
Support bundle notes:
scripts/support_bundle.shwrites intosupport-bundles/.- It includes redacted environment values, compose status, doctor output,
/healthz,/readyz, recent logs for the main services, andartifact-summary.jsonwhen the API container is available. - It is meant for remote troubleshooting without handing over shell access or the raw
.env.
This stack uses MinIO only as private object storage for raw audio and derivatives. It is not intended to be exposed publicly by default.
Where each setting lives:
MINIO_ROOT_USERandMINIO_ROOT_PASSWORDare read by theminiocontainer itself. These are the bootstrap admin credentials for the MinIO server.MINIO_ENDPOINT,MINIO_BUCKET,MINIO_ACCESS_KEY, andMINIO_SECRET_KEYare read byapi,worker,beat, andminio_init.docker-compose.ymlbinds MinIO to127.0.0.1:9000and the MinIO console to127.0.0.1:9001, so server-root access or an SSH tunnel is the normal way to inspect it directly.
What to set before the first deploy:
- Set strong values for
MINIO_ROOT_USERandMINIO_ROOT_PASSWORD. - Set
MINIO_BUCKETto the bucket name you want the app to use. The defaultmemoryis fine unless you need a different naming scheme. - Leave
MINIO_ENDPOINT=http://minio:9000if MinIO stays inside this compose stack. That internal service name is what the app expects.
Current repo behavior:
- The simplest supported path is to keep
MINIO_ACCESS_KEYequal toMINIO_ROOT_USER. - The simplest supported path is to keep
MINIO_SECRET_KEYequal toMINIO_ROOT_PASSWORD. - In that mode,
minio_inituses those credentials to create the bucket on first boot, and the Django/Celery services use the same credentials to read and write objects afterward.
Current recommendation:
- For the simplest single-node installation, reusing the root-backed credentials is still acceptable.
- For a production or longer-lived installation, prefer a separate MinIO service identity for
MINIO_ACCESS_KEYandMINIO_SECRET_KEY. - That keeps the app off the MinIO admin identity and makes later credential rotation cleaner.
If you want to provision MinIO manually:
- You can create a separate MinIO user or service account yourself because you have root on the server.
- If you do that, set
MINIO_ACCESS_KEYandMINIO_SECRET_KEYin.envto that non-root identity. - That identity needs permission to read, write, list, and delete objects in
MINIO_BUCKET. minio_initstill tries to ensure the bucket exists usingMINIO_ACCESS_KEYandMINIO_SECRET_KEY, so that identity also needs permission to create the bucket, or you need to create the bucket yourself before deploy.
When to change what:
- Before first deploy: set all MinIO env vars in
.env. - When rotating the MinIO root/admin credentials: update
MINIO_ROOT_USERandMINIO_ROOT_PASSWORD, and also updateMINIO_ACCESS_KEY/MINIO_SECRET_KEYif the app is still using the root identity. - When rotating only the app/service identity: update
MINIO_ACCESS_KEYandMINIO_SECRET_KEY, then re-run deployment soapi,worker,beat, andminio_initpick up the new values. - When changing
MINIO_BUCKET: create the new bucket first or letminio_initcreate it, then redeploy the stack so all services point at the same place. - When moving MinIO outside this compose stack: change
MINIO_ENDPOINTto the external S3-compatible endpoint and verify network reachability from theapicontainer.
Rotation notes:
- Django secret rotation: update
DJANGO_SECRET_KEYin.env, redeploy, and expect session invalidation. - Steward secret rotation: update
OPS_SHARED_SECRETin.env, redeploy, and expect current/ops/sessions to sign in again. - Postgres password rotation: rotate
POSTGRES_PASSWORDin both thedbservice and the application.env, then redeploy together. - MinIO app/service credential rotation: update
MINIO_ACCESS_KEYandMINIO_SECRET_KEY, ensure the MinIO identity already exists with bucket read/write/list/delete access, then redeploy. - MinIO root/admin credential rotation: update
MINIO_ROOT_USERandMINIO_ROOT_PASSWORD, and also update app credentials if the app still shares that same identity.
External S3-compatible migration notes:
- Pre-create the destination bucket and grant the app identity read, write, list, and delete permissions there.
- Copy object data from the existing MinIO bucket before changing
.env. - Update
MINIO_ENDPOINT,MINIO_BUCKET,MINIO_ACCESS_KEY, andMINIO_SECRET_KEY. - Run
./scripts/check.sh, then redeploy and confirm/healthz,/readyz, plus a real playback request from/room/. - Keep the old MinIO data untouched until
/ops/reports healthy storage and the room has successfully played migrated audio.
Versioning and object-locking notes:
- Leave MinIO bucket versioning and object locking disabled by default in this stack.
- The current retention and revocation model expects real deletes to succeed for raw audio and derivatives.
- If policy ever requires object locking, treat that as a deeper storage-policy project rather than a flip-the-switch operator task.
Practical verification after deploy:
docker compose logs --tail 100 minio
docker compose logs --tail 100 minio_init
docker compose exec -T api curl -fsS http://localhost:8000/healthz
docker compose exec -T api curl -fsS http://localhost:8000/readyzIf you want to inspect the MinIO console directly on the server, use http://127.0.0.1:9001 locally on that machine or tunnel it over SSH.
Use this as the quick triage table before drilling into longer logs:
| Symptom | First place to look | First likely action |
|---|---|---|
/ops/ says broken |
failing warning card or dependency card | ./scripts/status.sh |
| Kiosk trigger appears dead after reboot | Chromium focus on /kiosk/ |
relaunch with ./scripts/browser_kiosk.sh --role kiosk --base-url ... |
| Room is silent | /ops/ playback pause and /room/ autoplay posture |
clear pause state, then relaunch room browser |
| Monitor path seems wrong | /ops/ output tone then live monitor |
verify steward-browser mic permission and OS input device first; this does not prove the kiosk or room machines |
| Storage is critical | /ops/ storage card and host disk usage |
run a backup, then clear non-essential local clutter intentionally |
| Restore is needed | latest backup directory and export bundle | rehearse first if time permits, then run ./scripts/restore.sh --from ... |
Check OPS_SHARED_SECRET in .env, then redeploy. scripts/first_boot.sh now generates that value automatically if it is still a placeholder.
If OPS_ALLOWED_NETWORKS is set, also confirm the current steward machine IP falls inside one of those ranges.
If repeated attempts were made with the wrong secret, wait for the OPS_LOGIN_LOCKOUT_SECONDS window to expire before retrying.
The browser microphone API usually requires https:// or localhost. A plain remote http://IP/... URL often renders the page but blocks recording.
Check service order and dependency state:
dbhealthredishealthminioreachability- MinIO bucket and credentials in
.env apilogs for migration or environment errors
The API is up, but broader cluster work is degraded. Check:
workerandbeatservice state- shared Redis cache / broker reachability from all processes
- stale worker/beat warnings in
/ops/ - worker logs for failed derivative or expiry tasks
Check /ops/ first. If artifact counts are low, the system may be behaving correctly and just has little material to work with. If counts are healthy, inspect the browser kiosk and worker logs.
Refresh the browser kiosk and re-run ./scripts/status.sh. If the containers restarted cleanly, stale browser state is usually the issue before backend state is.
Treat this as a stewardship problem first, not a pool-tuning problem.
- Run
./scripts/backup.shbefore changing too much. - Confirm whether the pressure is on the host volume, MinIO data, or accumulated support/export artifacts.
- Move old support bundles and old copied exports off-machine if they are only lingering on the host for convenience.
- Do not delete active MinIO or Postgres data by hand unless you are already in a restore or migration procedure.
.envis operator-owned state. Treat it as part of deployment, not source control.backups/should be copied off-machine if the installation matters.docs/roadmap.mdtracks future improvements; this file is for recurring operations, not product planning.