Server log analytics for AI bot detection, AI agent monitoring, and AI visibility — including the traffic your JavaScript analytics never shows.
Most analytics tools count pageviews. Logwick classifies who made the request: real users, known crawlers, AI training scrapers, AI user-fetch agents, archive bots, and suspicious automated traffic — straight from your edge logs, on your own machine, with no data leaving your infrastructure. No GA, Plausible, or client-side snippet required — if you can get JSONL from your CDN or origin, you get the full picture.
AI agents don't run JavaScript. They never appear in GA or Plausible. Raw HTTP logs are the only place to catch them — Logwick does exactly that.
Run npm run dashboard-api, then open http://127.0.0.1:8787/ (static UI + read-only /api/*, 127.0.0.1 only — no data leaves your machine). See docs/dashboard.md.
Summary, timeseries, and traffic taxonomy.
Drill-down — flow Sankey and the session list.
You're flying blind on ~40% of your traffic. GA and similar tools see only browser sessions. Bots, AI crawlers, and automated clients skip JavaScript entirely — they're invisible in your dashboard but very real in your server logs.
Your logs already have the answer. Logwick processes the JSONL your CDN or edge already produces, classifies every request against a multi-phase ruleset (UA patterns, path heuristics, behavioral signals), and rolls it up into sessions you can explore — all from server-side logs, not a tracking tag on the page.
Stays on your machine. SQLite on disk, dashboard on localhost. No SaaS account, no data upload, no vendor lock-in. You control retention. Logwick never phones home.
Logwick is server-side analytics built for AI bot traffic monitoring, AI crawler detection, and AI agent monitoring from your own server logs — cookieless, self-hosted, no tracking tag.
AI visibility (from logs, not rank trackers). Many “AI visibility” products measure whether your brand appears in ChatGPT or Perplexity answers. Logwick answers the upstream question: which AI assistants and crawlers actually fetch your pages, and which URLs they hit — GPTBot, ClaudeBot, ChatGPT-User, Perplexity, Gemini Deep Research, and dozens more. That is the server-side signal behind visibility: who is reading your site so an AI can answer someone else.
Logwick splits AI traffic into three intents (not one generic “bot” bucket):
| AI traffic type | Examples | What you learn |
|---|---|---|
| Training corpus | GPTBot, ClaudeBot, CCBot | Who is scraping pages for model training |
| User-fetch | ChatGPT-User, Perplexity, Gemini Deep Research | Real-time fetches when a user asks an AI — the strongest AI visibility signal in raw HTTP logs |
| Search index | OAI-SearchBot, PerplexityBot (on robots/sitemap) | AI search indexers probing your site |
You also get GPTBot / ChatGPT bot tracking at vendor level, bot detection from server logs (“who is crawling my website?”), and Sankey flows that map each AI vendor to the exact paths it fetched.
| Capability | Details |
|---|---|
| AI & bot classification | AI crawler detection and AI scraper identification (GPTBot, ClaudeBot, Common Crawl…), plus search crawlers, link-preview bots, archive scrapers, and security probes — split by training / user-fetch / search index, with explainable rules (classification, traffic taxonomy) |
| Session rollups | Group requests into sessions with idle-timeout logic; engagement-style signals from HTTP behavior (methods, paths, timing, response sizes) |
| Social sharing analysis | See when your pages are shared on social and messaging platforms — link-preview bots (Facebook, LinkedIn, Slack, Telegram, Discord, X/Twitter…) mapped to the exact URLs they fetch, in a dedicated sharing flow (traffic taxonomy, dashboard) |
| Local-first storage | Single SQLite file, idempotent ingestion — all processing stays on your machine |
| Explore & drill down | Read-only JSON API over SQLite (summary, timeseries, breakdowns, sessions) on 127.0.0.1 — no auth token needed (dashboard API) |
| Who | What you can answer |
|---|---|
| SEO / content teams | AI visibility and crawl mix over time — which AI assistants fetch your content, plus search bots, link previews, and archives |
| Infra & SRE | Suspicious paths, probe-like traffic, and automated clients alongside “normal” requests, without shipping logs to a vendor |
| Indie hackers & small sites | Know exactly who's on your site without GA, Plausible, or any client-side snippet |
| SaaS & API products | Patterns that look like scraping or bulk automated use, grounded in HTTP facts rather than pageviews |
Logwick is server-side analytics that sits between privacy-first web analytics and AI / bot traffic dashboards: it reads your raw server / edge logs instead of a client-side tag — so it catches the automated traffic those tools never see, and keeps everything on your own machine.
| Tool | Data source | Catches AI agents & bots that skip JS | Self-hosted / local | Open source |
|---|---|---|---|---|
| Logwick | Server / edge JSONL logs | ✅ UA + path + behavior taxonomy | ✅ SQLite + localhost | ✅ AGPL-3.0 |
| Cloudflare AI Crawl Control | Cloudflare edge | ✅ AI crawlers | ❌ SaaS, Cloudflare-only | ❌ |
| Dark Visitors | Tracking + agent list | ✅ AI agents | ❌ SaaS | ❌ |
| GoAccess | Server logs | ✅ | ✅ | |
| GA4 / Plausible / Umami / Matomo (JS) | Client-side JS tag | ❌ bots & AI don't run JS | varies | varies |
If you already run Cloudflare AI Crawl Control or a JavaScript analytics tool, Logwick complements them: a local, vendor-neutral view of the same traffic, rebuilt from your own logs and never leaving your infrastructure.
- Is Logwick an open-source / self-hosted alternative to Cloudflare AI Crawl Control? Yes — AI bot traffic monitoring and crawler analytics from any edge or CDN that can emit JSONL (not only Cloudflare), runs locally, and is AGPL-3.0.
- Does Logwick measure “AI visibility” like GEO rank trackers? Not LLM answer rankings. Logwick shows which AI agents and crawlers fetch your URLs from server logs — the upstream traffic signal (training crawl vs user-fetch vs search index), including which pages ChatGPT-User or Perplexity hit in real time.
- How do I track GPTBot or ChatGPT on my site? Ingest your edge JSONL, run
process, open the dashboard — GPTBot, ChatGPT-User, and other vendors are classified by family with per-path breakdowns and AI user-fetch Sankey flows. - How is it different from GoAccess or other log analyzers? Logwick adds a multi-phase traffic taxonomy (humans, search crawlers, AI training/user-fetch, link previews, security probes) and rolls requests into sessions, instead of only counting hits.
- Why not just use Google Analytics, Plausible, or Umami? Those rely on a JavaScript tag, and AI agents and most bots don't execute JavaScript — they're invisible there. Logwick reads server logs, so it sees them.
- Does any data leave my machine? No. Processing is local, storage is a single SQLite file, and the dashboard binds to
127.0.0.1.
npm install # from the repository root after clone
npm run process -- --config config/process.example.json --target-id demo --db data/analytics/http-analytics.db --input path/to/logs.jsonl
npm run dashboard-api -- --db data/analytics/http-analytics.db --port 8787
# browser: http://127.0.0.1:8787/
curl -s "http://127.0.0.1:8787/api/summary"You can set LOGWICK_ANALYTICS_DB to the .db path and omit --db on the API (see Environment variables). Full setup: docs/getting-started.md.
Documentation lives in docs/ (architecture, CLI, dashboard API, classification, persistence).
Drop your logs in — get a dashboard of humans vs bots in seconds.
Your edge (Nginx, Caddy, Cloudflare, Fastly — anything that writes JSONL) is already recording every request. Logwick takes that file, classifies each entry, sessionizes the traffic, and stores the result in a local SQLite database. A read-only HTTP API on 127.0.0.1 serves a static browser dashboard and JSON metrics for exploration.
Your edge / CDN Logwick pipeline
┌─────────────┐ ┌──────────────────────────────────┐
│ Access log │ JSONL on disk │ classify → sessionize → │
│ (JSONL) │ ──────────────► │ SQLite → dashboard API │
└─────────────┘ │ (127.0.0.1:8787) │
└──────────────────────────────────┘
Log fetch and shipping from your edge are not in this repo — you add ingest when you need it. That keeps the project small and deployment-agnostic.
JSONL in → classify & sessionize → SQLite → explore locally.
| Area | In short |
|---|---|
| Runtime & repo | Node.js 20+, npm workspaces (apps/*, packages/*), ES modules |
| Pipeline & data | JSONL → CLI process → SQLite (better-sqlite3) |
| Dashboard | Read-only JSON API on Node http plus static UI at http://127.0.0.1:8787/ |
| Rules & quality | JSON Schema / Ajv, YAML registry → generated rules, ESLint 9, node --test |
Per-package dependencies and tooling: docs/tech-stack.md.
CHANGELOG.md — release history (current: 1.1.0).
HTTP logs often include IP addresses and User-Agents (personal data in many jurisdictions). Logwick never phones home — all processing happens locally, and retention and access control are yours to define. Keep secrets in env and local config — not in git.
See MAINTAINERS.md for contact, commercial inquiries, and the @http-logs/* package naming note.
Workspace packages are published under the npm scope @http-logs/*; the product name is Logwick.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). The full text is in LICENSE. Root package.json declares license: AGPL-3.0-only (SPDX).
For commercial or closed-source use, a commercial license is required. If you use Logwick inside a company or in a product without releasing your modifications under the AGPL, contact hi@r-sun.ai (Raising Sun s.r.o.) for a commercial license.
| Offering | What you get |
|---|---|
| Commercial license | Use without AGPL obligations inside your company or product |
| Extended detection signatures | Broader bot/AI patterns and heuristics, updated on a cadence |
| Integration & consulting | Pipeline design, customization, training |
| Custom support | Priority support, bug fixes, feature requests |
Third-party npm dependencies remain under their own licenses; they are not automatically AGPL. Audit node_modules or use a license tool before you ship a product build.
- CONTRIBUTING.md — workspaces, lint boundaries, how to contribute.
- SECURITY.md — responsible disclosure (do not file public issues for sensitive reports).

