Neuroglancer Tourguide

A 3D microscopy viewer with built-in structured-data browsing, plain-English queries powered by Claude / Gemini / OpenAI / local Ollama, and Python analysis (mesh-based or voxel-based, in-browser or on a cloud backend).

This repo contains two flavors of tourguide. Pick the one that fits your situation.

🌐 Web tourguide `web-app/`

Static web app — Neuroglancer embedded in the page, AI agent for natural-language queries and analysis, share links, optional cloud compute backend. Anyone can use it; nothing to install.

Live: https://tourguide-8j4.pages.dev
Docs: web-app/README.md
Cloud analysis backend (optional, for big data): hf-space/README.md

Best for: trying things out, sharing views with collaborators, exploratory analysis, day-to-day use, sharing data with anyone via a URL.

What it does:

Loads zarr / n5 / Neuroglancer precomputed datasets directly from S3 / GCS / local folders
Natural-language queries against organelle CSVs ("show the largest mito", "plot volume distributions")
Agent-generated Python analysis (regionprops, cc3d, custom code) — in-browser via Pyodide or on the HF Space for bigger volumes
Share-link with NG state + computed tables embedded; persists across browser refreshes
One-click "Copy NG link" for sharing just the viewer state with non-tourguide users
Bring your own AI key (Gemini free tier works great), or run an in-browser model via WebLLM
Can be run fully on-prem (vite preview + local uvicorn for analysis + local Ollama for LLM) — no cloud required

🖥️ Sidecar tourguide `server/`

Python service that runs alongside a local Neuroglancer process, streams screenshots, narrates them with local TTS, and records narrated tour videos. Originally the only flavor; preserved for the workflows the web app doesn't (yet) cover.

Best for: making narrated tour movies, voice cloning with Chatterbox, fully on-prem GPU workflows, batch tour generation.

What it does (in addition to the web app's features):

Voice narration with Chatterbox cloning (GPU TTS)
Movie recording with synchronized narration + multiple transition modes
Local Ollama integration on a GPU box (the web app supports this too via the OpenAI-compatible backend; the sidecar adds Janelia-cluster-friendly conventions on top)

Setup + usage instructions are below ⬇

Features

Live Screenshot Streaming: Debounced 0.1-5 fps JPEG streaming
State Tracking: Position, zoom, orientation, layer visibility, and segment selection
WebSocket Updates: Real-time updates to browser panel
AI Narration: Context-aware descriptions using cloud (Gemini/Claude) or local (Ollama) AI
Natural Language Query: Ask questions about organelles in plain English
Agent-Driven Visualization: AI interprets queries to show/hide segments intelligently
AI-Powered Analysis Mode: Generate and execute Python code for data analysis via natural language
Voice Synthesis: Browser-based TTS or edge-tts with multiple voices
Movie Recording: Record navigation sessions with synchronized narration
Multiple Transition Modes: Direct cuts, crossfade, or smooth state interpolation
Responsive UI: Clean dark theme with status indicators and narration history
Explore Mode with Verbose Logging: Real-time progress tracking shows screenshot capture, AI narration generation, and audio synthesis status

Quick Start

Installation with pixi (recommended)

# Install dependencies with pixi
pixi install

# Start the server
pixi run start

# Or with custom settings
pixi run python server/main.py --ng-port 9999 --web-port 8090 --fps 2

Alternative: Installation with pip

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r server/requirements.txt

# Start the server
python server/main.py

Usage

Just open one URL: http://localhost:8090/

The web panel now includes:

Embedded Neuroglancer viewer (left) with sample EM data pre-loaded
Explore Mode (default, right panel):
- Screenshots tab: Live screenshots with AI narrations as you navigate
- Verbose Log tab: Real-time progress tracking (📸 Screenshot captured → 📤 Sent to AI → ⏳ Waiting → ✅ Narration received → 🔊 Audio generated)
Query Mode: Natural language questions about organelles with AI-driven visualization
State tracking: Position, zoom, layers, selections
Recording controls: Capture and compile narrated tours with multiple transition modes

Navigate in the embedded viewer and watch the live stream update automatically!

Natural Language Queries

Ask questions about organelles in plain English:

Examples:

"show the largest mitochondrion"
"how many nuclei are there?"
"take me to the smallest peroxisome"
"show mitochondria larger than 1e11 nm³"
"also show nucleus 5" (adds to current selection)
"hide all mitochondria" (removes from view)

The AI agent:

Converts your question to SQL
Queries the organelle database
Interprets the results based on query semantics
Updates the visualization intelligently
Provides a natural language answer with voice narration

See AGENT_DRIVEN_VISUALIZATION.md for technical details.

Analysis Mode

Switch to Analysis Mode to generate and execute Python code for data analysis using natural language:

Examples:

"Plot the volume distribution of mitochondria"
"Show me a histogram of nucleus sizes"
"Create a scatter plot comparing mitochondria volume vs surface area"

The AI analysis agent:

Converts your question to Python code
Executes the code in a sandboxed container (Docker or Apptainer)
Displays generated plots and statistics
Tracks session metadata and timing information

Container Support:

Docker: Default for most systems
Apptainer: Automatic fallback for HPC/cluster environments

See ANALYSIS_MODE.md for technical details and API documentation.

Recording Tours

Start Recording: Click "Start Recording" to begin capturing frames
Navigate: Explore the dataset - narration triggers automatically on significant view changes
Stop Recording: Click "Stop Recording" when done
Create Movie: Choose transition style and click "Create Movie"
- Direct Cuts: Instant transitions with 2-second silent pauses
- Crossfade: Smooth dissolve transitions between views
- State Interpolation: Neuroglancer renders smooth camera movements

Movies are saved to recordings/<session_id>/output/movie.mp4 with:

960x540 resolution
Frame duration matches audio narration length
2-second silent transitions between narrations
Synchronized audio track

See QUICKSTART.md for detailed usage guide.

Architecture

Stage 1: State Capture ✅

Neuroglancer viewer with state change callbacks
Summarizes position, zoom, orientation, layers, and selections
Filters meaningful changes to avoid spam

Stage 2: Screenshot Loop ✅

Background thread captures screenshots when viewer state is "dirty"
Converts PNG to JPEG for bandwidth efficiency
Debounced to max 2 fps (configurable)

Stage 3: WebSocket Streaming ✅

FastAPI server with WebSocket endpoint
Sends {type: "frame", jpeg_b64: "...", state: {...}} messages
Browser displays live frames and state summary

Stage 4: AI Narrator ✅

Triggers narration on meaningful state changes
Uses Gemini, Claude, or local Ollama to describe current view
Context-aware prompts for EM/neuroanatomy
Real-time WebSocket broadcasting to all clients
Configurable thresholds and intervals

Stage 5: Voice & TTS ✅

Browser-based TTS or edge-tts with multiple voices
Automatic audio playback in browser
Audio synchronized with narration display
Saved to recordings for movie compilation

Stage 6: Movie Recording ✅

Record navigation sessions with frame capture
Three transition modes: cuts, crossfade, interpolation
Frame duration matches narration audio length
2-second silent transitions between narrations
FFmpeg-based video compilation with audio sync
Neuroglancer video_tool integration for smooth camera movements

Stage 7: Natural Language Query System ✅

SQLite database for organelle metadata (volume, position, etc.)
AI-powered natural language to SQL conversion
Multi-query support with automatic splitting
Intent classification: navigation, visualization, or informational
Agent-driven visualization state updates
Semantic understanding: "show X" vs "also show X" vs "hide X"
Context-aware command generation using current viewer state

Stage 8: Analysis Mode ✅

Natural language to Python code generation
Sandboxed code execution (Docker/Apptainer)
Interactive plot generation and visualization
Session metadata tracking with timing breakdown
Comprehensive results management with REST API
Automatic container detection for HPC environments

Project Structure

tourguide/
├── server/
│   ├── main.py             # Entry point
│   ├── ng.py               # Neuroglancer viewer + state tracking
│   ├── stream.py           # FastAPI WebSocket server + query/analysis endpoints
│   ├── narrator.py         # AI narration engine
│   ├── query_agent.py      # Natural language query agent
│   ├── analysis_agent.py   # Natural language to Python code agent
│   ├── docker_sandbox.py   # Docker container sandbox
│   ├── apptainer_sandbox.py # Apptainer container sandbox
│   ├── analysis_results.py # Analysis session metadata manager
│   ├── organelle_db.py     # SQLite database for organelle metadata
│   ├── recording.py        # Movie recording and compilation
│   └── requirements.txt    # Legacy pip requirements
├── web/
│   ├── index.html      # Web UI with recording and analysis controls
│   ├── app.js          # WebSocket client + recording + analysis logic
│   ├── style.css       # Styling with spinner animations
│   └── ng-screenshot-handler.js  # Neuroglancer screenshot capture
├── organelle_data/     # Organelle CSV files and database (gitignored)
├── analysis_results/   # Analysis session outputs (gitignored)
├── containers/         # Container images (gitignored)
├── recordings/         # Recorded sessions (auto-created)
├── pixi.toml           # Pixi environment config
├── AGENT_DRIVEN_VISUALIZATION.md  # Agent visualization docs
├── ANALYSIS_MODE.md    # Analysis mode documentation
└── README.md

Configuration

Command-line Arguments

--ng-host HOST        Neuroglancer bind address (default: 127.0.0.1)
--ng-port PORT        Neuroglancer port (default: 9999)
--web-host HOST       Web server bind address (default: 0.0.0.0)
--web-port PORT       Web server port (default: 8090)
--fps FPS             Maximum screenshot frame rate (default: 2)

Development Stages

Using AI Narration

Option 1: Cloud AI (Gemini - Recommended)

Get a free API key from https://aistudio.google.com/app/apikey
Create a .env file:
```
cp .env.example .env
```
Add your API key to .env:
```
GOOGLE_API_KEY=your_api_key_here
```
Start the server:
```
pixi run start
```

Option 2: Local AI (Ollama + Kokoro TTS - No API Key!)

For completely local, private, and free AI narration with voice:

Install Ollama from ollama.com
Download the vision model:
```
ollama pull llama3.2-vision
```

Install TTS (optional):

pixi run pip install kokoro soundfile sounddevice

Enable local mode in .env:
```
USE_LOCAL=true
```
Start the server:
```
pixi run start
```

See LOCAL_SETUP.md for detailed local setup instructions.

Option 3: Cloud AI (Claude/Anthropic)

Use ANTHROPIC_API_KEY in .env instead of GOOGLE_API_KEY.

Navigate in Neuroglancer and watch the AI narrate your exploration in real-time!

Running on GPU Cluster (LSF/H100)

To run on a GPU cluster node, use mode=shared when requesting GPUs:

bsub -P cellmap -n 12 -gpu "num=1:mode=shared" -q gpu_h100 -Is /bin/bash

Important: The mode=shared parameter is required! Without it, the GPU will be in exclusive mode, preventing both PyTorch (Chatterbox) and Ollama from using the GPU simultaneously.

Once on the node, run the application normally:

pixi run start

See CLUSTER_TROUBLESHOOTING.md for detailed cluster setup and troubleshooting.

Requirements

Python 3.10+
FastAPI & Uvicorn
Pillow
Neuroglancer
FFmpeg (for movie compilation)
edge-tts (for voice synthesis, optional)

License

GNU General Public License v3.0 — see LICENSE for details.

Tourguide depends on zmesh, cc3d, fastmorph, edt, and kimimaro from the Seung Lab, which are GPL-3.0; the combined work is therefore GPL-3.0.

Name		Name	Last commit message	Last commit date
Latest commit History 305 Commits
.github		.github
hf-space		hf-space
server		server
tests		tests
web-app		web-app
web		web
.env.example		.env.example
.gitignore		.gitignore
AGENT_DRIVEN_VISUALIZATION.md		AGENT_DRIVEN_VISUALIZATION.md
ANALYSIS_MODE.md		ANALYSIS_MODE.md
ARCHITECTURE.md		ARCHITECTURE.md
CSV_COLUMN_GUIDE.md		CSV_COLUMN_GUIDE.md
DEBUG_QUERY.md		DEBUG_QUERY.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
LOCAL_SETUP.md		LOCAL_SETUP.md
PART1_TEST_RESULTS.md		PART1_TEST_RESULTS.md
PART2_SUMMARY.md		PART2_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
QUICKTEST.md		QUICKTEST.md
QUICK_START_VISUALIZATION.md		QUICK_START_VISUALIZATION.md
README.md		README.md
SCREENSHOT_SOLUTION.md		SCREENSHOT_SOLUTION.md
STAGE4_COMPLETE.md		STAGE4_COMPLETE.md
USAGE.md		USAGE.md
VERBOSE_MODE.md		VERBOSE_MODE.md
VISUALIZATION_QUERIES.md		VISUALIZATION_QUERIES.md
VOICE_CLONING.md		VOICE_CLONING.md
debug_query.py		debug_query.py
inspect_all_csvs.sh		inspect_all_csvs.sh
inspect_csv.py		inspect_csv.py
migrate_analysis_metadata.py		migrate_analysis_metadata.py
pixi.toml		pixi.toml
test_analysis_api.py		test_analysis_api.py

Folders and files

Latest commit

History

Repository files navigation

Neuroglancer Tourguide

🌐 Web tourguide web-app/

🖥️ Sidecar tourguide server/

Features

Quick Start

Installation with pixi (recommended)

Alternative: Installation with pip

Usage

Natural Language Queries

Analysis Mode

Recording Tours

Architecture

Stage 1: State Capture ✅

Stage 2: Screenshot Loop ✅

Stage 3: WebSocket Streaming ✅

Stage 4: AI Narrator ✅

Stage 5: Voice & TTS ✅

Stage 6: Movie Recording ✅

Stage 7: Natural Language Query System ✅

Stage 8: Analysis Mode ✅

Project Structure

Configuration

Command-line Arguments

Development Stages

Using AI Narration

Option 1: Cloud AI (Gemini - Recommended)

Option 2: Local AI (Ollama + Kokoro TTS - No API Key!)

Option 3: Cloud AI (Claude/Anthropic)

Running on GPU Cluster (LSF/H100)

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

🌐 Web tourguide `web-app/`

🖥️ Sidecar tourguide `server/`

Packages