Skip to content

Commit 0e356d0

Browse files
committed
chore: release v3.2.0 governance launch controls
1 parent cd4ee83 commit 0e356d0

20 files changed

Lines changed: 2234 additions & 909 deletions

.dockerignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
node_modules
22
.git
33
README.md
4-
.env
4+
.env
5+
.next
6+
logs
7+
data

.eslintrc.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"extends": "next/core-web-vitals"
3+
}

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,4 +88,5 @@ Thumbs.db
8888
# Local Claude workspace settings
8989
.claude/
9090

91-
data/
91+
data/
92+
logs/

Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ FROM node:18-alpine
22

33
WORKDIR /app
44

5-
RUN mkdir -p /app/data
5+
RUN mkdir -p /logs/security /logs/analytics /logs/feedback
66

77
COPY package*.json ./
88

@@ -14,4 +14,4 @@ RUN npm run build
1414

1515
EXPOSE 3000
1616

17-
CMD ["npm", "start"]
17+
CMD ["npm", "start"]

README.md

Lines changed: 90 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -1,152 +1,123 @@
11
# VFB Chat Client
22

3-
A web-based chat client for exploring Virtual Fly Brain (VFB) data and Drosophila neuroscience using a guardrailed LLM with tool calling, connected to the VFB MCP server via the OpenAI API.
3+
VFB Chat is a Next.js chat interface for exploring Virtual Fly Brain (VFB) data with grounded tool use. The production build is aligned to the governance and privacy controls for launch: structured logging, no free-text analytics logging, reviewed-domain search only, outbound link allow-listing, and production fail-closed checks for the approved ELM endpoint and model.
44

5-
## Features
5+
## What Changed
66

7-
- URL parameter support for initial queries and existing scene context (`?query=...&i=...&id=...`)
8-
- Chat interface to explore Drosophila neuroanatomy, neural circuits, and research
9-
- Access to VFB datasets, connectome data, and morphological analysis
10-
- Display image thumbnails and construct 3D visualization scenes
11-
- Generate URLs for VFB 3D browser with proper scene management
12-
- Guardrailed responses covering VFB-related topics including papers, techniques, and methodologies
13-
- **Security**: Advanced jailbreak detection to prevent attempts to bypass safety restrictions
7+
- Native `web_search` has been removed from the model toolset.
8+
- Search is limited to a reviewed local index of approved `virtualflybrain.org` and reviewed `flybase.org` pages.
9+
- Outbound links are sanitized server-side to approved domains only.
10+
- Raw IP-based security logs are retained for up to 30 days under `/logs/security`.
11+
- Aggregated analytics and structured feedback are retained under `/logs/analytics` and `/logs/feedback`.
12+
- Users can explicitly attach a visible chat transcript to negative feedback; those transcripts are stored separately for up to 30 days.
13+
- Google Analytics is optional and receives structured metrics only. No free-text user queries or model responses are sent.
14+
- Production fails closed unless `OPENAI_BASE_URL` matches the approved ELM gateway and `OPENAI_MODEL` matches the approved model.
1415

15-
## Security
16+
## Logging Model
1617

17-
The VFB Chat client includes comprehensive protection against common jailbreak attempts used to bypass LLM safety restrictions. The system automatically detects and blocks messages containing:
18+
The app now uses a 3-layer logging model rooted at `LOG_ROOT_DIR`:
1819

19-
- Attempts to override or ignore system instructions
20-
- Requests to enter "developer mode," "uncensored mode," or similar unrestricted states
21-
- Role-playing as alternative AI personas (e.g., DAN, uncensored AI)
22-
- Commands to modify system prompts or disregard rules
23-
- Encoded or hidden prompts designed to circumvent filters
20+
- Layer A: `/logs/security`
21+
- JSONL security events
22+
- blocked-site audit events
23+
- rate-limit state with raw IP retention capped at 30 days
24+
- Layer B: `/logs/analytics`
25+
- daily aggregated service metrics only
26+
- no raw prompts or raw responses
27+
- Layer C: `/logs/feedback`
28+
- structured thumbs up/down feedback plus fixed reason codes
29+
- no free-text feedback comments
30+
- Feedback transcript attachments: `/logs/feedback-transcripts`
31+
- stored only when a user explicitly attaches a conversation to negative feedback
32+
- short retention, capped at 30 days
2433

25-
When a jailbreak attempt is detected, users receive a clear warning message and the request is blocked. This ensures the chat remains focused on Drosophila neuroscience and VFB-related topics.
34+
## Reviewed Search Index
2635

27-
## Usage Monitoring and AI Guidelines
36+
The reviewed documentation search tool reads from `config/reviewed-docs-index.json` by default. This is a curated, static index of approved pages and should be changed through review, not by runtime crawling.
2837

29-
### Analytics and Quality Control
30-
The VFB Chat client includes Google Analytics integration to monitor usage patterns and ensure quality control. All user queries and AI responses are tracked anonymously for:
31-
- Usage monitoring and system performance analysis
32-
- Quality control and improvement of responses
33-
- Research into user interaction patterns with neuroscience data
38+
Environment variable:
3439

35-
**Data Collected:**
36-
- Query text (truncated to 200 characters for privacy)
37-
- Query and response lengths
38-
- Processing duration
39-
- Session identifiers (anonymous)
40-
- Timestamps
41-
42-
**Privacy Protection:**
43-
- Query text is truncated to prevent storage of long or sensitive content
44-
- No personally identifiable information is collected
45-
- Analytics data is used solely for quality control and system improvement
46-
- A clear disclaimer is displayed at the bottom of the chat interface
47-
48-
### Important AI Usage Guidelines
49-
**Please verify all information provided by the AI assistant:**
50-
- AI-generated responses may contain inaccuracies or outdated information
51-
- Always cross-reference critical information with primary sources
52-
- Use VFB links provided in responses to access authoritative data
53-
- Report any concerns about response quality to the development team
40+
- `REVIEWED_DOCS_INDEX_FILE`
5441

55-
**Privacy and Security:**
56-
- Conversations may be monitored for quality control purposes
57-
- No personally identifiable information should be shared in queries
58-
- Confidential or sensitive research data should not be included in prompts
59-
- The system is designed for educational and research purposes within Drosophila neuroscience
42+
## Runtime Configuration
6043

61-
**Responsible Use:**
62-
- Use this tool to enhance, not replace, your understanding of neuroscience concepts
63-
- Cite appropriate sources when using information in research or publications
64-
- Respect intellectual property and data usage rights of VFB and related resources
44+
Required for production:
6545

66-
## Setup
46+
- `OPENAI_API_KEY`
47+
- `OPENAI_BASE_URL`
48+
- `OPENAI_MODEL`
49+
- `APPROVED_ELM_BASE_URL`
50+
- `APPROVED_ELM_MODEL`
51+
- `LOG_ROOT_DIR=/logs`
6752

68-
1. Ensure Docker and Docker Compose are installed.
53+
Optional:
6954

70-
2. Clone this repository.
55+
- `RATE_LIMIT_PER_IP`
56+
- `SEARCH_ALLOWLIST`
57+
- `OUTBOUND_ALLOWLIST`
58+
- `REVIEWED_DOCS_INDEX_FILE`
59+
- `GA_MEASUREMENT_ID`
60+
- `GA_API_SECRET`
7161

72-
3. Set your OpenAI API key: `export OPENAI_API_KEY=your-key-here`
62+
Default allow-lists:
7363

74-
4. Run `docker-compose up --build` to start the app.
64+
- Search allow-list: `virtualflybrain.org`, `*.virtualflybrain.org`, `flybase.org`
65+
- Outbound allow-list: `virtualflybrain.org`, `*.virtualflybrain.org`, `flybase.org`, `doi.org`, `pubmed.ncbi.nlm.nih.gov`, `biorxiv.org`, `medrxiv.org`
7566

76-
5. To use a different model, set the `OPENAI_MODEL` environment variable: `OPENAI_MODEL=gpt-4o docker-compose up --build`
67+
## Local Development
7768

78-
## Deployment
69+
Create `.env.local` with explicit values:
7970

80-
### Local Development
71+
```bash
72+
OPENAI_API_KEY=your-key-here
73+
OPENAI_BASE_URL=https://your-elm-gateway.example/v1
74+
OPENAI_MODEL=your-approved-model
75+
APPROVED_ELM_BASE_URL=https://your-elm-gateway.example/v1
76+
APPROVED_ELM_MODEL=your-approved-model
77+
LOG_ROOT_DIR=./logs
78+
```
8179

82-
For development without Docker:
83-
1. Create a `.env.local` file with your API configuration:
84-
```
85-
OPENAI_API_KEY=your-key-here
86-
OPENAI_BASE_URL=https://api.openai.com/v1
87-
OPENAI_MODEL=gpt-4o-mini
88-
```
89-
2. Run `npm install`
90-
3. Run `npm run dev`
80+
Then run:
9181

92-
### Docker Hub Deployment via GitHub Actions
93-
The project includes a GitHub Actions workflow (`.github/workflows/docker.yml`) that automatically builds and pushes Docker images to Docker Hub on pushes and pull requests.
82+
```bash
83+
npm install
84+
npm run dev
85+
```
9486

95-
1. Set up Docker Hub repository: Create a repository named `vfbchat` under your Docker Hub account (e.g., `robbie1977/vfbchat`).
87+
The local default for `LOG_ROOT_DIR` falls back to `./logs` when not running in production.
9688

97-
2. Configure GitHub Secrets:
98-
- Go to your repository settings > Secrets and variables > Actions
99-
- Add `DOCKER_HUB_USER`: Your Docker Hub username
100-
- Add `DOCKER_HUB_PASSWORD`: Your Docker Hub password or access token
89+
## Docker
10190

102-
3. The workflow will trigger on:
103-
- Pushes to any branch or tags starting with `v*`
104-
- Pull requests to `main`
105-
106-
4. Images are built for `linux/amd64` and `linux/arm64` platforms and tagged appropriately.
107-
108-
## Usage
109-
110-
- Access the app at `http://localhost:3000`
111-
- Without URL parameters, the chat starts with a welcome message and example queries
112-
- Append URL parameters for initial setup, e.g., `http://localhost:3000?query=medulla&i=VFB_00101567&id=VFB_00102107`
113-
- Chat with the assistant to explore VFB data
114-
- Click "Open in VFB 3D Browser" to view the scene
91+
The provided `docker-compose.yml` mounts a named volume at `/logs`:
11592

116-
## LLM Configuration
117-
118-
- **Model**: Default is `gpt-4o-mini`, configurable via `OPENAI_MODEL` env var. Any OpenAI-compatible model with tool calling support will work.
119-
- **API Endpoint**: Default is `https://api.openai.com/v1`, configurable via `OPENAI_BASE_URL` for use with OpenAI-compatible proxies (e.g., ELM at Edinburgh).
120-
- **Guardrailing**: Implemented via system prompt allowing responses about Drosophila neuroscience, VFB data/tools, research papers, and methodologies while using MCP tools for accurate information.
121-
- **MCP Integration**: The LLM calls VFB MCP tools (`get_term_info`, `search_terms`, `run_query`) via the OpenAI tool calling API.
93+
```bash
94+
docker-compose up --build
95+
```
12296

123-
## VFB MCP Details
97+
This keeps security, analytics, and feedback logs outside the application filesystem.
12498

125-
- **Server URL**: https://vfb3-mcp.virtualflybrain.org/
126-
- **Tools**:
127-
- `get_term_info(id)`: Retrieves term details, including images keyed by template.
128-
- `search_terms(query)`: Searches for terms matching the query.
129-
- `run_query(id, query_type)`: Runs specific queries (e.g., PaintedDomains) on terms.
130-
- **Data Structure**:
131-
- Terms have IDs like `VFB_00102107` or `FBbt_00003748`.
132-
- Images are associated with templates (e.g., `VFB_00101567` for JRC2018Unisex).
133-
- Thumbnails: `https://www.virtualflybrain.org/data/VFB/i/.../thumbnail.png`
134-
- **URL Construction for Scenes**:
135-
- `https://v2.virtualflybrain.org/org.geppetto.frontend/geppetto?id=<focus_term_id>&i=<template_id>,<image_id1>,<image_id2>`
136-
- `id`: Focus term (only one, site shows its info).
137-
- `i`: Comma-separated list starting with template ID, followed by image IDs.
138-
- **Limitations**:
139-
- Images must be aligned to the same template to view together.
140-
- Only one term can be the focus per scene, but all term info is accessible in the chat.
141-
- Templates define the coordinate space.
99+
## API Surface
142100

143-
## Development
101+
- `POST /api/chat`
102+
- Streams assistant responses over SSE
103+
- emits `result` events with `requestId` and `responseId`
104+
- `GET /api/rate-info`
105+
- returns the current per-IP daily usage counters
106+
- `POST /api/feedback`
107+
- accepts `{ request_id, response_id, rating, reason_code }`
108+
- negative feedback may also include `{ attach_conversation: true, conversation }`
109+
- `GET /privacy`
110+
- serves the VFB Chat privacy addendum page
144111

145-
- **Docker**: `docker-compose up --build`
146-
- **Local**: `npm run dev` (requires `.env.local` with API credentials)
147-
- Build: `npm run build`
148-
- API: POST to `/api/chat` with `{ message, scene }`
112+
## UI Notes
149113

150-
## License
114+
- The welcome text and footer now reflect the launch privacy position.
115+
- Assistant messages with IDs support structured feedback:
116+
- thumbs up submits `helpful`
117+
- thumbs down requires a fixed reason code
118+
- thumbs down can optionally attach the visible conversation transcript for short-term investigation
151119

152-
See LICENSE file.
120+
## Verification Notes
121+
122+
- `npm run lint` should be used for local verification after changes.
123+
- `npm run build` may fail in restricted sandboxes because Next.js attempts to bind a local IPC port during build; rerun it in the target environment before release.

RELEASE_NOTES.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@ This file summarizes the release notes inferred from git tags (tag message/annot
44

55
---
66

7+
## v3.2.0
8+
- Release v3.2.0: Add governance-ready logging with `/logs` volume support, reviewed-domain search controls, outbound link allow-listing, structured feedback, privacy updates, and opt-in transcript attachments for problem reports
9+
710
## v3.1.0
811
- Release v3.1.0: Add per-IP daily rate limiting and rate-info endpoint; backend data storage in data/rate-limits.json; client-side usage counter in UI
912

0 commit comments

Comments
 (0)