A stateful, personality-driven conversational AI system where each NPC (Non-Player Character) has unique traits, memory, and behavior patterns. Built with Spring Boot, MongoDB, Redis, and LLM APIs.
This is a production-grade implementation of the NPC Simulator HLD from the ChatGPT conversation. The system implements all 7 major components:
- API Gateway โ Authentication, rate limiting, routing
- Conversation Service โ Main backend orchestration (Java Spring Boot)
- Character Engine โ Personality and behavioral definition
- Memory Service โ Short-term (Redis) + long-term (MongoDB) memory
- Prompt Builder โ Combines personality + memory + user input
- AI Service โ Calls LLM (OpenAI/Anthropic, direct or via FastAPI)
- Post Processor โ Quality assurance and safety filtering
Client (Web / Mobile)
โ
API Gateway (Authentication, Rate Limiting)
โ
Conversation Service (Spring Boot)
โโ Character Engine (Personality Layer)
โโ Memory Service (Redis + MongoDB)
โโ Prompt Builder
โโ AI Service (LLM Integration)
โ
Post Processor (Quality Check)
โ
Response โ Client
- โ Stateful Conversations โ Each user-NPC pair has independent session management
- โ Personality-Driven โ NPCs have customizable traits, tone, and behavior
- โ Multi-Layer Memory โ Short-term (Redis for speed) + long-term (MongoDB for persistence)
- โ LLM Flexibility โ Support for OpenAI and Anthropic APIs
- โ Scalable Architecture โ Horizontal scaling with stateless services
- โ Optional FastAPI Wrapper โ Advanced orchestration and caching layer
- โ Docker Support โ Full containerization with docker-compose
- โ Quality Assurance โ Response filtering and confidence scoring
- Java 17+
- Maven 3.8+
- Docker & Docker Compose (optional, for containerized setup)
- MongoDB 7.0+
- Redis 7+
- LLM API Key (OpenAI or Anthropic)
# Set environment variables
export LLM_API_KEY=your-openai-api-key
export LLM_API_TYPE=openai # or 'anthropic'
# Start all services
docker-compose up -d
# Verify services
docker-compose ps
# Check backend health
curl http://localhost:8080/api/npcs1. Start MongoDB:
mongod --dbpath ./data/mongodb2. Start Redis:
redis-server3. Build the backend:
mvn clean package4. Run the backend:
java -jar target/npc-simulator-backend-0.1.0-MVP.jarThe application will start on http://localhost:8080/api
Get all NPCs:
GET /api/npcsGet specific NPC:
GET /api/npcs/{npcId}Response example:
{
"id": "npc_123",
"name": "Alex Manager",
"description": "Sarcastic and impatient project manager",
"personality": {
"primaryTrait": "sarcastic",
"traits": ["impatient", "direct", "technical"],
"tone": "professional",
"background": "10 years of project management experience",
"speakingStyle": "uses technical jargon, speaks concisely"
},
"isActive": true
}Create a new session:
POST /api/sessions
Header: X-User-Id: user_123
Body:
{
"npcId": "npc_123",
"sessionContext": "Discussing project delays"
}Get user sessions:
GET /api/sessions
Header: X-User-Id: user_123Get specific session:
GET /api/sessions/{sessionId}
Header: X-User-Id: user_123Close a session:
DELETE /api/sessions/{sessionId}Send message:
POST /api/chat/message
Header: X-User-Id: user_123
Body:
{
"sessionId": "session_xyz",
"content": "Why is the project delayed?"
}Response example:
{
"id": "msg_456",
"sessionId": "session_xyz",
"role": "npc",
"content": "Well, the team was waiting for requirements clarification, but sure, let's pretend it's just bad planning.",
"timestamp": "2024-04-16T10:30:00",
"modelUsed": "gpt-4",
"confidenceScore": 0.95
}# MongoDB
SPRING_DATA_MONGODB_URI=mongodb://localhost:27017/npc_simulator
# Redis
SPRING_REDIS_HOST=localhost
SPRING_REDIS_PORT=6379
# LLM API
LLM_API_KEY=sk-...
LLM_API_TYPE=openai # or 'anthropic'
LLM_MODEL=gpt-4
LLM_TEMPERATURE=0.7
LLM_MAX_TOKENS=500
# JWT
JWT_SECRET=your-secret-key
# FastAPI Wrapper (optional)
USE_FASTAPI_WRAPPER=false
FASTAPI_URL=http://localhost:8000See src/main/resources/application.yml for all configurable options.
# Connect to MongoDB
mongosh mongodb://localhost:27017/npc_simulator
# Insert custom NPC
db.npcs.insertOne({
"_id": ObjectId(),
"name": "Customer Support Agent",
"description": "Friendly and patient support specialist",
"personality": {
"primaryTrait": "empathetic",
"traits": ["patient", "helpful", "knowledgeable"],
"tone": "friendly",
"background": "5 years in customer support",
"speakingStyle": "uses a warm, conversational tone",
"behavioralRules": {
"rule1": "Always acknowledge customer frustration",
"rule2": "Provide solutions rather than explanations"
}
},
"config": {
"maxTokens": 300,
"temperature": 0.6
},
"createdAt": new Date(),
"updatedAt": new Date(),
"isActive": true
})This is how a user message becomes an NPC response:
- User sends message โ Chat endpoint receives request
- API Gateway โ Validates authentication (via X-User-Id header)
- Session & NPC retrieval โ Fetches active session and NPC config
- Character Engine โ Builds personality context from NPC traits
- Memory Service โ Retrieves last 5 messages + relevant memories
- Prompt Builder โ Combines personality + memory + user message
- AI Service โ Calls OpenAI/Anthropic API with final prompt
- Post Processor โ Validates confidence, filters unsafe content, trims length
- Response saved โ Stores both user and NPC messages in MongoDB
- Cache updated โ Recent messages cached in Redis for speed
- Response returned โ User receives NPC reply with metadata
{
id: String, // MongoDB ObjectId
name: String,
email: String,
passwordHash: String,
createdAt: LocalDateTime,
updatedAt: LocalDateTime,
isActive: Boolean
}{
id: String,
name: String,
description: String,
personality: {
primaryTrait: String,
traits: List<String>,
tone: String,
background: String,
speakingStyle: String,
behavioralRules: Map<String, String>
},
config: Map<String, Object>,
createdAt: LocalDateTime,
updatedAt: LocalDateTime,
isActive: Boolean
}{
id: String,
userId: String,
npcId: String,
startedAt: LocalDateTime,
lastActivityAt: LocalDateTime,
messageCount: Integer,
sessionContext: String,
isActive: Boolean
}{
id: String,
sessionId: String,
role: String, // "user" or "npc"
content: String,
timestamp: LocalDateTime,
internalPrompt: String, // (NPC only) Actual prompt sent to LLM
modelUsed: String, // (NPC only) Model name (gpt-4, claude-3, etc.)
confidenceScore: Double // (NPC only) 0.0-1.0
}- No server-side state except databases
- Load balance across multiple instances
- Each service can scale independently
- Short-term: Redis for recent messages (TTL: 1 hour)
- Long-term: MongoDB for persistence
- Optional: Vector DB for semantic search
- Connection pooling (HTTP, MongoDB, Redis)
- Response caching by prompt hash
- Batch operations for bulk inserts
- Index optimization on frequently queried fields
- JWT authentication via
X-User-Idheader - Basic rate limiting configuration
- Content moderation in Post Processor
- Implement proper OAuth2/JWT token validation
- Add rate limiting with Redis
- Enable HTTPS/TLS
- Use API keys for LLM API calls
- Implement audit logging
- Add request/response encryption for sensitive data
mvn testmvn verify# Create a session
curl -X POST http://localhost:8080/api/sessions \
-H "Content-Type: application/json" \
-H "X-User-Id: test_user" \
-d '{"npcId": "npc_123", "sessionContext": "test"}'
# Send a chat message
curl -X POST http://localhost:8080/api/chat/message \
-H "Content-Type: application/json" \
-H "X-User-Id: test_user" \
-d '{"sessionId": "session_xyz", "content": "Hello!"}'For advanced LLM orchestration:
docker-compose --profile fastapi upEndpoints:
POST /v1/promptโ Process prompt with cachingPOST /v1/semantic-searchโ Search memories by semantic similarity
Support for multiple NPCs in one session (future enhancement)
NPCs adapt based on conversation history (future enhancement)
Track engagement metrics and popular NPCs (future enhancement)
Logs are written to console and can be sent to centralized logging services:
# src/main/resources/application.yml
logging:
level:
root: INFO
com.npcsimulator: DEBUGCheck if MongoDB is running:
mongosh mongodb://localhost:27017
If using Docker:
docker logs npc-simulator-mongodb
Check if Redis is running:
redis-cli ping
If using Docker:
docker logs npc-simulator-redis
Verify API key:
echo $LLM_API_KEY
Test API connection:
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $LLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "hello"}]}'
Clear Redis cache:
redis-cli FLUSHALL
Or via Docker:
docker exec npc-simulator-redis redis-cli FLUSHALLnpc-simulator/
โโโ src/main/java/com/npcsimulator/
โ โโโ api/
โ โ โโโ controller/ # REST endpoints
โ โ โโโ dto/ # Data transfer objects
โ โโโ domain/
โ โ โโโ model/ # Entity classes
โ โ โโโ repository/ # Data access layer
โ โโโ service/
โ โ โโโ ai/ # AI/LLM service
โ โ โโโ CharacterEngine.java
โ โ โโโ MemoryService.java
โ โ โโโ PromptBuilder.java
โ โ โโโ PostProcessor.java
โ โ โโโ ConversationService.java (Orchestrator)
โ โ โโโ ...
โ โโโ config/ # Spring configurations
โโโ src/main/resources/
โ โโโ application.yml # Configuration
โโโ fastapi_wrapper/
โ โโโ main.py # FastAPI app
โ โโโ requirements.txt
โ โโโ Dockerfile
โโโ pom.xml # Maven dependencies
โโโ Dockerfile # Backend container
โโโ docker-compose.yml # Full stack setup
โโโ README.md # This file
"I designed a scalable, stateful AI conversation system where each NPC has unique personality traits and maintains conversation history. The backend (Spring Boot) orchestrates seven components: an API Gateway for security, Conversation Service for coordination, Character Engine for personality definition, Memory Service with Redis caching for performance, Prompt Builder for intelligent input construction, AI Service for LLM integration with both direct APIs and optional FastAPI wrapper, and Post Processor for response quality assurance. The system persists conversations in MongoDB while using Redis for sub-second response times, supports both OpenAI and Anthropic APIs, and scales horizontally with a stateless architecture."
- Multi-NPC group conversations
- NPC personality evolution over time
- Semantic memory with vector DB
- Prompt caching to reduce costs
- Advanced analytics dashboard
- Voice integration (TTS/STT)
- Real-time streaming responses
- Fine-tuned models for specific NPCs
For issues or questions:
- Check the troubleshooting section
- Review Docker logs:
docker-compose logs -f - Enable debug logging in application.yml
MIT License - Feel free to use for any purpose
Built from the NPC Simulator HLD discussion on ChatGPT. References production best practices in:
- Microservices architecture
- Cache-aside pattern
- API design
- LLM integration
Happy building! ๐