Architecture

Define your agent system's structure, decision flows, and component interactions.

System Overview

The Facebook Messenger AI Bot is a production-ready FastAPI application that creates AI-powered Facebook Messenger bots. The system uses a single-agent architecture with PydanticAI, powered by GitHub Copilot SDK (with OpenAI fallback), to answer questions based on synthesized reference documents from scraped websites.

High-Level Flow:

Facebook Messenger → Webhook → FastAPI → Agent Service → Copilot SDK → Response → Facebook Messenger

Key Components:

FastAPI Application: Webhook endpoints for Facebook Messenger
PydanticAI Agent: Message processing and response generation
Copilot SDK Service: LLM operations with fallback
Scraper Service: Website content extraction
Reference Document Service: Content synthesis
Supabase Database: Configuration and message history storage
Facebook Service: Message sending via Graph API
Logfire Observability: Structured logging, request tracing, and performance monitoring

Agent Roles & Responsibilities

Agent Name	Purpose	Tools	Output
MessengerAgentService	Process user messages and generate responses	CopilotService (chat), Reference Document (read), Message History (read)	AgentResponse (message, confidence, escalation flags)

Single Agent Architecture:

One primary agent handles all message processing
Agent uses reference document as knowledge base
Agent maintains conversation context via recent messages
Agent escalates to human when confidence is low or out of scope

Decision Flow

User Message (Facebook Messenger)
    ↓
Webhook Endpoint (FastAPI)
    ↓
Parse Message & Extract sender_id, page_id
    ↓
Lookup Bot Configuration (Supabase)
    ↓
Build AgentContext (reference_doc + tone + recent_messages)
    ↓
MessengerAgentService.respond()
    ↓
    ├─→ Low Confidence (< 0.7) → Escalate to Human
    ├─→ Out of Scope → Escalate to Human
    └─→ Valid Response → Send via Facebook Graph API
    ↓
Save to Message History (Supabase)

Setup Flow:

CLI Setup Command
    ↓
Scrape Website → Text Chunks
    ↓
Copilot SDK: Synthesize Reference Document
    ↓
Store Reference Document (Supabase)
    ↓
Create Bot Configuration (Supabase)
    ↓
Ready for Messages

Data Flow

Input Schemas

Webhook Payload:

class MessengerWebhookPayload(BaseModel):
    object: str
    entry: list[dict]  # Facebook webhook entry structure

Message Input:

class MessengerMessageIn(BaseModel):
    sender_id: str
    recipient_id: str
    text: str | None
    timestamp: int

State Management

AgentContext:

class AgentContext(BaseModel):
    bot_config_id: str
    reference_doc: str  # Full markdown reference document
    tone: str  # Communication tone (professional, friendly, etc.)
    recent_messages: list[str]  # Last 3 messages for context

AgentResponse:

class AgentResponse(BaseModel):
    message: str  # Response text (max 300 chars)
    confidence: float  # 0.0 to 1.0
    requires_escalation: bool
    escalation_reason: str | None

Output Formats

Success: AgentResponse with message and confidence > 0.7
Escalation: AgentResponse with requires_escalation = True
Error: HTTPException with appropriate status code

Orchestration Pattern

Used Pattern: Single-agent with tools

Reasoning:

Simple use case: Answer questions based on reference document
No need for complex multi-agent coordination
Single agent can handle all message types
Easier to maintain and debug
Lower latency (no agent handoffs)

Agent Tools:

CopilotService.chat(): LLM chat interface
Reference Document Access: Read-only access to synthesized content
Message History: Read recent conversation context
Facebook Service: Send messages (called after agent response)

Tools & External Systems

Tool Registry

Tool	Risk	Description
`scrape_website`	🟢 LOW	Read-only website scraping, timeout limits
`build_reference_doc`	🟢 LOW	Content synthesis via Copilot SDK
`get_bot_configuration`	🟢 LOW	Read-only database query
`get_reference_document`	🟢 LOW	Read-only database query
`agent_service.respond`	🟡 MEDIUM	AI response generation, confidence-based
`send_message` (Facebook)	🟡 MEDIUM	Send message via Facebook Graph API
`save_message_history`	🟡 MEDIUM	Write message to database
`create_bot_configuration`	🟠 HIGH	Create new bot (CLI only, requires validation)

External Systems

GitHub Copilot SDK:

Primary LLM provider
Endpoint: COPILOT_CLI_HOST (default: http://localhost:5909)
Fallback: OpenAI API if Copilot unavailable
Operations: Chat completion, content synthesis

Facebook Graph API:

Send messages to users
Endpoint: https://graph.facebook.com/v18.0/me/messages
Authentication: Page Access Token
Rate limits: Handled by Facebook

Supabase (PostgreSQL):

Database for bot configurations
Reference documents storage
Message history logging
Connection: Via Supabase Python client

Pydantic Logfire:

Structured logging and observability
FastAPI request/response tracing
Pydantic model validation logging
PydanticAI agent execution tracing
Correlation ID tracking across services
Environment-aware configuration (local console vs production JSON)
Optional cloud logging with Logfire token

Error Recovery & Fallback Logic

Copilot SDK Failures

Detection:

Health check failures
HTTP timeout errors
Invalid response format

Recovery:

Check copilot.is_available() before use
If unavailable, automatically fallback to OpenAI
Log fallback event for monitoring
Continue processing with OpenAI

Fallback Implementation:

async def chat(self, system_prompt: str, messages: list[dict]) -> str:
    if not await self.is_available():
        logger.warning("Copilot SDK unavailable, using OpenAI fallback")
        return await self._fallback_to_openai(system_prompt, messages)
    # ... use Copilot SDK

Facebook API Failures

Detection:

HTTP error codes (4xx, 5xx)
Invalid token responses
Rate limit errors

Recovery:

Retry with exponential backoff (max 3 retries)
Log error for monitoring
If persistent, alert admin
Continue processing (don't block other messages)

Database Failures

Detection:

Connection timeouts
Query errors
Transaction failures

Recovery:

Retry with backoff (max 3 retries)
Use cached bot configurations if available
Log error for monitoring
Alert admin if persistent

Agent Response Failures

Detection:

Low confidence scores (< 0.7)
Out-of-scope queries
Invalid response format

Recovery:

Set requires_escalation = True
Return default escalation message
Log for human review
Continue processing other messages

Component Interactions

Request Flow

┌─────────────────┐
│ Facebook        │
│ Messenger       │
└────────┬────────┘
         │ POST /webhook
         ↓
┌─────────────────┐
│ FastAPI         │
│ Webhook Handler │
│ + Logfire       │ ← Request tracing starts
└────────┬────────┘
         │
         ├─→ Parse payload (Pydantic validation logged)
         ├─→ Extract page_id
         ├─→ Correlation ID generated
         │
         ↓
┌─────────────────┐
│ Repository      │
│ (Supabase)      │
│ + Logfire       │ ← DB query timing logged
└────────┬────────┘
         │
         ├─→ Get bot_config (timed)
         ├─→ Get reference_doc (timed)
         ├─→ Get recent messages (timed)
         │
         ↓
┌─────────────────┐
│ Agent Service   │
│ (PydanticAI)    │
│ + Logfire       │ ← Agent execution traced
└────────┬────────┘
         │
         ├─→ Build context (logged)
         ├─→ Call Copilot SDK (timed, fallback logged)
         ├─→ Generate response (confidence logged)
         │
         ↓
┌─────────────────┐
│ Facebook        │
│ Service         │
│ + Logfire       │ ← API call timing logged
└────────┬────────┘
         │
         ├─→ Send message (success/failure logged)
         │
         ↓
┌─────────────────┐
│ Repository      │
│ (Save history)  │
│ + Logfire       │ ← Write operation timed
└─────────────────┘
         │
         ↓
┌─────────────────┐
│ Logfire         │
│ (Observability) │ ← Complete trace with correlation ID
└─────────────────┘

Setup Flow

┌─────────────────┐
│ CLI Setup       │
└────────┬────────┘
         │
         ├─→ Get website URL
         │
         ↓
┌─────────────────┐
│ Scraper Service │
└────────┬────────┘
         │
         ├─→ Scrape website
         ├─→ Chunk text
         │
         ↓
┌─────────────────┐
│ Copilot Service │
└────────┬────────┘
         │
         ├─→ Synthesize reference doc
         │
         ↓
┌─────────────────┐
│ Repository      │
│ (Save config)   │
└─────────────────┘

Scalability Considerations

Current Architecture

Single FastAPI instance
Single agent per message
Direct database connections
Synchronous message processing

Future Scaling Options

Horizontal Scaling: Multiple FastAPI instances behind load balancer
Message Queue: Use Redis/RabbitMQ for async message processing
Caching: Redis cache for bot configurations
Database Connection Pooling: Supabase connection pooling
Agent Pooling: Multiple agent instances for concurrent processing

Security Architecture

Authentication & Authorization

Facebook webhook verification via verify_token
Page Access Token validation
Supabase service key for database access

Data Protection

Environment variables for secrets
Encrypted database connections (Supabase)
HTTPS for all external communications
PII masking in logs

Input Validation

Pydantic models for all inputs
URL validation for website scraping
Message length limits
Rate limiting per user

Observability Architecture

Logging & Tracing

Pydantic Logfire Integration:

Centralized Configuration: src/logging_config.py handles environment-aware setup
FastAPI Instrumentation: Automatic request/response tracing with timing
Pydantic Instrumentation: Model validation logging for all Pydantic models
PydanticAI Instrumentation: Agent execution tracing and decision logging
Correlation IDs: Request tracing across services via CorrelationIDMiddleware
Structured Logging: JSON logs in production, formatted console logs in local development

Service-Level Logging:

CopilotService: Health check timing, API call success/failure, fallback events
MessengerAgentService: Message processing timing, confidence scores, escalation decisions
ScraperService: Website scraping attempts, chunking statistics, HTTP errors
FacebookService: Message send attempts, API responses, rate limiting
Repository: Database operation timing, query success/failure rates

Log Configuration:

Environment-based log levels (DEBUG/INFO/WARNING/ERROR)
PII masking utilities for sensitive data
Token redaction for authentication tokens
Optional cloud logging with Logfire token

Error Tracking:

Sentry integration for error aggregation and alerting
Logfire structured logs for debugging
Correlation between Sentry errors and Logfire traces

Performance Monitoring

Metrics Tracked:

Request/response latency (p50, p95, p99)
Agent response generation time
Database query duration
External API call timing (Copilot SDK, Facebook Graph API)
Service availability (Copilot SDK health checks)

Alert Thresholds:

Response latency > 2 seconds (p95)
Error rate > 2% for 5 minutes
Escalation rate > 20%
Copilot SDK availability < 99%
See RUNBOOK.md for detailed alert thresholds

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

System Overview

Agent Roles & Responsibilities

Decision Flow

Data Flow

Input Schemas

State Management

Output Formats

Orchestration Pattern

Tools & External Systems

Tool Registry

External Systems

Error Recovery & Fallback Logic

Copilot SDK Failures

Facebook API Failures

Database Failures

Agent Response Failures

Component Interactions

Request Flow

Setup Flow

Scalability Considerations

Current Architecture

Future Scaling Options

Security Architecture

Authentication & Authorization

Data Protection

Input Validation

Observability Architecture

Logging & Tracing

Performance Monitoring

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

System Overview

Agent Roles & Responsibilities

Decision Flow

Data Flow

Input Schemas

State Management

Output Formats

Orchestration Pattern

Tools & External Systems

Tool Registry

External Systems

Error Recovery & Fallback Logic

Copilot SDK Failures

Facebook API Failures

Database Failures

Agent Response Failures

Component Interactions

Request Flow

Setup Flow

Scalability Considerations

Current Architecture

Future Scaling Options

Security Architecture

Authentication & Authorization

Data Protection

Input Validation

Observability Architecture

Logging & Tracing

Performance Monitoring