Detecting automated misinformation campaigns on Bluesky using multi-method analysis
A hackathon project for Apart Research focused on developing better methods to detect automated misinformation campaigns using AI-powered analysis techniques.
Bot Detector combines multiple analysis methods to identify potential bot accounts on Bluesky:
- π Follow/Follower Analysis - Detects suspicious ratios and patterns
- β° Posting Pattern Analysis - Identifies inhuman timing and frequency
- π Text Content Analysis - Detects repetitive content and AI phrases
- π§ LLM Analysis - AI-powered content assessment for AI vs. human detection
- π Network Analysis - Future: Coordinated behavior detection
- π Real-time Monitoring - Future: Live bot detection dashboard
Human Team Members: Andreas Matt
AI Team Members:
- Backend: Claude Code
- Frontend: Lovable
Working across different time zones with diverse backgrounds - all code includes extensive documentation for team accessibility.
Bot-Detector/
βββ backend/ # FastAPI backend service
β βββ main.py # API server and endpoints
β βββ bot_detector.py # Main analysis orchestrator
β βββ bluesky_client.py # Bluesky AT Protocol client
β βββ analyzers.py # Core detection algorithms
β βββ llm_analyzer.py # Multi-provider LLM integration
β βββ config.py # Configuration management
β βββ models.py # Data models and validation
β βββ .env.example # Environment template
β βββ requirements.txt # Python dependencies
βββ tests/ # Test suite
β βββ test_config.py # Configuration tests
β βββ test_api.py # API endpoint tests
β βββ test_analyzers.py # Bot detection logic tests
β βββ test_bluesky_client.py # Bluesky integration tests
β βββ README_TESTING.md # Testing documentation
βββ frontend/ # React web interface with shadcn/ui
β βββ src/ # React source code
β βββ dist/ # Built frontend (served by backend)
β βββ package.json # Node.js dependencies
β βββ vite.config.ts # Vite configuration
βββ pytest.ini # Test configuration
βββ .gitignore # Prevents committing secrets
βββ README.md # This file
- Python 3.8+
- Node.js 18+ (for frontend)
- Git
- API keys for at least one service (Bluesky or LLM providers)
1. Clone and Navigate
git clone <repository-url>
cd Bot-Detector2. Install Backend Dependencies
Linux/macOS:
python3 -m venv venv
source venv/bin/activate
pip install -r backend/requirements.txtWindows:
python -m venv venv
venv\Scripts\activate
pip install -r backend\requirements.txt3. Install Frontend Dependencies
cd frontend
npm install
cd ..4. Configure Credentials
Copy the environment template:
# Linux/macOS
cp config.example.json config.json
# Windows
copy config.example.json config.jsonEdit config.json with your credentials:
{
"bluesky": {
"username": "your-bluesky-username",
"password": "your-bluesky-password"
},
"llm": {
"openai_api_key": "sk-your-openai-key"
}
}5. Choose Your Running Mode
Runs frontend and backend as separate services with hot reload:
./run_dev.shThis starts:
- Backend API server:
http://localhost:8000 - Frontend dev server:
http://localhost:8080(with API proxy)
Builds frontend and serves it through the backend:
./serve_prod.shThis serves everything from: http://localhost:8000
- Frontend routes:
/,/about, etc. - API routes:
/analyze,/health,/config
6. Test the Application
Visit the appropriate URL based on your running mode:
- Development:
http://localhost:8080(frontend with API proxy to backend on :8000) - Production:
http://localhost:8000(integrated backend serving frontend)
If the frontend can't connect to the backend API:
-
Check both services are running:
# In one terminal - this starts BOTH backend and frontend ./run_dev.sh -
Verify backend is accessible:
# In another terminal curl http://localhost:8000/health -
Check frontend proxy configuration - the Vite config should proxy
/analyze,/health, and/configtohttp://localhost:8000
7. Run Tests (Optional but recommended)
# From project root
pytestAnalyze a user:
curl -X POST "http://localhost:8000/analyze" \
-H "Content-Type: application/json" \
-d '{"bluesky_handle": "example.bsky.social"}'Bluesky Access (recommended):
- Username and password for your Bluesky account
- Enables full data fetching capabilities
LLM Providers (at least one recommended):
- OpenAI: Get key from platform.openai.com
- Anthropic: Get key from console.anthropic.com
- Google: Get key from aistudio.google.com
- Ollama: Local models (install separately from ollama.ai)
- Environment Variables (highest priority)
- config.json file
- .env file (lowest priority)
The system needs either:
- Bluesky credentials (for data fetching), OR
- At least one LLM API key (for analysis)
Ideally both for full functionality.
- High following-to-follower ratios
- Suspiciously round numbers
- Zero followers on established accounts
- Inhuman posting frequencies (>100 posts/day)
- No sleep patterns (24/7 posting)
- Perfectly regular intervals
- High repost-to-original ratios
- Vocabulary diversity measurement
- Repetitive or template-like content
- AI-typical phrases detection
- Cross-post similarity analysis
- AI vs. human content assessment
- Multi-model consensus for reliability
- Confidence scoring and reasoning
- Model-agnostic architecture
- β All credentials stored locally (never sent to our servers)
- β
API keys protected by
.gitignore - β No sensitive data logging
- β Read-only Bluesky access
- β Configurable analysis depth
- Multi-method bot detection
- Bluesky integration
- LLM analysis support
- REST API interface
- Web frontend interface (React + shadcn/ui)
- Integrated backend/frontend serving
- Comprehensive documentation
- Database for result storage
- Batch analysis capabilities
- Performance optimizations
- User authentication and accounts
- Real-time monitoring dashboard
- Network analysis for coordinated behavior
- ML model training on collected data
- Additional social media platforms
This is a hackathon project with team members across different backgrounds and time zones.
Development Guidelines:
- All code must include extensive comments
- Test changes thoroughly before committing:
pytest - Never commit API keys or credentials
- Follow the existing code structure
- Update documentation for new features
- Tests are located in
tests/directory
Getting Involved:
- Check with team leads (Andreas/Mitali) for task assignment
- Create feature branch for new work
- Test locally before submitting
- Include documentation updates
{
"handle": "example.bsky.social",
"overall_score": 0.75,
"confidence": 0.85,
"summary": "Analysis indicates this account is possibly a bot (risk level: HIGH, score: 0.75/1.00)",
"follow_analysis": {
"following_count": 2847,
"follower_count": 12,
"ratio": 237.25,
"score": 0.8
},
"posting_pattern": {
"posts_per_day_avg": 127.3,
"unusual_frequency": true,
"score": 0.9
},
"text_analysis": {
"repetitive_content": true,
"score": 0.7
},
"llm_analysis": {
"model_used": "openai/gpt-4o-mini",
"confidence": 0.85,
"score": 0.6
},
"recommendations": [
"β οΈ Suspicious follower/following patterns detected",
"β οΈ Unusual posting patterns detected",
"π« Consider blocking or reporting this account"
]
}Developed for Apart Research hackathon. See individual team member agreements for specific licensing terms.
- Technical Issues: Contact Andreas or Mitali
- Policy Questions: Contact Matt or Clare
- API Documentation: See
/backend/README.md - General Questions: Check project documentation first
Built with β€οΈ for the fight against misinformation