A production-quality financial sentiment intelligence platform that analyzes Reddit discussions using state-of-the-art FinBERT AI to provide real-time market sentiment insights across stocks, industries, and sectors.
- FinBERT Model - Specialized BERT model fine-tuned for financial sentiment analysis (ProsusAI/finbert)
- Real-time classification into Positive, Neutral, and Negative sentiment with confidence scores
- Context-aware analysis understanding financial terminology and market language
- Market Pulse Dashboard - Real-time overview of market sentiment trends
- Most Discussed Stocks - Track which tickers are generating the most conversation
- Industry & Sector Analysis - Aggregated sentiment across market segments
- Sentiment Heatmaps - Visual representation of industry-level market mood
- Fetches posts from multiple finance subreddits (r/stocks, r/investing, r/wallstreetbets, etc.)
- No API keys required - Uses RSS feeds for seamless data collection
- Configurable subreddit sources and search queries
- Automatic deduplication and post tracking
- Smart Ticker Extraction - Identifies stock symbols in posts with high accuracy
- False Positive Filtering - Eliminates common words like "CEO", "PR", "DD"
- Company Metadata - Maps tickers to company names, industries, and sectors
- Support for 100+ major stocks with extensible ticker mappings
- Filter by ticker symbol, industry, sector, sentiment, or date range
- Multi-dimensional data exploration
- Pagination support for large datasets
- Custom time-range analysis (daily/weekly granularity)
- Sentiment Trends - Track sentiment changes over time
- Stock Comparison Charts - Compare sentiment across multiple tickers
- Volume-Sentiment Correlation - Analyze post volume vs. market mood
- Industry Heatmaps - Sector-level sentiment distribution
- Interactive Charts - Built with Chart.js for dynamic data exploration
- Responsive React-based frontend
- Clean, intuitive three-panel layout (Overview, Analytics, Posts)
- Clickable posts with direct Reddit links
- Loading states and error handling
- Mobile-friendly responsive design
- Framework: Flask 3.0 with CORS support
- AI/ML: HuggingFace Transformers + PyTorch (CPU-optimized)
- Database: SQLite with optimized schema
- Data Source: Reddit RSS feeds (no authentication required)
- API: RESTful v1 API with standardized responses
- Framework: React 19 with modern hooks
- Build Tool: Vite 7.3 for fast development
- Charts: Chart.js 4.5 + react-chartjs-2
- HTTP Client: Axios for API communication
- Styling: Modern CSS with responsive design
Enhanced relational schema with:
- Posts - Analyzed Reddit posts with full metadata
- Tickers - Stock symbols with company info, sector, industry
- Industries & Sectors - Hierarchical classification
- Post-Ticker Relationships - Many-to-many associations
- Optimized Indexes - Fast querying on filters and aggregations
- Python 3.8+ with pip
- Node.js 16+ with npm
- ~2GB disk space for FinBERT model and dependencies
-
Navigate to backend directory:
cd backend -
Install Python dependencies:
pip install -r requirements.txt
-
Initialize the database:
python migrations.py
-
Start the Flask server:
python app.py
The backend will start on
http://localhost:5000β οΈ First Run Note: FinBERT model (~500MB) will be downloaded automatically from HuggingFace. This may take 5-10 minutes depending on your connection.
-
Navigate to frontend directory:
cd frontend -
Install Node dependencies:
npm install
-
Start the development server:
npm run dev
The frontend will start on
http://localhost:5173(Vite default)
- Open
http://localhost:5173in your browser - Click "Fetch New Posts" button
- Wait for Reddit posts to be fetched and analyzed
- Explore the dashboard, filters, and visualizations
- Click the "Fetch New Posts" button in the header
- Posts are fetched from configured subreddits via RSS
- Each post is analyzed for sentiment and ticker mentions
- Results appear in real-time as analysis completes
- Market Pulse - Quick market sentiment snapshot
- Key Metrics - Total posts, sentiment distribution, top stocks
- Most Discussed/Positive/Negative Stocks - Ranked lists
- Sentiment Trends - Line charts showing sentiment over time
- Stock Comparison - Side-by-side ticker sentiment analysis
- Industry Heatmap - Sector-level aggregated sentiment
- Volume-Sentiment Correlation - Post activity vs. market mood
- Paginated list of all analyzed posts
- Individual sentiment scores and extracted tickers
- Direct links to original Reddit discussions
- Sorting and filtering options
Ticker Filter: Select one or multiple stock symbols
- Example: Filter for "AAPL" to see all Apple-related posts
Industry Filter: Focus on specific industries
- Example: "Software", "Semiconductors", "Banking"
Sector Filter: Broader market segment filtering
- Example: "Technology", "Financial", "Healthcare"
Sentiment Filter: Show only positive, negative, or neutral posts
Date Range: Analyze specific time periods
- Uses YYYY-MM-DD format
- Supports both start and end dates
- Hover over data points for detailed tooltips
- Click legend items to toggle datasets
- Charts auto-update when filters change
Located in backend/config.json:
{
"reddit": {
"subreddits": ["stocks", "StockMarket", "investing", "wallstreetbets", "finance", "options"],
"user_agent": "finance-sentiment-dashboard/2.0",
"default_query": "stocks OR finance OR investing OR trading"
},
"sentiment": {
"model": "ProsusAI/finbert",
"confidence_threshold": 0.6
},
"api": {
"version": "v1",
"default_page_size": 50,
"max_page_size": 1000
},
"server": {
"port": 5000,
"debug": false
}
}Key Settings:
subreddits- List of subreddits to fetch fromconfidence_threshold- Minimum confidence for sentiment classificationdefault_page_size- API pagination size
Maps ticker symbols to company metadata:
{
"AAPL": {
"company": "Apple Inc.",
"sector": "Technology",
"industry": "Consumer Electronics"
}
}Add New Tickers: Simply extend this file with new entries following the same structure.
Full API documentation available in API.md
Analyze Custom Text:
curl -X POST http://localhost:5000/api/v1/analyze \
-H "Content-Type: application/json" \
-d '{"text": "NVDA earnings beat expectations! Stock surging."}'Get Posts for Specific Ticker:
curl "http://localhost:5000/api/v1/posts?ticker=TSLA&limit=20"Compare Multiple Stocks:
curl "http://localhost:5000/api/v1/sentiment-comparison?tickers=AAPL,MSFT,GOOGL"Market Pulse Overview:
curl "http://localhost:5000/api/v1/market-pulse"sentiment-analysis-finance-posts/
βββ backend/
β βββ app.py # Main Flask application
β βββ api_utils.py # API response utilities
β βββ database.py # Database layer (repositories pattern)
β βββ migrations.py # Database schema migrations
β βββ sentiment_analyzer.py # FinBERT sentiment analysis engine
β βββ ticker_extractor.py # Stock symbol extraction logic
β βββ industry_classifier.py # Industry/sector classification
β βββ reddit_rss_client.py # Reddit RSS feed parser
β βββ config.json # Application configuration
β βββ ticker_mappings.json # Ticker metadata (100+ stocks)
β βββ known_tickers.json # Ticker validation list
β βββ requirements.txt # Python dependencies
β
βββ frontend/
β βββ src/
β β βββ components/
β β β βββ Dashboard.jsx # Main dashboard component
β β β βββ MarketPulse.jsx # Market overview widget
β β β βββ SentimentChart.jsx # Trend line charts
β β β βββ StockComparisonChart.jsx # Multi-ticker comparison
β β β βββ IndustryHeatmap.jsx # Sector heatmap visualization
β β β βββ VolumeSentimentChart.jsx # Volume correlation chart
β β β βββ SentimentByStockChart.jsx # Per-ticker breakdown
β β β βββ PostsList.jsx # Paginated posts view
β β β βββ TickerFilter.jsx # Ticker selection dropdown
β β β βββ IndustryFilter.jsx # Industry filter
β β β βββ SectorFilter.jsx # Sector filter
β β β βββ DateRangeFilter.jsx # Date range picker
β β β βββ LoadingSpinner.jsx # Loading state component
β β β βββ ErrorMessage.jsx # Error display
β β β βββ EmptyState.jsx # Empty data state
β β β βββ Tooltip.jsx # Custom tooltip
β β βββ App.jsx # Root React component
β β βββ main.jsx # React entry point
β β βββ index.css # Global styles
β βββ index.html # HTML template
β βββ vite.config.js # Vite build configuration
β βββ package.json # Node dependencies
β
βββ API.md # Complete API documentation
βββ README.md # This file
βββ .gitignore # Git ignore rules
- Python 3.8+ - Backend runtime
- React 19 - Frontend framework
- Flask 3.0 - Web framework
- SQLite - Embedded database
- Vite 7.3 - Frontend build tool
- transformers 4.48+ - HuggingFace library for FinBERT
- torch 2.2+ - PyTorch (CPU-optimized)
- numpy 1.26+ - Numerical computing
- axios 1.13 - HTTP client
- chart.js 4.5 - Charting library
- react-chartjs-2 5.3 - React Chart.js wrapper
- react-dom 19.2 - React DOM renderer
- flask-cors 4.0 - CORS middleware
- requests 2.31 - HTTP requests
- python-dateutil 2.8 - Date parsing
The ticker extraction system uses a multi-stage approach:
- Pattern Matching - Identifies potential ticker symbols using regex (
$SYMBOLor uppercase words) - Known Ticker Validation - Checks against a database of 100+ major stocks
- False Positive Filtering - Removes common words (CEO, PR, DD, ETA, etc.)
- Context Analysis - Validates tickers appear in financial context
Supported Formats:
$AAPL- Dollar sign prefixAAPL- Plain uppercase symbolsAAPL, MSFT- Comma-separated lists
Automatically categorizes stocks into industries and sectors:
- Sectors: Technology, Financial, Healthcare, Energy, Consumer Cyclical, etc.
- Industries: Software, Semiconductors, Banking, Pharmaceuticals, etc.
- Mapping: Configured via
ticker_mappings.json - Extensibility: Easy to add new classifications
Real-time market overview providing:
- Most Discussed Stocks - Ranked by post volume
- Most Positive Stocks - Highest average sentiment scores
- Most Negative Stocks - Lowest sentiment scores
- Sentiment by Sector - Aggregated sector-level metrics
- Overall Market Sentiment - Platform-wide average
Sentiment Trends Chart
- Daily/weekly sentiment distribution over time
- Stacked area chart showing positive/neutral/negative
- Filterable by ticker, industry, sector, date range
Stock Comparison Chart
- Side-by-side sentiment comparison for multiple tickers
- Grouped bar chart visualization
- Shows sentiment score distributions
Industry Heatmap
- Color-coded sentiment intensity per industry
- Aggregated metrics with percentage breakdowns
- Quick visual identification of hot/cold sectors
Volume-Sentiment Correlation
- Dual-axis chart showing post volume and sentiment
- Identifies trending periods and sentiment shifts
- Useful for spotting market momentum changes
posts
- Primary post data (title, text, author, subreddit)
- Sentiment analysis results (label, score, confidence)
- Reddit metadata (URL, created_at, reddit_id)
tickers
- Stock symbols with company names
- Sector and industry classifications
- Unique constraint on symbol
industries
- Industry names and descriptions
- Foreign key relationships to tickers
sectors
- Sector names and descriptions
- Foreign key relationships to tickers
post_tickers (Junction Table)
- Many-to-many relationship between posts and tickers
- Enables multi-ticker posts
Optimized for common query patterns:
idx_posts_sentiment- Fast sentiment filteringidx_posts_created_at- Date range queriesidx_post_tickers_lookup- Ticker-post associations
Backend (with auto-reload):
cd backend
FLASK_ENV=development python app.pyFrontend (with HMR):
cd frontend
npm run devManual Testing:
- Fetch posts from various subreddits
- Verify sentiment classifications are reasonable
- Test all filters independently and in combination
- Validate chart data accuracy
- Check API responses against documentation
Recommended Test Scenarios:
- Positive sentiment: "AAPL earnings beat! Stock soaring π"
- Negative sentiment: "TSLA recalls, stock plummeting"
- Neutral sentiment: "Fed announces rate decision today"
- Multi-ticker: "Comparing MSFT vs GOOGL performance"
Frontend Build:
cd frontend
npm run buildOutput: frontend/dist/ (static files ready for deployment)
Backend Deployment:
- Use production WSGI server (gunicorn, uWSGI)
- Set
debug: falsein config.json - Enable database backups
- Configure reverse proxy (nginx)
FinBERT Model Download Fails
# Manually download model
python -c "from transformers import AutoTokenizer, AutoModelForSequenceClassification; \
AutoTokenizer.from_pretrained('ProsusAI/finbert'); \
AutoModelForSequenceClassification.from_pretrained('ProsusAI/finbert')"Database Locked Error
- Ensure only one backend instance is running
- Close any SQLite browser connections
- Check file permissions on
finance_sentiment.db
CORS Errors
- Verify Flask CORS is enabled in
app.py - Check frontend is accessing correct backend URL
- Ensure no conflicting CORS headers
No Posts Fetched from Reddit
- Verify internet connection
- Check subreddit names in
config.jsonare valid - Reddit RSS feeds may have rate limits (retry after delay)
Charts Not Rendering
- Check browser console for JavaScript errors
- Verify Chart.js is loaded (check network tab)
- Ensure data format matches chart component expectations
Ticker Not Detected
- Add ticker to
ticker_mappings.json - Add to
known_tickers.jsonvalidation list - Restart backend server to reload configs
Slow Sentiment Analysis:
- Use CPU-optimized PyTorch build (already configured)
- Reduce batch size if memory constrained
- Consider GPU support for high-volume deployments
Database Query Optimization:
- Indexes are pre-configured for common filters
- Use pagination (
limitparameter) for large datasets - Archive old posts to reduce table size
- π User Authentication - Multi-user support with saved preferences
- π Real-time Alerts - Notify on sentiment threshold changes
- π₯ Export Functionality - CSV/JSON export of filtered data
- π Custom Search Queries - User-defined Reddit searches
- π Historical Backtesting - Sentiment vs. actual stock performance
- π WebSocket Updates - Real-time dashboard updates
- π± Mobile App - Native iOS/Android applications
- π€ Advanced AI Models - Support for GPT-based analysis
- π Price Integration - Live stock price overlays
- ποΈ Watchlists - Save and track custom ticker lists
- Multi-language sentiment support
- Cryptocurrency ticker detection
- News aggregation from financial websites
- Social media integration (beyond Reddit)
- Sentiment-based trading signals