A Machine Learning project that detects real vs fake news articles using Natural Language Processing (NLP) and Flask web deployment. The model achieves ~99.37% accuracy using the PassiveAggressiveClassifier and TF-IDF vectorization.
The project aims to combat misinformation by building a text classification system that can identify whether a given news article or headline is real or fake. It follows a complete ML pipeline — from data preprocessing to model training, evaluation, and Flask deployment.
- 🧹 Text Preprocessing with NLTK (stopword removal, stemming, normalization)
- 🧾 TF-IDF Vectorization for feature extraction
- ⚡ PassiveAggressiveClassifier for efficient text classification
- 📊 Evaluation using Accuracy, Confusion Matrix, ROC Curve, and Precision-Recall Curve
- 🌐 Flask Web App for real-time predictions
- 💾 Model Persistence using
joblib
Fake-News-Detection/
│
├── dataSets/
│ ├── True.csv
│ ├── Fake.csv
│ └── preprocessed.csv
│
├── model/
│ ├── fake_news_model.pkl
│ └── tfidf_vectorizer.pkl
│
├── src/
│ ├── preprocess.py
│ ├── train_model.py
│ └── evaluate.py
│
├── templates/
│ └── index.html
│
├── static/
│ └── style.css
│
├── app.py
├── requirements.txt
└── README.md
git clone https://github.com/Murali-KrishnaM/Fake_News_Detector
cd Fake-News-Detectionpip install -r requirements.txtpython src/preprocess.pypython src/train_model.pypython src/evaluate.pypython app.pyThen visit http://127.0.0.1:5000/ in your browser.
| Category | Libraries |
|---|---|
| Data Handling | pandas, numpy |
| NLP | nltk |
| Machine Learning | scikit-learn |
| Visualization | matplotlib, seaborn |
| Deployment | flask |
| Model Storage | joblib |
- Accuracy: ~99.37%
- Algorithm: PassiveAggressiveClassifier
- Feature Extraction: TF-IDF Vectorization
- Evaluation Metrics: Accuracy, F1-Score, ROC, Confusion Matrix
- 🟢 Real News → Model predicts genuine articles
- 🔴 Fake News → Model flags potentially false content
The frontend is built using simple HTML/CSS with a clean and responsive design.
- Integrate Deep Learning models (e.g., LSTM, BERT)
- Deploy to Render, Railway, or AWS for public access
- Expand dataset for better generalization
- Add confidence score visualization on predictions
Murali Krishna
B.Tech Artificial Intelligence & Data Science
Rajalakshmi Engineering College, Chennai
ML Enthusiast | Flask Developer
⭐ If you find this project useful, don't forget to star the repo!