A command-line machine learning tool that classifies news headlines as Real or Fake using Logistic Regression and TF-IDF vectorization.
Built as part of the Fundamentals of AI and ML course project.
You type in a news headline. The model tells you whether it looks like real news or fake news — along with a confidence score.
Enter a headline: Scientists discover cure hidden by Big Pharma
Result : FAKE
Confidence : 87.3%
Fake prob : 87.3% | Real prob: 12.7%
- A dataset of real and fake headlines is vectorized using TF-IDF (Term Frequency-Inverse Document Frequency), which converts text into numerical features based on word importance.
- A Logistic Regression model is trained on these features to learn patterns that distinguish fake headlines from real ones.
- When you enter a new headline, the model predicts its label and outputs a probability.
- Python 3
- scikit-learn (Logistic Regression, TF-IDF, train/test split)
- NumPy
1. Clone the repository
git clone https://github.com/prinxeeee/fake-news-classifier.git
cd fake-news-classifier2. Install dependencies
pip install scikit-learn numpy3. Run the classifier
python fake_news_detector.pyWhen you run the script, it first trains the model and shows accuracy, then prompts you:
==================================================
FAKE NEWS HEADLINE CLASSIFIER
==================================================
Model trained on 30 headlines
Test Accuracy: 83.3%
==================================================
Try it yourself! Enter a headline below.
Type 'quit' to exit.
==================================================
Enter a headline:
Type any headline and press Enter. Type quit to exit.
SHOCKING: Government puts mind control chips in vaccines
Researchers publish new study on climate change effects
EXPOSED: Secret cure for cancer hidden by Big Pharma
City council approves plan to expand public transport
Feature Extraction
- Input headlines are lowercased and stop words are removed
TfidfVectorizerconverts each headline into a numerical feature vector- Words that are rare but distinctive get higher weight
Classification
LogisticRegressionlearns the boundary between fake and real patterns- Trained on an 80/20 train-test split
- Outputs both a label and a probability score
fake-news-classifier/
│
├── fake_news_detector.py # Main script
└── README.md # This file
- Trained on a small dataset of 30 headlines — a real-world version would use thousands of examples
- Works best with English headlines
- Neutral-toned fake headlines may not be caught reliably
- Browser extensions that flag suspicious headlines
- Social media misinformation filters
- Journalism tools for quick credibility checks
- Educational NLP demos
Name: Prince Mahar
Registration No: 25BCE10429
Branch: B.Tech CSE (Core)
University: VIT Bhopal University
Course: CSA2001 - Fundamentals of AI and ML
This project is submitted as part of the BYOP (Bring Your Own Project) assignment for academic purposes.