This project implements and compares classic machine learning classifiers for the task of recognizing grayscale images (28x28 pixels). The objective was to explore and optimize classification performance using interpretable and efficient models, ultimately yielding a structured prediction output for a large, unseen test set.
- Develop a supervised learning model to classify grayscale images into predefined categories.
- Experiment with multiple classifiers and select the best based on validation accuracy.
- Generate predictions on a hidden test set for real-world-style evaluation.
The dataset can be found using this Google Drive link [https://drive.google.com/drive/folders/1ctNE15BGo4FvC9w_RfdjfrgQy6tbXTnC?usp=sharing]
| Model | Accuracy on Validation Set (test1.csv) |
|---|---|
| K-Nearest Neighbors (KNN) | 85.0% |
| Random Forest | 85.0% |
| Decision Tree | 76.0% |
| Naive Bayes (GaussianNB) | 61.0% |
✅ Chosen Model: KNN was selected based on both high accuracy and generalizability.
The selected model was evaluated on an unseen test set (test2.csv) through a Kaggle-style leaderboard.
| Dataset | Accuracy |
|---|---|
| Validation Set | 85.0% |
| Hidden Test Set (Kaggle) | 85.0% |
Clone the repository and install the dependencies:
git clone https://github.com/your-username/image-classification-28x28.git
cd image-classification-28x28
pip install -r requirements.txt