ML-Driven Biomarker Identification for Early Disease Detection

This repository contains the work for my Major Technical Project-1 during 4th Year B.Tech, which explores the use of machine learning for the identification of potential biomarkers based on patient metabolite data. The goal is to aid in early disease diagnosis by extracting biologically significant features from large-scale sample datasets.

Problem Statement

To develop a machine learning model that can classify metabolite data from patients and identify a small subset of biomarkers that are predictive of disease occurrence.

Project Highlights

Analyzed 100 patient samples (cases + controls) for metabolite profiling.
Used MetaboAnalyst for:
- Data normalization
- Preprocessing
- PCA visualization
Applied ML models:
- Support Vector Machine (SVM)
- XGBoost
- Random Forest
Achieved 92% data similarity when validated against known datasets.

Tools & Technologies

Python (pandas, scikit-learn, xgboost, seaborn, matplotlib)
MetaboAnalyst (Web-based metabolomic data analysis)
Jupyter Notebook
PDF report making for methodology and results

Learnings

Gained hands-on experience with biomarker discovery pipelines.
Learned feature selection and model comparison in bioinformatics.
Practiced end-to-end machine learning workflows with biomedical data.
Validated findings using both biological context and model outputs.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
again.csv		again.csv
b20016_MTP1_Report.pdf		b20016_MTP1_Report.pdf
metabolite_classifier.ipynb		metabolite_classifier.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-Driven Biomarker Identification for Early Disease Detection

Problem Statement

Project Highlights

Tools & Technologies

Learnings

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML-Driven Biomarker Identification for Early Disease Detection

Problem Statement

Project Highlights

Tools & Technologies

Learnings

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages