Skip to content

adityak74/medfit-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MEDFIT-LLM

Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models

This repository contains the code, datasets, and results from our research on fine-tuning small Large Language Models (LLMs) for AI-based healthcare chatbots. Our work demonstrates that properly fine-tuned smaller models can achieve significant improvements in healthcare-specific tasks.

📋 Overview

MEDFIT-LLM explores the efficacy of fine-tuning small LLMs for healthcare applications. We present:

  • A novel approach to dataset creation using synthetic data generation
  • Fine-tuning methodology using LORA on MLX
  • Comprehensive performance evaluation of base models vs. fine-tuned counterparts
  • Analysis of improvements in response quality, efficiency, and structure

🔍 Research Highlights

  • Size Doesn't Always Matter: The smallest model (LLama 3.2 3B) showed the most substantial overall improvement after fine-tuning.
  • Direct Answer Improvement: Up to 30 percentage point increase in direct answer capabilities.
  • Efficiency Gains: Up to 22.24% reduction in generation time.
  • Structural Improvements: Fine-tuned models demonstrated more organized and domain-appropriate response structures.

🧠 Models Evaluated

We fine-tuned and evaluated four LLMs:

  1. Gemma 2 9B 4bit
  2. LLama 3.2 3B Instruct
  3. Mistral 7B Instruct v0.3
  4. Qwen2 7B Instruct 8 bit

🔗 Resources

📊 Key Results

Model Size Direct Answer Improvement Generation Time Change Response Length Change
Llama-3.2-3B 3B +30.0% +1.6% +2.84%
Mistral-7B 7B +20.0% -22.2% -22.64%
Gemma-2-9B 9B 0.0% -4.7% -2.72%
Qwen2-7B 7B 0.0% -8.3% -9.06%

🛠️ Dataset Creation

Our dataset comprises:

  • 6444 unique healthcare-related questions and answers
  • Generated using Phi-4 for synthetic data production
  • Focus on the future impact of LLMs in healthcare
  • Split into training (5155), testing (645), and validation (644) sets

💻 Code Structure

TBA.

🚀 Getting Started

Prerequisites

  • Python 3.8+
  • MLX
  • Transformers
  • Accelerate

Installation

# Clone the repository
git clone https://github.com/adityak74/medfit-llm.git
cd medfit-llm

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running the Code

Data Generation

https://huggingface.co/datasets/mlx-community/medfit-dataset

Fine-Tuning

https://huggingface.co/adityak74/medfit-llm-3B

📈 Visualization Examples

The repository includes notebooks for visualizing key metrics and comparing model performances:

  • Direct answer percentages before and after fine-tuning Direct Answer Comparison
  • Generation time comparison Generation time comparison
  • Response length changes Response length changes
  • Overall improvement scores Overall improvement scores

🔬 Methodology Details

Fine-Tuning Approach

  • Used LORA (Low-Rank Adaptation) on MLX
  • Optimized for healthcare-specific responses
  • Focused on direct answer capabilities and response structure

Evaluation Metrics

  • Direct Answer Rate: Ability to provide clear, immediate answers to healthcare questions
  • Generation Efficiency: Response generation time
  • Response Structure: Changes in formatting and organization
  • Response Length: Changes in verbosity while maintaining information quality

📖 Citation

If you use this code or find our research helpful, please cite:

@INPROCEEDINGS{11042816,
  author={Rao, Aditya Karnam Gururaj and Jaggi, Arjun and Naidu, Sonam},
  booktitle={2025 2nd International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE)}, 
  title={MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models}, 
  year={2025},
  volume={},
  number={},
  pages={1-5},
  keywords={Training;Analytical models;Accuracy;Computational modeling;Retrieval augmented generation;Medical services;Computer architecture;Chatbots;Tuning;Synthetic data;healthcare chatbots;fine-tuning;small language models;synthetic data generation;lora;mlx;gemma;llama;mistral;qwen;response efficiency;direct answer improvement;healthcare ai;medical information dissemination;domain-specific training},
  doi={10.1109/RMKMATE64874.2025.11042816}}

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📬 Contact

About

MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models

Resources

Stars

Watchers

Forks

Contributors