Skip to content

shivapreetham/indian-sign-language-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

38 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Sign Language Assistant

๐Ÿง  Overview

The Sign Language Assistant is a Python-based application aimed at bridging the communication gap for individuals with hearing impairments. It provides two core functionalities:

  • Voice to Sign: Converts spoken language to visual sign language using GIFs or alphabet signs.
  • Sign Detection: Detects and interprets real-time sign gestures from webcam input using a trained ML model and MediaPipe.

The app supports 35 classes focused on Indian Sign Language (ISL) using techniques from Computer Vision, Machine Learning, and Natural Language Processing.


image

๐Ÿš€ Features

  • ๐Ÿ“ธ Data Collection: Captures 100 images/class for 35 signs using webcam (in 10-image batches).
  • ๐Ÿงช Data Augmentation: Enhances dataset using brightness/contrast, blur, flips (3x augmentations).
  • ๐Ÿงฌ Feature Extraction: Uses MediaPipe Holistic landmarks (weighted hands/face/pose) for model input.
  • ๐Ÿง  Model Training: Trains a RandomForestClassifier with hyperparameter tuning.
  • ๐Ÿ—ฃ๏ธ Voice to Sign: Google Speech Recognition to ISL GIFs or letters.
  • ๐ŸคŸ Sign Detection: Real-time prediction with buffer-based voting and Gemini API interpretation.
  • ๐Ÿ–ฅ๏ธ UI: Built with Tkinter, interactive and easy to use.

๐ŸŽฅ Demo

๐Ÿ”ง Prerequisites

  • Python: Version 3.8+

  • Webcam: Required for real-time tasks

  • Gemini API: Must be running at http://localhost:8000/gemini or update in code

  • Image/GIF Assets:

    • logo.png
    • letters/ folder with a.jpg ... z.jpg, empty.jpg
    • ISL_Gifs/ with predefined phrases (e.g., good morning.gif)

๐Ÿ“ฆ Install Dependencies

pip install -r requirements.txt

๐Ÿ“ Directory Structure

sign-language-assistant/
โ”œโ”€โ”€ data/                      # Collected raw sign images
โ”œโ”€โ”€ augmented_data/           # Augmented images
โ”œโ”€โ”€ letters/                  # Alphabet signs (JPGs)
โ”œโ”€โ”€ ISL_Gifs/                 # Phrase signs (GIFs)
โ”œโ”€โ”€ logo.png
โ”œโ”€โ”€ data.pickle               # Extracted features
โ”œโ”€โ”€ model.p                   # Trained model
โ”œโ”€โ”€ confusion_matrix.png
โ”œโ”€โ”€ feature_importances.png
โ”œโ”€โ”€ image_gallery.html
โ”œโ”€โ”€ *.py                      # Python scripts
โ””โ”€โ”€ requirements.txt

๐Ÿ’ป Usage

โ–ถ Launch App

python main.py

Provides 3 GUI options: Voice To Sign, Sign Detection, Exit

๐Ÿ—ฃ Voice to Sign

  • Converts spoken phrases โ†’ ISL GIFs or letters
  • Displays gallery if no phrase match

โœ‹ Sign Detection

  • Live gesture prediction via webcam

  • Buffered voting (15 frames, 60% confidence)

  • Press:

    • G โ€“ Interpret buffered sequence via Gemini API
    • B โ€“ Back
    • Q โ€“ Quit

๐Ÿ›  Optional Scripts

๐Ÿ“ธ Data Collection

python data_collection.py

Captures 100 images per sign (10/batch)

๐Ÿงช Data Augmentation

python data_augmentation.py

Generates 3 augmented images per original

๐Ÿ“Š Feature Extraction

python feature_extraction.py

Uses MediaPipe Holistic landmarks, saves as data.pickle

๐Ÿง  Model Training

python model_training.py
  • Trains RandomForest with GridSearchCV
  • Outputs model.p, plots

๐Ÿงฌ Technical Details

๐Ÿ–ผ๏ธ Image Handling

  • Stored in ./data/<class> or ./augmented_data/<class>
  • JPG format

โš™๏ธ Feature Extraction

  • Weights: hand = 1.0, face = 0.1, pose = 0.3
  • Normalized using min(x, y)

๐Ÿง  Classifier

  • RandomForestClassifier + StandardScaler
  • Parameters: n_estimators, max_depth, min_samples_leaf, max_features
  • Outputs: confusion_matrix.png, feature_importances.png

๐Ÿ—ฃ Speech Module

  • Uses speech_recognition with Google API
  • Levenshtein threshold = 0.4 for matching

๐Ÿ“น Detection Module

  • Real-time with OpenCV & MediaPipe
  • 15-frame buffer, 60% voting threshold
  • Gemini integration for phrase interpretation

โš ๏ธ Limitations

  • ๐Ÿง Only 35 classes supported currently
  • ๐ŸŒ Speech recognition needs internet
  • ๐Ÿง  No dynamic sequence classification yet
  • ๐Ÿ“‰ Real-time inference may lag on low-end systems

๐Ÿ”ฎ Future Scope

  • Expand sign vocabulary & phrases
  • Add offline speech recognition
  • Optimize inference on low-resource devices
  • Dynamic sign sequences and sentence formation
  • Improved accessibility & GUI UX

๐Ÿค Contributing

  1. Fork the repo
  2. Create a branch: git checkout -b feature-name
  3. Commit: git commit -m 'Add feature'
  4. Push: git push origin feature-name
  5. Open a Pull Request

๐Ÿ“œ License

MIT License โ€“ see LICENSE


๐Ÿ™ Acknowledgments

About

Sign Language Assistant: A computer vision project for hearing-impaired communication, featuring real-time sign detection using MediaPipe and RandomForestClassifier, and voice-to-sign conversion with Google Speech Recognition. Supports 35 Indian Sign Language (ISL) gestures with GUI

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors