A simple, consistent and extendable toolkit for IndicTrans2. (Pypi: https://pypi.org/project/indictranstoolkit)
-
Updated
Jul 24, 2025 - Cython
A simple, consistent and extendable toolkit for IndicTrans2. (Pypi: https://pypi.org/project/indictranstoolkit)
MILU (Multi-task Indic Language Understanding Benchmark) is a comprehensive evaluation dataset designed to assess the performance of LLMs across 11 Indic languages.
Fine-tuned and compared 3 🤗 pre-trained Multilingual LLMs
Real-time Indic voice translation pipeline. Audio in (any Indic language) → Audio out (target language). <300ms latency. Built on AI4Bharat models.
Lightweight on-device Hindi TTS for Android & iOS — fine-tuned on AI4Bharat IndicVoices, ONNX export, runs offline on CPU in real-time.
HumanCTO's Indic Voice Pipeline — Download, transcribe, and translate audio/video in 12 Indian languages. 100% local, no API keys. Claude Code skills powered by OpenAI Whisper + vasista22 + AI4Bharat IndicWhisper fine-tuned models.
Open protocol & middleware for Indian language voice agents — STT→LLM→TTS in 22 languages
Unified API for Indian language AI — Speech-to-Text, Translation, TTS, Language ID & NLU for 22 languages. Powered by Whisper, IndicTrans2, Parler-TTS.
This repository contains Python implementations for processing multilingual text data, focusing on language classification and translation tasks. The project addresses two distinct tasks: language classification and English translation, each involving different complexities in the processing of text data.
Setu dashboard is a all-in-one streamlit application that allows users to provide feedback on the outputs of the setu data cleaning pipeline for @AI4Bharat
An open-source conversational AI assistant for rural & semi-urban India. Supports voice-based queries in Hindi, Punjabi, Tamil, Bengali and more. Converts speech to text (ASR), retrieves answers via RAG, translates between Indic ↔ English, and responds back with Indic TTS. Built with AI4Bharat models, Whisper, Flan-T5, IndicTrans2, and IndicLID.
Benchmarking NER on Naamapadam across 7 Indic languages. EDA + model training for Hindi/Bengali/Telugu using mBERT, XLM-R, T5, FlanT5, mT5 + LLM fine-tuning (TinyLlama, Llama-3.2, Gemma, Qwen, Mistral) + 0–5 shot inference on 9 generative models.
Add a description, image, and links to the ai4bharat topic page so that developers can more easily learn about it.
To associate your repository with the ai4bharat topic, visit your repo's landing page and select "manage topics."