Tagline: Measuring and improving digital & financial literacy outcomes using machine learning, with privacy-preserving analytics and fairness audits.
This repository extends the Ujjawal Women Association's Digital Data Literacy Program into an ML-ready project. It provides reproducible pipelines to:
- Ingest anonymized training/assessment data
- Perform feature engineering for learning outcomes
- Predict retention and mastery
- Generate actionable cohort insights and dashboards
- PI: Dr. Deepa Shukla (ORCID: 0000-0003-3016-1633)
- Impact: 5,000+ women trained across India
- Ethics: De-identified, consented analytics; bias and fairness checks documented in
reports/
participant_id (hash), age_band, region, literacy_level_baseline, module_hours, assessment_pre, assessment_post, followup_90d, dropout_flag, device_access, net_availability, income_band
- Binary classification: dropout prediction; follow-up completion
- Regression: learning gain score (post - pre)
- Uplift: treatment effect of module variants
- Clustering: learner personas
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python src/ingest/load_data.py