Skip to content

irakliskhirtladze/Ka-OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

175 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ka-OCR

This is a monorepo of OCR project that is able to detect Georgian (Ka) texts on images/PDFs.

The project consists of few parts/subprojects:

dataset_gen

Handles synthetic data generation, adding real image data and it's augmentation, unified metadate.csv generation, zipping and uploading to Hugging Face automatically.

ml_training

Runs model training script and manages checkpoints, model evaluation and best model saving.

api

FastAPI based user-facing api for model serving.

Setup

Each subproject is independently managed with UV. Go to the subproject root you want to run/edit and run command to automatically setup venv and install deps.

For example, in case of ml_training subproject:

cd ml_training
uv sync

Then run respective entry point:

uv run main

See more detailed information about setup and usage in each subproject's readme file.

About

In progress: A monorepo of related projects to train ML models for Georgian OCR, wrap with API and serve online.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors