A custom made TrOCR model. This model underperformed and is for that reason not used.
- Defines dataset classes:
- Bible text
- N-grams
- Bible with noise
- N-grams with noise
- Mixed datasets
- Uses
ocr/image_creator.pyto render text as images.
- Renders text into image format.
- Supports padding, grayscale conversion, and font selection.
Tokenizerclass for:- Encoding text to token IDs
- Decoding token IDs to text
- Vocabulary management
- Custom TrOCR-based model.
- Uses a ViT encoder and an autoregressive text decoder.
- Trains the OCR model on synthetic data.
- Handles model saving, loss logging, and evaluation.
- Loads a trained model and runs inference on new images.
- Outputs predicted text.