Status
Deferred — depends on completion of inference externalization (#194).
Context
After #194 lands, the production OCR path will be the BentoML-backed `nvisy-inference-ocr` service called over HTTP. This works for SaaS and self-hosted Pro customers who can run a second container.
Some customers may want a zero-sidecar deployment — single Rust binary, no Python, no separate inference container. For them, an in-process Rust PaddleOCR backend remains valuable.
`oar-ocr` is the most mature option: PP-OCRv5, actively maintained, exposes raw detection polygons, Apache-2.0.
Why this is currently blocked (post-#194 perspective)
`oar-ocr` pins `ort = "2.0.0-rc.12"`. After #194:
- The workspace no longer depends on `ort` at all (in-process inference removed).
- So there's no version conflict anymore — adding `oar-ocr` would just pull `ort` for its own use, no resolver fight with gline-rs (which is also gone from the workspace).
However, both gline-rs and SemplificaAI/gliner2-rs deliberately stay on rc.9 because rc.11/12 have empirical hang issues. If we ever want both an in-process OCR and an in-process NER in the same Rust binary, the version war comes back. That's a constraint for future planning.
What this issue becomes
A future feature: add `OarOcrBackend` to `nvisy-ocr` behind an `in-process-ocr` cargo feature, as an alternative to the default `BentoMlBackend`. Customers who don't need GLiNER (or who use `BentoMlNerBackend` for NER but want in-process OCR) can enable it.
Triggers to revisit
- Customer demand for zero-sidecar deployment in the self-hosted Pro tier
- ort 2.0.0 final ships and rc.11/12 hang issues are resolved upstream
- `oar-ocr` reaches a stable v1 release
Scope when implemented
- Add `OarOcrBackend` (or similarly named) implementing `nvisy_ocr::Backend`
- Bundled PP-OCRv5 model download via existing `nvisy-core::Downloader` pattern (mirroring `nvisy-nlp/preset/`)
- Per-call language selection in `RunParams`
- Matching `Page → Block → Line → Word` output shape
- Gated behind `in-process-ocr` cargo feature, off by default
- BentoML path stays the default
Status
Deferred — depends on completion of inference externalization (#194).
Context
After #194 lands, the production OCR path will be the BentoML-backed `nvisy-inference-ocr` service called over HTTP. This works for SaaS and self-hosted Pro customers who can run a second container.
Some customers may want a zero-sidecar deployment — single Rust binary, no Python, no separate inference container. For them, an in-process Rust PaddleOCR backend remains valuable.
`oar-ocr` is the most mature option: PP-OCRv5, actively maintained, exposes raw detection polygons, Apache-2.0.
Why this is currently blocked (post-#194 perspective)
`oar-ocr` pins `ort = "2.0.0-rc.12"`. After #194:
However, both gline-rs and SemplificaAI/gliner2-rs deliberately stay on rc.9 because rc.11/12 have empirical hang issues. If we ever want both an in-process OCR and an in-process NER in the same Rust binary, the version war comes back. That's a constraint for future planning.
What this issue becomes
A future feature: add `OarOcrBackend` to `nvisy-ocr` behind an `in-process-ocr` cargo feature, as an alternative to the default `BentoMlBackend`. Customers who don't need GLiNER (or who use `BentoMlNerBackend` for NER but want in-process OCR) can enable it.
Triggers to revisit
Scope when implemented