Future: add optional in-process PaddleOCR backend (oar-ocr) for zero-sidecar deployments

## Status

**Deferred** — depends on completion of inference externalization (#194).

## Context

After #194 lands, the production OCR path will be the BentoML-backed \`nvisy-inference-ocr\` service called over HTTP. This works for SaaS and self-hosted Pro customers who can run a second container.

Some customers may want a **zero-sidecar deployment** — single Rust binary, no Python, no separate inference container. For them, an in-process Rust PaddleOCR backend remains valuable.

[\`oar-ocr\`](https://github.com/greatv/oar-ocr) is the most mature option: PP-OCRv5, actively maintained, exposes raw detection polygons, Apache-2.0.

## Why this is currently blocked (post-#194 perspective)

\`oar-ocr\` pins \`ort = \"2.0.0-rc.12\"\`. After #194:
- The workspace no longer depends on \`ort\` at all (in-process inference removed).
- So there's no version conflict anymore — adding \`oar-ocr\` would just pull \`ort\` for its own use, no resolver fight with gline-rs (which is also gone from the workspace).

**However**, both gline-rs and SemplificaAI/gliner2-rs deliberately stay on rc.9 because rc.11/12 have empirical hang issues. If we ever want both an in-process OCR and an in-process NER in the same Rust binary, the version war comes back. That's a constraint for future planning.

## What this issue becomes

A future feature: **add \`OarOcrBackend\` to \`nvisy-ocr\` behind an \`in-process-ocr\` cargo feature**, as an alternative to the default \`BentoMlBackend\`. Customers who don't need GLiNER (or who use \`BentoMlNerBackend\` for NER but want in-process OCR) can enable it.

## Triggers to revisit

- Customer demand for zero-sidecar deployment in the self-hosted Pro tier
- ort 2.0.0 final ships and rc.11/12 hang issues are resolved upstream
- \`oar-ocr\` reaches a stable v1 release

## Scope when implemented

- Add \`OarOcrBackend\` (or similarly named) implementing \`nvisy_ocr::Backend\`
- Bundled PP-OCRv5 model download via existing \`nvisy-core::Downloader\` pattern (mirroring \`nvisy-nlp/preset/\`)
- Per-call language selection in \`RunParams\`
- Matching \`Page → Block → Line → Word\` output shape
- Gated behind \`in-process-ocr\` cargo feature, off by default
- BentoML path stays the default

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Future: add optional in-process PaddleOCR backend (oar-ocr) for zero-sidecar deployments #192

Status

Context

Why this is currently blocked (post-#194 perspective)

What this issue becomes

Triggers to revisit

Scope when implemented

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Future: add optional in-process PaddleOCR backend (oar-ocr) for zero-sidecar deployments #192

Description

Status

Context

Why this is currently blocked (post-#194 perspective)

What this issue becomes

Triggers to revisit

Scope when implemented

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions