Skip to content

Future: reintroduce Google Cloud Vision OCR backend #202

Description

@martsokha

Status

Deferred — removed in #195 to clear the deck for the externalised inference architecture (#194). Reintroduce as opt-in when there's user demand.

Context

GoogleVisionBackend was a direct cloud OCR backend in nvisy-ocr/src/provider/google_vision/, gated behind the google-vision cargo feature (forwarded as part of google through nvisy-engine/nvisy-server/nvisy-cli — note that the google feature also enables Gemini, which stays). It called Vision's images:annotate endpoint with TEXT_DETECTION / DOCUMENT_TEXT_DETECTION features.

The new architecture (#194) centres on externalised inference services (see nvisycom/inference). Cloud OCR backends carry provider-specific auth, retry, and parsing work that doesn't pay off without a user. This issue tracks reintroducing it when needed.

What this issue becomes

Add GoogleVisionBackend back to nvisy-ocr behind a google-vision cargo feature.

Triggers to revisit

  • A customer requests Vision OCR for self-hosted runtime
  • We want Vision OCR as a fallback when the externalised Bento OCR is unavailable
  • A GCP-resident deployment where calling Vision directly is simpler than running a sidecar

Scope when reintroduced

  • Restore nvisy-ocr/src/backend/google_vision_backend.rs (new backend/-based layout)
  • Restore the google-vision feature on nvisy-ocr
  • Restore the OcrBackend::GoogleVision { … } variant + OcrExtractor::from_config dispatch
  • Restore feature forwarding nvisy-ocr/google-visionnvisy-engine/googlenvisy-server/googlenvisy-cli/google (the google feature continues to also enable nvisy-agent/google-gemini — the OCR addition is purely additive)
  • Auth: API key (header or query param) or service-account JWT; both have historical precedent
  • Map polygon vertices into the new Polygon primitive on ImageLocation

Reference

The deleted code is preserved in git history; the last commit including it is the parent of the removal commit on #195.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featrequest for or implementation of a new featureocrOCR backends and providers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions