Status
Deferred — removed in #195 to clear the deck for the externalised inference architecture (#194). Reintroduce as opt-in when there's user demand.
Context
AwsTextractBackend was a direct cloud OCR backend in nvisy-ocr/src/provider/aws_textract/, gated behind the aws-textract cargo feature (forwarded as amazon through nvisy-engine/nvisy-server/nvisy-cli). It used the AWS Sigv4 signing flow over reqwest-middleware, with the runtime calling Textract's AnalyzeDocument API directly.
The new architecture (#194) centres on externalised inference services (see nvisycom/inference) called over HTTP. Cloud OCR backends — Textract, Vision, DocAI — are conceptually closer to that path than to the in-process model backends, but each carries provider-specific request signing, retry, and error-shape work that doesn't pay off until a user actually deploys against it.
Rather than carry three dormant cloud backends through every refactor, this PR deletes them and tracks the reintroduction here.
What this issue becomes
Add AwsTextractBackend back to nvisy-ocr behind an aws-textract cargo feature when a customer or first-party deployment wants it.
Triggers to revisit
- A customer requests Textract OCR for self-hosted runtime
- We want Textract OCR as a fallback when the externalised Bento OCR is unavailable
- We want a quick "no extra infrastructure" path for AWS-resident deployments
Scope when reintroduced
- Restore
nvisy-ocr/src/backend/aws_textract_backend.rs (mirroring the new backend/-based layout)
- Restore the
aws-textract feature on nvisy-ocr (with sha2 + hmac deps for Sigv4)
- Restore the
OcrBackend::AwsTextract { … } variant + OcrExtractor::from_config dispatch
- Restore feature forwarding
nvisy-ocr/aws-textract → nvisy-engine/amazon → nvisy-server/amazon → nvisy-cli/amazon
- Auth: pull
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION from env or config
- Map AWS confidence (0–100) →
0.0..=1.0 on the wire types
Reference
The deleted code is preserved in git history; the last commit including it is the parent of the removal commit on #195.
Status
Deferred — removed in #195 to clear the deck for the externalised inference architecture (#194). Reintroduce as opt-in when there's user demand.
Context
AwsTextractBackendwas a direct cloud OCR backend innvisy-ocr/src/provider/aws_textract/, gated behind theaws-textractcargo feature (forwarded asamazonthroughnvisy-engine/nvisy-server/nvisy-cli). It used the AWS Sigv4 signing flow overreqwest-middleware, with the runtime calling Textract'sAnalyzeDocumentAPI directly.The new architecture (#194) centres on externalised inference services (see
nvisycom/inference) called over HTTP. Cloud OCR backends — Textract, Vision, DocAI — are conceptually closer to that path than to the in-process model backends, but each carries provider-specific request signing, retry, and error-shape work that doesn't pay off until a user actually deploys against it.Rather than carry three dormant cloud backends through every refactor, this PR deletes them and tracks the reintroduction here.
What this issue becomes
Add
AwsTextractBackendback tonvisy-ocrbehind anaws-textractcargo feature when a customer or first-party deployment wants it.Triggers to revisit
Scope when reintroduced
nvisy-ocr/src/backend/aws_textract_backend.rs(mirroring the new backend/-based layout)aws-textractfeature onnvisy-ocr(withsha2+hmacdeps for Sigv4)OcrBackend::AwsTextract { … }variant +OcrExtractor::from_configdispatchnvisy-ocr/aws-textract→nvisy-engine/amazon→nvisy-server/amazon→nvisy-cli/amazonAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_REGIONfrom env or config0.0..=1.0on the wire typesReference
The deleted code is preserved in git history; the last commit including it is the parent of the removal commit on #195.