Skip to content

Future: add optional in-process GLiNER backend (gliner2-rs) for zero-sidecar deployments #193

Description

@martsokha

Status

Deferred — depends on completion of inference externalization (#194).

Context

After #194 lands, the production NER path will be the BentoML-backed `nvisy-inference-ner` service called over HTTP. GLiNER lives inside that Python container; the choice of `gline-rs` vs `gliner2-rs` Rust crate becomes moot because Python uses the upstream Python `gliner` package.

Some customers may want a zero-sidecar deployment — single Rust binary, no Python, no separate inference container. For them, an in-process Rust GLiNER backend remains valuable.

What this issue becomes

A future feature: add `Gliner2Backend` to `nvisy-nlp` behind an `in-process-ner` cargo feature, as an alternative to the default `BentoMlNerBackend`.

When this becomes worth doing:

  • `SemplificaAI/gliner2-rs` publishes to crates.io with a 1.0 commitment
  • ort 2.0.0 final stabilizes (current rc.11/12 hang issues resolved upstream)
  • Customer demand for zero-sidecar deployment

Scope when implemented

  • Add `Gliner2Backend` implementing `nvisy_nlp::NerBackend`
  • Map our `GlinerMode { Span, Token }` enum to the v2 facade
  • Wire HuggingFace download via existing `nvisy-core::Downloader` preset machinery
  • Gated behind `in-process-ner` cargo feature, off by default
  • BentoML path stays the default

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    featrequest for or implementation of a new featurenlpNER backends, language detection, tokenization

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions