Skip to content

refactor + feat: code quality + Presidio-aligned shipped pattern set + chunk context propagation#279

Merged
martsokha merged 14 commits into
mainfrom
refactor/code-quality
Jun 16, 2026
Merged

refactor + feat: code quality + Presidio-aligned shipped pattern set + chunk context propagation#279
martsokha merged 14 commits into
mainfrom
refactor/code-quality

Conversation

@martsokha

@martsokha martsokha commented Jun 14, 2026

Copy link
Copy Markdown
Member

Summary

13 commits split into three loose groups; full list at the bottom.

Architecture / refactor

  • Extract nvisy-context crate. Engine wires the per-label keyword-boost enhancer as a wrapping layer around recognizers. Context owns BoostRule, Enhancer, SubstringMatcher, LemmaMatcher, ContextEnhanced<R>, Tokens/Token.
  • nvisy-context module split. enhancer/{mod,context,window}.rs, matching/{mod,matcher,lemma}.rs, io/{mod,tokens,wrapper}.rs. Public surface stays at the crate root via re-exports.
  • Rename DetectorPatternRegex in nvisy-pattern. Detection rule is Regex (with Vec<Variant>), matching Presidio's recognizer shape. Inline PatternRegistry into PatternRecognizerBuilder.
  • Country scope. Add CountryCode to nvisy-core::primitive (ISO 3166-1 alpha-2 via celes); RecognizerInput.country: Option<CountryCode> + applies_to_country; per-call filtering inside PatternRecognizer::recognize. Shipped pattern + dictionary assets reorganized into world/, us/, uk/ subtrees.
  • Per-language context keywords with primary-subtag matching via LanguageTag::matches.
  • SuppressionLayer in nvisy-toolkit::deduplication — allow-list false-positive filter slotted between fuse and resolve.

Presidio-aligned shipped pattern set

  • Audited every shipped pattern (15 world + 10 US + 6 UK) against Microsoft Presidio. Findings written up and addressed.
  • Tier 1 correctness bugs: IBAN regex accepts UK/IE/MT/hyphenated forms; private_key matches the full PEM block; us/medical_license gains \b anchors; uk.nino rejects O at position 0; us/passport gains context; us/postal_code drops score 0.5→0.1 + gains context + us.postal_code validator that rejects 00000.
  • Tier 2 coverage: bitcoin Taproot (Bech32m up to 62 chars) + Base58Check validator via bs58; credit_card adds Mastercard 2-series + score 0.5→0.3 (Presidio parity); aws_key broadens to 7 prefixes + secret-key variant; github_pat_; generic_api_key accepts whitespace separator (Authorization: Bearer); uk DL + vehicle-reg validators (Presidio parity).
  • world/phone validator now uses the phonenumber crate (region-aware libphonenumber port) instead of a regex + length check.
  • 3 UK validators added: uk.driving_licence, uk.vehicle_registration, plus existing uk.nhs + uk.nino extended.
  • Drive-by: rig 0.38's Gemini provider now implements CompletionClient, so the gemini::completion::CompletionModel::new workaround is gone.

Chunk-level context propagation

  • RecognizerInput.context_hints: Vec<String> — out-of-band context strings the enhancer should treat as in-context. Threaded through ContextEnhanced wrapper.
  • Enhancer::enhance takes a Context bundle (text + tokens + language + hints) instead of four loose args. Hint-path runs as a fallback when the in-text word window doesn't fire; at most one boost per rule per entity.
  • Chunk<M> gains pub hints: Vec<String> populated by next_chunk alongside data and location. No new Handler trait method, no double traversal.
  • CSV handler populates from chunk.location.column_name.
  • HTML handler populates from the nearest block-level ancestor's text (curated is_block_element set — html5ever/scraper don't expose one). <p>...the payment card <code>4111…</code> is on file</p> gives the <code> chunk a hint of "the payment card is on file".
  • JSON handler populates from the enclosing object key (threaded through parse_value/parse_array; array elements inherit the containing key).
  • Toolkit detect() reads chunk.hints directly and forwards to RecognizerInput::with_context_hints.

Tests + style

  • tests/shipped_detection.rs split into per-region binaries (builtin.rs, builtin_us.rs, builtin_uk.rs) with shared tests/fixtures/mod.rs. 11 region+domain tests; substring + label assertions with #[track_caller].
  • testdata/inputs/ reorganized to mirror the asset tree: inputs/{contact,credentials,finance,network,personal}.txt (world) + inputs/us/{identity,finance,health}.txt + inputs/uk/{identity,contact,vehicle}.txt.
  • Workspace-wide import-style refactor: hoisted ~94 inline fully-qualified paths (std + crate paths) into use statements across 51 files. Exceptions for #[async_trait::async_trait] and tracing::* macros/attrs preserved.

Test plan

  • cargo test --workspace — all green
  • cargo clippy --workspace --all-targets -- -D warnings
  • RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
  • cargo test -p nvisy-pattern --doc
  • e2e codec tests: CSV, HTML, JSON, XLSX, DOCX, PDF, TXT — all detect payment_card (CC=0.3) via the new context-hint path
  • 17 enhancer unit tests (including 2 new hint-path tests)
  • 11 shipped_detection e2e tests across 3 region binaries

Commits

e34babce refactor: hoist fully-qualified paths into use imports
c63a0e40 feat(codec,toolkit): chunk-level context hints for HTML + JSON
2137db58 feat(pattern,context): Presidio-aligned audit fixes + tabular context propagation
ae6a6409 test(pattern): split shipped_detection into per-region e2e binaries
85940d9f feat(pattern,core): country scope + Presidio-aligned shipped pattern set
40d49750 feat(toolkit): SuppressionLayer with allow-list false-positive filter
756a9fa0 feat(pattern,context): per-language context keywords + primary-subtag matching
ae12d261 refactor(pattern,context): rename Boosting→ContextEnhanced, split build()
628227cf refactor(pattern): inline VariantBuilder, drop Terms, normalize all docs
111bbac0 refactor(pattern,context): rename Pattern→Regex, inline registries, normalize docs
682913cb style: cargo fmt
534c1c73 chore(deps): clean up workspace deps + normalize per-crate manifests
f25f21cb refactor(context): extract nvisy-context crate + wire engine enhancer pass

🤖 Generated with Claude Code

martsokha and others added 5 commits June 14, 2026 13:29
… pass

Lifts crate::context out of nvisy-core into a sibling nvisy-context crate so
the SDK base stays primitives-only for third-party recognizer authors. Adds
NerRecognizer::context_registry (mirroring PatternRegistry::context_registry)
and wires ContextEnhancer into DetectionPhase: build_for_request now returns
DetectionResources { recognizers, enhancer }, the enhancer runs in block-local
coordinates between recognizer dispatch and modality lifting, and the
substring path runs by default (Tokens artifact wiring follows when an
NlpEngine is plumbed in).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drops 9 unused workspace deps (hmac, include_dir, quick-xml, reqwest,
serde_with, smallvec, stop-words, walkdir, zip) and reorders the root
[workspace.dependencies] foundation-first: primitives → runtime → domain
(text/document/image/audio) → integration (HTTP, AI, server, CLI) →
storage → utilities. Removes per-crate machete-flagged deps and aligns
every crate manifest with the new group names and order. Keeps calamine
and unicode-segmentation in workspace deps for upcoming xlsx + word-boundary
work. Marks humantime-serde as ignored in nvisy-llm/nvisy-engine/nvisy-server
where it's used via serde `with =` strings.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ormalize docs

- nvisy-pattern: rename Detector→Pattern→Regex; inline PatternRegistry
  into PatternRecognizerBuilder; split CompiledPattern out; export
  built-in validators by bare-noun names (luhn, iban, ssn, phone, date);
  add Scoring::get + per-column resolution; convert pattern assets to
  TOML; normalize module/function docs (returns-form for predicates,
  reference-form doc-links, # Errors + code examples for public types).
- nvisy-context: extract registry/declaration into rule + wrapper;
  trim enhancer/matcher/tokens surface.
- nvisy-toolkit: drop stale PatternRegistry usage in pipeline example;
  fix broken rustdoc links in redaction module.
- nvisy-engine, nvisy-ner: knock-on updates for the new pattern and
  context surfaces.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Variant: replace derive_builder with `new(regex)?` +
  `with_score` / `with_validator` chain (matches Term::new style).
- Drop the Terms newtype; Dictionary::terms is `Vec<Term>` and the
  parsers move to associated fns on Term:
  - Term::from_text(&str) -> Vec<Term>   (infallible)
  - Term::from_csv(&str) -> Result<Vec<Term>, Error>
  Signatures now match Regex::from_toml / Dictionary::from_toml.
- Rewrite every public-item docblock in nvisy-pattern for a
  consistent style: noun-phrase openers for types, imperative for
  constructors/setters, returns-form for predicates, reference-form
  doc-links at the bottom, `# Errors` only where fallible, code
  examples on top-level types.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@martsokha martsokha self-assigned this Jun 14, 2026
@martsokha martsokha added pattern refactor code restructuring without behavior change labels Jun 14, 2026
martsokha and others added 2 commits June 14, 2026 23:40
…ld()

- nvisy-context: `Boosting<R>` → `ContextEnhanced<R>` (more
  self-descriptive; reads as "an R that's been context-enhanced").
- nvisy-pattern: `PatternRecognizerBuilder::build()` now returns
  the bare `PatternRecognizer`; the wrapped form moves to
  `build_context_enhanced() -> ContextEnhanced<PatternRecognizer>`.
  Callers opt into the keyword-boost layer explicitly.
- Engine config, shipped-detection / user-rules / enhancer
  roundtrip tests, toolkit fixtures + example flipped to
  `build_context_enhanced()` to preserve prior behavior.
- README + module/struct docs rewritten to describe both methods
  without historical framing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… matching

- nvisy-pattern: new `Context` enum (Global | PerLanguage) replaces
  `Vec<String>` on Regex and Dictionary. Untagged serde keeps the
  flat TOML form working unchanged; new form is `[context.en] = [...]`.
  Shipped phone, credit_card, date_of_birth, datetime patterns now
  carry EN/ES/DE/FR keyword sets.
- nvisy-context: BoostRule gains `language: Option<LanguageTag>`.
  Enhancer storage flips to `HashMap<Label, Vec<BoostRule>>` —
  one bucket per label, distinct language scopes inside.
  Enhancer::enhance takes a language hint; ContextEnhanced
  threads input.language through.
- nvisy-core: LanguageTag::new returns nvisy_core::Error;
  LanguageTag::matches compares primary subtags case-insensitively
  so `en` matches `en-US` / `en-GB`. BoostRule::applies_to_language
  and RecognizerInput::applies_to_language switch from `==` to
  `matches()` so language-scoped rules fire under regional variants.
- Tests: TOML round-trip both forms; per-language boost fires for
  matching language; no boost for non-matching language; no-hint
  unions all per-language keywords; regional variants
  (`en-US`) trigger `en`-scoped rules.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
martsokha and others added 6 commits June 15, 2026 06:51
- New `nvisy-toolkit::deduplication::suppress::{SuppressionLayer,
  SuppressionParams}`. Three independent allow-list shapes apply
  by union:
  - `allow_values` — exact, ASCII case-insensitive
  - `allow_values_substring` — entity text contains the value
  - `allow_values_regex` — regex matched against entity text
- Operates on the entity's resolved text via `TextAt::text_at`.
  Fail-open when the resolver returns `None`: keep the entity
  rather than silently drop something we can't verify.
- Empty entries are filtered at construction (otherwise an empty
  substring would drop every entity via `str::contains("")`).
- `LayerParams` gains a nested `suppression: SuppressionParams`
  field; `LayerPipeline::from_params` becomes fallible and inserts
  the new layer between fuse and resolve, growing the canonical
  recipe to: calibrate → filter → fuse → suppress → resolve.
- Six `from_params` call sites updated (engine pipeline, engine
  tests, toolkit fixtures, toolkit example).
- 15 unit tests in `suppress::tests` (exact / substring / regex
  modes, case insensitivity, partial-overlap semantics, empty-
  entry filtering, union across modes, unresolved-location
  fail-open, invalid-regex error). Plus a pipeline-order test in
  `pipeline::tests` that pins the architectural intent that fuse
  collapses before suppress sees, suppress drops before resolve
  adjudicates.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- nvisy-core: new `CountryCode` (ISO 3166-1 alpha-2, validated via
  `celes`). `RecognizerInput.country: Option<CountryCode>` +
  `applies_to_country` mirror the existing language scoping.
- nvisy-pattern: `Regex.countries` / `Dictionary.countries`
  (Vec<CountryCode>, empty = world). `PatternRecognizer::recognize`
  honours per-call jurisdiction hints alongside language hints.
- Asset tree reorganized into `world/`, `us/`, `uk/` subtrees;
  shipped accessors split into per-region modules
  (`shipped::patterns::{world,us,uk}`, dictionaries::world).
  Macro helpers exported as `__shipped_pattern` /
  `__shipped_dictionary` so sub-modules resolve their own
  include_str! paths.
- Pattern count grows 23 → 34: world unchanged at 18; us 5 → 10
  (+itin, npi, mbi, bank_account, medical_license); uk added at 6
  (nhs, nino, driving_licence, postcode, vehicle_registration,
  passport).
- Validators split into per-country sub-modules with dotted names
  (`us.ssn`, `us.aba_routing`, `us.npi`, `us.dea_number`,
  `uk.nhs`, `uk.nino`). Shared `luhn`, `iban`, `phone`, `date`
  stay flat. World pattern set extended (brand-aware credit_card,
  RFC5322-loose email, Cisco-form MAC, IPv4 CIDR, comprehensive
  IPv6 alternation set).
- Pattern scores normalized to a single conservative-baseline
  scheme (most regex-only matches land at 0.1–0.5 before context
  boost). Confidence threshold in the toolkit test fixture
  lowered to 0.35 to match.
- `assets/NOTICE.md` documents third-party regex provenance for
  the shipped pattern assets.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Restructure pattern crate end-to-end tests:

- Replace single tests/shipped_detection.rs with three per-region
  binaries: tests/builtin.rs (5 world tests), tests/builtin_us.rs
  (3 US tests), tests/builtin_uk.rs (3 UK tests). 11 tests total.
- Move shared scan + assert_match + assert_label_present helpers to
  tests/fixtures/mod.rs, declared via mod fixtures; from each binary.
  Both helpers carry #[track_caller] for better failure attribution.
- Reshape testdata/inputs/ to mirror the asset tree:
  - world fixtures move from monolithic domain files into
    inputs/{contact,credentials,finance,network,personal}.txt
  - inputs/us/{identity,finance,health}.txt
  - inputs/uk/{identity,contact,vehicle}.txt (split from old uk.txt)
- Each test scans one fixture, asserting substring + label matches
  against a recognizer loaded with every shipped pattern and
  dictionary via build_context_enhanced.
- builtin_uk_identity asserts NATIONALITY (world dictionary firing
  on "British") to keep assert_label_present reachable across all
  three binaries.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… propagation

Tier 1 (correctness bugs):

- world/iban regex: extend middle groups from \d{4} to [A-Z0-9]{4}
  and separator \s? to [\s\-]?; accepts IBANs with letters past
  position 8 (UK NWBK, IE, MT, MK, GI) and hyphenated forms that
  the mod-97 validator was already prepared to handle.
- world/private_key: match the full BEGIN..END PEM block instead
  of only the header line; add ENCRYPTED PRIVATE KEY, PGP, SSH2,
  and PuTTY-User-Key-File-{2,3} variants.
- us/medical_license: add \b anchors to both DEA variants; prior
  pattern matched inside longer alphanumeric tokens.
- uk.nino validator: reject O as the position-0 letter (HMRC
  reserved); the character class blocks D/F/I/Q/U/V but allows
  O via the j-p range.
- us/passport: add Presidio's context = [passport, passport#,
  travel document, us passport, united states passport] so the
  0.1-base pattern can boost above threshold.
- us/postal_code: drop score 0.5 -> 0.1, add context, ship a
  us.postal_code validator that rejects 00000.

Tier 2 (coverage + scoring):

- world/bitcoin_address: split legacy (Base58) from Bech32; bump
  Bech32 cap {25,39}->{25,59} for Taproot (bc1p...). Add a
  crypto.btc validator using bs58::decode_check.
- world/credit_card: add Mastercard 2-series (2221-2720); drop
  score 0.5 -> 0.3 to match Presidio's deliberate baseline that
  expects context boost to do the rest.
- world/aws_key: broaden access-key ID prefix to also catch
  ASIA/AIDA/AROA/ANPA/AGPA/AIPA; add a second variant for the
  40-char secret access key; ship Presidio-style context.
- world/github_token: add github_pat_[A-Z0-9_]{82} variant for
  the fine-grained PAT format introduced in 2022.
- world/generic_api_key: accept whitespace separator alongside
  [:=] so `Authorization: Bearer <token>` matches.
- uk/driving_licence + uk/vehicle/registration: add the Presidio
  validators we'd left on the table (99999 surname rejection,
  age-ID range 02-29 / 51-79 for current-format plates).

Validator infrastructure:

- world/phone: replace the regex+length validator with a
  phonenumber-crate-backed region-aware validator. The validator
  parses E.164 directly and falls back to the caller-specified
  country (via RecognizerInput.country) when present.
  Introduces a workspace-wide phonenumber = 0.3 dependency.
- ValidatorRegistry::with_simple convenience for the ten
  context-free validators; with() stays the canonical entry
  point for ctx-aware validators (only phone today).

Context-enhancer architecture:

- Add RecognizerInput.context_hints: Vec<String> for out-of-band
  context strings (CSV column headers, JSON keys, log field
  names) the caller wants treated as in-context.
- nvisy-context::Enhancer::enhance now takes a Context bundle
  (text + tokens + language + hints) instead of four loose
  arguments. The hint path runs as a fallback when the in-text
  word window doesn't fire; at most one boost per rule per
  entity.
- LiftedFromText in nvisy-toolkit gains chunk_hints; Tabular
  surfaces column_name as a hint so a `card` column header
  lifts a per-cell CC=0.3 match to ~0.65 via the existing boost
  pipeline (no synthetic score patching to clear the threshold).

nvisy-context module split:

- enhancer.rs -> enhancer/{mod, context, window}.rs
- matcher.rs -> matching/{mod, matcher, lemma}.rs
- tokens.rs + wrapper.rs -> io/{mod, tokens, wrapper}.rs
- Public surface (Context, Enhancer, BoostRule, KeywordMatcher,
  SubstringMatcher, LemmaMatcher, ContextEnhanced, Token,
  Tokens) stays at the crate root via re-exports.
- Drop 3 redundant enhancer tests (suffix-symmetry duplicate,
  unicode-too-distant duplicate, token-window symmetry); keep
  the 13 unique behaviors plus 2 new hint-path tests.

Known gap: html_codec_e2e payment_card assertion fails because
HTML chunks at text-node boundary and `<code>4111...</code>`
loses the surrounding "payment card" context. The fix requires
moving chunk_hints from LiftedFromText onto the Handler trait
and overriding it on HtmlHandler to emit parent-element text
as a hint. Tracked as follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extend the tabular-cell hint mechanism to all chunked formats by
making hints a first-class field on Chunk<M>, populated alongside
data + location during next_chunk.

Architecture:

- Chunk<M> gains pub hints: Vec<String>. Hints are metadata the
  chunk's structural neighbours surface — CSV column headers,
  HTML parent-element text, JSON object keys — for downstream
  context-aware recognizers.
- nvisy-toolkit detect() reads chunk.hints directly. The earlier
  LiftedFromText::chunk_hints and the briefly-added
  Handler::chunk_hints methods are both removed; the field on
  Chunk avoids a second handler call to recompute information
  next_chunk already had.
- Handlers without useful out-of-band metadata (TXT, PDF, image,
  audio) initialise the field with Vec::new().

CSV handler:

- Populates chunk.hints from chunk.location.column_name.
  Replaces the prior LiftedFromText::chunk_hints<TabularLocation>
  override, which had the same effect via a less-direct path.

HTML handler:

- RedactableItem gains pub hints: Vec<String>. The DOM walk in
  build_items computes a per-text-node hint by collecting the
  text of the node's nearest block-level ancestor (excluding
  the node's own text). nearest_block_ancestor walks parents
  until it finds a tag in is_block_element's curated set (p,
  div, li, td, th, h1-h6, blockquote, dt, dd, section, article,
  aside, header, footer, main, nav, figcaption, caption).
- Stopping at the immediate inline parent would yield only the
  chunk's own text — the surrounding sentence lives in the
  enclosing block. `<p>...the payment card <code>4111…</code>
  is on file</p>` gives the <code> chunk a hint of "the payment
  card is on file", which lifts CC=0.3 above threshold via the
  existing context-enhancer.
- Note: neither html5ever, markup5ever, nor scraper exposes a
  block/inline classifier; the curated list is the simplest
  honest implementation. Future HTML elements not in the list
  graciously degrade (walk continues to root, hints stay empty)
  rather than corrupting detection.

JSON handler:

- Leaf gains pub hints: Vec<String>. parse_value and parse_array
  now thread an Option<&str> key_context; parse_object captures
  the just-parsed key and passes it to parse_value for the
  value. Array elements inherit the containing object's key so
  {"cards": ["4111…", "5555…"]} gives both PANs the "cards"
  hint. Top-level scalars and keys themselves stay hint-less.
- The leaf's hint is copied onto Chunk.hints in next_chunk so
  recognizers see it via input.context_hints.

Resolves codec_e2e_html and codec_e2e_json payment_card
assertions without touching scores.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Apply the prefer-use-imports style across the workspace: replace
inline `foo::bar::Baz::method(...)` and `impl std::fmt::Debug ...`
patterns with `use` lines at the top of each file, then refer to
the imported name directly.

Touches 51 files across nvisy-core, nvisy-context, nvisy-codec,
nvisy-engine, nvisy-fake, nvisy-llm, nvisy-pattern, nvisy-server,
nvisy-toolkit. Hoisted both crate paths (axum, aide, tower_http,
rig, image, symphonia, nvisy_core::*) and std paths
(std::fmt, std::cmp::{Ordering, Reverse}, std::marker::PhantomData,
std::any::type_name, std::io::ErrorKind, std::path::Path,
std::slice, std::vec).

Exceptions preserved:

- `#[async_trait::async_trait]` attributes stay inlined.
- `tracing::*` macros and attributes stay inlined.
- `*macros.rs` files keep fully-qualified paths for macro hygiene.
- String literals inside `#[serde(...)]` attributes are not
  code-position paths.

For collisions with locally-defined types (e.g. `Path`, `Json` in
the server crate), imports use aliasing — `axum::extract::Path
as AxumPath` rather than inlining.

Drive-by simplification in nvisy-llm/src/backend/rig/mod.rs: rig
0.38 now implements CompletionClient for Gemini, so the
`gemini::completion::CompletionModel::new(...)` workaround from
rig-core 0.31 is gone and the build site uses
`client.completion_model(...)` like the other providers.

No behavior changes; workspace test, clippy, doc all clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@martsokha martsokha changed the title refactor: rename Pattern→Regex, extract nvisy-context, normalize pattern docs refactor + feat: code quality + Presidio-aligned shipped pattern set + chunk context propagation Jun 16, 2026
CI clippy on Rust 1.95 flags the nested
  if let Node::Element(e) = node.value() {
      if is_block_element(...) { ... }
  }
as `clippy::collapsible-if`. Combine the two conditions with `&&`
so the chained let pattern + boolean check live on one expression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@martsokha martsokha merged commit 5ade4ef into main Jun 16, 2026
5 checks passed
@martsokha martsokha deleted the refactor/code-quality branch June 16, 2026 09:19
@martsokha martsokha added recognition pattern, NER, LLM, and OCR backends (elide::recognition::*) engine redaction engine, pipeline runtime, orchestration, configuration and removed pattern labels Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine redaction engine, pipeline runtime, orchestration, configuration recognition pattern, NER, LLM, and OCR backends (elide::recognition::*) refactor code restructuring without behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant