Skip to content

Batch facet searches into a single multi_search (per-facet engine.search fan-out) #532

Description

@ddeboer

Context

The keyed DatasetFacets GraphQL surface (@lde/search-api-graphql) resolves each facet through its own context.engine.search(...) call, with that facet’s own where-filter removed (skip-own-filter). A typical listing page selects ~7 facets, so one page load fans out to 1 listing search + 7 facet searches against Typesense, and the reference facets additionally re-resolve labels. The pre-migration direct-Typesense path issued a single multi_search for the whole sidebar.

Problem

  • ~4–5× the Typesense round-trips per page load (and per keystroke) – more latency, more load, and a higher chance any single sub-query flakes.
  • In the common unfiltered browse, skip-own-filter removes nothing, so the 7 facet searches are identical except for facet_by; they could be a single faceted search.

Proposal

Batch the selected facet computations into one Typesense multi_search (the client’s multiSearch is already used for label lookups):

  • Group the selected facets by their effective where (after skip-own-filter). Facets whose own field is not filtered share the same where → one search faceting all of them; the unfiltered case collapses to a single search.
  • Expose a batch entry point on the SearchEngine port (e.g. a searchFacets(queries) / multi-facet method) so the surface dispatches once instead of per field. The per-field GraphQL resolvers then read from a per-request batched result (dataloader-style) rather than each calling the engine.

Acceptance

  • A typical page load issues ≤ 2 Typesense round-trips in the common case (listing + one batched facet search), more only when distinct skip-own-filter wheres are genuinely needed.
  • Facet counts, skip-own-filter semantics, and reference-facet labels are unchanged.

Notes

  • Surfaced during the Dataset Register search-migration code review.
  • Related follow-up: an in-memory label cache (TTL + single-flight, like the former browser createLabelResolver) would further cut the per-search label lookups.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Fields

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions