Skip to content

refactor(embedding_compute): extract providers into adapter pattern#373

Open
luojiyin1987 wants to merge 2 commits into
StarTrail-org:mainfrom
luojiyin1987:refactor/embedding-providers
Open

refactor(embedding_compute): extract providers into adapter pattern#373
luojiyin1987 wants to merge 2 commits into
StarTrail-org:mainfrom
luojiyin1987:refactor/embedding-providers

Conversation

@luojiyin1987

Copy link
Copy Markdown

Closes #372

Summary

Split the 1444-line embedding_compute.py monolith into a registry-backed adapter pattern.

New structure

embedding_compute.py  (403 lines — shared utils + registry + dispatch)
providers/
├── __init__.py
├── sentence_transformers.py  (487 lines)
├── ollama.py                 (335 lines)
├── openai.py                 (136 lines)
├── gemini.py                 (89 lines)
└── mlx.py                    (82 lines)

Design

# Public API — unchanged
compute_embeddings(texts, model_name, mode="sentence-transformers", ...)

# Internal — new
_init_providers()  → lazy-import + _providers registry dict
register_provider("custom", fn)  → third-party extensibility

Benefits

  • Adding a new provider: 1 file + 1 registry line (was: touch 1444-line elif chain)
  • Full backward compat — ~90 callers unchanged, old function names preserved as aliases
  • Each provider independently testable
  • Third-party extensibility via register_provider()

- Split embedding_compute.py (1444→403 lines) into registry + dispatch
- New providers/ package with one module per backend:
  - sentence_transformers.py (487 lines) — ST models + hardware optimizations
  - openai.py (136 lines) — OpenAI-compatible API
  - mlx.py (82 lines) — Apple Silicon MLX
  - ollama.py (335 lines) — Ollama local server
  - gemini.py (89 lines) — Google Gemini API
- Add _init_providers() lazy-import + _providers registry dict
- Add register_provider(name, fn) for third-party backends
- Full backward compat: old function names (compute_embeddings_sentence_transformers
  etc.) kept as module-level aliases replaced at provider init
- Shared utilities (_model_cache, get_model_token_limit, truncate_to_token_limit,
  _query_ollama_context_limit, _query_lmstudio_context_limit) stay in main module
- Adding a new provider now requires only a new file + 1 registry line
- Remove unused imports: subprocess, time, cast, json, etc.
- Add missing imports: _model_cache (mlx), token limit helpers (ollama), Protocol (st)
- Apply ruff format to satisfy CI pre-commit checks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(embedding_compute): extract providers into adapter pattern

1 participant