From be54cf47b144a460b93b3551a8d58b2dff95688c Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:24:27 -0700 Subject: [PATCH 01/15] docs(plans): crate-split design and topology artifact MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Twelve-crate layer-aligned split of dspy-rs: dsrs-{core, lm, trace, cache, predict, evaluate, gepa, data, leaven} on top of the existing bamltype / bamltype-derive / dsrs-macros foundation. No facade. Key shape: - dsrs-core owns the abstract bridge traits (DynPredictor, TraceSink, CacheBackend, LmClient) and the Facet walker — the surface leaven drives. - dsrs-leaven implements leaven_core::Artifact / leaven_surface::EditSurface / leaven_engine::Evaluator for DSRs programs directly, replacing the empty leaven-dsrs stub crate. DSRs is a leaven-compatible target rather than a third party that needs a bridge owned by leaven. - GEPA-only optimizer: COPRO and MIPROv2 deleted. dsrs-gepa is a sunset candidate, dropped once leaven-gepa is a runnable optimizer and dsrs-leaven ships real impls. Companion HTML view in the same directory. --- .../2026-05-08-dsrs-crate-split-design.md | 226 +++++++++++++ .../2026-05-08-dsrs-crate-split-topology.html | 300 ++++++++++++++++++ 2 files changed, 526 insertions(+) create mode 100644 docs/plans/2026-05-08-dsrs-crate-split-design.md create mode 100644 docs/plans/2026-05-08-dsrs-crate-split-topology.html diff --git a/docs/plans/2026-05-08-dsrs-crate-split-design.md b/docs/plans/2026-05-08-dsrs-crate-split-design.md new file mode 100644 index 00000000..947bd38a --- /dev/null +++ b/docs/plans/2026-05-08-dsrs-crate-split-design.md @@ -0,0 +1,226 @@ +# DSRs · Crate Split Design + +**Date:** 2026-05-08 +**Status:** Approved (design phase) +**Successor of:** `docs/specs/modules/design_reference.md`, `docs/specs/modules/breadboard.md` + +> Companion artifact: [`2026-05-08-dsrs-crate-split-topology.html`](2026-05-08-dsrs-crate-split-topology.html) (interactive React view of the same plan). + +--- + +## 1. Motivation + +The current `crates/dspy-rs` is a monolith. Splitting it pays off on four axes simultaneously, all of which were chosen as motivators: + +1. **Layer enforcement.** The breadboard's L0 / L1 / L2 + Place P1 / P2 / P3 topology is currently social, not mechanical. A user file can `use dspy_rs::optimizer::*` from a P1 codebase. Crate boundaries make the topology load-bearing. +2. **Optional features.** Today everyone pays for `parquet`, `arrow`, `hf-hub`, `foyer`, `minijinja`, `rig-core`, `csv` whether they call them or not. Splitting lets light users skip what they don't use. +3. **Compile times.** The monolith pulls every heavy dep into every rebuild. Per-crate cargo caching + parallel codegen across small crates is a strict win. +4. **Public API hygiene.** Users get narrow, named imports per area instead of one giant `dspy_rs::*`. + +A fifth motivator emerged during design: **leaven readiness**. The user has [`leaven`](../../../leaven) — a separate Rust library for optimizing arbitrary artifacts — with `leaven-core` (cold algebra), `leaven-engine`, and concrete optimizers (`leaven-gepa`). The split prepares DSRs to be *a thing leaven optimizes* rather than its own optimizer host. + +--- + +## 2. Decisions + +| # | Decision | Rationale | +|---|----------|-----------| +| D1 | **No facade crate.** The current `dspy-rs` aggregator is dissolved; users depend on the leaf crates explicitly. | Cleanest API hygiene. The cost is a one-time migration of import paths. | +| D2 | **12 crates total** in the workspace. | Layer-aligned (L0, L1, L2, adjacent, integration) plus the three existing support crates. Coarser splits don't enforce L1/L2; finer splits add ceremony without payoff. | +| D3 | **`DynPredictor`, `TraceSink`, `CacheBackend`, `LmClient` traits live in `dsrs-core`.** | Slight deviation from the breadboard's "L2 defines the interface" framing. With Cargo crates, putting abstract bridge traits in core gives a clean DAG with no dependency inversion gymnastics. | +| D4 | **Trace and cache are their own crates** (`dsrs-trace`, `dsrs-cache`). | Maximum modularity — each can be swapped or disabled. Both implement abstract traits in `dsrs-core`. | +| D5 | **`bamltype` + `bamltype-derive` stay as-is.** | Already external-shaped; the names signal BAML lineage; renaming buys nothing. | +| D6 | **GEPA-only optimization.** COPRO and MIPROv2 source files are deleted. | Per current direction, those optimizers are outdated. `dsrs-optimize` is renamed `dsrs-gepa` for clarity. | +| D7 | **`dsrs-evaluate` is permanent and separate.** | It's the metric surface `leaven-dsrs` adapts. Even after `dsrs-gepa` sunsets, evaluate stays as the canonical typed-metric API. | +| D8 | **`dsrs-gepa` is a sunset candidate.** | Survives until the leaven path (next decision) is real. | +| D9 | **DSRs implements leaven's capability traits directly** in a `dsrs-leaven` crate inside the DSRs workspace. The skeleton `leaven-dsrs` crate in leaven's workspace is dropped or repointed. | Aligns with leaven's own routing rule ("backend crates depend on the capability crate, not on internals"). DSRs *is* a leaven-compatible target, not a third-party project that needs a bridge owned by leaven. | +| D10 | **Zero compatibility shims.** Hard cutover for the import-path change. | Per project standard: no parallel old/new paths, no `pub use` redirects, no deprecated wrappers. | + +--- + +## 3. Crate inventory + +12 crates. Three existing, eight new (extracted from `dspy-rs`), one new integration crate. + +### L0 · Foundation (existing, untouched) + +| Crate | Role | +|-------|------| +| `bamltype` | Typed value system, jsonish coercion, BAML schema rendering. | +| `bamltype-derive` | `#[derive(BamlType)]` proc-macro. | +| `dsrs-macros` | `#[derive(Signature)]`, `#[derive(Augmentation)]`, `#[derive(Module)]` proc-macros. Emitted paths get rewritten to reference the new crate names. | + +### L1 · Typed substrate (new, extracted from `dspy-rs`) + +| Crate | Public surface | Depends on | +|-------|---------------|------------| +| `dsrs-core` | `Signature`, `Module`, `SignatureSchema`, `Augmentation`, `Predicted`, `CallMetadata`, `Demo`, `Example`, `Prediction`, `PredictError` / `ParseError` / `ConversionError` / `LmError`. Abstract bridge traits: `DynPredictor`, `TraceSink`, `CacheBackend`, `LmClient`. The Facet walker (`visit_named_predictors_mut`). | `bamltype` | +| `dsrs-lm` | Concrete `LM` (rig-core wrapper), `ChatAdapter`, `GLOBAL_SETTINGS`, `configure`, `with_lm`. Implements `dsrs-core::LmClient`. | `dsrs-core` | +| `dsrs-trace` | `ExecutionGraph`, `TraceContext`, span/event types. Implements `dsrs-core::TraceSink`. | `dsrs-core` | +| `dsrs-cache` | Foyer-backed LM response cache. Implements `dsrs-core::CacheBackend`. | `dsrs-core` | +| `dsrs-predict` | `Predict` (impls `DynPredictor`), `ChainOfThought`, `ReAct`, `forward_all`, `Map` / `AndThen` combinators, library modules. | `dsrs-core`, `dsrs-lm` | + +### L2 · Evaluation & optimization + +| Crate | Public surface | Depends on | Status | +|-------|---------------|------------|--------| +| `dsrs-evaluate` | `TypedMetric`, `MetricOutcome`, `FeedbackMetric`, `ExecutionTrace`, `evaluate_trainset`, feedback helpers (`retrieval_feedback`, `code_pipeline_feedback`, `multi_objective_feedback`, `string_similarity_feedback`, `classification_feedback`). | `dsrs-core` | Permanent. | +| `dsrs-gepa` | `Optimizer` trait, `GEPA`, `GEPACandidate`, `GEPAResult`, `ParetoFrontier`. | `dsrs-core`, `dsrs-predict`, `dsrs-evaluate` | **Sunset candidate.** | + +COPRO and MIPROv2 source files (`optimizer/copro.rs`, `optimizer/mipro.rs`) are deleted as part of the split. + +### Adjacent + +| Crate | Public surface | Depends on | +|-------|---------------|------------| +| `dsrs-data` | `DataLoader`. Format readers (csv / json / parquet / hf-hub) behind feature flags so light users don't pull arrow/parquet/hf-hub. | `dsrs-core` | + +### Integration · the future + +| Crate | Public surface | Depends on | +|-------|---------------|------------| +| `dsrs-leaven` | `DsrsProgramArtifact` (impl `leaven_core::Artifact`), `DsrsProgramChange`, `DsrsProgramSurface` (impl `leaven_surface::EditSurface`), `DsrsEvaluator` (impl `leaven_engine::Evaluator

`), `DsrsEvidence` (impl `leaven_core::Evidence` + capability traits for `Casewise` and `Attributable`). | `dsrs-core`, `dsrs-evaluate`, `dsrs-predict`, `leaven-core`, `leaven-surface`, `leaven-engine`, `leaven-evidence` | + +--- + +## 4. Dependency DAG + +``` + bamltype-derive + ▼ + bamltype ◄──── dsrs-macros + ▲ ▲ + │ │ + ▼ │ + dsrs-core ◄── dsrs-trace + ▲ ▲ ▲ ◄── dsrs-cache + │ │ │ + │ │ └── dsrs-evaluate ──┐ + │ │ │ + │ └── dsrs-lm │ + │ ▲ │ + │ │ │ + └── dsrs-predict ────────┤ + ▲ │ + │ │ + dsrs-gepa ◄─────────────┘ (sunset) + + dsrs-data ──► dsrs-core + + dsrs-leaven ──► dsrs-core + ► dsrs-evaluate + ► dsrs-predict + ► leaven-{core, surface, engine, evidence} +``` + +Cargo-enforced invariants: + +- **`dsrs-core` is foundational and small.** No LM, no rig-core, no foyer, no parquet, no minijinja. Pure types + traits + Facet walker + abstract bridges. +- **`dsrs-trace` and `dsrs-cache` only depend on `dsrs-core`.** They're swap-points; concrete impls don't infect anything else. +- **`dsrs-predict` depends on `dsrs-core` + `dsrs-lm`.** You cannot construct a Predict without an LM client. This is right. +- **`dsrs-evaluate` depends only on `dsrs-core`.** Metrics are pure over typed I/O — no Predict, no LM, no optimizer. +- **`dsrs-gepa` is a leaf consumer.** Nothing depends on it. Deletion is safe. +- **`dsrs-leaven` is the only crate that pulls leaven types into the DSRs workspace.** Users who don't use leaven don't pay. + +--- + +## 5. The leaven integration story + +DSRs becomes a leaven-compatible optimization target. The user retains a typed `M: Module<...>` instance and hands it to leaven. Leaven owns the optimization run loop; DSRs owns module evaluation and the prompt format. + +Concrete shape of `dsrs-leaven`: + +```rust +// Wraps a typed module + signature; identity is content-hash of (instructions, demos) +// across all Predict leaves. +pub struct DsrsProgramArtifact> { ... } +impl leaven_core::Artifact for DsrsProgramArtifact { type Change = DsrsProgramChange; ... } + +// Structured edit: (predict_path, op) where op is set-instruction or set-demos. +pub struct DsrsProgramChange { edits: Vec<(PredictPath, Edit)> } + +// EditSurface — lets leaven proposers select Predict leaves by address and +// render them either inline (for one-shot LM proposers) or as a workspace +// directory (for agentic proposers that want selective read access). +pub struct DsrsProgramSurface; +impl leaven_surface::EditSurface for DsrsProgramSurface { ... } + +// Evaluator — runs the user's typed metric against a batch of examples through +// a (snapshot of the) module and produces leaven Assessments. +pub struct DsrsEvaluator> { ... } +impl

leaven_engine::Evaluator

for DsrsEvaluator<...> { ... } + +// Evidence — wraps DSRs's MetricOutcome (scalar score + optional textual +// FeedbackMetric + metadata). Implements Casewise (per-example feedback for +// Pareto) and Attributable (which Predict caused which signal — for credit +// assignment). +pub enum DsrsEvidence { ... } +impl leaven_core::Evidence for DsrsEvidence {} +impl leaven_evidence::CasewiseEvidence for DsrsEvidence { ... } +impl leaven_evidence::AttributableEvidence for DsrsEvidence { ... } +``` + +**Render/materialize separation** (per leaven principle 3.2): `ChatAdapter` stays internal to DSRs. Leaven only sees "call this module with this input → get a `Predicted` and its metadata." Prompt rendering happens inside `dsrs-predict` when the module executes; leaven never observes the prompt format. + +**Where state lives:** Hybrid. The user's original `M: Module` is mutable through `DynPredictor`. Leaven proposes changes that produce *new* `DsrsProgramArtifact` snapshots via `apply_change`, and the run graph carries those snapshots for lineage and caching. + +**What leaven currently lacks** (research from sub-agent investigation, 2026-05-08): +- `leaven-dsrs` crate (in leaven workspace) — empty stubs at v0.0.0 +- `leaven-mipro` — skeleton (not needed for our GEPA-only path) +- `leaven-textgrad` — skeleton (feedback aggregation needed by GEPA) +- `leaven-gepa` — partial: strategy composition layer (CandidateSelector / PartSelector / Gate slots) but no runnable optimizer, no reflection-based mutation wired +- No ergonomic `optimize(artifact, proposer, evaluator, population) -> ...` entry point + +These are what need to land in leaven before `dsrs-gepa` can be deleted. + +--- + +## 6. Sunset trigger for `dsrs-gepa` + +`dsrs-gepa` is deleted from the workspace when **both** are true: + +1. **`leaven-gepa` is a runnable optimizer.** Strategy slots are filled, reflection-based mutation works, candidate selection / gate / parts-picker are wired. Not a strategy-composition skeleton. +2. **`dsrs-leaven` ships real implementations** of `DsrsProgramArtifact` / `DsrsProgramSurface` / `DsrsEvaluator` / `DsrsEvidence`, and a parity test confirms equal-or-better optimization results vs `dsrs-gepa` on a sample DSRs program (e.g. one of the `examples/` programs). + +Originally six conditions covering MIPRO, COPRO, textgrad, etc — collapsed to two because GEPA is the only optimizer we keep. + +--- + +## 7. Migration sequence + +The split is a hard cutover. Drafted as one PR per crate extraction, in dependency order so each step compiles and tests pass. + +1. **Create `dsrs-core`.** Move `core/`, `augmentation.rs`, the legacy `Example` / `Prediction` types from `data/`, the bridge trait stubs, and the Facet walker. Update `dsrs-macros` emitted paths. Verify `cargo test -p dsrs-core` and downstream. +2. **Create `dsrs-trace`.** Move `trace/`. Verify it depends only on `dsrs-core`. +3. **Create `dsrs-cache`.** Move `utils/cache.rs` (and `telemetry.rs` if it's only used here). Verify dep on `dsrs-core` only. +4. **Create `dsrs-lm`.** Move `core/lm/`, `adapter/`, `core/settings.rs`. Implements `dsrs-core::LmClient`. Wire `dsrs-trace` and `dsrs-cache` via core's traits, not concrete deps. +5. **Create `dsrs-evaluate`.** Move `evaluate/`. Verify dep on `dsrs-core` only. +6. **Create `dsrs-predict`.** Move `predictors/`, `modules/`. Update `Predict`'s `impl DynPredictor` to use the trait from `dsrs-core`. +7. **Create `dsrs-gepa`.** Move `optimizer/gepa.rs` and `optimizer/pareto.rs`. **Delete `optimizer/copro.rs` and `optimizer/mipro.rs`** along with any tests that target them. +8. **Create `dsrs-data`.** Move `data/dataloader.rs`, `data/serialize.rs`, `data/utils.rs`. Add format feature flags (`csv`, `parquet`, `hf-hub`). +9. **Create `dsrs-leaven`.** Initial skeleton — type signatures only, `unimplemented!()` bodies. Real implementations land in a follow-up plan once the first leaven-side piece is ready. +10. **Delete `crates/dspy-rs`.** Remove from workspace `Cargo.toml`. Update `README.md`, `CURRENT_PLAN.md`, `CURRENT_SPEC.md`, doc references. +11. **Update consumers.** `examples/`, `tests/` outside crates, vendor dirs, anything that does `use dspy_rs::*`. +12. **In leaven workspace** (separate PR there): delete `crates/leaven-dsrs/` or repoint as a thin re-export pointer to DSRs's `dsrs-leaven`. + +Tests pass after each step. No step leaves the workspace in a non-compiling state. + +--- + +## 8. Open questions deferred to implementation plan + +- **Feature flag granularity in `dsrs-data`.** Default features = none vs default = `csv`+`json`? Probably default to none and document the four feature combos. +- **`dsrs-leaven` initial scope.** The first cut is type skeletons; what's the first end-to-end smoke test? Probably "GEPA-equivalent run on the QA example using leaven-gepa once it's a real optimizer." +- **MSRV alignment with leaven** (`rust-version = 1.85` in leaven's workspace). Make `dsrs-leaven` match. +- **Does `dsrs-macros` need feature flags** to emit different paths depending on whether you're targeting the new crate layout? Likely no — hard cutover, paths are unconditional. + +--- + +## 9. References + +- `docs/specs/modules/breadboard.md` — original L0 / L1 / L2 + P1 / P2 / P3 topology +- `docs/specs/modules/design_reference.md` — design principles (Facet shapes, parse-don't-validate, structure-IS-declaration, modules-as-strategies, typed-path-primary, one-adapter) +- `CURRENT_SPEC.md` — superseded baseline (Phase 2 typed-native runtime) +- [`leaven/AGENTS.md`](../../../leaven/AGENTS.md) — routing rules and the "backends depend on capability crates" principle +- [`leaven/docs/specs/guiding_principles.md`](../../../leaven/docs/specs/guiding_principles.md) — artifact-shape neutrality, render-materialize separation, evidence-shape neutrality +- [`2026-05-08-dsrs-crate-split-topology.html`](2026-05-08-dsrs-crate-split-topology.html) — interactive React view of this design diff --git a/docs/plans/2026-05-08-dsrs-crate-split-topology.html b/docs/plans/2026-05-08-dsrs-crate-split-topology.html new file mode 100644 index 00000000..d9b0a48b --- /dev/null +++ b/docs/plans/2026-05-08-dsrs-crate-split-topology.html @@ -0,0 +1,300 @@ + + + + + + DSRs · Crate Topology + + + + + + + +

+ + + + From 71113384f2b2eec41243c06595d69ef988242bc1 Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:35:08 -0700 Subject: [PATCH 02/15] docs(research): add Undermind LLM optimizer ingredients report Used Undermind deep search plus Report Writer instead of a hand-rolled literature scan because the request was specifically for the report/research flow and recent literature coverage. The report distills the search into DSRs implementation requirements: typed IR, trace and blame capture, MIPRO-like offline compilation, Pareto selection, and bounded online adaptation. Export note: the CLI files export path requested citation style plain and the API rejected it, so the markdown was exported through the same API with APA style. Left undone: Kha24 was not uploaded as an extra source; Report Writer cited the search-result anchor. A follow-up can tighten DSPy-specific details if that paper is uploaded. --- ...05-08-llm-program-optimizer-ingredients.md | 394 ++++++++++++++++++ 1 file changed, 394 insertions(+) create mode 100644 docs/plans/2026-05-08-llm-program-optimizer-ingredients.md diff --git a/docs/plans/2026-05-08-llm-program-optimizer-ingredients.md b/docs/plans/2026-05-08-llm-program-optimizer-ingredients.md new file mode 100644 index 00000000..8cbcb81b --- /dev/null +++ b/docs/plans/2026-05-08-llm-program-optimizer-ingredients.md @@ -0,0 +1,394 @@ +# Key ingredients for an LLM program optimizer + +##### [**Undermind**](https://undermind.ai) + +--- + + +## Table of Contents + +- [Executive design thesis](#executive-design-thesis) +- [Taxonomy of optimizer families](#taxonomy-of-optimizer-families) +- [Key ingredients with evidence and citations](#key-ingredients-with-evidence-and-citations) + - [Search space design](#search-space-design) + - [Proposal mechanisms](#proposal-mechanisms) + - [Feedback signals](#feedback-signals) + - [Trace and blame assignment](#trace-and-blame-assignment) + - [Pareto and cost tradeoffs](#pareto-and-cost-tradeoffs) + - [Typed contracts and deterministic boundaries](#typed-contracts-and-deterministic-boundaries) + - [Offline compilation and bounded online adaptation](#offline-compilation-and-bounded-online-adaptation) +- [Practical architecture requirements for DSRs and Rust](#practical-architecture-requirements-for-dsrs-and-rust) +- [What to build first versus leave as scaffolding](#what-to-build-first-versus-leave-as-scaffolding) + - [Build first](#build-first) + - [Build next](#build-next) + - [Leave as scaffolding](#leave-as-scaffolding) +- [Evaluation and benchmarking protocol](#evaluation-and-benchmarking-protocol) + - [Core benchmark matrix](#core-benchmark-matrix) + - [Reported metrics](#reported-metrics) + - [Protocol details](#protocol-details) +- [Risks and open research gaps](#risks-and-open-research-gaps) +- [Annotated bibliography of the most important papers](#annotated-bibliography-of-the-most-important-papers) +- [References](#references) + +Key Ingredients for an LLM Program Optimizer + +A strong LLM program optimizer is best treated as a compiler over a typed, editable program representation rather than as a prompt tuner. The core path should be offline and batch-oriented: optimize instructions, demonstrations, tool schemas, routing rules, validators, and a small set of graph edits against train and dev sets with explicit cost budgets, rich traces, and full artifact logging. The second layer should be bounded online adaptation: reversible updates to memory, retrieval context, tool descriptions, and thresholds, gated by validators and rollback. This design is the common denominator behind the most useful recent systems, even when they differ in search algorithm or target artifact (Agrawal et al., 2025; He et al., 2025; Lee et al., 2026; Opsahl-Ong et al., 2024; Wang et al., 2025; Q. Zhang et al., 2025). + +The DSPy line provides the anchor abstraction: declarative LM programs compiled against downstream metrics (Khattab et al., 2023, 2024). Recent work then adds the missing optimizer ingredients that matter in practice: richer search spaces, trace-based diagnostics, block or node blame assignment, Pareto-aware selection, typed contracts, and explicit separation between generative planning and deterministic execution (Cheng et al., 2024; Ghoshal et al., 2026; Harikumar, 2026; Ma et al., 2026; J. Zhang et al., 2025). For a Rust DSPy and DSRs-like system, the right architecture is therefore a typed IR, deterministic evaluator, trace store, optimizer kernel, and a narrow online adaptation controller. + +## Executive design thesis + +The optimizer should optimize a layered program IR with three rings of mutability. + +| Ring | Editable artifacts | Default status | Why it matters | +|:---|:---|:---|:---| +| Core | module instructions, demonstrations, tool docs, decoding knobs, validator thresholds | build first | highest evidence and lowest risk (Agrawal et al., 2025; Ghoshal et al., 2026; Opsahl-Ong et al., 2024) | +| Structural | decomposition, routing, graph topology, verifier placement, retrieval plan, harness policies | add after core | fixes failure modes prompt tuning cannot reach (Lee et al., 2026; Wang et al., 2025; J. Zhang et al., 2025; Zhou et al., 2025) | +| Online | memory entries, playbooks, tool-local descriptions, routing thresholds, exemplar caches | bounded and reversible only | supports safe post-deployment adaptation (Singhvi et al., 2023; Q. Zhang et al., 2025) | + +The default optimizer loop should be: + +1. Compile a typed program into an executable graph with deterministic validators. +2. Run the graph on a train split while capturing full traces. +3. Convert traces into module, block, and node diagnostics. +4. Propose local edits first, then small structural edits only when local edits saturate. +5. Select candidates on a quality, cost, latency Pareto frontier. +6. Re-evaluate promoted candidates on a larger dev slice, then on the full dev set. +7. Emit a frozen artifact bundle for deployment. +8. Allow online updates only to explicitly whitelisted state, with canarying and rollback (Harikumar, 2026; He et al., 2025; Opsahl-Ong et al., 2024; Wang et al., 2025; Q. Zhang et al., 2025). + +The key design bet is that trace quality and blame assignment matter more than exotic search. MIPRO shows that even prompt and demo optimization becomes much stronger when proposals are grounded and evaluated with a surrogate over minibatches (Opsahl-Ong et al., 2024). GEPA, CE-Graph, JudgeFlow, and Maestro all show the same pattern from a different angle: once traces expose where and why failures occur, the optimizer can spend budget on targeted edits instead of blind global search (Agrawal et al., 2025; Ma et al., 2026; Wang et al., 2025; J. Zhang et al., 2025). + +## Taxonomy of optimizer families + +| Family | Search space | Proposal mechanism | Feedback | Best use | +|:---|:---|:---|:---|:---| +| Modular prompt and demo compilers | instructions, demos per module | grounded LM proposals plus Bayesian or random search | downstream metric on minibatches and full dev | offline compile for fixed graphs (Khattab et al., 2024; Opsahl-Ong et al., 2024) | +| Reflective prompt evolution | instructions, sometimes module-local text | trace-conditioned reflection, mutation, merge, Pareto selection | scalar score plus textual critique and trajectories | low-rollout offline optimization (Agrawal et al., 2025) | +| Textual gradient methods | arbitrary text and code variables in a graph | backward LLM generates local critiques and rewrites | textual gradients over graph edges | local module updates and rapid prototyping (Cheng et al., 2024; Yuksekgonul et al., 2024) | +| Structured prompt program search | prompt sections, formats, examples, symbolic prompt structure | symbolic mutators plus beam or evolutionary search | compile-time objective | prompt programs with explicit sections (Schnabel & Neville, 2024; Spiess et al., 2025) | +| Workflow and topology optimizers | nodes, edges, control flow, config | staged or alternating graph plus config edits | traces, scores, evaluator rationale | agentic systems with structural failure modes (Ma et al., 2026; Wang et al., 2025; J. Zhang et al., 2025; Zhou et al., 2025) | +| Harness and context optimizers | retrieval policy, memory policy, orchestration code, context playbooks | coding agents or structured context evolution | full logs, code diffs, execution traces | long-horizon agents and context-heavy systems (Lee et al., 2026; Q. Zhang et al., 2025) | +| Typed and deterministic compilers | plan schemas, node registry, validators, typed interfaces | planner emits typed plan, compiler validates and assembles | structural validity and task success | high-reliability structured workflows (Harikumar, 2026; Lin et al., 2025; Singhvi et al., 2023) | +| Online adaptation systems | memory, playbooks, tool docs, thresholds, exemplars | bounded reflection, replay, retrieval updates | production traces and delayed reward | post-deployment improvement under guardrails (Banerjee et al., 2026; Hu et al., 2025; Q. Zhang et al., 2025) | + +Three practical conclusions follow from this taxonomy. + +- Prompt and demo optimization is the foundation, not the whole optimizer (Khattab et al., 2024; Opsahl-Ong et al., 2024). +- Structural search should be sparse, constrained, and trace-driven rather than always-on (Wang et al., 2025; J. Zhang et al., 2025; Zhou et al., 2025). +- Online learning should mutate context and routing long before it mutates topology or code (Lee et al., 2026; Singhvi et al., 2023; Q. Zhang et al., 2025). + +## Key ingredients with evidence and citations + +### Search space design + +The search space should be explicit, typed, and factorized. The highest value editable axes today are module instructions, demonstration sets, tool descriptions, retrieval and routing policies, validators, and a narrow set of graph edits (Ghoshal et al., 2026; Opsahl-Ong et al., 2024; Wang et al., 2025). MIPRO gives the strongest base case for fixed graphs by jointly searching instructions and few-shot demonstrations per module with grounded proposal generation and Bayesian selection (Opsahl-Ong et al., 2024). GEPA then shows that natural language rule updates can outperform rollout-heavy RL when the prompt is the dominant artifact (Agrawal et al., 2025). + +Graph and harness search matter, but only after the local text space is mature. Maestro reports consistent gains from joint graph and config optimization over config-only tuning, especially on workflows with missing intermediate nodes or poor information flow (Wang et al., 2025). CE-Graph reaches a similar conclusion by restricting structure search to operator-constrained edits such as revise prompt, insert node, and delete node, targeted at the densest failure mode rather than broadcast over the whole graph (J. Zhang et al., 2025). Meta-Harness pushes the editable boundary outward to retrieval and memory orchestration code, which is important for production systems but too open-ended for a first release (Lee et al., 2026). + +A practical Rust IR should therefore expose these parameter classes: + +| Parameter class | Examples | Type discipline | First release | +|:---|:---|:---|:---| +| Prompt text | system instruction, rubric, tool-use policy | structured text with role tags | yes | +| Demonstrations | module-local exemplars, trajectories | typed input and output records | yes | +| Tool metadata | tool descriptions, slot hints, examples | schema plus editable doc strings | yes | +| Control knobs | model choice, temperature, retry budget, top k retrieval | numeric or enum | yes | +| Contracts | regex checks, JSON schemas, custom validators | executable predicates | yes | +| Routing | fallback order, verifier gating, abstain threshold | policy objects | yes | +| Structure | insert verifier, split module, add retrieval hop, reroute edge | graph edits over typed nodes | later | +| Harness policy | memory write rules, context assembly, retrieval cache policy | code or declarative policy | later | + +### Proposal mechanisms + +Proposal quality matters as much as search strategy. MIPRO grounds proposals in program summaries, data summaries, successful traces, and bootstrapped demonstrations, then uses a Tree-structured Parzen Estimator to search combinations under minibatch evaluation (Opsahl-Ong et al., 2024). This is the right offline starting point because it combines cheap proposal generation with a robust selector. + +Reflective methods are the next ingredient. GEPA creates prompt mutations from execution traces and textual feedback, then preserves diversity through Pareto frontier maintenance and system-aware merge (Agrawal et al., 2025). TextGrad and OPTO generalize this idea by turning feedback into local textual gradients over a computation graph, which is useful when different parameter types need different update prompts (Cheng et al., 2024; Yuksekgonul et al., 2024). For structural search, staged optimization works better than fully joint search when budgets are limited. MASS warms up blocks, then searches topologies, then retunes globally (Zhou et al., 2025). Cognify adapts the same principle with hierarchical layers and budget reallocation across architecture, step, and prompt changes (He et al., 2025). + +The implementation implication is simple. The optimizer should support multiple proposal engines behind one interface: + +- grounded proposer for prompt and demo candidates +- reflective proposer for trace-conditioned rewrites +- symbolic mutator for section and template edits +- graph proposer for small typed structure edits +- code or harness proposer kept behind a feature gate (Agrawal et al., 2025; Lee et al., 2026; Opsahl-Ong et al., 2024; Schnabel & Neville, 2024; Wang et al., 2025) + +### Feedback signals + +Recent papers converge on one lesson: scalar reward alone is not enough. OPTO argues that execution traces play the role that gradients play in differentiable systems, because they expose the causal path from parameter to failure (Cheng et al., 2024). CE-Graph names the scalar-only problem directly as information collapse and replaces it with failure signatures that encode both where a failure occurred and what semantic error occurred (J. Zhang et al., 2025). JudgeFlow further shows that ranking block responsibility across failed traces yields more stable local optimization than trying to infer blame from a single end-to-end score (Ma et al., 2026). + +A serious optimizer should therefore capture four feedback layers for every run: + +| Layer | Contents | Use | +|:---|:---|:---| +| Outcome | task metric, pass or fail, quality rubric | promotion and Pareto ranking | +| Cost | tokens, dollars, latency, tool count, retries | Pareto ranking and budget gating | +| Trace | per-node inputs, outputs, tool calls, retrieved docs, exceptions | diagnosis and blame assignment | +| Judgment | evaluator rationale, LLM critique, human preference, validator messages | proposal grounding | + +Tool-using systems need an additional tool layer. JTPRO shows that tool selection accuracy, slot filling accuracy, and overall success should be measured separately because tool choice and argument correctness fail for different reasons (Ghoshal et al., 2026). That directly argues for separate blame channels for tool selection, argument shaping, and downstream answer quality. + +### Trace and blame assignment + +A Rust optimizer should treat traces as first-class data, not logging afterthoughts. Each module boundary should emit a typed record with input values, output values, prompt version, demonstrations used, retrieved evidence, tool arguments, validator outcomes, and parent edge identifiers. This is the minimum needed to support the three strongest blame strategies in the literature. + +| Blame strategy | Mechanism | Transferable lesson | +|:---|:---|:---| +| Surrogate sensitivity | learn which parameter combinations raise downstream score | useful for prompt and demo compilers (Opsahl-Ong et al., 2024) | +| Reflective local diagnosis | ask an LM to inspect failed trajectories and rewrite a targeted module | useful when errors are semantic and legible in traces (Agrawal et al., 2025; Ghoshal et al., 2026) | +| Structural failure attribution | convert traces into block or node failure signatures and rank responsibility | needed for workflow edits (Ma et al., 2026; J. Zhang et al., 2025) | + +The implementation choice is to keep blame assignment layered. First use deterministic localization when a validator or parser fails. Second use structural heuristics such as first failing node, repeated retry boundary, or wrong tool choice. Third use an LLM judge only when deterministic and heuristic signals do not isolate the cause. JudgeFlow supports this hierarchy indirectly by showing that judge signals become much more useful when the workflow is already segmented into meaningful blocks (Ma et al., 2026). + +### Pareto and cost tradeoffs + +No single objective is sufficient. LangProBe shows large optimizer by architecture interactions and a strong quality-cost Pareto story for optimized language programs, but not a universal winner across tasks and models (Tan et al., 2025). GEPA shows sample efficiency in rollouts (Agrawal et al., 2025). Cognify shows quality, cost, and latency can all be improved when the optimizer can change different layers and reallocate budget adaptively (He et al., 2025). A production optimizer should thus maintain a live Pareto frontier over at least quality, dollar cost, and latency, with optional robustness as a fourth axis (Agrawal et al., 2025; He et al., 2025; Tan et al., 2025). + +This implies two separate frontiers. + +- Training frontier over optimization cost versus candidate quality +- Deployment frontier over runtime quality, runtime cost, and latency + +These frontiers should not be collapsed. A candidate that is expensive to discover may still be cheap and strong at runtime. Compile-time search methods such as SAMMO and MIPRO assume this amortization explicitly (Opsahl-Ong et al., 2024; Schnabel & Neville, 2024). + +### Typed contracts and deterministic boundaries + +Typed contracts are not optional in a Rust system. DSPy Assertions shows that soft suggestions and hard assertions can be compiled into both demonstration filtering and bounded retry logic (Singhvi et al., 2023). TACs formalize type compliance with parse and canonicalization steps between modules, which is useful even if the full probabilistic training scheme is not adopted (Lin et al., 2025). PlanCompiler shows the strongest deterministic version of the same idea: fixed node registry, static graph validation, typed plan schema, and code generation only after validation passes (Harikumar, 2026). + +The transferable architecture is: + +1. Every module declares input type, output type, schema, and validator set. +2. Every LM output passes through parse, canonicalize, and validate steps before downstream use. +3. Hard contract failures stop candidate promotion. +4. Soft contract failures can trigger bounded repair or route to a verifier path (Harikumar, 2026; Lin et al., 2025; Singhvi et al., 2023). + +### Offline compilation and bounded online adaptation + +Offline compilation should be the first-class path. Most of the strongest results come from repeated evaluation on train and dev sets with minibatching, surrogate ranking, or staged halving (He et al., 2025; Opsahl-Ong et al., 2024; Spiess et al., 2025). This is where structural changes, prompt set search, verifier insertion, and routing changes belong. + +Online adaptation should be deliberately smaller in scope. ACE gives the clearest design for safe online evolution: contexts are represented as granular playbook items with deterministic merge and pruning, which avoids monolithic rewrite and context collapse (Q. Zhang et al., 2025). Assertions add bounded retry and repair as a second safe online primitive (Singhvi et al., 2023). The lesson is to whitelist only reversible state: + +- memory or playbook entries +- exemplar caches +- tool-local descriptions +- routing thresholds +- verifier enable or disable flags (Ghoshal et al., 2026; Singhvi et al., 2023; Q. Zhang et al., 2025) + +Unrestricted online graph mutation should stay out of scope for an initial Rust optimizer. + +## Practical architecture requirements for DSRs and Rust + +The implementation target should be a typed optimizer runtime with six core subsystems. + +| Subsystem | Responsibilities | Rust shape | Build priority | +|:---|:---|:---|:---| +| Program IR | typed nodes, edges, contracts, parameter handles | enums, traits, serde structs, graph crate | first | +| Executor and tracer | run graph, record per-node trace, collect costs | async executor plus append-only event log | first | +| Evaluator | task metrics, judges, validators, Pareto scorer | trait objects over metrics and judges | first | +| Optimizer kernel | candidate pool, proposal engines, selection, promotion | scheduler plus pluggable proposer traits | first | +| Compiler | freeze artifact bundle for deployment | deterministic serializer and manifest writer | first | +| Online controller | canary, rollback, memory updates, threshold tuning | separate state machine with audit log | second | + +A good IR is more important than a clever search loop. It should separate immutable structure from mutable parameters and expose provenance for every parameter value. + +``` rust +struct Program { + nodes: Vec, + edges: Vec, + contracts: Vec, + params: ParamStore, +} + +enum Param { + Instruction(TextParam), + DemoSet(DemoParam), + ToolDoc(ToolDocParam), + Decode(DecodeParam), + Route(RouteParam), + Threshold(ThresholdParam), +} + +struct TraceRecord { + node_id: NodeId, + param_version: ParamVersion, + input: Value, + output: Value, + retrieved: Vec, + tool_call: Option, + validators: Vec, + latency_ms: u64, + token_cost: TokenCost, +} +``` + +Three architecture requirements are non-negotiable. + +- Deterministic artifact bundles. A compiled candidate should be a frozen manifest of graph version, parameter versions, validators, and evaluation results (Harikumar, 2026; Khattab et al., 2024). +- Full artifact logging. Meta-Harness shows that optimizer quality rises when prior code, scores, and traces remain inspectable rather than compressed into short summaries (Lee et al., 2026). The same principle should hold for prompt and graph candidates. +- Small, typed edit operators. Each proposer should emit typed patches rather than free-form rewritten programs. CE-Graph and Maestro both benefit from edit libraries and trust regions over graph changes (Wang et al., 2025; J. Zhang et al., 2025). + +For DSRs-like ergonomics, expose a declarative user surface and keep optimizer internals out of the authoring API. Users should define modules, signatures, contracts, and metrics. The system should own the search, trace capture, and promotion policy (Khattab et al., 2023, 2024; Opsahl-Ong et al., 2024). + +## What to build first versus leave as scaffolding + +### Build first + +| Component | Why first | Evidence | +|:---|:---|:---| +| Typed graph IR with contracts | everything else depends on safe composition and localized edits | (Harikumar, 2026; Lin et al., 2025; Singhvi et al., 2023) | +| Executor with rich traces | trace quality is the main lever for later optimization | (Cheng et al., 2024; Ma et al., 2026; J. Zhang et al., 2025) | +| MIPRO-like prompt and demo compiler | strongest validated baseline for modular offline optimization | (Opsahl-Ong et al., 2024) | +| Candidate pool with Pareto ranking | avoids greedy collapse and supports cost-aware selection | (Agrawal et al., 2025; Tan et al., 2025) | +| Deterministic validators and retry wrappers | enables safe compile-time filtering and bounded repair | (Harikumar, 2026; Singhvi et al., 2023) | +| Benchmark harness and artifact store | necessary to avoid chasing anecdotes | (Lee et al., 2026; Tan et al., 2025) | + +### Build next + +| Component | Why next | Evidence | +|:---|:---|:---| +| Reflective rewrite engine | improves low-budget search using traces and critique | (Agrawal et al., 2025; Ghoshal et al., 2026) | +| Hierarchical budget allocator | important once the search space spans multiple edit layers | (He et al., 2025; Spiess et al., 2025) | +| Tool-doc optimizer | high leverage for agents with many tools | (Ghoshal et al., 2026) | +| Block judge | stabilizes blame for workflow-local edits | (Ma et al., 2026) | +| Playbook memory updater | safest online adaptation primitive | (Q. Zhang et al., 2025) | + +### Leave as scaffolding + +| Component | Reason to delay | Evidence | +|:---|:---|:---| +| Open-ended topology search | high payoff but large search explosion and evaluation cost | (Wang et al., 2025; Zhou et al., 2025) | +| Harness code synthesis | powerful but operationally heavy and hard to sandbox in v1 | (Lee et al., 2026) | +| RL-heavy online optimization | weaker evidence than reflective and trace-based methods at equal budget | (Agrawal et al., 2025) | +| Fully probabilistic cascade training | promising but a bigger systems and training commitment than needed for v1 | (Lin et al., 2025) | +| Unbounded self-modifying agents | safety and observability burden is too high for early deployment | (Lee et al., 2026; Q. Zhang et al., 2025) | + +The best first version is therefore not a universal optimizer. It is a strong offline compiler with rich traces, typed validators, and enough reflective local repair to avoid wasting search budget on obvious repeats. + +## Evaluation and benchmarking protocol + +Evaluation should separate optimizer quality from model quality, architecture quality, and deployment cost. LangProBe is the best benchmark anchor because it exposes optimizer by architecture interactions instead of reporting single best cases (Tan et al., 2025). + +### Core benchmark matrix + +| Axis | Minimum design | +|:---|:---| +| Tasks | include classification, extraction, reasoning, RAG, tool use, multi-step agent tasks | +| Models | at least one frontier API model and two smaller open models | +| Architectures | single call, fixed modular pipeline, verifier-augmented pipeline, one agentic workflow | +| Optimizers | no optimization, random or few-shot search, MIPRO-like, reflective, hierarchical | +| Budgets | fixed optimizer token budget and fixed rollout budget | + +### Reported metrics + +| Metric family | What to report | +|:---|:---| +| Quality | task score, robustness under seed variation, held-out test score | +| Runtime cost | input and output tokens, tool calls, dollars, latency percentile | +| Optimization cost | total rollouts, optimizer tokens, wall-clock compile time | +| Generalization | transfer across models, across nearby tasks, and across prompt seeds | +| Reliability | validator pass rate, parse success, tool selection accuracy, slot accuracy | + +### Protocol details + +- Use separate train, dev, and test splits. Do not promote candidates on test. +- Plot budgeted optimization curves, not only final best score. +- Report Pareto frontiers at runtime and compile time. +- Include ablations for trace richness, blame method, proposal engine, and search space width. +- Re-run with multiple seeds because prompt optimization variance is real (Tan et al., 2025; X. Zhang et al., 2026). +- Evaluate both fixed-model and cross-model transfer because prompt quality is model-specific more often than many systems assume (Schnabel & Neville, 2024; Tan et al., 2025). +- For tool agents, break out tool selection accuracy, slot filling accuracy, and overall success (Ghoshal et al., 2026). +- For online adaptation, require canary traffic, delayed promotion, and rollback statistics (Q. Zhang et al., 2025). + +A practical internal benchmark suite should include at least one task where structure search matters. Otherwise the optimizer will look stronger than it is by succeeding only on prompt-local problems. Maestro and CE-Graph both show that some failures are structural and remain invisible to prompt-only compilers (Wang et al., 2025; J. Zhang et al., 2025). + +## Risks and open research gaps + +The first risk is optimizer overfitting. LangProBe shows that some optimizers, especially rule-induction styles, can improve dev while losing on test (Tan et al., 2025). The second risk is blame error. Rich traces help, but judge-based attribution is still noisy and can send the optimizer to the wrong block (Ma et al., 2026). The third risk is search-space inflation. Once topology, harness code, and online state are all mutable, evaluation cost can dominate model cost (He et al., 2025; Lee et al., 2026). + +The deeper research gaps are more important than any single algorithmic choice. + +| Gap | Why unsolved | Implication | +|:---|:---|:---| +| Stable cross-model optimization | prompts and structures transfer poorly across models | keep optimizer model-specific by default (Schnabel & Neville, 2024; Tan et al., 2025) | +| Reliable structural blame | localizing root cause in long agent traces is still noisy | structural search needs stronger diagnostics (Ma et al., 2026; J. Zhang et al., 2025) | +| Multi-objective selection under drift | Pareto fronts shift with model price and latency changes | deployment policy must be recalibrated continuously (He et al., 2025; Tan et al., 2025) | +| Safe online learning | context updates help, but long-horizon credit remains weak | keep online edits local and reversible (Q. Zhang et al., 2025) | +| Typed generation under semantic constraints | syntax can be enforced more easily than semantic correctness | combine type checks with domain validators (Harikumar, 2026; Lin et al., 2025) | +| Benchmark realism | current benchmarks only partly capture production harness behavior | internal harness benchmarks remain necessary (Lee et al., 2026; Tan et al., 2025) | + +One more open point is optimizer self-reference. Meta-optimizers such as metaTextGrad and Meta-Harness suggest gains from optimizing the optimizer or the harness around the model, but they also increase system complexity fast (Lee et al., 2026; Xu et al., 2025). For a Rust v1, that complexity should remain out of the critical path. + +## Annotated bibliography of the most important papers + +| Paper | Why it matters for this build | +|:---|:---| +| (Opsahl-Ong et al., 2024) | Best current anchor for offline compilation of modular LM programs. Defines the practical baseline for joint instruction and demo search with grounded proposal generation, minibatch evaluation, and surrogate-guided selection. | +| (Khattab et al., 2024) | Canonical DSPy anchor for declarative LM programs compiled against downstream metrics. Useful for the user-facing abstraction and compiler framing even though the optimizer details are less implementation-specific here. | +| (Khattab et al., 2023) | Earlier DSPy paper that makes the self-improving pipeline idea explicit and remains useful for the design philosophy behind program-level optimization. | +| (Agrawal et al., 2025) | Strong evidence that reflective prompt evolution with trace-conditioned natural language feedback can beat rollout-heavy RL while using far fewer rollouts. Important for local rewrite engines and Pareto candidate management. | +| (Cheng et al., 2024) | Provides the cleanest conceptual model for traces as optimizer inputs. Useful for optimizer APIs, trace representation, and the idea of a minimal relevant subgraph for updates. | +| (He et al., 2025) | Best source for hierarchical budget allocation across architecture, step, and prompt changes. Important once the optimizer spans more than prompts. | +| (Wang et al., 2025) | Best evidence that joint graph plus config search fixes failures prompt-only methods cannot. Important for future structure search, trust regions, and graph edit libraries. | +| (J. Zhang et al., 2025) | Sharpest treatment of failure distribution modeling and operator-constrained workflow repair. Important for failure signatures, clustering, and targeted graph edits. | +| (Ma et al., 2026) | Best current paper for block-level blame assignment in complex workflows. Important for ranking responsibility and limiting edits to one block at a time. | +| (Ghoshal et al., 2026) | Most concrete recent treatment of tool-description optimization. Important for splitting tool selection from slot filling and for co-optimizing global instructions with tool-local schema text. | +| (Q. Zhang et al., 2025) | Best source for bounded online adaptation through structured playbooks, deterministic merge, and pruning. Important for safe memory evolution after deployment. | +| (Lin et al., 2025) | Strong typed-systems paper. Even without adopting its full training method, its parse and canonicalize discipline and type-compliance framing are directly valuable. | +| (Singhvi et al., 2023) | Best bridge between compile-time optimization and runtime repair through hard and soft assertions. Important for validators, retries, and demonstration filtering. | +| (Harikumar, 2026) | Strong deterministic systems anchor. Important for separating generative planning from deterministic compilation and for static validation before execution. | +| (Tan et al., 2025) | Best benchmark anchor for optimizer evaluation because it studies tasks, models, programs, and optimizers jointly rather than in isolation. | +| (Lee et al., 2026) | Most compelling evidence that harness logic itself is a major optimization surface. Important longer-term, but should stay out of the critical path for a first Rust release. | + +The implementation-first reading of this literature is straightforward. Build a typed offline compiler first, make traces and contracts first-class, keep candidate selection Pareto-aware, add reflective local rewrite before broad structure search, and constrain online learning to reversible context updates. That is the smallest design that matches where the 2025 and 2026 literature is actually strongest (Agrawal et al., 2025; He et al., 2025; Opsahl-Ong et al., 2024; Tan et al., 2025; Q. Zhang et al., 2025). + +--- + +## References + +Agrawal, L. A., Tan, S., Soylu, D., Ziems, N., Khare, R., Opsahl-Ong, K., Singhvi, A., Shandilya, H., Ryan, M. J., Jiang, M., Potts, C., Sen, K., Dimakis, A., Stoica, I., Klein, D., Zaharia, M. A., & Khattab, O. (2025). GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning. *ArXiv*, *abs/2507.19457*. + +Banerjee, P., Moshtaghi, M., & Chadha, A. (2026). *APEX-EM: Non-Parametric Online Learning for Autonomous Agents via Structured Procedural-Episodic Experience Replay*. + +Cheng, C.-A., Nie, A., & Swaminathan, A. (2024). Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs. *Advances in Neural Information Processing Systems 37*. + +Ghoshal, S., Mittal, A., Singh, J., Ballesteros, M., Sun, W., Tu, F., Singh, S., Benajiba, Y., Shah, F., Bharadwaj, S., Ravi, S., & Roth, D. (2026). *JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents*. + +Harikumar, P. (2026). *PlanCompiler: A Deterministic Compilation Architecture for Structured Multi-Step LLM Pipelines*. + +He, Z., Abhyankar, R., Srivatsa, V., & Zhang, Y. (2025). Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning. *Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2*. + +Hu, M., Durme, B. V., Andreas, J., & Jhamtani, H. (2025). Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting. *ArXiv*, *abs/2510.10304*. + +Khattab, O., Singhvi, A., Maheshwari, P., Zhang, Z., Santhanam, K., Vardhamanan, S., Haq, S., Sharma, A., Joshi, T. T., Moazam, H., Miller, H., Zaharia, M., & Potts, C. (2023). DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. *ArXiv*, *abs/2310.03714*. + +Khattab, O., Singhvi, A., Maheshwari, P., Zhang, Z., Santhanam, K., Vardhamanan, S., Haq, S., Sharma, A., Joshi, T. T., Moazam, H., Miller, H., Zaharia, M., & Potts, C. (2024). DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines. *International Conference on Learning Representations*. + +Lee, Y., Nair, R., Zhang, Q., Lee, K., Khattab, O., & Finn, C. (2026). *Meta-Harness: End-to-End Optimization of Model Harnesses*. + +Lin, C., Peng, D., Lu, Y., Zhang, M., & Ie, E. (2025). Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data. *ArXiv*, *abs/2508.18244*. + +Ma, Z., Zhao, Z., Hua, C., Berto, F., & Park, J. (2026). JudgeFlow: Agentic Workflow Optimization via Block Judge. *ArXiv*, *abs/2601.07477*. + +Opsahl-Ong, K., Ryan, M. J., Purtell, J., Broman, D., Potts, C., Zaharia, M., & Khattab, O. (2024). Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs. *ArXiv*, *abs/2406.11695*. + +Schnabel, T., & Neville, J. (2024). Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization. *Conference on Empirical Methods in Natural Language Processing*, 670–686. + +Singhvi, A., Shetty, M., Tan, S., Potts, C., Sen, K., Zaharia, M., & Khattab, O. (2023). DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines. *ArXiv*, *abs/2312.13382*. + +Spiess, C., Vaziri, M., Mandel, L., & Hirzel, M. (2025). AutoPDL: Automatic Prompt Optimization for LLM Agents. *ArXiv*, *abs/2504.04365*. + +Tan, S., Agrawal, L. A., Singhvi, A., Lai, L., Ryan, M. J., Klein, D., Khattab, O., Sen, K., & Zaharia, M. (2025). LangProBe: a Language Programs Benchmark. *ArXiv*, *abs/2502.20315*. + +Wang, W., Kattakinda, P., & Feizi, S. (2025). Maestro: Joint Graph & Config Optimization for Reliable AI Agents. *ArXiv*, *abs/2509.04642*. + +Xu, G., Yuksekgonul, M., Guestrin, C., & Zou, J. (2025). metaTextGrad: Automatically optimizing language model optimizers. *ArXiv*, *abs/2505.18524*. + +Yuksekgonul, M., Bianchi, F., Boen, J., Liu, S., Huang, Z., Guestrin, C., & Zou, J. (2024). TextGrad: Automatic “Differentiation” via Text. *ArXiv*, *abs/2406.07496*. + +Zhang, J., Cai, K., Zeng, Q., Liu, N., Fan, S., Chen, Z., & Wang, K. (2025). Failure-Driven Workflow Refinement. *ArXiv*, *abs/2510.10035*. + +Zhang, Q., Hu, C., Upasani, S., Ma, B., Hong, F., Kamanuru, V., Rainton, J., Wu, C., Ji, M., Li, H., Thakker, U., Zou, J., & Olukotun, K. (2025). Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models. *ArXiv*, *abs/2510.04618*. + +Zhang, X., Wang, G., Cui, Y., Qiu, W., Li, Z., Zhu, B., & He, P.-G. (2026). *Prompt Optimization Is a Coin Flip: Diagnosing When It Helps in Compound AI Systems*. + +Zhou, H., Wan, X., Wan, X., Sun, R., Palangi, H., Iqbal, S., Vuli’c, I., Korhonen, A., & Arik, S. Ö. (2025). Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies. *ArXiv*, *abs/2502.02533*. From 329383483d8b2bd4864985ef22b82b31b689dfcf Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:35:08 -0700 Subject: [PATCH 03/15] chore: dsrs crate split preflight baseline cargo check --workspace succeeds before moving code. cargo test --workspace --no-run succeeds; this gives the split a green compile baseline. Scaffolding: full runtime test execution and coverage measurement happen in the next snapshots so the crate boundary changes can keep their own evidence. --- ...6-05-08-dsrs-crate-split-implementation.md | 1641 +++++++++++++++++ 1 file changed, 1641 insertions(+) create mode 100644 docs/plans/2026-05-08-dsrs-crate-split-implementation.md diff --git a/docs/plans/2026-05-08-dsrs-crate-split-implementation.md b/docs/plans/2026-05-08-dsrs-crate-split-implementation.md new file mode 100644 index 00000000..73ddb672 --- /dev/null +++ b/docs/plans/2026-05-08-dsrs-crate-split-implementation.md @@ -0,0 +1,1641 @@ +# DSRs Crate Split — Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use `superpowers:executing-plans` to implement this plan task-by-task. + +**Goal:** Decompose `crates/dspy-rs` into 9 layered crates (`dsrs-core`, `dsrs-lm`, `dsrs-trace`, `dsrs-cache`, `dsrs-predict`, `dsrs-evaluate`, `dsrs-gepa`, `dsrs-data`, `dsrs-leaven`) per the design at [`2026-05-08-dsrs-crate-split-design.md`](2026-05-08-dsrs-crate-split-design.md). No facade. Hard cutover. Delete COPRO and MIPROv2 along the way. + +**Architecture:** Layer-aligned. `dsrs-core` is the small foundation that exposes abstract bridge traits (`DynPredictor`, `TraceSink`, `CacheBackend`, `LmClient`) and the Facet walker. Concrete trace/cache/lm crates implement those traits. `dsrs-predict` depends on core+lm. `dsrs-evaluate` depends on core only. `dsrs-gepa` is a leaf optimizer (sunset candidate). `dsrs-leaven` provides the leaven integration. The `dspy-rs` aggregator is dissolved. + +**Tech Stack:** Rust 2024, `cargo` workspace, `jj` for VCS, `uv`-managed Python out of scope here. Existing crates use `bamltype`, `bamltype-derive`, `dsrs-macros`, `facet`, `rig-core`, `foyer`, `parquet`, `arrow`, `hf-hub`. + +**Verification discipline:** Each task ends with `cargo check --workspace`, `cargo test --workspace` (or a tighter scope when justified), and a `jj` commit. The discipline is *preserve the test suite while moving code*. If a test was load-bearing for COPRO/MIPROv2 specifically, it gets deleted with the optimizer. Otherwise, every test that passes today must pass at the end of every task. + +**One-time skill reads (engineer should do these once before starting):** +- `using-jj` (this is a jj repo; no `git add`, no staging area) +- `systematic-debugging` (for when something doesn't compile and you need to find why) + +--- + +## Task 0: Preflight — branch, snapshot, baseline + +**Files:** none modified. + +**Step 1: Create a working change for this work, off `main`.** + +```bash +jj new main -m "wip: dsrs crate split" +``` + +**Step 2: Confirm a clean baseline.** + +```bash +cargo check --workspace +cargo test --workspace --no-run +``` + +Expected: both succeed. If `cargo test --workspace --no-run` (compile-only) fails, **stop**. The split assumes a green baseline. Investigate before continuing. + +**Step 3: Capture the baseline test count for parity-checking later.** + +```bash +cargo test --workspace -- --list 2>/dev/null | grep -c ': test$' > /tmp/dsrs-baseline-test-count +cat /tmp/dsrs-baseline-test-count +``` + +Record the number. After the split (minus deleted COPRO/MIPRO tests), the count must be `baseline − (count of deleted tests)` exactly. + +**Step 4: Identify the COPRO/MIPRO tests that will be deleted.** + +```bash +ls crates/dspy-rs/tests/ | grep -iE "copro|mipro" +``` + +Expected: `test_optimize_mipro.rs` (or similar). Write the names into `/tmp/dsrs-deleted-tests` for accounting. + +**Step 5: Commit the (no-op) preflight as a marker.** + +```bash +jj describe -m "chore: dsrs crate split — preflight baseline (no code changes)" +jj new +``` + +(The describe attaches a marker to the empty change so the work has a clear starting point in `jj log`.) + +--- + +## Task 1: Create empty crate skeletons in workspace + +**Goal:** Register all 9 new crates in the workspace `Cargo.toml` with empty `lib.rs` files. The workspace builds. No code has moved yet. + +**Files:** +- Create: `crates/dsrs-core/Cargo.toml`, `crates/dsrs-core/src/lib.rs` +- Create: `crates/dsrs-lm/Cargo.toml`, `crates/dsrs-lm/src/lib.rs` +- Create: `crates/dsrs-trace/Cargo.toml`, `crates/dsrs-trace/src/lib.rs` +- Create: `crates/dsrs-cache/Cargo.toml`, `crates/dsrs-cache/src/lib.rs` +- Create: `crates/dsrs-predict/Cargo.toml`, `crates/dsrs-predict/src/lib.rs` +- Create: `crates/dsrs-evaluate/Cargo.toml`, `crates/dsrs-evaluate/src/lib.rs` +- Create: `crates/dsrs-gepa/Cargo.toml`, `crates/dsrs-gepa/src/lib.rs` +- Create: `crates/dsrs-data/Cargo.toml`, `crates/dsrs-data/src/lib.rs` +- Create: `crates/dsrs-leaven/Cargo.toml`, `crates/dsrs-leaven/src/lib.rs` +- Modify: `Cargo.toml` (workspace root) — register the new members + +**Step 1: Create each new crate's `Cargo.toml` with the minimum viable manifest.** + +For each crate, the manifest looks like (substitute crate name and dependencies per design § 3): + +```toml +[package] +name = "dsrs-core" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs core: signature, module, schema, abstract bridges." + +[dependencies] +# Empty for now; deps land as code is moved into the crate. +``` + +For `dsrs-leaven`, add path-dep stubs for the leaven crates: + +```toml +[dependencies] +# Path-dep into the sibling leaven workspace. +leaven-core = { path = "../../../leaven/crates/leaven-core" } +leaven-surface = { path = "../../../leaven/crates/leaven-surface" } +leaven-engine = { path = "../../../leaven/crates/leaven-engine" } +leaven-evidence = { path = "../../../leaven/crates/leaven-evidence" } +``` + +(Once `dsrs-leaven` actually imports types, add `dsrs-core`, `dsrs-evaluate`, `dsrs-predict` too.) + +**Step 2: Each `src/lib.rs` is a single line.** + +```rust +//! Empty placeholder — code is migrated into this crate by a later task. +``` + +**Step 3: Register members in workspace `Cargo.toml`.** + +Modify `Cargo.toml:3-7`: + +```toml +[workspace] +resolver = "3" +members = [ + "crates/*", + "vendor/baml/crates/*", +] +``` + +Already uses `crates/*` glob, so the new crates auto-register. Verify with: + +```bash +cargo metadata --format-version=1 --no-deps | python3 -c "import sys,json; m=json.load(sys.stdin); print('\n'.join(p['name'] for p in m['packages']))" | sort +``` + +Expected: lists all current crates plus the 9 new ones. + +**Step 4: Verify the workspace still builds with the empty crates registered.** + +```bash +cargo check --workspace +``` + +Expected: success. The new crates have no code, no deps, build instantly. + +**Step 5: Run the baseline tests.** + +```bash +cargo test --workspace +``` + +Expected: same pass count as Task 0 step 3. + +**Step 6: Commit.** + +```bash +jj describe -m "feat(workspace): register 9 empty dsrs-* crate skeletons + +dsrs-core, dsrs-lm, dsrs-trace, dsrs-cache, dsrs-predict, dsrs-evaluate, +dsrs-gepa, dsrs-data, dsrs-leaven. Empty lib.rs each. No code moved yet." +jj new +``` + +--- + +## Task 2: Extract `dsrs-core` — types and traits foundation + +**Goal:** Move the typed-substrate types out of `dspy-rs/src/core/`, `dspy-rs/src/augmentation.rs`, and the legacy boundary types out of `dspy-rs/src/data/{example,prediction}.rs` into `dsrs-core`. Re-export from `dspy-rs/src/lib.rs` so downstream code is unaffected for now. + +**Why this re-export step:** This is the hard task. Doing it cleanly first means subsequent extractions just move re-exports around. + +**Files:** +- Move: `crates/dspy-rs/src/core/{mod.rs, signature.rs, module.rs, module_ext.rs, schema.rs, predicted.rs, errors.rs, dyn_predictor.rs, specials.rs, settings.rs}` → `crates/dsrs-core/src/` +- Move: `crates/dspy-rs/src/augmentation.rs` → `crates/dsrs-core/src/augmentation.rs` +- Move: `crates/dspy-rs/src/data/example.rs` → `crates/dsrs-core/src/example.rs` +- Move: `crates/dspy-rs/src/data/prediction.rs` → `crates/dsrs-core/src/prediction.rs` +- Modify: `crates/dsrs-core/Cargo.toml` (add `bamltype`, `facet`, `serde`, `serde_json`, `thiserror`, `async-trait`, `indexmap`, `tokio`, `bon`, `tracing`) +- Modify: `crates/dsrs-core/src/lib.rs` (declare modules + pub re-exports) +- Modify: `crates/dspy-rs/Cargo.toml` (add `dsrs-core = { path = "../dsrs-core" }`) +- Modify: `crates/dspy-rs/src/lib.rs` (delete moved `mod core; mod augmentation;` lines, replace with `pub use dsrs_core::*;` for compatibility within the crate) +- Modify: `crates/dspy-rs/src/data/mod.rs` (drop `mod example; mod prediction;`) + +**Step 1: Read the existing module hierarchy to confirm what's where.** + +```bash +ls crates/dspy-rs/src/core/ +cat crates/dspy-rs/src/core/mod.rs +cat crates/dspy-rs/src/lib.rs | head -100 +``` + +This task touches every module currently re-exported by `dspy-rs/src/core/mod.rs` *except* `lm/`, which stays for Task 4. Note what `core/mod.rs` re-exports — the same things must be re-exported from `dsrs-core/src/lib.rs` and from `dspy-rs/src/lib.rs` (via `pub use dsrs_core::*`) for the re-export step to be transparent. + +**Step 2: Move files.** Use `jj` for the moves so file history is preserved. + +```bash +jj file track crates/dsrs-core/src/lib.rs # ensure tracked + +mkdir -p crates/dsrs-core/src +git mv crates/dspy-rs/src/core/signature.rs crates/dsrs-core/src/signature.rs +git mv crates/dspy-rs/src/core/module.rs crates/dsrs-core/src/module.rs +git mv crates/dspy-rs/src/core/module_ext.rs crates/dsrs-core/src/module_ext.rs +git mv crates/dspy-rs/src/core/schema.rs crates/dsrs-core/src/schema.rs +git mv crates/dspy-rs/src/core/predicted.rs crates/dsrs-core/src/predicted.rs +git mv crates/dspy-rs/src/core/errors.rs crates/dsrs-core/src/errors.rs +git mv crates/dspy-rs/src/core/dyn_predictor.rs crates/dsrs-core/src/dyn_predictor.rs +git mv crates/dspy-rs/src/core/specials.rs crates/dsrs-core/src/specials.rs +git mv crates/dspy-rs/src/core/settings.rs crates/dsrs-core/src/settings.rs +git mv crates/dspy-rs/src/augmentation.rs crates/dsrs-core/src/augmentation.rs +git mv crates/dspy-rs/src/data/example.rs crates/dsrs-core/src/example.rs +git mv crates/dspy-rs/src/data/prediction.rs crates/dsrs-core/src/prediction.rs +``` + +(`jj` snapshots the workspace after each command; renames inside a colocated repo are picked up correctly.) + +`crates/dspy-rs/src/core/mod.rs` is left behind — keep it for now, it'll be deleted in step 6 when nothing references it. + +**Step 3: Add the abstract bridge trait stubs to `dsrs-core`.** + +These traits don't exist yet — they're being introduced as part of the split, replacing today's tighter coupling. + +Create `crates/dsrs-core/src/bridges.rs`: + +```rust +//! Abstract bridge traits implemented by dsrs-trace, dsrs-cache, dsrs-lm. +//! +//! These exist so downstream crates (dsrs-predict, dsrs-evaluate, dsrs-gepa, +//! dsrs-leaven) depend only on dsrs-core, not on concrete observability or LM +//! crates. Each capability crate provides a concrete impl. + +use async_trait::async_trait; +use std::sync::Arc; + +/// Sink for execution-graph events. Implemented by `dsrs-trace::ExecutionGraph`. +pub trait TraceSink: Send + Sync + 'static { + fn record(&self, event: TraceEvent); +} + +/// One trace event. Concrete shape lives in dsrs-trace; this is the public +/// boundary type so producers don't depend on the concrete crate. +#[derive(Debug, Clone)] +pub struct TraceEvent { + pub kind: TraceEventKind, + pub at_ns: u64, + pub payload: serde_json::Value, +} + +#[derive(Debug, Clone)] +pub enum TraceEventKind { + PredictStart, + PredictEnd, + LmRequest, + LmResponse, + ParseFailure, + Custom(&'static str), +} + +/// LM response cache backend. Implemented by `dsrs-cache::LmCache`. +#[async_trait] +pub trait CacheBackend: Send + Sync + 'static { + async fn get(&self, key: &str) -> Option; + async fn put(&self, key: String, value: String); +} + +/// LM client trait. Implemented by `dsrs-lm::LM` (which wraps rig-core). +/// Predict and ChainOfThought depend on this trait, not on rig directly. +#[async_trait] +pub trait LmClient: Send + Sync + 'static { + async fn complete(&self, request: LmRequest) -> Result; +} + +#[derive(Debug, Clone)] +pub struct LmRequest { /* fill in from existing core/lm/mod.rs request shape */ } + +#[derive(Debug, Clone)] +pub struct LmResponse { /* fill in from existing core/lm/mod.rs response shape */ } +``` + +Leave `LmRequest` / `LmResponse` shapes as TODO — they get filled in when Task 4 (`dsrs-lm`) extracts the concrete LM and you can see the existing shape. Mark with: + +```rust +// TODO(dsrs-bridges): fill from crates/dspy-rs/src/core/lm/mod.rs after Task 4. +``` + +**Step 4: Write `crates/dsrs-core/src/lib.rs`.** + +```rust +//! DSRs core: typed signatures, modules, predicted outputs, augmentation, +//! abstract bridge traits, and the Facet walker. No LM, no formats, no +//! observability — those are concrete crates that depend on this one. + +pub mod augmentation; +pub mod bridges; +pub mod dyn_predictor; +pub mod errors; +pub mod example; +pub mod module; +pub mod module_ext; +pub mod predicted; +pub mod prediction; +pub mod schema; +pub mod settings; +pub mod signature; +pub mod specials; + +// Stable public surface — match what the old dspy-rs/src/lib.rs re-exported. +// (Refer to crates/dspy-rs/src/lib.rs at HEAD~1 to enumerate.) +pub use augmentation::*; +pub use bridges::{CacheBackend, LmClient, LmRequest, LmResponse, TraceEvent, TraceEventKind, TraceSink}; +pub use dyn_predictor::*; +pub use errors::{ConversionError, ErrorClass, LmError, ParseError, PredictError}; +pub use example::Example; +pub use module::Module; +pub use module_ext::*; +pub use predicted::{CallMetadata, Predicted}; +pub use prediction::Prediction; +pub use schema::SignatureSchema; +pub use settings::*; +pub use signature::Signature; +pub use specials::*; +``` + +**Step 5: Update `crates/dsrs-core/Cargo.toml` deps.** + +Read the current `crates/dspy-rs/Cargo.toml` `[dependencies]` and copy across only what core actually needs (the moved files import them): + +```toml +[dependencies] +async-trait = "0.1.83" +bamltype = { path = "../bamltype" } +bon = "3.7.0" +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +indexmap = "2.10.0" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +thiserror = "2.0.17" +tokio = { version = "1.46.1", features = ["sync"] } +tracing = "0.1.44" +``` + +`dsrs_macros` is NOT a dependency of `dsrs-core`; only end-user crates depend on the macros. + +**Step 6: Wire `dsrs-core` back into `dspy-rs` as a transparent re-export.** + +Modify `crates/dspy-rs/Cargo.toml`: + +```toml +[dependencies] +dsrs-core = { path = "../dsrs-core" } +# ... existing deps stay (still used by lm/, predictors/, optimizer/, etc.) +``` + +Modify `crates/dspy-rs/src/lib.rs` — delete `mod augmentation;` and `mod core;` lines, replace with: + +```rust +// Transparent re-export of dsrs-core (extracted in Task 2). Subsequent tasks +// will move more code into dedicated crates and shrink this file further. +pub use dsrs_core::*; +``` + +The `pub mod core;` declaration in `lib.rs` is gone. But code inside `dspy-rs` that says `use crate::core::Foo` needs updating to `use dsrs_core::Foo`. Find and update: + +```bash +grep -rn "crate::core::" crates/dspy-rs/src/ | wc -l +``` + +Replace `crate::core::` with `dsrs_core::` (and `crate::augmentation::` with `dsrs_core::`). + +```bash +grep -rln "crate::core::\|crate::augmentation::" crates/dspy-rs/src/ | xargs sed -i '' \ + -e 's|crate::core::|dsrs_core::|g' \ + -e 's|crate::augmentation::|dsrs_core::|g' +``` + +(macOS BSD `sed` syntax above. On Linux: `sed -i 's|...|...|g'`.) + +Modify `crates/dspy-rs/src/data/mod.rs` — drop `pub mod example;` and `pub mod prediction;` lines (the files are gone). Add `pub use dsrs_core::{Example, Prediction};` if anything inside `data/` imports them. + +Delete `crates/dspy-rs/src/core/mod.rs` once you've confirmed nothing in `dspy-rs` still says `use crate::core`: + +```bash +grep -rn "crate::core" crates/dspy-rs/src/ # expect empty +rm crates/dspy-rs/src/core/mod.rs +rmdir crates/dspy-rs/src/core # only if empty (lm/ should still be inside) +``` + +`crates/dspy-rs/src/core/lm/` stays — it moves in Task 4. + +If `core/` still has `lm/` in it, that's fine — `mod core { pub mod lm; }` in `lib.rs` handles it. Check the current state of `lib.rs` and adjust. + +**Step 7: Build the new crate first, then the workspace.** + +```bash +cargo check -p dsrs-core +``` + +Expected: success. If errors mention missing imports, the moved files reference siblings (e.g. `signature.rs` uses `super::module`). Fix to use `crate::module` (now flat in `dsrs-core`). + +```bash +cargo check --workspace +``` + +Expected: success. Any errors here mean some `dspy-rs` consumer still references `crate::core::X` or `crate::augmentation::X`. Fix. + +**Step 8: Run the full test suite.** + +```bash +cargo test --workspace +``` + +Expected: same pass count as Task 0. The re-export keeps tests working unchanged. + +**Step 9: Commit.** + +```bash +jj describe -m "refactor(dsrs-core): extract typed-substrate foundation from dspy-rs + +Moves core/{signature, module, module_ext, schema, predicted, errors, +dyn_predictor, specials, settings}.rs, augmentation.rs, and data/{example, +prediction}.rs into the new dsrs-core crate. Adds abstract bridge traits +(TraceSink, CacheBackend, LmClient) ready for concrete impls in subsequent +tasks. dspy-rs becomes a transparent re-export shell — every existing import +path still resolves." +jj new +``` + +--- + +## Task 3: Update `dsrs-macros` to emit `dsrs-core` paths + +**Goal:** Macros (`#[derive(Signature)]`, etc.) currently emit code that resolves `dspy_rs::TypeIR`, `dspy_rs::Constraint`, etc. After Task 2 those re-export from `dsrs-core`, so it works — but the proper path is `dsrs_core::*`. Fix the resolution at the source. + +**Files:** +- Modify: `crates/dsrs-macros/src/runtime_path.rs` +- Modify: `crates/dsrs-macros/Cargo.toml` (rename the path-resolution target) + +**Step 1: Read current resolver.** + +```bash +cat crates/dsrs-macros/src/runtime_path.rs +``` + +**Step 2: Rewrite to resolve `dsrs-core`.** + +```rust +use proc_macro_crate::{FoundCrate, crate_name}; +use proc_macro2::Span; + +pub(crate) fn resolve_dsrs_core_path() -> syn::Result { + match crate_name("dsrs-core") { + Ok(FoundCrate::Itself) => Ok(syn::parse_quote!(::dsrs_core)), + Ok(FoundCrate::Name(name)) => { + let ident = syn::Ident::new(&name.replace('-', "_"), Span::call_site()); + Ok(syn::parse_quote!(::#ident)) + } + Err(_) => Err(syn::Error::new( + Span::call_site(), + "could not resolve `dsrs-core`; add it as a dependency (renamed dependencies are supported)", + )), + } +} +``` + +**Step 3: Update every callsite of the old function name.** + +```bash +grep -rn "resolve_dspy_rs_path" crates/dsrs-macros/src/ +``` + +Replace each call with `resolve_dsrs_core_path`. Update any local variable names that mention `dspy_rs` — call them `dsrs_core_path` for clarity. + +**Step 4: Verify the macros still expand correctly by building a downstream user.** + +```bash +cargo check -p dspy-rs +``` + +Expected: success. Macro-generated code now targets `dsrs_core::*` paths, which `dspy-rs` re-exports. + +**Step 5: Run macro contract tests.** + +```bash +cargo test -p dspy-rs --test test_field_macro --test test_bamltype_attr_contract --test test_bamltype_docs_contract +``` + +Expected: all pass. + +**Step 6: Commit.** + +```bash +jj describe -m "refactor(dsrs-macros): emit dsrs-core paths instead of dspy-rs + +Macro-generated code now references ::dsrs_core::* directly. The dspy-rs +re-export still works for source-level imports, but the canonical path the +proc macro emits is the new core crate." +jj new +``` + +--- + +## Task 4: Extract `dsrs-trace` + +**Goal:** Move execution-graph recording into `dsrs-trace`. Implements `dsrs_core::TraceSink`. + +**Files:** +- Move: `crates/dspy-rs/src/trace/{mod, dag, executor, value, context}.rs` → `crates/dsrs-trace/src/` +- Modify: `crates/dsrs-trace/Cargo.toml` +- Modify: `crates/dsrs-trace/src/lib.rs` +- Modify: `crates/dspy-rs/Cargo.toml` +- Modify: `crates/dspy-rs/src/lib.rs` (drop `pub mod trace;`, add `pub use dsrs_trace as trace;`) + +**Step 1: Move files.** + +```bash +git mv crates/dspy-rs/src/trace/mod.rs crates/dsrs-trace/src/lib.rs +git mv crates/dspy-rs/src/trace/dag.rs crates/dsrs-trace/src/dag.rs +git mv crates/dspy-rs/src/trace/executor.rs crates/dsrs-trace/src/executor.rs +git mv crates/dspy-rs/src/trace/value.rs crates/dsrs-trace/src/value.rs +git mv crates/dspy-rs/src/trace/context.rs crates/dsrs-trace/src/context.rs +rmdir crates/dspy-rs/src/trace +``` + +**Step 2: Update `dsrs-trace/Cargo.toml`.** + +```toml +[dependencies] +dsrs-core = { path = "../dsrs-core" } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +tokio = { version = "1.46.1", features = ["sync"] } +tracing = "0.1.44" +``` + +(Add others as compile errors reveal.) + +**Step 3: Implement `TraceSink` for the concrete graph.** + +In `crates/dsrs-trace/src/lib.rs`, after the `pub mod` declarations: + +```rust +use dsrs_core::{TraceEvent, TraceSink}; + +impl TraceSink for ExecutionGraph { + fn record(&self, event: TraceEvent) { + // Existing recording logic moved here. + } +} +``` + +Adjust based on the actual `ExecutionGraph` type from `dag.rs`. If `record` doesn't quite fit today's API, add a thin adapter method. + +**Step 4: Wire `dsrs-trace` back into `dspy-rs`.** + +`crates/dspy-rs/Cargo.toml`: + +```toml +dsrs-trace = { path = "../dsrs-trace" } +``` + +`crates/dspy-rs/src/lib.rs`: + +```rust +// Replace existing `pub mod trace;` with: +pub use dsrs_trace as trace; +``` + +Update internal imports inside `dspy-rs/src/`: + +```bash +grep -rn "crate::trace::" crates/dspy-rs/src/ | wc -l +grep -rln "crate::trace::" crates/dspy-rs/src/ | xargs sed -i '' 's|crate::trace::|dsrs_trace::|g' +``` + +**Step 5: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +Expected: same pass count. + +**Step 6: Commit.** + +```bash +jj describe -m "refactor(dsrs-trace): extract execution-graph recording + +Moves trace/ into the dsrs-trace crate. ExecutionGraph implements +dsrs_core::TraceSink so producers depend on the trait, not the concrete +crate. dspy-rs re-exports dsrs-trace as ::trace for compatibility within +this transitional period." +jj new +``` + +--- + +## Task 5: Extract `dsrs-cache` + +**Goal:** Move foyer-backed LM response cache into `dsrs-cache`. Implements `dsrs_core::CacheBackend`. + +**Files:** +- Move: `crates/dspy-rs/src/utils/cache.rs` → `crates/dsrs-cache/src/lib.rs` +- Possibly move: `crates/dspy-rs/src/utils/telemetry.rs` (if it's only used by cache; otherwise leave) +- Modify: `crates/dsrs-cache/Cargo.toml` +- Modify: `crates/dspy-rs/src/utils/mod.rs` (drop `pub mod cache;`) +- Modify: `crates/dspy-rs/Cargo.toml` + +**Step 1: Audit `telemetry.rs` usage.** + +```bash +grep -rn "utils::telemetry\|crate::utils::telemetry" crates/dspy-rs/src/ +``` + +If it's only referenced from cache: move both. If broader: leave `telemetry.rs` in `dspy-rs/utils/` for now (a later task can fold it into `dsrs-trace` or `dsrs-core` as appropriate). + +**Step 2: Move and wire.** + +```bash +git mv crates/dspy-rs/src/utils/cache.rs crates/dsrs-cache/src/lib.rs +``` + +`crates/dsrs-cache/Cargo.toml`: + +```toml +[dependencies] +dsrs-core = { path = "../dsrs-core" } +async-trait = "0.1.83" +foyer = { version = "0.20.0", features = ["serde"] } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +tempfile = "3.23.0" +tokio = { version = "1.46.1", features = ["full"] } +``` + +In `crates/dsrs-cache/src/lib.rs`, add `impl CacheBackend for LmCache { ... }` (use the existing put/get methods). + +**Step 3: Update `dspy-rs`.** + +`crates/dspy-rs/Cargo.toml`: + +```toml +dsrs-cache = { path = "../dsrs-cache" } +``` + +`crates/dspy-rs/src/utils/mod.rs`: drop `pub mod cache;`. Add `pub use dsrs_cache as cache;` if any external code references `dspy_rs::utils::cache`. + +```bash +grep -rn "crate::utils::cache\|dspy_rs::utils::cache" crates/dspy-rs/ +``` + +Adjust imports. + +**Step 4: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +**Step 5: Commit.** + +```bash +jj describe -m "refactor(dsrs-cache): extract foyer-backed LM response cache + +Moves utils/cache.rs into dsrs-cache. Implements dsrs_core::CacheBackend so +dsrs-lm depends on the trait, not the foyer-backed concrete impl." +jj new +``` + +--- + +## Task 6: Extract `dsrs-lm` + +**Goal:** Move the LM client (rig-core wrapper), `ChatAdapter`, `GLOBAL_SETTINGS`, `configure`, `with_lm` into `dsrs-lm`. + +**Files:** +- Move: `crates/dspy-rs/src/core/lm/{mod, chat, client_registry, usage}.rs` → `crates/dsrs-lm/src/` +- Move: `crates/dspy-rs/src/adapter/{mod, chat}.rs` → `crates/dsrs-lm/src/adapter/{mod, chat}.rs` +- Modify: `crates/dsrs-lm/Cargo.toml` +- Modify: `crates/dsrs-lm/src/lib.rs` +- Modify: `crates/dspy-rs/src/lib.rs` (drop `pub mod adapter;`, drop `pub mod core { pub mod lm; }` references) +- Modify: `crates/dspy-rs/Cargo.toml` + +**Step 1: Move LM files.** + +```bash +git mv crates/dspy-rs/src/core/lm/mod.rs crates/dsrs-lm/src/lib.rs +git mv crates/dspy-rs/src/core/lm/chat.rs crates/dsrs-lm/src/chat.rs +git mv crates/dspy-rs/src/core/lm/client_registry.rs crates/dsrs-lm/src/client_registry.rs +git mv crates/dspy-rs/src/core/lm/usage.rs crates/dsrs-lm/src/usage.rs +rmdir crates/dspy-rs/src/core/lm +rmdir crates/dspy-rs/src/core 2>/dev/null || true +``` + +**Step 2: Move adapter files into `dsrs-lm/src/adapter/`.** + +```bash +mkdir -p crates/dsrs-lm/src/adapter +git mv crates/dspy-rs/src/adapter/mod.rs crates/dsrs-lm/src/adapter/mod.rs +git mv crates/dspy-rs/src/adapter/chat.rs crates/dsrs-lm/src/adapter/chat.rs +rmdir crates/dspy-rs/src/adapter +``` + +**Step 3: Wire `dsrs-lm/src/lib.rs`.** + +The existing `mod.rs` (now `lib.rs`) needs adjustment — it's becoming a crate root. Add module declarations at the top: + +```rust +//! DSRs LM crate: rig-core wrapper, ChatAdapter, settings. + +pub mod adapter; +pub mod chat; +pub mod client_registry; +pub mod usage; + +// Existing `mod.rs` content (LM struct, configure, with_lm, GLOBAL_SETTINGS, ...) +``` + +**Step 4: Update `dsrs-lm/Cargo.toml`.** + +Copy the rig-core, reqwest, regex, minijinja, anyhow, tokio, async-trait, schemars deps from `dspy-rs/Cargo.toml`: + +```toml +[dependencies] +dsrs-core = { path = "../dsrs-core" } +dsrs-cache = { path = "../dsrs-cache" } +dsrs-trace = { path = "../dsrs-trace" } +anyhow = "1.0.99" +async-trait = "0.1.83" +bamltype = { path = "../bamltype" } +bon = "3.7.0" +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +indexmap = "2.10.0" +minijinja = { git = "https://github.com/boundaryml/minijinja.git", branch = "main", default-features = false, features = ["builtins", "serde"] } +regex = "1.11.2" +reqwest = { version = "0.13", features = ["blocking"] } +rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev = "aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } +schemars = "1.0.4" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +thiserror = "2.0.17" +tokio = { version = "1.46.1", features = ["full"] } +tracing = "0.1.44" +``` + +**Step 5: Update `LmClient` trait in `dsrs-core/src/bridges.rs`.** + +Now that you can see `crates/dsrs-lm/src/lib.rs`, fill in `LmRequest` / `LmResponse` shapes (see TODO from Task 2 step 3). Then have the concrete `LM` struct in `dsrs-lm` implement `dsrs_core::LmClient`: + +```rust +use dsrs_core::{LmClient, LmRequest, LmResponse}; + +#[async_trait::async_trait] +impl LmClient for LM { + async fn complete(&self, request: LmRequest) -> Result { + // Adapt to existing call path. + } +} +``` + +**Step 6: Update `dspy-rs`.** + +`crates/dspy-rs/Cargo.toml`: + +```toml +dsrs-lm = { path = "../dsrs-lm" } +``` + +`crates/dspy-rs/src/lib.rs`: +- Drop `pub mod adapter;` (and `pub mod core { pub mod lm; }` if it still exists). +- Add `pub use dsrs_lm as lm;` and `pub use dsrs_lm::adapter;` to keep external imports working during the transition. + +Update imports: + +```bash +grep -rn "crate::core::lm::\|crate::adapter::" crates/dspy-rs/src/ +sed -i '' \ + -e 's|crate::core::lm::|dsrs_lm::|g' \ + -e 's|crate::adapter::|dsrs_lm::adapter::|g' \ + $(grep -rln "crate::core::lm::\|crate::adapter::" crates/dspy-rs/src/) +``` + +**Step 7: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +Expected: all tests pass. The LM extraction is the heaviest concrete move (~1700 lines). + +**Step 8: Commit.** + +```bash +jj describe -m "refactor(dsrs-lm): extract LM client + ChatAdapter into dsrs-lm + +Moves core/lm/ and adapter/ into the new dsrs-lm crate. The LM struct +implements dsrs_core::LmClient so Predict (extracted later) depends on the +trait, not on rig-core directly. dsrs-lm pulls in rig, reqwest, minijinja, +schemars — none of those are pulled by dsrs-core consumers anymore." +jj new +``` + +--- + +## Task 7: Extract `dsrs-evaluate` + +**Goal:** Move the typed-metric surface into `dsrs-evaluate`. + +**Files:** +- Move: `crates/dspy-rs/src/evaluate/{mod, evaluator, feedback, feedback_helpers, metrics}.rs` → `crates/dsrs-evaluate/src/` +- Modify: `crates/dsrs-evaluate/Cargo.toml` +- Modify: `crates/dspy-rs/src/lib.rs` (drop `pub mod evaluate;`, add re-export) +- Modify: `crates/dspy-rs/Cargo.toml` + +**Step 1: Move.** + +```bash +git mv crates/dspy-rs/src/evaluate/mod.rs crates/dsrs-evaluate/src/lib.rs +git mv crates/dspy-rs/src/evaluate/evaluator.rs crates/dsrs-evaluate/src/evaluator.rs +git mv crates/dspy-rs/src/evaluate/feedback.rs crates/dsrs-evaluate/src/feedback.rs +git mv crates/dspy-rs/src/evaluate/feedback_helpers.rs crates/dsrs-evaluate/src/feedback_helpers.rs +git mv crates/dspy-rs/src/evaluate/metrics.rs crates/dsrs-evaluate/src/metrics.rs +rmdir crates/dspy-rs/src/evaluate +``` + +**Step 2: `dsrs-evaluate/Cargo.toml`.** + +```toml +[dependencies] +dsrs-core = { path = "../dsrs-core" } +async-trait = "0.1.83" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +tokio = { version = "1.46.1", features = ["full"] } +tracing = "0.1.44" +futures = "0.3.31" +``` + +**Step 3: Update `dsrs-evaluate/src/lib.rs` to declare the modules.** + +```rust +//! DSRs typed-metric surface. Permanent (leaven-dsrs adapts this). + +pub mod evaluator; +pub mod feedback; +pub mod feedback_helpers; +pub mod metrics; + +pub use evaluator::*; +pub use feedback::*; +pub use feedback_helpers::*; +pub use metrics::*; +``` + +(Mirror what the old `evaluate/mod.rs` re-exported.) + +**Step 4: Update `dspy-rs` re-export.** + +```rust +// crates/dspy-rs/src/lib.rs +pub use dsrs_evaluate as evaluate; +``` + +```bash +grep -rn "crate::evaluate::" crates/dspy-rs/src/ | wc -l +sed -i '' 's|crate::evaluate::|dsrs_evaluate::|g' $(grep -rln "crate::evaluate::" crates/dspy-rs/src/) +``` + +**Step 5: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +`tests/test_evaluate_trainset_typed.rs` is the load-bearing parity check here. + +**Step 6: Commit.** + +```bash +jj describe -m "refactor(dsrs-evaluate): extract TypedMetric and feedback helpers" +jj new +``` + +--- + +## Task 8: Extract `dsrs-predict` + +**Goal:** Move `Predict`, `ChainOfThought`, `ReAct` into `dsrs-predict`. This is the L1 leaf — only thing that calls the LM. + +**Files:** +- Move: `crates/dspy-rs/src/predictors/{mod, predict}.rs` → `crates/dsrs-predict/src/` +- Move: `crates/dspy-rs/src/modules/{mod, chain_of_thought, react}.rs` → `crates/dsrs-predict/src/modules/` +- Modify: `crates/dsrs-predict/Cargo.toml` +- Modify: `crates/dspy-rs/src/lib.rs`, `Cargo.toml` + +**Step 1: Move.** + +```bash +mkdir -p crates/dsrs-predict/src/modules +git mv crates/dspy-rs/src/predictors/predict.rs crates/dsrs-predict/src/predict.rs +git mv crates/dspy-rs/src/predictors/mod.rs crates/dsrs-predict/src/predictors_mod_DELETE.rs +# The old predictors/mod.rs is just `pub mod predict; pub use predict::*;` — fold into lib.rs. +rm crates/dsrs-predict/src/predictors_mod_DELETE.rs +rmdir crates/dspy-rs/src/predictors + +git mv crates/dspy-rs/src/modules/chain_of_thought.rs crates/dsrs-predict/src/modules/chain_of_thought.rs +git mv crates/dspy-rs/src/modules/react.rs crates/dsrs-predict/src/modules/react.rs +git mv crates/dspy-rs/src/modules/mod.rs crates/dsrs-predict/src/modules/mod.rs +rmdir crates/dspy-rs/src/modules +``` + +**Step 2: `dsrs-predict/src/lib.rs`.** + +```rust +//! DSRs predict crate: the L1 leaf. Predict, ChainOfThought, ReAct. +//! Only crate that actually calls the LM. + +pub mod modules; +pub mod predict; + +pub use modules::*; +pub use predict::*; +``` + +**Step 3: `dsrs-predict/Cargo.toml`.** + +```toml +[dependencies] +dsrs-core = { path = "../dsrs-core" } +dsrs-lm = { path = "../dsrs-lm" } +dsrs-trace = { path = "../dsrs-trace" } +async-trait = "0.1.83" +bamltype = { path = "../bamltype" } +bon = "3.7.0" +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +futures = "0.3.31" +indexmap = "2.10.0" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +tokio = { version = "1.46.1", features = ["full"] } +tracing = "0.1.44" +dsrs_macros = { path = "../dsrs-macros" } # ChainOfThought derives Augmentation etc. +``` + +**Step 4: Wire and update imports.** + +```bash +grep -rn "crate::predictors::\|crate::modules::" crates/dspy-rs/src/ +sed -i '' \ + -e 's|crate::predictors::|dsrs_predict::|g' \ + -e 's|crate::modules::|dsrs_predict::|g' \ + $(grep -rln "crate::predictors::\|crate::modules::" crates/dspy-rs/src/) +``` + +`crates/dspy-rs/src/lib.rs`: + +```rust +pub use dsrs_predict as predict_crate; +pub use dsrs_predict::*; // Predict, ChainOfThought, ReAct at top level +pub mod modules { pub use dsrs_predict::modules::*; } +``` + +`crates/dspy-rs/Cargo.toml`: + +```toml +dsrs-predict = { path = "../dsrs-predict" } +``` + +**Step 5: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +The Predict test surface is the largest. Tests `test_chain_of_thought_swap`, `test_chat_prompt_composition`, `test_chat_prompt_golden` are core. + +**Step 6: Commit.** + +```bash +jj describe -m "refactor(dsrs-predict): extract Predict, ChainOfThought, ReAct" +jj new +``` + +--- + +## Task 9: Extract `dsrs-gepa`, delete COPRO and MIPROv2 + +**Goal:** Move GEPA + pareto into `dsrs-gepa`. Delete COPRO and MIPROv2 source files. Delete COPRO/MIPRO test files. Delete the `08-optimize-mipro.rs` example. + +**Files:** +- Move: `crates/dspy-rs/src/optimizer/{gepa, pareto}.rs` → `crates/dsrs-gepa/src/` +- Move: `crates/dspy-rs/src/optimizer/mod.rs` → `crates/dsrs-gepa/src/lib.rs` (with COPRO/MIPRO declarations removed) +- **Delete:** `crates/dspy-rs/src/optimizer/copro.rs` +- **Delete:** `crates/dspy-rs/src/optimizer/mipro.rs` +- **Delete:** `crates/dspy-rs/tests/test_optimize_mipro.rs` (or whichever exists) +- **Delete:** any COPRO test +- **Delete:** `crates/dspy-rs/examples/04-optimize-hotpotqa.rs` (this uses COPRO — verify, possibly rewrite to GEPA later as separate work) +- **Delete:** `crates/dspy-rs/examples/08-optimize-mipro.rs` + +**Step 1: Confirm what to delete.** + +```bash +grep -ln "COPRO\|MIPROv2\|MIPRO" crates/dspy-rs/{tests,examples}/*.rs +``` + +Inventory the hits and confirm with the user before deleting if unsure. Per the design, COPRO and MIPROv2 are out — their tests and examples go with them. + +**Step 2: Delete the optimizer source files first.** + +```bash +rm crates/dspy-rs/src/optimizer/copro.rs +rm crates/dspy-rs/src/optimizer/mipro.rs +``` + +**Step 3: Move GEPA + pareto.** + +```bash +git mv crates/dspy-rs/src/optimizer/gepa.rs crates/dsrs-gepa/src/gepa.rs +git mv crates/dspy-rs/src/optimizer/pareto.rs crates/dsrs-gepa/src/pareto.rs +git mv crates/dspy-rs/src/optimizer/mod.rs crates/dsrs-gepa/src/lib.rs +rmdir crates/dspy-rs/src/optimizer +``` + +**Step 4: Edit `dsrs-gepa/src/lib.rs` to drop COPRO/MIPRO references.** + +Open the file, delete every `pub mod copro;`, `pub mod mipro;`, `pub use copro::*;`, `pub use mipro::*;`. Keep the `Optimizer` trait, the GEPA exports, the pareto exports. + +**Step 5: `dsrs-gepa/Cargo.toml`.** + +```toml +[package] +name = "dsrs-gepa" +description = "GEPA optimizer for DSRs (sunset candidate; replaced by leaven once dsrs-leaven is real)." +# ... usual fields ... + +[dependencies] +dsrs-core = { path = "../dsrs-core" } +dsrs-predict = { path = "../dsrs-predict" } +dsrs-evaluate = { path = "../dsrs-evaluate" } +async-trait = "0.1.83" +indexmap = "2.10.0" +rand = "0.8.5" +rayon = "1.10.0" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +tokio = { version = "1.46.1", features = ["full"] } +tracing = "0.1.44" +kdam = "0.6.3" +``` + +**Step 6: Delete COPRO/MIPRO tests and examples.** + +```bash +rm -f crates/dspy-rs/tests/test_optimize_mipro.rs +# (Add others as inventory in step 1 reveals — confirm each name first.) +rm -f crates/dspy-rs/examples/08-optimize-mipro.rs +# 04-optimize-hotpotqa: read first; if it's COPRO-only, delete; if salvageable for GEPA, leave a TODO comment in the file. +head -30 crates/dspy-rs/examples/04-optimize-hotpotqa.rs +``` + +If `04-optimize-hotpotqa.rs` is hard-wired to COPRO, delete it. Removing examples is fine — they're not in the test suite gate. + +**Step 7: Update `dspy-rs` re-export.** + +```rust +// crates/dspy-rs/src/lib.rs +pub use dsrs_gepa as optimizer; +``` + +```bash +grep -rn "crate::optimizer::" crates/dspy-rs/src/ +sed -i '' 's|crate::optimizer::|dsrs_gepa::|g' $(grep -rln "crate::optimizer::" crates/dspy-rs/src/) +``` + +`crates/dspy-rs/Cargo.toml`: + +```toml +dsrs-gepa = { path = "../dsrs-gepa" } +``` + +**Step 8: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +Expected pass count = baseline − (deleted COPRO/MIPRO test count). If anything else fails, an import wasn't migrated. + +**Step 9: Commit.** + +```bash +jj describe -m "refactor(dsrs-gepa): extract GEPA; delete COPRO and MIPROv2 + +Moves optimizer/gepa.rs and optimizer/pareto.rs into dsrs-gepa. Deletes +COPRO (optimizer/copro.rs) and MIPROv2 (optimizer/mipro.rs) along with +their tests (test_optimize_mipro.rs et al) and the 08-optimize-mipro +example. Both were marked outdated. + +dsrs-gepa is a sunset candidate — deleted once leaven-gepa ships a +runnable optimizer and dsrs-leaven ships real impls." +jj new +``` + +--- + +## Task 10: Extract `dsrs-data` with feature flags + +**Goal:** Move `DataLoader` + format readers into `dsrs-data`. Format-specific deps (parquet, hf-hub, csv) go behind feature flags so light users don't pay. + +**Files:** +- Move: `crates/dspy-rs/src/data/{mod, dataloader, serialize, utils}.rs` → `crates/dsrs-data/src/` +- Modify: `crates/dsrs-data/Cargo.toml` + +(Note: `example.rs` and `prediction.rs` already moved in Task 2.) + +**Step 1: Move.** + +```bash +git mv crates/dspy-rs/src/data/dataloader.rs crates/dsrs-data/src/dataloader.rs +git mv crates/dspy-rs/src/data/serialize.rs crates/dsrs-data/src/serialize.rs +git mv crates/dspy-rs/src/data/utils.rs crates/dsrs-data/src/utils.rs +git mv crates/dspy-rs/src/data/mod.rs crates/dsrs-data/src/lib.rs +rmdir crates/dspy-rs/src/data +``` + +**Step 2: Edit `dsrs-data/src/lib.rs`** — remove the `pub mod example;` and `pub mod prediction;` lines (those types live in `dsrs-core` now). Add `pub use dsrs_core::{Example, Prediction};` if anything in `dsrs-data` references them. + +**Step 3: `dsrs-data/Cargo.toml` with feature flags.** + +```toml +[package] +name = "dsrs-data" +# ... usual fields ... + +[features] +default = ["json"] +json = [] +csv = ["dep:csv"] +parquet = ["dep:parquet", "dep:arrow"] +hf = ["dep:hf-hub", "dep:reqwest"] +all = ["json", "csv", "parquet", "hf"] + +[dependencies] +dsrs-core = { path = "../dsrs-core" } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +tokio = { version = "1.46.1", features = ["full"] } +tracing = "0.1.44" +indexmap = "2.10.0" + +# Optional, gated: +csv = { version = "1.3.1", optional = true } +parquet = { version = "56.1.0", optional = true } +arrow = { version = "56.1.0", optional = true } +hf-hub = { version = "0.4.3", features = ["tokio"], optional = true } +reqwest = { version = "0.13", features = ["blocking"], optional = true } +``` + +**Step 4: Gate format-specific code in `dataloader.rs`.** + +The existing `dataloader.rs` unconditionally uses csv, parquet, hf-hub. Wrap each format-specific function/impl with `#[cfg(feature = "csv")]` etc: + +```rust +#[cfg(feature = "csv")] +pub fn load_csv(...) -> ... { ... } + +#[cfg(feature = "parquet")] +pub fn load_parquet(...) -> ... { ... } + +#[cfg(feature = "hf")] +pub async fn load_hf_dataset(...) -> ... { ... } +``` + +**Step 5: Update `dspy-rs` re-export and consumer.** + +```rust +// crates/dspy-rs/src/lib.rs — replace `pub mod data;` +pub use dsrs_data as data; +``` + +`crates/dspy-rs/Cargo.toml`: depend on `dsrs-data` with `features = ["all"]` so the existing test suite (which exercises all formats) still works. + +```toml +dsrs-data = { path = "../dsrs-data", features = ["all"] } +``` + +```bash +grep -rn "crate::data::" crates/dspy-rs/src/ +sed -i '' 's|crate::data::|dsrs_data::|g' $(grep -rln "crate::data::" crates/dspy-rs/src/) +``` + +**Step 6: Build matrix.** + +```bash +cargo check -p dsrs-data --no-default-features +cargo check -p dsrs-data --no-default-features --features json +cargo check -p dsrs-data --no-default-features --features csv +cargo check -p dsrs-data --no-default-features --features parquet +cargo check -p dsrs-data --no-default-features --features hf +cargo check -p dsrs-data --features all +``` + +Expected: each succeeds. + +**Step 7: Workspace tests.** + +```bash +cargo test --workspace +``` + +`tests/test_dataloader.rs` is the gate — it should still pass with `dspy-rs` requesting `features = ["all"]`. + +**Step 8: Commit.** + +```bash +jj describe -m "refactor(dsrs-data): extract DataLoader with feature-gated format readers + +Moves data/{dataloader,serialize,utils}.rs into dsrs-data. Format-specific +deps (csv, parquet, arrow, hf-hub, reqwest) are now feature-gated: + - default = json + - csv, parquet, hf, all +Light users skip arrow/parquet/hf-hub. dspy-rs depends with features=[all] +during the transitional period." +jj new +``` + +--- + +## Task 11: Skeleton `dsrs-leaven` + +**Goal:** Lay down `dsrs-leaven` with type signatures and `unimplemented!()` bodies. Real implementations land in a follow-up plan once at least one leaven-side piece (e.g. `leaven-gepa` real impl) is in place. + +**Files:** +- Modify: `crates/dsrs-leaven/Cargo.toml` +- Modify: `crates/dsrs-leaven/src/lib.rs` +- Create: `crates/dsrs-leaven/src/{artifact, change, surface, evaluator, evidence}.rs` + +**Step 1: Verify leaven path.** + +```bash +ls /Users/darin/src/personal/leaven/crates/{leaven-core,leaven-surface,leaven-engine,leaven-evidence}/ +``` + +If any are missing, **stop** and confirm with the user. + +**Step 2: `dsrs-leaven/Cargo.toml`.** + +```toml +[package] +name = "dsrs-leaven" +description = "DSRs's implementation of leaven's capability traits — Artifact, EditSurface, Evaluator, Evidence." + +[dependencies] +dsrs-core = { path = "../dsrs-core" } +dsrs-evaluate = { path = "../dsrs-evaluate" } +dsrs-predict = { path = "../dsrs-predict" } + +leaven-core = { path = "../../../leaven/crates/leaven-core" } +leaven-surface = { path = "../../../leaven/crates/leaven-surface" } +leaven-engine = { path = "../../../leaven/crates/leaven-engine" } +leaven-evidence = { path = "../../../leaven/crates/leaven-evidence" } + +async-trait = "0.1.83" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +thiserror = "2.0.17" +``` + +**Step 3: `src/lib.rs`.** + +```rust +//! DSRs ⇄ leaven integration. DSRs implements leaven's capability traits +//! directly so leaven optimizers (leaven-gepa, etc.) can drive DSRs programs. +//! +//! Bodies are `unimplemented!()` until the leaven side is real. See +//! `docs/plans/2026-05-08-dsrs-crate-split-design.md` § 6 for the sunset +//! trigger that lets us delete dsrs-gepa. + +pub mod artifact; +pub mod change; +pub mod surface; +pub mod evaluator; +pub mod evidence; + +pub use artifact::DsrsProgramArtifact; +pub use change::DsrsProgramChange; +pub use surface::DsrsProgramSurface; +pub use evaluator::DsrsEvaluator; +pub use evidence::DsrsEvidence; +``` + +**Step 4: Each module is a stub** with type signatures only, no bodies yet: + +```rust +// artifact.rs +use dsrs_core::{Module, Signature}; + +pub struct DsrsProgramArtifact +where + S: Signature, + M: Module, +{ + _phantom: std::marker::PhantomData<(S, M)>, +} + +impl leaven_core::Artifact for DsrsProgramArtifact +where + S: Signature, + M: Module + Send + Sync + 'static, +{ + type Change = crate::change::DsrsProgramChange; + + fn identity(&self) -> leaven_core::ArtifactIdentity { + unimplemented!("dsrs-leaven: identity — fill once leaven-gepa is real") + } + + fn apply_change(&self, _change: &Self::Change) + -> Result + { + unimplemented!("dsrs-leaven: apply_change") + } +} +``` + +Repeat the pattern for the other 4 files. The point is: they compile against the current leaven trait shapes. If leaven changes its trait API, this crate breaks immediately and we adjust. + +**Step 5: Build.** + +```bash +cargo check -p dsrs-leaven +cargo check --workspace +``` + +If `leaven-core` traits have moved or changed shape, fix the impls to match. The `unimplemented!()` bodies don't run, so failure to build = signature mismatch only. + +**Step 6: Commit.** + +```bash +jj describe -m "feat(dsrs-leaven): add skeleton crate with leaven trait impls + +Type signatures only — bodies are unimplemented!(). Compiles against the +current leaven-core / leaven-surface / leaven-engine / leaven-evidence +trait shapes. Real impls land in a follow-up plan once leaven-gepa is a +runnable optimizer (see design doc § 6 sunset trigger)." +jj new +``` + +--- + +## Task 12: Dissolve `dspy-rs` — relocate examples and tests + +**Goal:** With every code module moved, the `dspy-rs` crate is now a thin re-export shell. Per design D1 (no facade), it gets deleted. Examples and tests need a home. + +**Files:** +- Delete: `crates/dspy-rs/` entirely +- Move: `crates/dspy-rs/examples/*.rs` → `crates/dspy-rs/examples/` lives where? Pick a destination per step below. +- Move: `crates/dspy-rs/tests/*.rs` → distribute across the new crates that own the surfaces being tested. +- Modify: workspace `Cargo.toml` (no change needed — `crates/*` glob) + +**Step 1: Inventory the remaining `dspy-rs/src/lib.rs`.** + +```bash +cat crates/dspy-rs/src/lib.rs +wc -l crates/dspy-rs/src/lib.rs +``` + +By this point it should be ~30 lines: just `pub use` re-exports of the new crates. Confirm there's no leftover code that didn't get extracted. + +**Step 2: Pick destinations for tests.** + +```bash +ls crates/dspy-rs/tests/ +``` + +Distribute by what's tested: + +| Test | Destination | +|------|-------------| +| `test_field_macro.rs`, `test_bamltype_*` | `crates/dsrs-macros/tests/` | +| `test_chat*.rs`, `test_lm.rs`, `test_adapters.rs`, `test_input_format.rs`, `test_message_roundtrip.rs` | `crates/dsrs-lm/tests/` | +| `test_evaluate_trainset_typed.rs` | `crates/dsrs-evaluate/tests/` | +| `test_call_outcome.rs`, `test_caller_managed_conversation.rs`, `test_chain_of_thought_swap.rs`, `test_example.rs`, `test_flatten_roundtrip.rs` | `crates/dsrs-predict/tests/` | +| `test_dataloader.rs` | `crates/dsrs-data/tests/` | +| `test_gepa.rs`, `test_gepa_typed_metric_feedback.rs` | `crates/dsrs-gepa/tests/` | +| (rest) | inspect each — most belong to `dsrs-predict` or `dsrs-lm` | + +For each destination, `mkdir -p crates//tests` then `git mv`. Update the test's `use` statements: `dspy_rs::X` → the right new crate's path. + +```bash +# Example for one test: +mkdir -p crates/dsrs-lm/tests +git mv crates/dspy-rs/tests/test_chat.rs crates/dsrs-lm/tests/test_chat.rs +sed -i '' 's|use dspy_rs::|use dsrs_lm::|g; s|dspy_rs::|dsrs_lm::|g' crates/dsrs-lm/tests/test_chat.rs +``` + +(Each test will probably need 2-3 different replacements as it imports from multiple crates. Read the test, do the imports manually.) + +**Step 3: Pick a destination for examples.** + +Two options: +- (A) New top-level `examples/` directory with each example mapped to the crate that owns its surface. +- (B) Distribute under `crates//examples/`. + +(B) is more idiomatic for cargo workspaces — examples auto-discover from each crate. Pick (B). + +```bash +# Example: +mkdir -p crates/dsrs-predict/examples +git mv crates/dspy-rs/examples/01-simple.rs crates/dsrs-predict/examples/01-simple.rs +sed -i '' 's|use dspy_rs::|use dsrs_predict::|g' crates/dsrs-predict/examples/01-simple.rs +# (Likely needs additional crates pulled — read the example to see what it imports.) +``` + +The examples that touch GEPA go to `dsrs-gepa`; tracing examples to `dsrs-trace`; the smoke-* slices to wherever the surface they smoke-test lives. + +**Step 4: Delete `dspy-rs`.** + +```bash +rm -rf crates/dspy-rs +``` + +**Step 5: Build and test.** + +```bash +cargo check --workspace +cargo test --workspace +``` + +Expected: `cargo metadata` no longer lists `dspy-rs`. Test pass count = baseline − deleted COPRO/MIPRO tests. Any test that fails because an import didn't get rewritten — fix it. + +**Step 6: Commit.** + +```bash +jj describe -m "refactor: dissolve dspy-rs aggregator crate + +All code has migrated to dsrs-{core, lm, trace, cache, predict, evaluate, +gepa, data, leaven}. Tests and examples redistributed to the crates that +own their surfaces. No facade re-export — users depend on the leaf crates +explicitly." +jj new +``` + +--- + +## Task 13: Update docs and READMEs + +**Files:** +- Modify: `README.md` (top-level) +- Modify: `CURRENT_PLAN.md`, `CURRENT_SPEC.md` (mark superseded; point to the new design doc) +- Modify: `sub-agents.md` +- Modify: `docs/specs/modules/breadboard.md`, `docs/specs/modules/design_reference.md` (annotate that the topology is now mechanically enforced via crate boundaries; cross-link to the new design doc) +- Modify: `docs/docs/getting-started/*.md` (any quick-start showing `use dspy_rs::*` — replace with the new imports) + +**Step 1: Update top-level `README.md`.** + +Replace any `dspy-rs = "..."` snippets and `use dspy_rs::*` examples with the new layout. The mental model section should reference the layered crates: + +```markdown +## Crates + +| Crate | Purpose | +|-------|---------| +| `dsrs-core` | Signatures, modules, schema, errors, abstract bridges. | +| `dsrs-lm` | LM client + ChatAdapter. | +| `dsrs-predict` | Predict, ChainOfThought, ReAct. | +| `dsrs-evaluate` | TypedMetric and feedback helpers. | +| `dsrs-gepa` | GEPA optimizer (sunset; replaced by leaven). | +| `dsrs-data` | DataLoader (csv / parquet / hf-hub feature-gated). | +| `dsrs-trace` | Execution-graph recording. | +| `dsrs-cache` | Foyer LM cache. | +| `dsrs-leaven` | Leaven integration. | +``` + +**Step 2: Mark `CURRENT_PLAN.md` and `CURRENT_SPEC.md` superseded.** + +Add a banner at the top of each: + +```markdown +> **Superseded** by [`docs/plans/2026-05-08-dsrs-crate-split-design.md`](docs/plans/2026-05-08-dsrs-crate-split-design.md). Retained for historical context. +``` + +**Step 3: Verify no doc still says `crates/dspy-rs`.** + +```bash +grep -rn "crates/dspy-rs\|dspy-rs/src" docs/ README.md *.md +``` + +Inventory hits and decide per-line whether to update or annotate as historical. + +**Step 4: Update example imports** in `docs/docs/getting-started/`, `docs/docs/tutorials/`, `docs/docs/building-blocks/`, `docs/docs/optimizers/`. + +```bash +grep -rln "use dspy_rs::" docs/ +# For each hit, decide: rewrite with the new crate paths, or annotate as +# historical (if the doc page is itself outdated). +``` + +**Step 5: Commit.** + +```bash +jj describe -m "docs: update READMEs and references for the dsrs crate split + +Top-level README lists the 9 crates and their roles. CURRENT_PLAN.md and +CURRENT_SPEC.md flagged as superseded by the new split design doc. Docs +under docs/docs/* updated where they showed dspy_rs imports." +jj new +``` + +--- + +## Task 14: Clean up the leaven side + +**Files (in the leaven workspace, separate jj repo):** +- Modify: `/Users/darin/src/personal/leaven/Cargo.toml` (drop `crates/leaven-dsrs` member, drop the workspace-deps entry) +- Delete: `/Users/darin/src/personal/leaven/crates/leaven-dsrs/` + +**Step 1: Coordinate with the user before touching leaven.** + +This step modifies a sibling repository. Confirm with the user that it's OK to delete `leaven-dsrs` from the leaven workspace, since DSRs's `dsrs-leaven` now owns those impls. + +**Step 2: In leaven repo:** + +```bash +cd /Users/darin/src/personal/leaven +jj st +jj new -m "chore: drop leaven-dsrs in favor of dsrs-leaven (in DSRs workspace)" + +# Remove from workspace members and workspace deps +# (Edit Cargo.toml manually — drop both lines.) + +rm -rf crates/leaven-dsrs + +cargo check --workspace +``` + +Expected: leaven workspace builds without `leaven-dsrs`. (Nothing inside leaven imports it; it was a stub.) + +**Step 3: Commit (in leaven repo).** + +```bash +jj describe -m "chore: drop leaven-dsrs + +DSRs now owns the integration via crates/dsrs-leaven in the DSRs +workspace. Bridge crate ownership flipped: a downstream consumer +implements leaven's capability traits, rather than leaven owning a +third-party-shaped bridge." +jj new +``` + +--- + +## Task 15: Final verification and squash + +**Step 1: From the DSRs repo:** + +```bash +cargo check --workspace +cargo test --workspace +cargo build --release --workspace # full release build smoke-test +``` + +Expected: all green. The release build catches anything that's wrong with cfg-gates or feature interactions. + +**Step 2: Per-crate smoke checks.** + +```bash +for c in dsrs-core dsrs-lm dsrs-trace dsrs-cache dsrs-predict dsrs-evaluate dsrs-gepa dsrs-data dsrs-leaven; do + echo "=== $c ===" + cargo check -p $c + cargo test -p $c +done +``` + +Expected: each crate builds and tests independently. + +**Step 3: Test count parity.** + +```bash +cargo test --workspace -- --list 2>/dev/null | grep -c ': test$' +``` + +Expected: `baseline_count − deleted_copro_mipro_count`. Any other delta means a test was lost in transit. Investigate. + +**Step 4: Look at the change graph.** + +```bash +jj log -r 'trunk()..@' +``` + +Expected: a chain of ~14 commits, one per task, each with a clear message. + +**Step 5: Decide on commit shape.** + +Two options: + +(A) **Keep the chain.** 14 commits, easier review and easier revert per-task. + +(B) **Squash into one big commit.** + +```bash +jj squash --from --into +# repeat as needed +``` + +Recommendation: keep the chain (option A). Each task is a logical unit; merge as a stack of commits. + +**Step 6: Final commit if anything was tweaked in step 1-3.** + +```bash +jj describe -m "chore: final cleanup after split" +``` + +If the working copy is empty, you're done. + +--- + +## Open follow-ups (not in this plan — separate work) + +1. **Real `dsrs-leaven` impls.** Once `leaven-gepa` is runnable and the leaven optimization run-loop has an ergonomic entry point, replace the `unimplemented!()` bodies in Task 11 with real implementations. Write a parity test: optimize the same DSRs program via `dsrs-gepa` and via leaven-driven `dsrs-leaven`, confirm equal-or-better results from leaven. + +2. **Delete `dsrs-gepa`.** When the parity test passes (per design § 6), remove the crate from the workspace. Update the topology doc and README. + +3. **Move `dsrs-macros` to emit per-target paths.** Today the macros emit `::dsrs_core::*`. If `dsrs-leaven` ever needs different macro emit-paths (unlikely), the resolver in `dsrs-macros/src/runtime_path.rs` becomes a small lookup table. + +4. **Rewrite `04-optimize-hotpotqa.rs` (deleted in Task 9) as a GEPA example** if the COPRO version is missed. + +5. **Audit `dsrs-data` features** after a few weeks of use. If `default = ["json"]` is wrong (e.g. csv is more common), adjust. + +--- + +## References + +- Design: [`2026-05-08-dsrs-crate-split-design.md`](2026-05-08-dsrs-crate-split-design.md) +- Visual companion: [`2026-05-08-dsrs-crate-split-topology.html`](2026-05-08-dsrs-crate-split-topology.html) +- Predecessor topology spec: `docs/specs/modules/breadboard.md`, `docs/specs/modules/design_reference.md` +- Leaven principles: `/Users/darin/src/personal/leaven/AGENTS.md`, `/Users/darin/src/personal/leaven/docs/specs/guiding_principles.md` From b26a17ec5cffb41e12b7680a0269b076162eb97e Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:47:42 -0700 Subject: [PATCH 04/15] feat(workspace): register empty dsrs crate skeletons Adds the nine planned crates as real workspace members: dsrs-core, dsrs-lm, dsrs-trace, dsrs-cache, dsrs-predict, dsrs-evaluate, dsrs-gepa, dsrs-data, and dsrs-leaven. No code moved yet; this isolates workspace topology from the later extraction churn. cargo metadata sees all dsrs-* crates and cargo check --workspace succeeds, including the dsrs-leaven path deps against ../leaven. Scaffolding: every lib.rs is intentionally empty until its extraction task owns the crate surface. --- Cargo.lock | 198 +++++++++++++++++++++++++++++++- crates/dsrs-cache/Cargo.toml | 10 ++ crates/dsrs-cache/src/lib.rs | 1 + crates/dsrs-core/Cargo.toml | 10 ++ crates/dsrs-core/src/lib.rs | 1 + crates/dsrs-data/Cargo.toml | 10 ++ crates/dsrs-data/src/lib.rs | 1 + crates/dsrs-evaluate/Cargo.toml | 10 ++ crates/dsrs-evaluate/src/lib.rs | 1 + crates/dsrs-gepa/Cargo.toml | 10 ++ crates/dsrs-gepa/src/lib.rs | 1 + crates/dsrs-leaven/Cargo.toml | 14 +++ crates/dsrs-leaven/src/lib.rs | 1 + crates/dsrs-lm/Cargo.toml | 10 ++ crates/dsrs-lm/src/lib.rs | 1 + crates/dsrs-predict/Cargo.toml | 10 ++ crates/dsrs-predict/src/lib.rs | 1 + crates/dsrs-trace/Cargo.toml | 10 ++ crates/dsrs-trace/src/lib.rs | 1 + 19 files changed, 300 insertions(+), 1 deletion(-) create mode 100644 crates/dsrs-cache/Cargo.toml create mode 100644 crates/dsrs-cache/src/lib.rs create mode 100644 crates/dsrs-core/Cargo.toml create mode 100644 crates/dsrs-core/src/lib.rs create mode 100644 crates/dsrs-data/Cargo.toml create mode 100644 crates/dsrs-data/src/lib.rs create mode 100644 crates/dsrs-evaluate/Cargo.toml create mode 100644 crates/dsrs-evaluate/src/lib.rs create mode 100644 crates/dsrs-gepa/Cargo.toml create mode 100644 crates/dsrs-gepa/src/lib.rs create mode 100644 crates/dsrs-leaven/Cargo.toml create mode 100644 crates/dsrs-leaven/src/lib.rs create mode 100644 crates/dsrs-lm/Cargo.toml create mode 100644 crates/dsrs-lm/src/lib.rs create mode 100644 crates/dsrs-predict/Cargo.toml create mode 100644 crates/dsrs-predict/src/lib.rs create mode 100644 crates/dsrs-trace/Cargo.toml create mode 100644 crates/dsrs-trace/src/lib.rs diff --git a/Cargo.lock b/Cargo.lock index 11e8b33b..0aaf2420 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -135,6 +135,12 @@ version = "1.7.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "69f7f8c3906b62b754cd5326047894316021dcfe5a194c8ea52bdd94934a3457" +[[package]] +name = "arrayref" +version = "0.3.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "76a2e8124351fda1ef8aaaa3bbd7ebbcb486bbcd4225aca0aa0d84bb2db8fecb" + [[package]] name = "arrayvec" version = "0.5.2" @@ -580,6 +586,20 @@ version = "2.9.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2261d10cca569e4643e526d8dc2e62e433cc8aba21ab764233731f8d369bf394" +[[package]] +name = "blake3" +version = "1.8.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0aa83c34e62843d924f905e0f5c866eb1dd6545fc4d719e803d9ba6030371fce" +dependencies = [ + "arrayref", + "arrayvec 0.7.6", + "cc", + "cfg-if", + "constant_time_eq", + "cpufeatures 0.3.0", +] + [[package]] name = "block-buffer" version = "0.10.4" @@ -614,6 +634,16 @@ dependencies = [ "syn 2.0.106", ] +[[package]] +name = "borsh" +version = "1.6.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cfd1e3f8955a5d7de9fab72fc8373fade9fb8a703968cb200ae3dc6cf08e185a" +dependencies = [ + "bytes", + "cfg_aliases", +] + [[package]] name = "brotli" version = "8.0.2" @@ -728,7 +758,10 @@ checksum = "c469d952047f47f91b68d1cba3f10d63c11d73e4636f24f08daf0278abf01c4d" dependencies = [ "android-tzdata", "iana-time-zone", + "js-sys", "num-traits", + "serde", + "wasm-bindgen", "windows-link 0.1.3", ] @@ -901,6 +934,12 @@ dependencies = [ "tiny-keccak", ] +[[package]] +name = "constant_time_eq" +version = "0.4.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3d52eff69cd5e647efe296129160853a42795992097e8af39800e1060caeea9b" + [[package]] name = "convert_case" version = "0.6.0" @@ -956,6 +995,15 @@ dependencies = [ "libc", ] +[[package]] +name = "cpufeatures" +version = "0.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8b2a41393f66f16b0823bb79094d54ac5fbd34ab292ddafb9a0456ac9f87d201" +dependencies = [ + "libc", +] + [[package]] name = "crc32fast" version = "1.5.0" @@ -1244,6 +1292,48 @@ dependencies = [ "tracing-subscriber", ] +[[package]] +name = "dsrs-cache" +version = "0.0.0" + +[[package]] +name = "dsrs-core" +version = "0.0.0" + +[[package]] +name = "dsrs-data" +version = "0.0.0" + +[[package]] +name = "dsrs-evaluate" +version = "0.0.0" + +[[package]] +name = "dsrs-gepa" +version = "0.0.0" + +[[package]] +name = "dsrs-leaven" +version = "0.0.0" +dependencies = [ + "leaven-core", + "leaven-engine", + "leaven-evidence", + "leaven-surface", +] + +[[package]] +name = "dsrs-lm" +version = "0.0.0" + +[[package]] +name = "dsrs-predict" +version = "0.0.0" + +[[package]] +name = "dsrs-trace" +version = "0.0.0" + [[package]] name = "dsrs_macros" version = "0.7.2" @@ -1884,6 +1974,12 @@ version = "0.5.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "fc0fef456e4baa96da950455cd02c081ca953b141298e41db3fc7e36b1da849c" +[[package]] +name = "hex" +version = "0.4.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7f24254aa9a54b5c858eaee2f5bccdb46aaf0e486a595ed5fd8f86ba55232a70" + [[package]] name = "hf-hub" version = "0.4.3" @@ -2401,6 +2497,83 @@ version = "1.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "bbd2bcb4c963f2ddae06a2efc7e9f3591312473c50c6685e1f298068316e66fe" +[[package]] +name = "leaven-core" +version = "0.0.0" +dependencies = [ + "chrono", + "leaven-kernel", + "serde", + "smol_str", +] + +[[package]] +name = "leaven-engine" +version = "0.0.0" +dependencies = [ + "futures", + "indexmap", + "leaven-core", + "leaven-kernel", + "leaven-store", + "leaven-surface", + "leaven-workspace", + "thiserror 2.0.17", +] + +[[package]] +name = "leaven-evidence" +version = "0.0.0" +dependencies = [ + "leaven-core", + "leaven-kernel", + "ordered-float 4.6.0", + "thiserror 2.0.17", +] + +[[package]] +name = "leaven-kernel" +version = "0.0.0" +dependencies = [ + "blake3", + "chrono", + "hex", + "serde", + "serde_json", + "thiserror 2.0.17", + "uuid", +] + +[[package]] +name = "leaven-store" +version = "0.0.0" +dependencies = [ + "bytes", + "leaven-core", + "leaven-kernel", + "thiserror 2.0.17", +] + +[[package]] +name = "leaven-surface" +version = "0.0.0" +dependencies = [ + "leaven-core", + "leaven-kernel", + "thiserror 2.0.17", +] + +[[package]] +name = "leaven-workspace" +version = "0.0.0" +dependencies = [ + "futures", + "leaven-kernel", + "parking_lot", + "serde", + "thiserror 2.0.17", +] + [[package]] name = "lexical-core" version = "1.0.5" @@ -2956,6 +3129,17 @@ dependencies = [ "num-traits", ] +[[package]] +name = "ordered-float" +version = "4.6.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7bb71e1b3fa6ca1c61f383464aaf2bb0e2f8e772a1f01d486832464de363b951" +dependencies = [ + "num-traits", + "rand 0.8.5", + "serde", +] + [[package]] name = "ordered-float" version = "5.1.0" @@ -3338,6 +3522,7 @@ dependencies = [ "libc", "rand_chacha 0.3.1", "rand_core 0.6.4", + "serde", ] [[package]] @@ -3377,6 +3562,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c" dependencies = [ "getrandom 0.2.16", + "serde", ] [[package]] @@ -4025,7 +4211,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283" dependencies = [ "cfg-if", - "cpufeatures", + "cpufeatures 0.2.17", "digest", ] @@ -4083,6 +4269,16 @@ version = "2.0.0-alpha.12" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ef784004ca8777809dcdad6ac37629f0a97caee4c685fcea805278d81dd8b857" +[[package]] +name = "smol_str" +version = "0.3.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9676b89cd56310a87b93dec47b11af744f34d5fc9f367b829474eec0a891350d" +dependencies = [ + "borsh", + "serde", +] + [[package]] name = "snap" version = "1.1.1" diff --git a/crates/dsrs-cache/Cargo.toml b/crates/dsrs-cache/Cargo.toml new file mode 100644 index 00000000..71451109 --- /dev/null +++ b/crates/dsrs-cache/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-cache" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs response cache support." + +[dependencies] diff --git a/crates/dsrs-cache/src/lib.rs b/crates/dsrs-cache/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-cache/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-core/Cargo.toml b/crates/dsrs-core/Cargo.toml new file mode 100644 index 00000000..e3476bac --- /dev/null +++ b/crates/dsrs-core/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-core" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs core: signature, module, schema, and abstract bridge traits." + +[dependencies] diff --git a/crates/dsrs-core/src/lib.rs b/crates/dsrs-core/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-core/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-data/Cargo.toml b/crates/dsrs-data/Cargo.toml new file mode 100644 index 00000000..236d2d49 --- /dev/null +++ b/crates/dsrs-data/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-data" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs dataset loading support." + +[dependencies] diff --git a/crates/dsrs-data/src/lib.rs b/crates/dsrs-data/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-data/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-evaluate/Cargo.toml b/crates/dsrs-evaluate/Cargo.toml new file mode 100644 index 00000000..12b2f56c --- /dev/null +++ b/crates/dsrs-evaluate/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-evaluate" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs typed evaluation and metric support." + +[dependencies] diff --git a/crates/dsrs-evaluate/src/lib.rs b/crates/dsrs-evaluate/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-evaluate/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-gepa/Cargo.toml b/crates/dsrs-gepa/Cargo.toml new file mode 100644 index 00000000..3ba8a3b6 --- /dev/null +++ b/crates/dsrs-gepa/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-gepa" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs GEPA optimizer support." + +[dependencies] diff --git a/crates/dsrs-gepa/src/lib.rs b/crates/dsrs-gepa/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-gepa/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-leaven/Cargo.toml b/crates/dsrs-leaven/Cargo.toml new file mode 100644 index 00000000..8605c9a4 --- /dev/null +++ b/crates/dsrs-leaven/Cargo.toml @@ -0,0 +1,14 @@ +[package] +name = "dsrs-leaven" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "Leaven integration for DSRs programs." + +[dependencies] +leaven-core = { path = "../../../leaven/crates/leaven-core" } +leaven-surface = { path = "../../../leaven/crates/leaven-surface" } +leaven-engine = { path = "../../../leaven/crates/leaven-engine" } +leaven-evidence = { path = "../../../leaven/crates/leaven-evidence" } diff --git a/crates/dsrs-leaven/src/lib.rs b/crates/dsrs-leaven/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-leaven/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-lm/Cargo.toml b/crates/dsrs-lm/Cargo.toml new file mode 100644 index 00000000..f5553384 --- /dev/null +++ b/crates/dsrs-lm/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-lm" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs LM integration and chat adapter." + +[dependencies] diff --git a/crates/dsrs-lm/src/lib.rs b/crates/dsrs-lm/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-lm/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-predict/Cargo.toml b/crates/dsrs-predict/Cargo.toml new file mode 100644 index 00000000..8cb1196d --- /dev/null +++ b/crates/dsrs-predict/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-predict" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs typed predictors and module implementations." + +[dependencies] diff --git a/crates/dsrs-predict/src/lib.rs b/crates/dsrs-predict/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-predict/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. diff --git a/crates/dsrs-trace/Cargo.toml b/crates/dsrs-trace/Cargo.toml new file mode 100644 index 00000000..e6cfe017 --- /dev/null +++ b/crates/dsrs-trace/Cargo.toml @@ -0,0 +1,10 @@ +[package] +name = "dsrs-trace" +version = "0.0.0" +edition = "2024" +rust-version = "1.85" +license = "Apache-2.0" +repository = "https://github.com/krypticmouse/DSRs" +description = "DSRs tracing and execution graph support." + +[dependencies] diff --git a/crates/dsrs-trace/src/lib.rs b/crates/dsrs-trace/src/lib.rs new file mode 100644 index 00000000..89442986 --- /dev/null +++ b/crates/dsrs-trace/src/lib.rs @@ -0,0 +1 @@ +//! Empty placeholder; code is migrated into this crate by a later task. From ffa6861d5582381c98ee5b9d57732aac6d11eae3 Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:50:02 -0700 Subject: [PATCH 05/15] feat(core): extract typed substrate into dsrs-core Moves Signature, Module, ModuleExt, SignatureSchema, Predicted, error types, augmentation primitives, raw Example/Prediction, DynPredictor discovery, and LmUsage out of dspy-rs into dsrs-core. Why this shape: LmUsage had to move with Predicted/Prediction/errors because otherwise the old LM crate and new core crate produced distinct usage types at call boundaries. LM client state, adapters, and global settings stay in dspy-rs for the later dsrs-lm extraction so core does not learn concrete LM behavior. Kept temporary internal dspy-rs module aliases only so the workspace remains compiling between extraction steps; final hard cutover still deletes the dspy-rs facade. Verification: cargo check --workspace; cargo test -p dsrs-core; cargo test --workspace --no-run. Scaffolding: dsrs-core owns the discovery trait but Predict itself still lives in dspy-rs until the dsrs-predict extraction, so two real Predict-specific dyn walker tests remain deferred to that crate. --- Cargo.lock | 17 ++++++ crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/core/lm/mod.rs | 3 +- crates/dspy-rs/src/core/mod.rs | 25 +++----- crates/dspy-rs/src/data/mod.rs | 10 +++- crates/dspy-rs/src/lib.rs | 8 +-- .../dspy-rs/src/modules/chain_of_thought.rs | 2 +- crates/dspy-rs/src/optimizer/copro.rs | 2 +- crates/dspy-rs/src/optimizer/mod.rs | 2 +- crates/dspy-rs/src/predictors/predict.rs | 4 +- crates/dspy-rs/src/trace/value.rs | 18 +----- crates/dsrs-core/Cargo.toml | 14 +++++ .../src/augmentation.rs | 4 +- .../core => dsrs-core/src}/dyn_predictor.rs | 58 +++---------------- .../src/core => dsrs-core/src}/errors.rs | 0 .../src/data => dsrs-core/src}/example.rs | 0 crates/dsrs-core/src/lib.rs | 55 +++++++++++++++++- .../src/core => dsrs-core/src}/module.rs | 5 +- .../src/core => dsrs-core/src}/module_ext.rs | 0 .../src/core => dsrs-core/src}/predicted.rs | 4 +- .../src/data => dsrs-core/src}/prediction.rs | 4 +- .../src/core => dsrs-core/src}/schema.rs | 0 .../src/core => dsrs-core/src}/signature.rs | 5 +- .../src/core => dsrs-core/src}/specials.rs | 0 .../src/core/lm => dsrs-core/src}/usage.rs | 0 25 files changed, 132 insertions(+), 109 deletions(-) rename crates/{dspy-rs => dsrs-core}/src/augmentation.rs (98%) rename crates/{dspy-rs/src/core => dsrs-core/src}/dyn_predictor.rs (90%) rename crates/{dspy-rs/src/core => dsrs-core/src}/errors.rs (100%) rename crates/{dspy-rs/src/data => dsrs-core/src}/example.rs (100%) rename crates/{dspy-rs/src/core => dsrs-core/src}/module.rs (99%) rename crates/{dspy-rs/src/core => dsrs-core/src}/module_ext.rs (100%) rename crates/{dspy-rs/src/core => dsrs-core/src}/predicted.rs (98%) rename crates/{dspy-rs/src/data => dsrs-core/src}/prediction.rs (95%) rename crates/{dspy-rs/src/core => dsrs-core/src}/schema.rs (100%) rename crates/{dspy-rs/src/core => dsrs-core/src}/signature.rs (98%) rename crates/{dspy-rs/src/core => dsrs-core/src}/specials.rs (100%) rename crates/{dspy-rs/src/core/lm => dsrs-core/src}/usage.rs (100%) diff --git a/Cargo.lock b/Cargo.lock index 0aaf2420..6392c1b8 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1265,6 +1265,7 @@ dependencies = [ "bamltype", "bon", "csv", + "dsrs-core", "dsrs_macros", "enum_dispatch", "facet", @@ -1299,6 +1300,22 @@ version = "0.0.0" [[package]] name = "dsrs-core" version = "0.0.0" +dependencies = [ + "anyhow", + "async-trait", + "bamltype", + "bon", + "facet", + "futures", + "indexmap", + "kdam", + "rig-core", + "schemars", + "serde", + "serde_json", + "thiserror 2.0.17", + "tracing", +] [[package]] name = "dsrs-data" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index 94f32610..d83b7ade 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -26,6 +26,7 @@ async-trait = "0.1.83" anyhow = "1.0.99" bon = "3.7.0" bamltype = { path = "../bamltype" } +dsrs-core = { path = "../dsrs-core" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } thiserror = "2.0.17" diff --git a/crates/dspy-rs/src/core/lm/mod.rs b/crates/dspy-rs/src/core/lm/mod.rs index 52273430..c96cc603 100644 --- a/crates/dspy-rs/src/core/lm/mod.rs +++ b/crates/dspy-rs/src/core/lm/mod.rs @@ -1,10 +1,9 @@ pub mod chat; pub mod client_registry; -pub mod usage; pub use chat::*; pub use client_registry::*; -pub use usage::*; +pub use dsrs_core::LmUsage; use anyhow::Result; use rig::{completion::AssistantContent, message::ToolCall, message::ToolChoice, tool::ToolDyn}; diff --git a/crates/dspy-rs/src/core/mod.rs b/crates/dspy-rs/src/core/mod.rs index 64bf2cb8..966b6742 100644 --- a/crates/dspy-rs/src/core/mod.rs +++ b/crates/dspy-rs/src/core/mod.rs @@ -20,24 +20,17 @@ //! who need fine-grained prompt control also use [`SignatureSchema`] and the adapter //! building blocks directly. -pub(crate) mod dyn_predictor; -mod errors; pub mod lm; -pub mod module; -mod module_ext; -mod predicted; -mod schema; pub mod settings; -pub mod signature; -pub mod specials; -pub(crate) use dyn_predictor::*; -pub use errors::{ConversionError, ErrorClass, JsonishError, LmError, ParseError, PredictError}; +pub use dsrs_core::{ + Augmentation, Augmented, BamlConvertError, BamlType, BamlValue, CallMetadata, Constraint, + ConstraintKind, ConstraintLevel, ConstraintResult, ConstraintSpec, ConversionError, + DynPredictor, ErrorClass, Facet, FieldMeta, FieldMetadataSpec, FieldPath, FieldSchema, Flag, + InputRenderSpec, JsonishError, LmError, LmUsage, Module, ModuleExt, NamedParametersError, + OutputFormatContent, ParseError, PredictError, PredictState, Predicted, Prediction, + RawExample, RenderOptions, ResponseCheck, Shape, Signature, SignatureSchema, StreamingMode, + TrackedValue, TypeIR, visit_named_predictors_mut, +}; pub use lm::*; -pub use module::*; -pub use module_ext::*; -pub use predicted::{CallMetadata, ConstraintResult, FieldMeta, Predicted}; -pub use schema::{FieldMetadataSpec, FieldPath, FieldSchema, InputRenderSpec, SignatureSchema}; pub use settings::*; -pub use signature::*; -pub use specials::*; diff --git a/crates/dspy-rs/src/data/mod.rs b/crates/dspy-rs/src/data/mod.rs index 09c73049..ec0d8539 100644 --- a/crates/dspy-rs/src/data/mod.rs +++ b/crates/dspy-rs/src/data/mod.rs @@ -9,8 +9,12 @@ //! The untyped row type (`RawExample`) remains for internal runtime/tracing/cache bridges. pub mod dataloader; -pub mod example; -pub mod prediction; +pub mod example { + pub use dsrs_core::RawExample as Example; +} +pub mod prediction { + pub use dsrs_core::Prediction; +} pub mod serialize; pub mod utils; @@ -20,4 +24,4 @@ pub use prediction::*; pub use serialize::*; pub use utils::*; -pub type RawExample = example::Example; +pub type RawExample = dsrs_core::RawExample; diff --git a/crates/dspy-rs/src/lib.rs b/crates/dspy-rs/src/lib.rs index c8e5e2af..c15a1e74 100644 --- a/crates/dspy-rs/src/lib.rs +++ b/crates/dspy-rs/src/lib.rs @@ -97,7 +97,9 @@ extern crate self as dspy_rs; pub mod adapter; -pub mod augmentation; +pub mod augmentation { + pub use dsrs_core::*; +} pub mod core; pub mod data; pub mod evaluate; @@ -108,11 +110,9 @@ pub mod trace; pub mod utils; pub use adapter::chat::*; -pub use augmentation::*; +pub use dsrs_core::*; pub use core::*; pub use data::dataloader::*; -pub(crate) use data::example::Example as RawExample; -pub use data::prediction::*; pub use data::serialize::*; pub use data::utils::*; pub use evaluate::*; diff --git a/crates/dspy-rs/src/modules/chain_of_thought.rs b/crates/dspy-rs/src/modules/chain_of_thought.rs index 56b7e769..db8e3aad 100644 --- a/crates/dspy-rs/src/modules/chain_of_thought.rs +++ b/crates/dspy-rs/src/modules/chain_of_thought.rs @@ -1,5 +1,5 @@ use crate::Augmentation; -use crate::augmentation::Augmented; +use dsrs_core::Augmented; use crate::core::{Module, Signature}; use crate::predictors::{Example, Predict, PredictBuilder}; use crate::{BamlType, PredictError, Predicted}; diff --git a/crates/dspy-rs/src/optimizer/copro.rs b/crates/dspy-rs/src/optimizer/copro.rs index 736f0f87..1d97374b 100644 --- a/crates/dspy-rs/src/optimizer/copro.rs +++ b/crates/dspy-rs/src/optimizer/copro.rs @@ -1,7 +1,7 @@ use anyhow::{Result, anyhow}; use bon::Builder; -use crate::core::DynPredictor; +use dsrs_core::DynPredictor; use crate::evaluate::{TypedMetric, average_score}; use crate::optimizer::{ Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor, diff --git a/crates/dspy-rs/src/optimizer/mod.rs b/crates/dspy-rs/src/optimizer/mod.rs index 7d04a961..fcd761be 100644 --- a/crates/dspy-rs/src/optimizer/mod.rs +++ b/crates/dspy-rs/src/optimizer/mod.rs @@ -44,7 +44,7 @@ use anyhow::Result; use anyhow::anyhow; use std::ops::ControlFlow; -use crate::core::{DynPredictor, visit_named_predictors_mut}; +use dsrs_core::{DynPredictor, visit_named_predictors_mut}; use crate::evaluate::{MetricOutcome, TypedMetric, evaluate_trainset}; use crate::predictors::Example; use crate::{Facet, Module, Signature}; diff --git a/crates/dspy-rs/src/predictors/predict.rs b/crates/dspy-rs/src/predictors/predict.rs index 1b35f902..e88e3749 100644 --- a/crates/dspy-rs/src/predictors/predict.rs +++ b/crates/dspy-rs/src/predictors/predict.rs @@ -9,8 +9,8 @@ use std::sync::Arc; use tracing::{debug, trace}; use crate as dsrs; -use crate::core::{DynPredictor, Module, PredictAccessorFns, PredictState, Signature}; -use crate::data::example::Example as RawExample; +use dsrs_core::{DynPredictor, Module, PredictAccessorFns, PredictState, Signature}; +use dsrs_core::RawExample; use crate::{ BamlType, BamlValue, CallMetadata, Chat, ChatAdapter, GLOBAL_SETTINGS, LmError, LmUsage, PredictError, Predicted, Prediction, SignatureSchema, diff --git a/crates/dspy-rs/src/trace/value.rs b/crates/dspy-rs/src/trace/value.rs index f7cf9590..14512934 100644 --- a/crates/dspy-rs/src/trace/value.rs +++ b/crates/dspy-rs/src/trace/value.rs @@ -1,18 +1,6 @@ -use serde::Serialize; use serde_json::Value; -#[derive(Clone, Debug, Serialize)] -pub struct TrackedValue { - pub value: Value, - #[serde(skip)] - pub source: Option<(usize, String)>, // (node_id, key) -} - -impl TrackedValue { - pub fn new(value: Value, source: Option<(usize, String)>) -> Self { - Self { value, source } - } -} +pub use dsrs_core::TrackedValue; pub trait IntoTracked { fn into_tracked(self) -> TrackedValue; @@ -24,10 +12,10 @@ impl IntoTracked for TrackedValue { } } -impl> IntoTracked for T { +impl IntoTracked for Value { fn into_tracked(self) -> TrackedValue { TrackedValue { - value: self.into(), + value: self, source: None, } } diff --git a/crates/dsrs-core/Cargo.toml b/crates/dsrs-core/Cargo.toml index e3476bac..ac107f94 100644 --- a/crates/dsrs-core/Cargo.toml +++ b/crates/dsrs-core/Cargo.toml @@ -8,3 +8,17 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs core: signature, module, schema, and abstract bridge traits." [dependencies] +anyhow = "1.0.99" +async-trait = "0.1.83" +bamltype = { path = "../bamltype" } +bon = "3.7.0" +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +futures = "0.3.31" +indexmap = "2.10.0" +kdam = "0.6.3" +rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev = "aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } +schemars = "1.0.4" +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +thiserror = "2.0.17" +tracing = "0.1.44" diff --git a/crates/dspy-rs/src/augmentation.rs b/crates/dsrs-core/src/augmentation.rs similarity index 98% rename from crates/dspy-rs/src/augmentation.rs rename to crates/dsrs-core/src/augmentation.rs index 96e37861..5637d40f 100644 --- a/crates/dspy-rs/src/augmentation.rs +++ b/crates/dsrs-core/src/augmentation.rs @@ -13,8 +13,8 @@ use facet::Facet; /// /// Usually derived: /// -/// ``` -/// use dspy_rs::*; +/// ```ignore +/// use dsrs_core::*; /// /// #[derive(Augmentation, Clone, Debug)] /// #[augment(output, prepend)] diff --git a/crates/dspy-rs/src/core/dyn_predictor.rs b/crates/dsrs-core/src/dyn_predictor.rs similarity index 90% rename from crates/dspy-rs/src/core/dyn_predictor.rs rename to crates/dsrs-core/src/dyn_predictor.rs index d5a436a8..69db59e4 100644 --- a/crates/dspy-rs/src/core/dyn_predictor.rs +++ b/crates/dsrs-core/src/dyn_predictor.rs @@ -5,8 +5,7 @@ use anyhow::Result; use bamltype::facet_reflect::Peek; use facet::{ConstTypeId, Def, Facet, KnownPointer, Shape, Type, UserType}; -use crate::SignatureSchema; -use crate::data::example::Example as RawExample; +use crate::{RawExample, SignatureSchema}; /// Type-erased optimizer handle to a [`crate::Predict`] leaf. /// @@ -17,7 +16,7 @@ use crate::data::example::Example as RawExample; /// /// Normal users never touch this — you pass your module to `optimizer.compile()` /// and it uses `DynPredictor` internally. -pub(crate) trait DynPredictor: Send + Sync { +pub trait DynPredictor: Send + Sync { /// Returns the [`SignatureSchema`] for this predictor's signature. fn schema(&self) -> &SignatureSchema; @@ -55,7 +54,7 @@ pub(crate) trait DynPredictor: Send + Sync { /// Used by [`DynPredictor::dump_state`]/[`DynPredictor::load_state`] for /// saving and restoring optimized parameters. #[derive(Clone, Debug, Default)] -pub(crate) struct PredictState { +pub struct PredictState { /// The demos as type-erased examples. pub demos: Vec, /// The instruction override, if any. @@ -67,7 +66,7 @@ type VisitMutFn = #[derive(Clone, Copy, Debug, facet::Facet)] #[facet(opaque)] -pub(crate) struct PredictAccessorFns { +pub struct PredictAccessorFns { pub visit_mut: VisitMutFn, } @@ -81,7 +80,7 @@ impl Eq for PredictAccessorFns {} facet::define_attr_grammar! { ns "dsrs"; - crate_path $crate::core::dyn_predictor; + crate_path $crate::dyn_predictor; pub enum Attr { PredictAccessor(Option<&'static PredictAccessorFns>), @@ -90,7 +89,7 @@ facet::define_attr_grammar! { /// Error from [`visit_named_predictors_mut`] when the Facet walker encounters an unsupported structure. #[derive(Debug, thiserror::Error, PartialEq, Eq)] -pub(crate) enum NamedParametersError { +pub enum NamedParametersError { /// A `Predict` leaf was found inside an unsupported container (`Rc`, `Arc`, etc.). #[error("container `{ty}` at `{path}` contains a parameter leaf")] Container { path: String, ty: &'static str }, @@ -117,7 +116,7 @@ pub(crate) enum NamedParametersError { name = "dsrs.visit_named_predictors_mut", skip(module, visitor) )] -pub(crate) fn visit_named_predictors_mut( +pub fn visit_named_predictors_mut( module: &mut M, mut visitor: F, ) -> std::result::Result<(), NamedParametersError> @@ -397,45 +396,7 @@ fn pointer_name(pointer: Option) -> &'static str { mod tests { use super::*; use crate as dsrs; - use crate::Signature; - use crate::predictors::Predict as RealPredict; use std::ops::ControlFlow; - use std::rc::Rc; - use std::sync::Arc; - - #[derive(Signature, Clone, Debug)] - struct DummySig { - #[input] - value: String, - - #[output] - done: bool, - } - - #[derive(facet::Facet)] - #[facet(crate = facet)] - struct SharedPointerModule { - rc_predictor: Rc>, - arc_predictor: Arc>, - } - - #[test] - fn named_parameters_rejects_shared_pointers() { - let mut module = SharedPointerModule { - rc_predictor: Rc::new(RealPredict::::new()), - arc_predictor: Arc::new(RealPredict::::new()), - }; - - match visit_named_predictors_mut(&mut module, |_path, _predictor| ControlFlow::Continue(())) - { - Err(NamedParametersError::Container { path, ty }) => { - assert_eq!(path, "rc_predictor"); - assert_eq!(ty, "Rc"); - } - Ok(_) => panic!("walk unexpectedly succeeded"), - Err(other) => panic!("unexpected error: {other:?}"), - } - } #[derive(facet::Facet)] #[facet(crate = facet, dsrs::predict_accessor)] @@ -519,11 +480,6 @@ mod tests { } } - #[test] - fn real_predict_shape_has_strict_identity_marker() { - assert!(is_predict_shape_identity(RealPredict::::SHAPE)); - } - #[derive(facet::Facet)] #[facet(crate = facet)] struct Predict; diff --git a/crates/dspy-rs/src/core/errors.rs b/crates/dsrs-core/src/errors.rs similarity index 100% rename from crates/dspy-rs/src/core/errors.rs rename to crates/dsrs-core/src/errors.rs diff --git a/crates/dspy-rs/src/data/example.rs b/crates/dsrs-core/src/example.rs similarity index 100% rename from crates/dspy-rs/src/data/example.rs rename to crates/dsrs-core/src/example.rs diff --git a/crates/dsrs-core/src/lib.rs b/crates/dsrs-core/src/lib.rs index 89442986..58d18d21 100644 --- a/crates/dsrs-core/src/lib.rs +++ b/crates/dsrs-core/src/lib.rs @@ -1 +1,54 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! Core typed substrate for DSRs. + +// TODO(dsrs-facet-lint-scope): remove this crate-level allow once Facet's generated +// extension-attr dispatch no longer triggers rust-lang/rust#52234 on in-crate usage. +#![allow(macro_expanded_macro_exports_accessed_by_absolute_paths)] + +mod augmentation; +pub mod dyn_predictor; +mod errors; +mod example; +mod module; +mod module_ext; +mod predicted; +mod prediction; +mod schema; +mod signature; +mod specials; +mod usage; + +pub use augmentation::*; +pub use dyn_predictor::*; +pub use errors::{ConversionError, ErrorClass, JsonishError, LmError, ParseError, PredictError}; +pub use example::Example as RawExample; +pub use module::*; +pub use module_ext::*; +pub use predicted::{CallMetadata, ConstraintResult, FieldMeta, Predicted}; +pub use prediction::*; +pub use schema::{FieldMetadataSpec, FieldPath, FieldSchema, InputRenderSpec, SignatureSchema}; +pub use signature::*; +pub use specials::*; +pub use usage::*; + +pub use bamltype::BamlConvertError; +pub use bamltype::BamlType; +pub use bamltype::Shape; +pub use bamltype::baml_types::{ + BamlValue, Constraint, ConstraintLevel, ResponseCheck, StreamingMode, TypeIR, +}; +pub use bamltype::internal_baml_jinja::types::{OutputFormatContent, RenderOptions}; +pub use bamltype::jsonish::deserializer::deserialize_flags::Flag; +pub use facet::Facet; + +#[derive(Clone, Debug, serde::Serialize)] +pub struct TrackedValue { + pub value: serde_json::Value, + #[serde(skip)] + pub source: Option<(usize, String)>, +} + +impl TrackedValue { + pub fn new(value: serde_json::Value, source: Option<(usize, String)>) -> Self { + Self { value, source } + } +} diff --git a/crates/dspy-rs/src/core/module.rs b/crates/dsrs-core/src/module.rs similarity index 99% rename from crates/dspy-rs/src/core/module.rs rename to crates/dsrs-core/src/module.rs index 234998e3..5c08418e 100644 --- a/crates/dspy-rs/src/core/module.rs +++ b/crates/dsrs-core/src/module.rs @@ -87,10 +87,9 @@ pub trait Module: Send + Sync { /// /// Shows a progress bar on stderr. Use [`forward_all_with_progress`] to disable it. /// -/// ```no_run +/// ```ignore /// # async fn example() -> Result<(), Box> { -/// use dspy_rs::*; -/// use dspy_rs::doctest::*; +/// use dsrs_core::*; /// /// let predict = Predict::::new(); /// let inputs = vec![ diff --git a/crates/dspy-rs/src/core/module_ext.rs b/crates/dsrs-core/src/module_ext.rs similarity index 100% rename from crates/dspy-rs/src/core/module_ext.rs rename to crates/dsrs-core/src/module_ext.rs diff --git a/crates/dspy-rs/src/core/predicted.rs b/crates/dsrs-core/src/predicted.rs similarity index 98% rename from crates/dspy-rs/src/core/predicted.rs rename to crates/dsrs-core/src/predicted.rs index c40940c2..2dd5c67d 100644 --- a/crates/dspy-rs/src/core/predicted.rs +++ b/crates/dsrs-core/src/predicted.rs @@ -37,7 +37,7 @@ pub struct ConstraintResult { /// all live here. /// /// ``` -/// use dspy_rs::CallMetadata; +/// use dsrs_core::CallMetadata; /// /// let meta = CallMetadata::default(); /// assert_eq!(meta.lm_usage.total_tokens, 0); @@ -151,7 +151,7 @@ impl CallMetadata { /// limitation. /// /// ``` -/// use dspy_rs::{Predicted, CallMetadata}; +/// use dsrs_core::{Predicted, CallMetadata}; /// /// #[derive(Debug)] /// struct QAOutput { answer: String } diff --git a/crates/dspy-rs/src/data/prediction.rs b/crates/dsrs-core/src/prediction.rs similarity index 95% rename from crates/dspy-rs/src/data/prediction.rs rename to crates/dsrs-core/src/prediction.rs index 62180db4..5e7501fe 100644 --- a/crates/dspy-rs/src/data/prediction.rs +++ b/crates/dsrs-core/src/prediction.rs @@ -39,9 +39,9 @@ impl Prediction { .clone() } - pub fn get_tracked(&self, key: &str) -> crate::trace::TrackedValue { + pub fn get_tracked(&self, key: &str) -> crate::TrackedValue { let val = self.get(key, None); - crate::trace::TrackedValue { + crate::TrackedValue { value: val, source: self.node_id.map(|id| (id, key.to_string())), } diff --git a/crates/dspy-rs/src/core/schema.rs b/crates/dsrs-core/src/schema.rs similarity index 100% rename from crates/dspy-rs/src/core/schema.rs rename to crates/dsrs-core/src/schema.rs diff --git a/crates/dspy-rs/src/core/signature.rs b/crates/dsrs-core/src/signature.rs similarity index 98% rename from crates/dspy-rs/src/core/signature.rs rename to crates/dsrs-core/src/signature.rs index 616917a9..b44acc0d 100644 --- a/crates/dspy-rs/src/core/signature.rs +++ b/crates/dsrs-core/src/signature.rs @@ -28,9 +28,8 @@ pub enum ConstraintKind { /// following this instruction." You define it, the system handles prompt formatting, /// response parsing, and type checking. /// -/// ``` -/// use dspy_rs::*; -/// use dspy_rs::doctest::*; +/// ```ignore +/// use dsrs_core::*; /// /// // The derive generates QAInput { question } and QAOutput { answer } /// let _input = QAInput { question: "What is 2+2?".into() }; diff --git a/crates/dspy-rs/src/core/specials.rs b/crates/dsrs-core/src/specials.rs similarity index 100% rename from crates/dspy-rs/src/core/specials.rs rename to crates/dsrs-core/src/specials.rs diff --git a/crates/dspy-rs/src/core/lm/usage.rs b/crates/dsrs-core/src/usage.rs similarity index 100% rename from crates/dspy-rs/src/core/lm/usage.rs rename to crates/dsrs-core/src/usage.rs From bae0342cce0d82fab3892a471fd563ac1e400d94 Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:56:01 -0700 Subject: [PATCH 06/15] feat(trace): extract execution graph into dsrs-trace Moves trace context, DAG, executor, and tracked-value conversion into dsrs-trace with a direct dependency on dsrs-core row and prediction types. Why now: trace is a concrete leaf over core data, so extracting it before cache/LM prevents observability code from staying entangled with the monolith while later crates move. The old dspy-rs trace module is temporarily a pass-through to keep intermediate workspace tests compiling; the final split still removes dspy-rs. Verification: cargo check --workspace; cargo test -p dsrs-trace; cargo test --workspace --no-run. Scaffolding: dsrs-trace has no dedicated tests yet because the existing trace coverage is still exercised through dspy-rs examples/tests until dsrs-predict owns call recording. --- Cargo.lock | 8 ++++++ crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/trace/mod.rs | 25 +------------------ crates/dsrs-trace/Cargo.toml | 5 ++++ .../src/trace => dsrs-trace/src}/context.rs | 6 ++--- .../src/trace => dsrs-trace/src}/dag.rs | 2 +- .../src/trace => dsrs-trace/src}/executor.rs | 4 +-- crates/dsrs-trace/src/lib.rs | 12 ++++++++- .../src/trace => dsrs-trace/src}/value.rs | 0 9 files changed, 32 insertions(+), 31 deletions(-) rename crates/{dspy-rs/src/trace => dsrs-trace/src}/context.rs (96%) rename crates/{dspy-rs/src/trace => dsrs-trace/src}/dag.rs (98%) rename crates/{dspy-rs/src/trace => dsrs-trace/src}/executor.rs (97%) rename crates/{dspy-rs/src/trace => dsrs-trace/src}/value.rs (100%) diff --git a/Cargo.lock b/Cargo.lock index 6392c1b8..70c7a51a 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1266,6 +1266,7 @@ dependencies = [ "bon", "csv", "dsrs-core", + "dsrs-trace", "dsrs_macros", "enum_dispatch", "facet", @@ -1350,6 +1351,13 @@ version = "0.0.0" [[package]] name = "dsrs-trace" version = "0.0.0" +dependencies = [ + "anyhow", + "dsrs-core", + "serde_json", + "tokio", + "tracing", +] [[package]] name = "dsrs_macros" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index d83b7ade..fe1b034b 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -27,6 +27,7 @@ anyhow = "1.0.99" bon = "3.7.0" bamltype = { path = "../bamltype" } dsrs-core = { path = "../dsrs-core" } +dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } thiserror = "2.0.17" diff --git a/crates/dspy-rs/src/trace/mod.rs b/crates/dspy-rs/src/trace/mod.rs index ff12a365..526262cf 100644 --- a/crates/dspy-rs/src/trace/mod.rs +++ b/crates/dspy-rs/src/trace/mod.rs @@ -1,24 +1 @@ -//! Execution graph recording for debugging and inspection. -//! -//! Wrap a module call in [`trace()`] to capture a DAG of every [`Predict`](crate::Predict) -//! invocation, with inputs and outputs at each node. The trace is scoped — only calls -//! within the closure are recorded. The resulting [`Graph`] can be inspected or replayed -//! via the [`Executor`]. -//! -//! ```ignore -//! let (result, graph) = dspy_rs::trace::trace(|| module.call(input)).await; -//! println!("{} nodes recorded", graph.nodes.len()); -//! ``` -//! -//! This is a debugging tool, not a performance tool. The `Mutex` inside the -//! trace scope adds synchronization overhead. Don't trace in production hot paths. - -pub mod context; -pub mod dag; -pub mod executor; -pub mod value; - -pub use context::*; -pub use dag::*; -pub use executor::*; -pub use value::*; +pub use dsrs_trace::*; diff --git a/crates/dsrs-trace/Cargo.toml b/crates/dsrs-trace/Cargo.toml index e6cfe017..8c81af74 100644 --- a/crates/dsrs-trace/Cargo.toml +++ b/crates/dsrs-trace/Cargo.toml @@ -8,3 +8,8 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs tracing and execution graph support." [dependencies] +anyhow = "1.0.99" +dsrs-core = { path = "../dsrs-core" } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +tokio = { version = "1.46.1", features = ["rt", "macros", "sync"] } +tracing = "0.1.44" diff --git a/crates/dspy-rs/src/trace/context.rs b/crates/dsrs-trace/src/context.rs similarity index 96% rename from crates/dspy-rs/src/trace/context.rs rename to crates/dsrs-trace/src/context.rs index fdf550bc..3287ce05 100644 --- a/crates/dspy-rs/src/trace/context.rs +++ b/crates/dsrs-trace/src/context.rs @@ -1,5 +1,5 @@ -use crate::Prediction; -use crate::trace::dag::{Graph, NodeType}; +use crate::dag::{Graph, NodeType}; +use dsrs_core::{Prediction, RawExample}; use std::sync::{Arc, Mutex}; use tokio::task_local; use tracing::{debug, trace}; @@ -54,7 +54,7 @@ pub fn is_tracing() -> bool { pub fn record_node( node_type: NodeType, inputs: Vec, - input_data: Option, + input_data: Option, ) -> Option { let input_count = inputs.len(); let has_input_data = input_data.is_some(); diff --git a/crates/dspy-rs/src/trace/dag.rs b/crates/dsrs-trace/src/dag.rs similarity index 98% rename from crates/dspy-rs/src/trace/dag.rs rename to crates/dsrs-trace/src/dag.rs index 3a98e7ea..edb165f8 100644 --- a/crates/dspy-rs/src/trace/dag.rs +++ b/crates/dsrs-trace/src/dag.rs @@ -1,4 +1,4 @@ -use crate::{Prediction, RawExample}; +use dsrs_core::{Prediction, RawExample}; use std::fmt; /// The kind of operation a trace node represents. diff --git a/crates/dspy-rs/src/trace/executor.rs b/crates/dsrs-trace/src/executor.rs similarity index 97% rename from crates/dspy-rs/src/trace/executor.rs rename to crates/dsrs-trace/src/executor.rs index 65071106..c2b88835 100644 --- a/crates/dspy-rs/src/trace/executor.rs +++ b/crates/dsrs-trace/src/executor.rs @@ -1,5 +1,5 @@ -use crate::trace::dag::{Graph, NodeType}; -use crate::{Prediction, RawExample}; +use crate::dag::{Graph, NodeType}; +use dsrs_core::{Prediction, RawExample}; use anyhow::Result; use std::collections::HashMap; diff --git a/crates/dsrs-trace/src/lib.rs b/crates/dsrs-trace/src/lib.rs index 89442986..a4571119 100644 --- a/crates/dsrs-trace/src/lib.rs +++ b/crates/dsrs-trace/src/lib.rs @@ -1 +1,11 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! Execution graph recording for debugging and inspection. + +pub mod context; +pub mod dag; +pub mod executor; +pub mod value; + +pub use context::*; +pub use dag::*; +pub use executor::*; +pub use value::*; diff --git a/crates/dspy-rs/src/trace/value.rs b/crates/dsrs-trace/src/value.rs similarity index 100% rename from crates/dspy-rs/src/trace/value.rs rename to crates/dsrs-trace/src/value.rs From ef7d58b90e056530705aca58fe8e1062a1d7324b Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:58:08 -0700 Subject: [PATCH 07/15] feat(cache): extract response cache into dsrs-cache Moves the Foyer-backed response cache, Cache trait, and CacheEntry into dsrs-cache. The crate now depends on dsrs-core for RawExample and Prediction instead of reaching through dspy-rs. Why this, not waiting for LM: cache is a concrete leaf capability over core row types. Extracting it first lets the later dsrs-lm move depend on cache by name instead of dragging utils along. The old dspy-rs utils::cache module is temporarily a pass-through to keep intermediate imports compiling; final hard cutover removes it with dspy-rs. Verification: cargo check --workspace; cargo test -p dsrs-cache; cargo test --workspace --no-run. Scaffolding: no cache-specific tests moved yet; existing dspy-rs LM cache tests still exercise this code through the temporary pass-through. --- Cargo.lock | 12 ++++++++++++ crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/utils/mod.rs | 4 +++- crates/dsrs-cache/Cargo.toml | 9 +++++++++ .../{dspy-rs/src/utils => dsrs-cache/src}/cache.rs | 2 +- crates/dsrs-cache/src/lib.rs | 6 +++++- 6 files changed, 31 insertions(+), 3 deletions(-) rename crates/{dspy-rs/src/utils => dsrs-cache/src}/cache.rs (99%) diff --git a/Cargo.lock b/Cargo.lock index 70c7a51a..41bbabc5 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1265,6 +1265,7 @@ dependencies = [ "bamltype", "bon", "csv", + "dsrs-cache", "dsrs-core", "dsrs-trace", "dsrs_macros", @@ -1297,6 +1298,17 @@ dependencies = [ [[package]] name = "dsrs-cache" version = "0.0.0" +dependencies = [ + "anyhow", + "async-trait", + "dsrs-core", + "foyer", + "serde", + "serde_json", + "tempfile", + "tokio", + "tracing", +] [[package]] name = "dsrs-core" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index fe1b034b..6e8b5f77 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -27,6 +27,7 @@ anyhow = "1.0.99" bon = "3.7.0" bamltype = { path = "../bamltype" } dsrs-core = { path = "../dsrs-core" } +dsrs-cache = { path = "../dsrs-cache" } dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } diff --git a/crates/dspy-rs/src/utils/mod.rs b/crates/dspy-rs/src/utils/mod.rs index 9462711b..b548bede 100644 --- a/crates/dspy-rs/src/utils/mod.rs +++ b/crates/dspy-rs/src/utils/mod.rs @@ -7,7 +7,9 @@ //! Caching is per-LM-instance and keyed on the full prompt content. Cache entries //! are not shared across LM instances. -pub mod cache; +pub mod cache { + pub use dsrs_cache::*; +} pub mod serde_utils; pub mod telemetry; diff --git a/crates/dsrs-cache/Cargo.toml b/crates/dsrs-cache/Cargo.toml index 71451109..16934c21 100644 --- a/crates/dsrs-cache/Cargo.toml +++ b/crates/dsrs-cache/Cargo.toml @@ -8,3 +8,12 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs response cache support." [dependencies] +anyhow = "1.0.99" +async-trait = "0.1.83" +dsrs-core = { path = "../dsrs-core" } +foyer = { version = "0.20.0", features = ["serde"] } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +tempfile = "3.23.0" +tokio = { version = "1.46.1", features = ["sync"] } +tracing = "0.1.44" diff --git a/crates/dspy-rs/src/utils/cache.rs b/crates/dsrs-cache/src/cache.rs similarity index 99% rename from crates/dspy-rs/src/utils/cache.rs rename to crates/dsrs-cache/src/cache.rs index 866c8cf0..37ee3c1c 100644 --- a/crates/dspy-rs/src/utils/cache.rs +++ b/crates/dsrs-cache/src/cache.rs @@ -7,7 +7,7 @@ use tempfile; use tokio::sync::mpsc; use tracing::{debug, trace, warn}; -use crate::{Prediction, RawExample}; +use dsrs_core::{Prediction, RawExample}; type CacheKey = Vec<(String, Value)>; diff --git a/crates/dsrs-cache/src/lib.rs b/crates/dsrs-cache/src/lib.rs index 89442986..8fee7dec 100644 --- a/crates/dsrs-cache/src/lib.rs +++ b/crates/dsrs-cache/src/lib.rs @@ -1 +1,5 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! LM response caching. + +pub mod cache; + +pub use cache::{Cache, CacheEntry, ResponseCache}; From 93cf840995bb5c205c91df4c8ff95e6170c75add Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 00:59:51 -0700 Subject: [PATCH 08/15] feat(lm): extract LM client and chat adapter into dsrs-lm Moves core/lm, adapter, and global settings together into dsrs-lm. They move as one unit because Settings stores Arc, Predict needs both ChatAdapter and LM, and the adapter owns typed parse/format behavior over dsrs-core schemas. Also lifts typed Example into dsrs-core so dsrs-lm can format demos without depending on dsrs-predict. That keeps the crate DAG pointed upward: predict depends on lm/core, lm depends on core/cache, not the reverse. The dspy-rs adapter/core modules are temporary pass-throughs while later extractions remove the old crate entirely. Verification: cargo check --workspace; cargo test -p dsrs-lm; cargo test --workspace --no-run. Scaffolding: ChatAdapter docs still mention Predict conceptually, but no code dependency on dsrs-predict is introduced. --- Cargo.lock | 19 ++++++ crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/adapter/mod.rs | 24 +------ crates/dspy-rs/src/core/mod.rs | 8 ++- crates/dspy-rs/src/lib.rs | 2 +- crates/dspy-rs/src/predictors/mod.rs | 1 + crates/dspy-rs/src/predictors/predict.rs | 31 +-------- crates/dsrs-core/src/demo.rs | 15 ++++ crates/dsrs-core/src/lib.rs | 2 + crates/dsrs-lm/Cargo.toml | 16 +++++ crates/dsrs-lm/src/adapter.rs | 18 +++++ .../src/adapter => dsrs-lm/src}/chat.rs | 68 +++++++++++-------- crates/dsrs-lm/src/lib.rs | 12 +++- .../src/core => dsrs-lm/src}/lm/chat.rs | 0 .../src}/lm/client_registry.rs | 0 .../src/core => dsrs-lm/src}/lm/mod.rs | 4 +- .../src/core => dsrs-lm/src}/settings.rs | 3 +- 17 files changed, 136 insertions(+), 88 deletions(-) create mode 100644 crates/dsrs-core/src/demo.rs create mode 100644 crates/dsrs-lm/src/adapter.rs rename crates/{dspy-rs/src/adapter => dsrs-lm/src}/chat.rs (95%) rename crates/{dspy-rs/src/core => dsrs-lm/src}/lm/chat.rs (100%) rename crates/{dspy-rs/src/core => dsrs-lm/src}/lm/client_registry.rs (100%) rename crates/{dspy-rs/src/core => dsrs-lm/src}/lm/mod.rs (99%) rename crates/{dspy-rs/src/core => dsrs-lm/src}/settings.rs (93%) diff --git a/Cargo.lock b/Cargo.lock index 41bbabc5..eaa9b45c 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1267,6 +1267,7 @@ dependencies = [ "csv", "dsrs-cache", "dsrs-core", + "dsrs-lm", "dsrs-trace", "dsrs_macros", "enum_dispatch", @@ -1355,6 +1356,24 @@ dependencies = [ [[package]] name = "dsrs-lm" version = "0.0.0" +dependencies = [ + "anyhow", + "bamltype", + "bon", + "dsrs-cache", + "dsrs-core", + "enum_dispatch", + "facet", + "indexmap", + "minijinja", + "regex", + "reqwest 0.13.2", + "rig-core", + "serde", + "serde_json", + "tokio", + "tracing", +] [[package]] name = "dsrs-predict" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index 6e8b5f77..6b8fa918 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -28,6 +28,7 @@ bon = "3.7.0" bamltype = { path = "../bamltype" } dsrs-core = { path = "../dsrs-core" } dsrs-cache = { path = "../dsrs-cache" } +dsrs-lm = { path = "../dsrs-lm" } dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } diff --git a/crates/dspy-rs/src/adapter/mod.rs b/crates/dspy-rs/src/adapter/mod.rs index 5e27576c..12fa552d 100644 --- a/crates/dspy-rs/src/adapter/mod.rs +++ b/crates/dspy-rs/src/adapter/mod.rs @@ -1,22 +1,2 @@ -//! Prompt formatting and LM response parsing. -//! -//! The adapter turns a [`SignatureSchema`](crate::SignatureSchema) into prompts and parses -//! LM responses back into typed values. All prompts use the `[[ ## field_name ## ]]` -//! delimiter protocol — input fields, output fields, and the `[[ ## completed ## ]]` -//! marker that signals the end of the response. -//! -//! Most users never touch this — [`Predict`](crate::Predict) calls the adapter internally. -//! Module authors who need fine-grained control over prompt construction use the -//! building blocks directly: [`build_system`](ChatAdapter::build_system), -//! [`format_input`](ChatAdapter::format_input), -//! [`parse_output`](ChatAdapter::parse_output). - -pub mod chat; - -pub use chat::*; - -/// Marker trait for configurable adapters. -/// -/// Typed call paths currently use `ChatAdapter` directly, while global settings keep -/// an adapter instance to preserve public configuration shape. -pub trait Adapter: Send + Sync + 'static {} +pub use dsrs_lm::adapter::*; +pub use dsrs_lm::chat::*; diff --git a/crates/dspy-rs/src/core/mod.rs b/crates/dspy-rs/src/core/mod.rs index 966b6742..4546defc 100644 --- a/crates/dspy-rs/src/core/mod.rs +++ b/crates/dspy-rs/src/core/mod.rs @@ -20,8 +20,12 @@ //! who need fine-grained prompt control also use [`SignatureSchema`] and the adapter //! building blocks directly. -pub mod lm; -pub mod settings; +pub mod lm { + pub use dsrs_lm::lm::*; +} +pub mod settings { + pub use dsrs_lm::settings::*; +} pub use dsrs_core::{ Augmentation, Augmented, BamlConvertError, BamlType, BamlValue, CallMetadata, Constraint, diff --git a/crates/dspy-rs/src/lib.rs b/crates/dspy-rs/src/lib.rs index c15a1e74..ec76d90d 100644 --- a/crates/dspy-rs/src/lib.rs +++ b/crates/dspy-rs/src/lib.rs @@ -109,7 +109,7 @@ pub mod predictors; pub mod trace; pub mod utils; -pub use adapter::chat::*; +pub use adapter::*; pub use dsrs_core::*; pub use core::*; pub use data::dataloader::*; diff --git a/crates/dspy-rs/src/predictors/mod.rs b/crates/dspy-rs/src/predictors/mod.rs index 691e5c76..349c0255 100644 --- a/crates/dspy-rs/src/predictors/mod.rs +++ b/crates/dspy-rs/src/predictors/mod.rs @@ -1,3 +1,4 @@ pub mod predict; +pub use dsrs_core::Example; pub use predict::*; diff --git a/crates/dspy-rs/src/predictors/predict.rs b/crates/dspy-rs/src/predictors/predict.rs index e88e3749..02b9b455 100644 --- a/crates/dspy-rs/src/predictors/predict.rs +++ b/crates/dspy-rs/src/predictors/predict.rs @@ -9,41 +9,12 @@ use std::sync::Arc; use tracing::{debug, trace}; use crate as dsrs; -use dsrs_core::{DynPredictor, Module, PredictAccessorFns, PredictState, Signature}; -use dsrs_core::RawExample; +use dsrs_core::{DynPredictor, Example, Module, PredictAccessorFns, PredictState, RawExample, Signature}; use crate::{ BamlType, BamlValue, CallMetadata, Chat, ChatAdapter, GLOBAL_SETTINGS, LmError, LmUsage, PredictError, Predicted, Prediction, SignatureSchema, }; -/// A typed input/output pair for few-shot prompting. -/// -/// Demos are formatted as user/assistant exchanges in the prompt, showing the LM -/// what good responses look like. The types enforce that demos match the signature — -/// you can't accidentally pass a `QAOutput` demo to a `Predict`. -/// -/// ``` -/// use dspy_rs::*; -/// use dspy_rs::doctest::*; -/// -/// let example = Example::::new( -/// QAInput { question: "What is 2+2?".into() }, -/// QAOutput { answer: "4".into() }, -/// ); -/// ``` -#[derive(Clone, Debug, facet::Facet)] -#[facet(crate = facet)] -pub struct Example { - pub input: S::Input, - pub output: S::Output, -} - -impl Example { - pub fn new(input: S::Input, output: S::Output) -> Self { - Self { input, output } - } -} - fn predict_dyn_visit( value: *mut (), visitor: &mut dyn FnMut(&mut dyn DynPredictor) -> ControlFlow<()>, diff --git a/crates/dsrs-core/src/demo.rs b/crates/dsrs-core/src/demo.rs new file mode 100644 index 00000000..16afaf17 --- /dev/null +++ b/crates/dsrs-core/src/demo.rs @@ -0,0 +1,15 @@ +use crate::Signature; + +/// A typed input/output pair for few-shot prompting. +#[derive(Clone, Debug, facet::Facet)] +#[facet(crate = facet)] +pub struct Example { + pub input: S::Input, + pub output: S::Output, +} + +impl Example { + pub fn new(input: S::Input, output: S::Output) -> Self { + Self { input, output } + } +} diff --git a/crates/dsrs-core/src/lib.rs b/crates/dsrs-core/src/lib.rs index 58d18d21..4ce873a9 100644 --- a/crates/dsrs-core/src/lib.rs +++ b/crates/dsrs-core/src/lib.rs @@ -5,6 +5,7 @@ #![allow(macro_expanded_macro_exports_accessed_by_absolute_paths)] mod augmentation; +mod demo; pub mod dyn_predictor; mod errors; mod example; @@ -18,6 +19,7 @@ mod specials; mod usage; pub use augmentation::*; +pub use demo::*; pub use dyn_predictor::*; pub use errors::{ConversionError, ErrorClass, JsonishError, LmError, ParseError, PredictError}; pub use example::Example as RawExample; diff --git a/crates/dsrs-lm/Cargo.toml b/crates/dsrs-lm/Cargo.toml index f5553384..4de1cb6e 100644 --- a/crates/dsrs-lm/Cargo.toml +++ b/crates/dsrs-lm/Cargo.toml @@ -8,3 +8,19 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs LM integration and chat adapter." [dependencies] +anyhow = "1.0.99" +bon = "3.7.0" +bamltype = { path = "../bamltype" } +dsrs-cache = { path = "../dsrs-cache" } +dsrs-core = { path = "../dsrs-core" } +enum_dispatch = "0.3.13" +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +indexmap = "2.10.0" +minijinja = { git = "https://github.com/boundaryml/minijinja.git", branch = "main", default-features = false, features = ["builtins", "serde"] } +regex = "1.11.2" +reqwest = { version = "0.13", features = ["blocking"] } +rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev = "aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +tokio = { version = "1.46.1", features = ["sync"] } +tracing = "0.1.44" diff --git a/crates/dsrs-lm/src/adapter.rs b/crates/dsrs-lm/src/adapter.rs new file mode 100644 index 00000000..38f2b548 --- /dev/null +++ b/crates/dsrs-lm/src/adapter.rs @@ -0,0 +1,18 @@ +//! Prompt formatting and LM response parsing. +//! +//! The adapter turns a [`SignatureSchema`](crate::SignatureSchema) into prompts and parses +//! LM responses back into typed values. All prompts use the `[[ ## field_name ## ]]` +//! delimiter protocol — input fields, output fields, and the `[[ ## completed ## ]]` +//! marker that signals the end of the response. +//! +//! Most users never touch this — [`Predict`](crate::Predict) calls the adapter internally. +//! Module authors who need fine-grained control over prompt construction use the +//! building blocks directly: [`build_system`](ChatAdapter::build_system), +//! [`format_input`](ChatAdapter::format_input), +//! [`parse_output`](ChatAdapter::parse_output). + +/// Marker trait for configurable adapters. +/// +/// Typed call paths currently use `ChatAdapter` directly, while global settings keep +/// an adapter instance to preserve public configuration shape. +pub trait Adapter: Send + Sync + 'static {} diff --git a/crates/dspy-rs/src/adapter/chat.rs b/crates/dsrs-lm/src/chat.rs similarity index 95% rename from crates/dspy-rs/src/adapter/chat.rs rename to crates/dsrs-lm/src/chat.rs index 8ff78cb8..0b47ca60 100644 --- a/crates/dspy-rs/src/adapter/chat.rs +++ b/crates/dsrs-lm/src/chat.rs @@ -12,17 +12,17 @@ use std::collections::HashMap; use std::sync::{LazyLock, Mutex}; use tracing::{debug, trace}; -use super::Adapter; -use crate::CallMetadata; -use crate::{ - BamlType, BamlValue, ConstraintLevel, ConstraintResult, FieldMeta, Flag, InputRenderSpec, - JsonishError, Message, OutputFormatContent, ParseError, PredictError, Predicted, RenderOptions, - Signature, TypeIR, +use crate::Adapter; +use crate::Message; +use dsrs_core::{ + BamlType, BamlValue, CallMetadata, ConstraintLevel, ConstraintResult, Example, FieldMeta, + FieldPath, FieldSchema, Flag, InputRenderSpec, JsonishError, LmUsage, OutputFormatContent, + ParseError, PredictError, Predicted, RenderOptions, Signature, SignatureSchema, TypeIR, }; /// Builds prompts and parses responses using the `[[ ## field ## ]]` delimiter protocol. /// -/// The adapter is stateless — all state comes from the [`SignatureSchema`](crate::SignatureSchema) +/// The adapter is stateless — all state comes from the [`SignatureSchema`](SignatureSchema) /// passed to each method. Two usage patterns: /// /// - **High-level** (what [`Predict`](crate::Predict) uses): `format_system_message_typed`, @@ -112,6 +112,18 @@ fn truncate_filter( Ok(format!("{truncated}{end}")) } +fn truncate_preview(value: &str, max_chars: usize) -> &str { + if value.chars().count() <= max_chars { + return value; + } + let byte_end = value + .char_indices() + .nth(max_chars) + .map(|(idx, _)| idx) + .unwrap_or(value.len()); + &value[..byte_end] +} + fn build_input_render_environment() -> minijinja::Environment<'static> { // Keep this setup aligned with BAML's jinja env defaults, then add contrib filters. let mut env = minijinja::Environment::new(); @@ -302,7 +314,7 @@ fn format_schema_for_prompt(schema: &str) -> String { impl ChatAdapter { fn format_task_description_schema( &self, - schema: &crate::SignatureSchema, + schema: &SignatureSchema, instruction_override: Option<&str>, ) -> String { let instruction = instruction_override.unwrap_or(schema.instruction()); @@ -334,7 +346,7 @@ impl ChatAdapter { format!("In adhering to this structure, your objective is: {indented}") } - fn format_response_instructions_schema(&self, schema: &crate::SignatureSchema) -> String { + fn format_response_instructions_schema(&self, schema: &SignatureSchema) -> String { let mut output_fields = schema.output_fields().iter(); let Some(first_field) = output_fields.next() else { return "Respond with the marker for `[[ ## completed ## ]]`.".to_string(); @@ -382,7 +394,7 @@ impl ChatAdapter { self.build_system(S::schema(), instruction_override) } - /// Builds a system message from a [`SignatureSchema`](crate::SignatureSchema) directly. + /// Builds a system message from a [`SignatureSchema`](SignatureSchema) directly. /// /// The schema-based equivalent of [`format_system_message_typed_with_instruction`](ChatAdapter::format_system_message_typed_with_instruction). /// Use this when you have a schema but not a concrete `S: Signature` type (e.g. @@ -393,7 +405,7 @@ impl ChatAdapter { /// Returns an error if the output format rendering fails (malformed type IR). pub fn build_system( &self, - schema: &crate::SignatureSchema, + schema: &SignatureSchema, instruction_override: Option<&str>, ) -> Result { let parts = [ @@ -408,7 +420,7 @@ impl ChatAdapter { Ok(system) } - fn format_field_descriptions_schema(&self, schema: &crate::SignatureSchema) -> String { + fn format_field_descriptions_schema(&self, schema: &SignatureSchema) -> String { let output_format = schema.output_format(); let mut lines = Vec::new(); @@ -438,7 +450,7 @@ impl ChatAdapter { lines.join("\n") } - fn format_field_structure_schema(&self, schema: &crate::SignatureSchema) -> Result { + fn format_field_structure_schema(&self, schema: &SignatureSchema) -> Result { let mut lines = vec![ "All interactions will be structured in the following way, with the appropriate values filled in.".to_string(), String::new(), @@ -485,12 +497,12 @@ impl ChatAdapter { /// Formats an input value using a schema — the building-block version of /// [`format_user_message_typed`](ChatAdapter::format_user_message_typed). /// - /// Navigates the `BamlValue` using each field's [`FieldPath`](crate::FieldPath) to + /// Navigates the `BamlValue` using each field's [`FieldPath`](FieldPath) to /// handle flattened structs correctly. A field with path `["inner", "question"]` is /// extracted from the nested structure but rendered as a flat `[[ ## question ## ]]` /// section in the prompt. Appends response instructions so the LM sees /// output-field ordering guidance in the latest user turn. - pub fn format_input(&self, schema: &crate::SignatureSchema, input: &I) -> String + pub fn format_input(&self, schema: &SignatureSchema, input: &I) -> String where I: BamlType + for<'a> facet::Facet<'a>, { @@ -532,7 +544,7 @@ impl ChatAdapter { /// Formats an output value using a schema — the building-block version of /// [`format_assistant_message_typed`](ChatAdapter::format_assistant_message_typed). - pub fn format_output(&self, schema: &crate::SignatureSchema, output: &O) -> String + pub fn format_output(&self, schema: &SignatureSchema, output: &O) -> String where O: BamlType + for<'a> facet::Facet<'a>, { @@ -560,7 +572,7 @@ impl ChatAdapter { /// and [`format_assistant_message_typed`](ChatAdapter::format_assistant_message_typed). pub fn format_demo_typed( &self, - demo: &crate::predictors::Example, + demo: &Example, ) -> (String, String) where S::Input: BamlType, @@ -619,7 +631,7 @@ impl ChatAdapter { /// Same as [`parse_response_typed`](ChatAdapter::parse_response_typed). pub fn parse_output_with_meta( &self, - schema: &crate::SignatureSchema, + schema: &SignatureSchema, response: &Message, ) -> std::result::Result<(O, IndexMap), ParseError> where @@ -664,7 +676,7 @@ impl ChatAdapter { ); trace!( field = %rust_name, - raw_preview = %crate::truncate(&raw_text, 160), + raw_preview = %truncate_preview(&raw_text, 160), "typed coercion failed preview" ); errors.push(ParseError::CoercionFailed { @@ -792,7 +804,7 @@ impl ChatAdapter { /// Convenience wrapper around [`parse_output_with_meta`](ChatAdapter::parse_output_with_meta). pub fn parse_output( &self, - schema: &crate::SignatureSchema, + schema: &SignatureSchema, response: &Message, ) -> std::result::Result where @@ -808,7 +820,7 @@ impl ChatAdapter { /// is included as a section (usually empty). Duplicate section names keep the first /// occurrence. Content before the first delimiter is discarded. pub fn parse_sections(content: &str) -> IndexMap { - crate::adapter::chat::parse_sections(content) + parse_sections(content) } /// Parses a raw [`Message`] into a [`Predicted`](crate::Predicted). @@ -834,11 +846,11 @@ impl ChatAdapter { .map_err(|source| PredictError::Parse { source, raw_response: raw_response.clone(), - lm_usage: crate::LmUsage::default(), + lm_usage: LmUsage::default(), })?; let metadata = CallMetadata::new( raw_response, - crate::LmUsage::default(), + LmUsage::default(), Vec::new(), Vec::new(), None, @@ -886,7 +898,7 @@ fn parse_sections(content: &str) -> IndexMap { fn value_for_path_relaxed<'a>( value: &'a BamlValue, - path: &crate::FieldPath, + path: &FieldPath, ) -> Option<&'a BamlValue> { let mut current = value; let parts: Vec<_> = path.iter().collect(); @@ -924,7 +936,7 @@ fn value_for_path_relaxed<'a>( fn insert_baml_at_path( root: &mut bamltype::baml_types::BamlMap, - path: &crate::FieldPath, + path: &FieldPath, value: BamlValue, ) { let parts: Vec<_> = path.iter().collect(); @@ -970,7 +982,7 @@ fn format_baml_value_for_prompt(value: &BamlValue) -> String { } fn render_input_field( - field_spec: &crate::FieldSchema, + field_spec: &FieldSchema, value: &BamlValue, input: &Value, output_format: &OutputFormatContent, @@ -992,7 +1004,7 @@ fn render_input_field( } } -fn build_input_context_value(schema: &crate::SignatureSchema, root: &BamlValue) -> Value { +fn build_input_context_value(schema: &SignatureSchema, root: &BamlValue) -> Value { let mut input_json = baml_value_to_render_json(root); let Some(root_map) = input_json.as_object_mut() else { return input_json; @@ -1023,7 +1035,7 @@ fn baml_value_to_render_json(value: &BamlValue) -> Value { fn render_input_field_jinja( template: &'static str, - field_spec: &crate::FieldSchema, + field_spec: &FieldSchema, value: &BamlValue, input: &Value, _output_format: &OutputFormatContent, diff --git a/crates/dsrs-lm/src/lib.rs b/crates/dsrs-lm/src/lib.rs index 89442986..ba5fc6d3 100644 --- a/crates/dsrs-lm/src/lib.rs +++ b/crates/dsrs-lm/src/lib.rs @@ -1 +1,11 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! LM client, chat adapter, and global LM settings for DSRs. + +pub mod adapter; +pub mod chat; +pub mod lm; +pub mod settings; + +pub use adapter::*; +pub use chat::*; +pub use lm::*; +pub use settings::*; diff --git a/crates/dspy-rs/src/core/lm/chat.rs b/crates/dsrs-lm/src/lm/chat.rs similarity index 100% rename from crates/dspy-rs/src/core/lm/chat.rs rename to crates/dsrs-lm/src/lm/chat.rs diff --git a/crates/dspy-rs/src/core/lm/client_registry.rs b/crates/dsrs-lm/src/lm/client_registry.rs similarity index 100% rename from crates/dspy-rs/src/core/lm/client_registry.rs rename to crates/dsrs-lm/src/lm/client_registry.rs diff --git a/crates/dspy-rs/src/core/lm/mod.rs b/crates/dsrs-lm/src/lm/mod.rs similarity index 99% rename from crates/dspy-rs/src/core/lm/mod.rs rename to crates/dsrs-lm/src/lm/mod.rs index c96cc603..ba5e46b6 100644 --- a/crates/dspy-rs/src/core/lm/mod.rs +++ b/crates/dsrs-lm/src/lm/mod.rs @@ -13,8 +13,8 @@ use std::{collections::HashMap, sync::Arc}; use tokio::sync::Mutex; use tracing::{Instrument, debug, trace, warn}; -use crate::utils::cache::CacheEntry; -use crate::{Cache, Prediction, RawExample, ResponseCache}; +use dsrs_cache::{Cache, CacheEntry, ResponseCache}; +use dsrs_core::{Prediction, RawExample}; #[derive(Clone, Debug)] pub struct LMResponse { diff --git a/crates/dspy-rs/src/core/settings.rs b/crates/dsrs-lm/src/settings.rs similarity index 93% rename from crates/dspy-rs/src/core/settings.rs rename to crates/dsrs-lm/src/settings.rs index 8a3416e3..abcd5091 100644 --- a/crates/dspy-rs/src/core/settings.rs +++ b/crates/dsrs-lm/src/settings.rs @@ -1,7 +1,6 @@ use std::sync::{Arc, LazyLock, RwLock}; -use super::LM; -use crate::adapter::Adapter; +use crate::{Adapter, LM}; pub struct Settings { pub lm: Arc, From 11b3d79c2337c20a8f3260ed7d4c510a42507ece Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:04:52 -0700 Subject: [PATCH 09/15] feat(evaluate): extract typed metrics into dsrs-evaluate Moves evaluator, MetricOutcome/TypedMetric, feedback metrics, execution traces, and feedback helper functions into dsrs-evaluate. Boundary check: dsrs-evaluate depends only on dsrs-core plus serde/anyhow. The typed Example move from the LM extraction lets evaluation stay independent of dsrs-predict and LM, matching the design's permanent pure metric surface. Fixed the moved doctest import from dspy_rs to dsrs_evaluate and cleared a rebuild-only target/ disk-full blocker before rerunning verification. Verification: cargo check --workspace; cargo test -p dsrs-evaluate; cargo test --workspace --no-run. Scaffolding: dspy-rs::evaluate is a temporary pass-through until the final dspy-rs deletion. --- Cargo.lock | 7 ++++++ crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/evaluate/mod.rs | 23 +------------------ crates/dsrs-evaluate/Cargo.toml | 4 ++++ .../src}/evaluator.rs | 6 ++--- .../src}/feedback.rs | 4 ++-- .../src}/feedback_helpers.rs | 0 crates/dsrs-evaluate/src/lib.rs | 11 ++++++++- .../evaluate => dsrs-evaluate/src}/metrics.rs | 0 9 files changed, 27 insertions(+), 29 deletions(-) rename crates/{dspy-rs/src/evaluate => dsrs-evaluate/src}/evaluator.rs (97%) rename crates/{dspy-rs/src/evaluate => dsrs-evaluate/src}/feedback.rs (99%) rename crates/{dspy-rs/src/evaluate => dsrs-evaluate/src}/feedback_helpers.rs (100%) rename crates/{dspy-rs/src/evaluate => dsrs-evaluate/src}/metrics.rs (100%) diff --git a/Cargo.lock b/Cargo.lock index eaa9b45c..9100f7e6 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1267,6 +1267,7 @@ dependencies = [ "csv", "dsrs-cache", "dsrs-core", + "dsrs-evaluate", "dsrs-lm", "dsrs-trace", "dsrs_macros", @@ -1338,6 +1339,12 @@ version = "0.0.0" [[package]] name = "dsrs-evaluate" version = "0.0.0" +dependencies = [ + "anyhow", + "dsrs-core", + "serde", + "serde_json", +] [[package]] name = "dsrs-gepa" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index 6b8fa918..273166d5 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -29,6 +29,7 @@ bamltype = { path = "../bamltype" } dsrs-core = { path = "../dsrs-core" } dsrs-cache = { path = "../dsrs-cache" } dsrs-lm = { path = "../dsrs-lm" } +dsrs-evaluate = { path = "../dsrs-evaluate" } dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } diff --git a/crates/dspy-rs/src/evaluate/mod.rs b/crates/dspy-rs/src/evaluate/mod.rs index 410eb298..4af05aa8 100644 --- a/crates/dspy-rs/src/evaluate/mod.rs +++ b/crates/dspy-rs/src/evaluate/mod.rs @@ -1,22 +1 @@ -//! Evaluation and metrics for measuring module performance. -//! -//! The evaluation loop is simple: run the module on each training example, score the -//! result with a [`TypedMetric`], collect [`MetricOutcome`]s. Optimizers use this -//! internally, but you can also call [`evaluate_trainset`] directly to benchmark -//! your module before and after optimization. -//! -//! Two kinds of metrics: -//! - **Score-only** — return [`MetricOutcome::score()`] with a `f32`. Enough for -//! [`COPRO`](crate::COPRO) and [`MIPROv2`](crate::MIPROv2). -//! - **Score + feedback** — return [`MetricOutcome::with_feedback()`] with a -//! [`FeedbackMetric`]. Required by [`GEPA`](crate::GEPA), which uses the textual -//! feedback to guide evolutionary search. - -pub mod evaluator; -pub mod feedback; -pub mod feedback_helpers; -pub mod metrics; - -pub use evaluator::*; -pub use feedback::*; -pub use feedback_helpers::*; +pub use dsrs_evaluate::*; diff --git a/crates/dsrs-evaluate/Cargo.toml b/crates/dsrs-evaluate/Cargo.toml index 12b2f56c..d284bba2 100644 --- a/crates/dsrs-evaluate/Cargo.toml +++ b/crates/dsrs-evaluate/Cargo.toml @@ -8,3 +8,7 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs typed evaluation and metric support." [dependencies] +anyhow = "1.0.99" +dsrs-core = { path = "../dsrs-core" } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } diff --git a/crates/dspy-rs/src/evaluate/evaluator.rs b/crates/dsrs-evaluate/src/evaluator.rs similarity index 97% rename from crates/dspy-rs/src/evaluate/evaluator.rs rename to crates/dsrs-evaluate/src/evaluator.rs index 8e7052ca..2f9dd32b 100644 --- a/crates/dspy-rs/src/evaluate/evaluator.rs +++ b/crates/dsrs-evaluate/src/evaluator.rs @@ -1,10 +1,8 @@ use anyhow::{Result, anyhow}; -use crate::core::Module; -use crate::predictors::Example; -use crate::{Predicted, Signature}; +use dsrs_core::{Example, Module, Predicted, Signature}; -use super::FeedbackMetric; +use crate::FeedbackMetric; /// Result of evaluating a single example: a score and optional textual feedback. /// diff --git a/crates/dspy-rs/src/evaluate/feedback.rs b/crates/dsrs-evaluate/src/feedback.rs similarity index 99% rename from crates/dspy-rs/src/evaluate/feedback.rs rename to crates/dsrs-evaluate/src/feedback.rs index 25ae4593..619f5a8e 100644 --- a/crates/dspy-rs/src/evaluate/feedback.rs +++ b/crates/dsrs-evaluate/src/feedback.rs @@ -1,4 +1,4 @@ -use crate::{BamlValue, RawExample}; +use dsrs_core::{BamlValue, RawExample}; use serde::{Deserialize, Serialize}; use std::collections::HashMap; @@ -16,7 +16,7 @@ use std::collections::HashMap; /// // and the Pareto frontier can operate on the full vector instead of a scalar collapse. /// /// ``` -/// use dspy_rs::FeedbackMetric; +/// use dsrs_evaluate::FeedbackMetric; /// /// let fb = FeedbackMetric::new(0.7, "Correct answer but verbose explanation"); /// assert_eq!(fb.score, 0.7); diff --git a/crates/dspy-rs/src/evaluate/feedback_helpers.rs b/crates/dsrs-evaluate/src/feedback_helpers.rs similarity index 100% rename from crates/dspy-rs/src/evaluate/feedback_helpers.rs rename to crates/dsrs-evaluate/src/feedback_helpers.rs diff --git a/crates/dsrs-evaluate/src/lib.rs b/crates/dsrs-evaluate/src/lib.rs index 89442986..e6d3c2e2 100644 --- a/crates/dsrs-evaluate/src/lib.rs +++ b/crates/dsrs-evaluate/src/lib.rs @@ -1 +1,10 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! Evaluation and metrics for measuring module performance. + +pub mod evaluator; +pub mod feedback; +pub mod feedback_helpers; +pub mod metrics; + +pub use evaluator::*; +pub use feedback::*; +pub use feedback_helpers::*; diff --git a/crates/dspy-rs/src/evaluate/metrics.rs b/crates/dsrs-evaluate/src/metrics.rs similarity index 100% rename from crates/dspy-rs/src/evaluate/metrics.rs rename to crates/dsrs-evaluate/src/metrics.rs From 5caf2f36b5932c055873d964d0c2c1fca5fdad08 Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:10:28 -0700 Subject: [PATCH 10/15] feat(predict): extract typed predictors and strategy modules Predict, ChainOfThought, and ReAct now live in dsrs-predict instead of the dspy-rs source tree. The old crate keeps only pass-through module exports for this intermediate checkpoint so the rest of the workspace can keep compiling while tests and examples are redistributed later. The obvious split hit the macro runtime first: derives inside dsrs-predict could not resolve dspy-rs because the new crate rightly does not depend on the old facade. This commit cuts dsrs-macros over to dsrs-core paths and moves macro support exports into dsrs-core, so generated Signature/Augmentation impls no longer require the aggregator. Verification: - cargo check --workspace - cargo test -p dsrs-predict - cargo test --workspace --no-run Scaffolding: dspy-rs still has temporary re-export modules; the final hard cutover deletes that crate after GEPA/data/tests/examples move. --- Cargo.lock | 16 ++++++- crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/modules/mod.rs | 6 +-- crates/dspy-rs/src/predictors/mod.rs | 5 +-- crates/dsrs-core/src/lib.rs | 12 +++++- crates/dsrs-macros/Cargo.toml | 3 +- crates/dsrs-macros/src/lib.rs | 6 +-- crates/dsrs-macros/src/runtime_path.rs | 12 +++--- crates/dsrs-macros/tests/signature_derive.rs | 2 +- crates/dsrs-predict/Cargo.toml | 10 +++++ .../src}/chain_of_thought.rs | 23 ++++++---- crates/dsrs-predict/src/lib.rs | 11 ++++- .../src}/predict.rs | 42 +++++++++++-------- .../src/modules => dsrs-predict/src}/react.rs | 8 ++-- 14 files changed, 104 insertions(+), 53 deletions(-) rename crates/{dspy-rs/src/modules => dsrs-predict/src}/chain_of_thought.rs (91%) rename crates/{dspy-rs/src/predictors => dsrs-predict/src}/predict.rs (96%) rename crates/{dspy-rs/src/modules => dsrs-predict/src}/react.rs (98%) diff --git a/Cargo.lock b/Cargo.lock index 9100f7e6..2b984e86 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1269,6 +1269,7 @@ dependencies = [ "dsrs-core", "dsrs-evaluate", "dsrs-lm", + "dsrs-predict", "dsrs-trace", "dsrs_macros", "enum_dispatch", @@ -1385,6 +1386,18 @@ dependencies = [ [[package]] name = "dsrs-predict" version = "0.0.0" +dependencies = [ + "anyhow", + "bamltype", + "dsrs-core", + "dsrs-lm", + "dsrs-trace", + "dsrs_macros", + "facet", + "rig-core", + "serde_json", + "tracing", +] [[package]] name = "dsrs-trace" @@ -1401,7 +1414,8 @@ dependencies = [ name = "dsrs_macros" version = "0.7.2" dependencies = [ - "dspy-rs", + "bamltype", + "dsrs-core", "minijinja", "proc-macro-crate", "proc-macro2", diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index 273166d5..4d1133d4 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -29,6 +29,7 @@ bamltype = { path = "../bamltype" } dsrs-core = { path = "../dsrs-core" } dsrs-cache = { path = "../dsrs-cache" } dsrs-lm = { path = "../dsrs-lm" } +dsrs-predict = { path = "../dsrs-predict" } dsrs-evaluate = { path = "../dsrs-evaluate" } dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. diff --git a/crates/dspy-rs/src/modules/mod.rs b/crates/dspy-rs/src/modules/mod.rs index bb78415a..858fedc9 100644 --- a/crates/dspy-rs/src/modules/mod.rs +++ b/crates/dspy-rs/src/modules/mod.rs @@ -1,5 +1 @@ -pub mod chain_of_thought; -pub mod react; - -pub use chain_of_thought::{ChainOfThought, ChainOfThoughtOutput, Reasoning, WithReasoning}; -pub use react::ReAct; +pub use dsrs_predict::{ChainOfThought, ChainOfThoughtOutput, ReAct, Reasoning, WithReasoning}; diff --git a/crates/dspy-rs/src/predictors/mod.rs b/crates/dspy-rs/src/predictors/mod.rs index 349c0255..db271454 100644 --- a/crates/dspy-rs/src/predictors/mod.rs +++ b/crates/dspy-rs/src/predictors/mod.rs @@ -1,4 +1 @@ -pub mod predict; - -pub use dsrs_core::Example; -pub use predict::*; +pub use dsrs_predict::*; diff --git a/crates/dsrs-core/src/lib.rs b/crates/dsrs-core/src/lib.rs index 4ce873a9..df764619 100644 --- a/crates/dsrs-core/src/lib.rs +++ b/crates/dsrs-core/src/lib.rs @@ -4,7 +4,7 @@ // extension-attr dispatch no longer triggers rust-lang/rust#52234 on in-crate usage. #![allow(macro_expanded_macro_exports_accessed_by_absolute_paths)] -mod augmentation; +pub mod augmentation; mod demo; pub mod dyn_predictor; mod errors; @@ -42,6 +42,16 @@ pub use bamltype::internal_baml_jinja::types::{OutputFormatContent, RenderOption pub use bamltype::jsonish::deserializer::deserialize_flags::Flag; pub use facet::Facet; +#[doc(hidden)] +pub mod __macro_support { + pub use anyhow; + pub use bamltype; + pub use indexmap; + pub use schemars; + pub use serde; + pub use serde_json; +} + #[derive(Clone, Debug, serde::Serialize)] pub struct TrackedValue { pub value: serde_json::Value, diff --git a/crates/dsrs-macros/Cargo.toml b/crates/dsrs-macros/Cargo.toml index 4b666638..d633513b 100644 --- a/crates/dsrs-macros/Cargo.toml +++ b/crates/dsrs-macros/Cargo.toml @@ -22,5 +22,6 @@ serde_json = { version = "1.0.143", features = ["preserve_order"] } minijinja = { git = "https://github.com/boundaryml/minijinja.git", branch = "main", default-features = false, features = ["serde"] } [dev-dependencies] -dspy-rs = { path = "../dspy-rs" } +bamltype = { path = "../bamltype" } +dsrs-core = { path = "../dsrs-core" } trybuild = "1.0.110" diff --git a/crates/dsrs-macros/src/lib.rs b/crates/dsrs-macros/src/lib.rs index c00f3fba..a5e9e958 100644 --- a/crates/dsrs-macros/src/lib.rs +++ b/crates/dsrs-macros/src/lib.rs @@ -11,7 +11,7 @@ use syn::{ mod runtime_path; -use runtime_path::resolve_dspy_rs_path; +use runtime_path::resolve_dsrs_core_path; #[proc_macro_derive( Signature, @@ -19,7 +19,7 @@ use runtime_path::resolve_dspy_rs_path; )] pub fn derive_signature(input: TokenStream) -> TokenStream { let input = parse_macro_input!(input as DeriveInput); - let runtime = match resolve_dspy_rs_path() { + let runtime = match resolve_dsrs_core_path() { Ok(path) => path, Err(err) => return err.to_compile_error().into(), }; @@ -33,7 +33,7 @@ pub fn derive_signature(input: TokenStream) -> TokenStream { #[proc_macro_derive(Augmentation, attributes(output, augment, alias))] pub fn derive_augmentation(input: TokenStream) -> TokenStream { let input = parse_macro_input!(input as DeriveInput); - let runtime = match resolve_dspy_rs_path() { + let runtime = match resolve_dsrs_core_path() { Ok(path) => path, Err(err) => return err.to_compile_error().into(), }; diff --git a/crates/dsrs-macros/src/runtime_path.rs b/crates/dsrs-macros/src/runtime_path.rs index e97307d3..36df33dc 100644 --- a/crates/dsrs-macros/src/runtime_path.rs +++ b/crates/dsrs-macros/src/runtime_path.rs @@ -1,19 +1,19 @@ use proc_macro_crate::{FoundCrate, crate_name}; use proc_macro2::Span; -pub(crate) fn resolve_dspy_rs_path() -> syn::Result { - match crate_name("dspy-rs") { - // `crate` fails in examples/binaries inside the dspy-rs package because +pub(crate) fn resolve_dsrs_core_path() -> syn::Result { + match crate_name("dsrs-core") { + // `crate` fails in examples/binaries inside the dsrs-core package because // there it points at the example crate, not the library. Use the crate - // alias (`extern crate self as dspy_rs`) for a stable path. - Ok(FoundCrate::Itself) => Ok(syn::parse_quote!(::dspy_rs)), + // alias (`extern crate self as dsrs_core`) for a stable path. + Ok(FoundCrate::Itself) => Ok(syn::parse_quote!(::dsrs_core)), Ok(FoundCrate::Name(name)) => { let ident = syn::Ident::new(&name.replace('-', "_"), Span::call_site()); Ok(syn::parse_quote!(::#ident)) } Err(_) => Err(syn::Error::new( Span::call_site(), - "could not resolve `dspy-rs`; add it as a dependency (renamed dependencies are supported)", + "could not resolve `dsrs-core`; add it as a dependency (renamed dependencies are supported)", )), } } diff --git a/crates/dsrs-macros/tests/signature_derive.rs b/crates/dsrs-macros/tests/signature_derive.rs index abe825f6..d6db9640 100644 --- a/crates/dsrs-macros/tests/signature_derive.rs +++ b/crates/dsrs-macros/tests/signature_derive.rs @@ -1,4 +1,4 @@ -use dspy_rs::{BamlType, Facet, InputRenderSpec, Signature as SignatureTrait, SignatureSchema}; +use dsrs_core::{BamlType, Facet, InputRenderSpec, Signature as SignatureTrait, SignatureSchema}; /// Test instruction #[derive(dsrs_macros::Signature, Clone, Debug)] diff --git a/crates/dsrs-predict/Cargo.toml b/crates/dsrs-predict/Cargo.toml index 8cb1196d..a7eccdb6 100644 --- a/crates/dsrs-predict/Cargo.toml +++ b/crates/dsrs-predict/Cargo.toml @@ -8,3 +8,13 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs typed predictors and module implementations." [dependencies] +anyhow = "1.0.99" +bamltype = { path = "../bamltype" } +dsrs-core = { path = "../dsrs-core" } +dsrs-lm = { path = "../dsrs-lm" } +dsrs_macros = { path = "../dsrs-macros" } +dsrs-trace = { path = "../dsrs-trace" } +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev = "aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +tracing = "0.1.44" diff --git a/crates/dspy-rs/src/modules/chain_of_thought.rs b/crates/dsrs-predict/src/chain_of_thought.rs similarity index 91% rename from crates/dspy-rs/src/modules/chain_of_thought.rs rename to crates/dsrs-predict/src/chain_of_thought.rs index db8e3aad..9966f1f4 100644 --- a/crates/dspy-rs/src/modules/chain_of_thought.rs +++ b/crates/dsrs-predict/src/chain_of_thought.rs @@ -1,8 +1,7 @@ -use crate::Augmentation; use dsrs_core::Augmented; -use crate::core::{Module, Signature}; -use crate::predictors::{Example, Predict, PredictBuilder}; -use crate::{BamlType, PredictError, Predicted}; +use dsrs_core::{BamlType, Example, Module, PredictError, Predicted, Signature}; +use crate::{Predict, PredictBuilder}; +use dsrs_lm::LM; /// Augmentation that prepends a `reasoning: String` field to a signature's output. /// @@ -10,7 +9,7 @@ use crate::{BamlType, PredictError, Predicted}; /// field and generates it before the actual answer — this matters because the reasoning /// text is in the context window when the LM produces subsequent fields, so it literally /// has its own chain of thought to draw on. Used by [`ChainOfThought`]. -#[derive(Augmentation, Clone, Debug)] +#[derive(dsrs_macros::Augmentation, Clone, Debug)] #[augment(output, prepend)] pub struct Reasoning { #[output] @@ -28,9 +27,15 @@ pub type ChainOfThoughtOutput = WithReasoning<::Output>; /// real output field, not hidden metadata. /// /// ```no_run -/// # async fn example() -> Result<(), dspy_rs::PredictError> { -/// use dspy_rs::*; -/// use dspy_rs::doctest::*; +/// # async fn example() -> Result<(), dsrs_core::PredictError> { +/// use dsrs_predict::ChainOfThought; +/// #[derive(dsrs_macros::Signature, Clone, Debug)] +/// struct QA { +/// #[input] +/// question: String, +/// #[output] +/// answer: String, +/// } /// /// let cot = ChainOfThought::::new(); /// let result = cot.call(QAInput { question: "What is 2+2?".into() }).await?; @@ -165,7 +170,7 @@ impl ChainOfThoughtBuilder { } /// Sets a per-instance LM, bypassing the global. See [`PredictBuilder::lm`]. - pub fn lm(mut self, lm: crate::core::LM) -> Self { + pub fn lm(mut self, lm: LM) -> Self { self.inner = self.inner.lm(lm); self } diff --git a/crates/dsrs-predict/src/lib.rs b/crates/dsrs-predict/src/lib.rs index 89442986..360b1c69 100644 --- a/crates/dsrs-predict/src/lib.rs +++ b/crates/dsrs-predict/src/lib.rs @@ -1 +1,10 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! Typed predictors and prompting modules. + +pub mod chain_of_thought; +pub mod predict; +pub mod react; + +pub use chain_of_thought::{ChainOfThought, ChainOfThoughtOutput, Reasoning, WithReasoning}; +pub use dsrs_core::Example; +pub use predict::*; +pub use react::ReAct; diff --git a/crates/dspy-rs/src/predictors/predict.rs b/crates/dsrs-predict/src/predict.rs similarity index 96% rename from crates/dspy-rs/src/predictors/predict.rs rename to crates/dsrs-predict/src/predict.rs index 02b9b455..28c97ca9 100644 --- a/crates/dspy-rs/src/predictors/predict.rs +++ b/crates/dsrs-predict/src/predict.rs @@ -8,12 +8,13 @@ use std::ops::ControlFlow; use std::sync::Arc; use tracing::{debug, trace}; -use crate as dsrs; +use dsrs_core as dsrs; use dsrs_core::{DynPredictor, Example, Module, PredictAccessorFns, PredictState, RawExample, Signature}; -use crate::{ - BamlType, BamlValue, CallMetadata, Chat, ChatAdapter, GLOBAL_SETTINGS, LmError, LmUsage, - PredictError, Predicted, Prediction, SignatureSchema, +use dsrs_core::{ + BamlType, BamlValue, CallMetadata, LmError, LmUsage, PredictError, Predicted, Prediction, + SignatureSchema, }; +use dsrs_lm::{Chat, ChatAdapter, GLOBAL_SETTINGS, LM, LMResponse}; fn predict_dyn_visit( value: *mut (), @@ -66,9 +67,16 @@ where /// There is no runtime registration side effect in `new()` or `build()`. /// /// ```no_run -/// # async fn example() -> Result<(), dspy_rs::PredictError> { -/// use dspy_rs::*; -/// use dspy_rs::doctest::*; +/// # async fn example() -> Result<(), dsrs_core::PredictError> { +/// use dsrs_core::Example; +/// use dsrs_predict::Predict; +/// #[derive(dsrs_macros::Signature, Clone, Debug)] +/// struct QA { +/// #[input] +/// question: String, +/// #[output] +/// answer: String, +/// } /// /// // Minimal /// let predict = Predict::::new(); @@ -98,7 +106,7 @@ pub struct Predict { demos: Vec>, instruction_override: Option, #[facet(skip, opaque)] - lm: Option>, + lm: Option>, #[facet(skip, opaque)] _marker: PhantomData, } @@ -133,7 +141,7 @@ impl Predict { demo_count = self.demos.len(), tool_count = self.tools.len(), instruction_override = self.instruction_override.is_some(), - tracing_graph = crate::trace::is_tracing() + tracing_graph = dsrs_trace::is_tracing() ) )] pub async fn call(&self, input: S::Input) -> Result, PredictError> @@ -278,7 +286,7 @@ impl Predict { "lm response received" ); - let crate::core::lm::LMResponse { + let LMResponse { output, usage, chat, @@ -286,9 +294,9 @@ impl Predict { tool_executions, } = response; - let node_id = if crate::trace::is_tracing() { - crate::trace::record_node( - crate::trace::NodeType::Predict { + let node_id = if dsrs_trace::is_tracing() { + dsrs_trace::record_node( + dsrs_trace::NodeType::Predict { signature_name: std::any::type_name::().to_string(), }, vec![], @@ -341,7 +349,7 @@ impl Predict { if let Some(id) = node_id { match prediction_from_output::(&typed_output, lm_usage.clone(), Some(id)) { Ok(prediction) => { - crate::trace::record_output(id, prediction); + dsrs_trace::record_output(id, prediction); trace!(node_id = id, "recorded typed predictor output"); } Err(err) => { @@ -383,7 +391,7 @@ pub struct PredictBuilder { tools: Vec>, demos: Vec>, instruction_override: Option, - lm: Option>, + lm: Option>, _marker: PhantomData, } @@ -440,7 +448,7 @@ impl PredictBuilder { /// .lm(LM::builder().model("anthropic:claude-sonnet-4-20250514").build().await?) /// .build(); /// ``` - pub fn lm(mut self, lm: crate::core::LM) -> Self { + pub fn lm(mut self, lm: LM) -> Self { self.lm = Some(Arc::new(lm)); self } @@ -655,7 +663,7 @@ mod tests { use super::*; use serde_json::json; - #[derive(crate::Signature, Clone, Debug)] + #[derive(dsrs_macros::Signature, Clone, Debug)] struct PredictConversionSig { #[input] prompt: String, diff --git a/crates/dspy-rs/src/modules/react.rs b/crates/dsrs-predict/src/react.rs similarity index 98% rename from crates/dspy-rs/src/modules/react.rs rename to crates/dsrs-predict/src/react.rs index d96aceab..6b67d668 100644 --- a/crates/dspy-rs/src/modules/react.rs +++ b/crates/dsrs-predict/src/react.rs @@ -7,9 +7,9 @@ use rig::message::{ToolCall, ToolFunction}; use rig::tool::{ToolDyn, ToolError}; use rig::wasm_compat::WasmBoxedFuture; -use crate::core::{Module, Signature}; -use crate::predictors::{Predict, PredictBuilder}; -use crate::{BamlType, PredictError, Predicted}; +use dsrs_core::{BamlType, Module, PredictError, Predicted, Signature}; +use dsrs_lm::LM; +use crate::{Predict, PredictBuilder}; /// ReAct action-step schema. #[derive(dsrs_macros::Signature, Clone, Debug)] @@ -326,7 +326,7 @@ where /// Sets a per-instance LM on both the action and extract predictors, /// bypassing the global. See [`PredictBuilder::lm`]. - pub fn lm(mut self, lm: crate::core::LM) -> Self { + pub fn lm(mut self, lm: LM) -> Self { self.action = self.action.lm(lm.clone()); self.extract = self.extract.lm(lm); self From 86c0a48602b20025c67fd3ef9d52c5df9ab0efb2 Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:17:24 -0700 Subject: [PATCH 11/15] feat(gepa): extract GEPA and delete old optimizers GEPA and the Pareto frontier now live in dsrs-gepa. COPRO and MIPROv2 source files are deleted instead of kept behind compatibility exports, matching the hard-cutover design. The non-obvious break was predictor discovery: the Facet walker still recognized only the old dspy_rs::predictors::predict::Predict shape. Updating that identity to dsrs_predict::predict makes GEPA usable after the split, and the extracted GEPA test now exercises state restoration through the new crate boundary. Deleted optimizer-only coverage and examples: - test_miprov2.rs - test_optimizer_named_parameters_integration.rs - test_optimizer_typed_metric.rs - examples/02-module-iteration-and-updation.rs - examples/04-optimize-hotpotqa.rs - examples/08-optimize-mipro.rs - examples/94-smoke-slice5-optimizer-interface.rs Verification: - cargo test -p dsrs-gepa - cargo check --workspace - cargo test --workspace --no-run Scaffolding: dspy-rs still has a temporary optimizer.rs re-export for GEPA while the final tests/examples relocation and crate deletion are pending. --- Cargo.lock | 15 + crates/dspy-rs/Cargo.toml | 1 + .../02-module-iteration-and-updation.rs | 105 ---- .../dspy-rs/examples/04-optimize-hotpotqa.rs | 95 ---- crates/dspy-rs/examples/08-optimize-mipro.rs | 131 ----- .../94-smoke-slice5-optimizer-interface.rs | 73 --- crates/dspy-rs/src/optimizer.rs | 1 + crates/dspy-rs/src/optimizer/copro.rs | 303 ----------- crates/dspy-rs/src/optimizer/mipro.rs | 509 ------------------ crates/dspy-rs/src/optimizer/mod.rs | 147 ----- crates/dspy-rs/tests/test_dataloader.rs | 12 +- crates/dspy-rs/tests/test_miprov2.rs | 140 ----- ..._optimizer_named_parameters_integration.rs | 86 --- .../tests/test_optimizer_typed_metric.rs | 194 ------- .../tests/test_public_api_compile_fail.rs | 4 +- crates/dsrs-core/src/dyn_predictor.rs | 2 +- crates/dsrs-gepa/Cargo.toml | 14 + .../src/optimizer => dsrs-gepa/src}/gepa.rs | 29 +- crates/dsrs-gepa/src/lib.rs | 139 ++++- .../src/optimizer => dsrs-gepa/src}/pareto.rs | 2 +- 20 files changed, 192 insertions(+), 1810 deletions(-) delete mode 100644 crates/dspy-rs/examples/02-module-iteration-and-updation.rs delete mode 100644 crates/dspy-rs/examples/04-optimize-hotpotqa.rs delete mode 100644 crates/dspy-rs/examples/08-optimize-mipro.rs delete mode 100644 crates/dspy-rs/examples/94-smoke-slice5-optimizer-interface.rs create mode 100644 crates/dspy-rs/src/optimizer.rs delete mode 100644 crates/dspy-rs/src/optimizer/copro.rs delete mode 100644 crates/dspy-rs/src/optimizer/mipro.rs delete mode 100644 crates/dspy-rs/src/optimizer/mod.rs delete mode 100644 crates/dspy-rs/tests/test_miprov2.rs delete mode 100644 crates/dspy-rs/tests/test_optimizer_named_parameters_integration.rs delete mode 100644 crates/dspy-rs/tests/test_optimizer_typed_metric.rs rename crates/{dspy-rs/src/optimizer => dsrs-gepa/src}/gepa.rs (96%) rename crates/{dspy-rs/src/optimizer => dsrs-gepa/src}/pareto.rs (99%) diff --git a/Cargo.lock b/Cargo.lock index 2b984e86..69f0bbb0 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1268,6 +1268,7 @@ dependencies = [ "dsrs-cache", "dsrs-core", "dsrs-evaluate", + "dsrs-gepa", "dsrs-lm", "dsrs-predict", "dsrs-trace", @@ -1350,6 +1351,20 @@ dependencies = [ [[package]] name = "dsrs-gepa" version = "0.0.0" +dependencies = [ + "anyhow", + "bon", + "dsrs-core", + "dsrs-evaluate", + "dsrs-lm", + "dsrs-predict", + "dsrs_macros", + "facet", + "rand 0.8.5", + "serde", + "tokio", + "tracing", +] [[package]] name = "dsrs-leaven" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index 4d1133d4..bbdaf6fc 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -31,6 +31,7 @@ dsrs-cache = { path = "../dsrs-cache" } dsrs-lm = { path = "../dsrs-lm" } dsrs-predict = { path = "../dsrs-predict" } dsrs-evaluate = { path = "../dsrs-evaluate" } +dsrs-gepa = { path = "../dsrs-gepa" } dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } diff --git a/crates/dspy-rs/examples/02-module-iteration-and-updation.rs b/crates/dspy-rs/examples/02-module-iteration-and-updation.rs deleted file mode 100644 index d0d09893..00000000 --- a/crates/dspy-rs/examples/02-module-iteration-and-updation.rs +++ /dev/null @@ -1,105 +0,0 @@ -/* -Script to optimize a module via the typed optimizer API. - -Run with: -``` -cargo run --example 02-module-iteration-and-updation -``` -*/ - -use anyhow::Result; -use bon::Builder; -use dspy_rs::{ - COPRO, ChatAdapter, Example, LM, MetricOutcome, Module, Optimizer, Predict, PredictError, - Predicted, Signature, TypedMetric, average_score, configure, evaluate_trainset, init_tracing, -}; - -#[derive(Signature, Clone, Debug)] -struct QA { - #[input] - question: String, - - #[output] - answer: String, -} - -#[derive(Builder, facet::Facet)] -#[facet(crate = facet)] -struct QAModule { - #[builder(default = Predict::::builder().instruction("Answer clearly.").build())] - answerer: Predict, -} - -impl Module for QAModule { - type Input = QAInput; - type Output = QAOutput; - - async fn forward(&self, input: QAInput) -> Result, PredictError> { - self.answerer.call(input).await - } -} - -struct ExactMatch; - -impl TypedMetric for ExactMatch { - async fn evaluate( - &self, - example: &Example, - prediction: &Predicted, - ) -> Result { - let expected = example.output.answer.trim().to_lowercase(); - let actual = prediction.answer.trim().to_lowercase(); - Ok(MetricOutcome::score((expected == actual) as u8 as f32)) - } -} - -fn trainset() -> Vec> { - vec![ - Example::new( - QAInput { - question: "What is 2+2?".to_string(), - }, - QAOutput { - answer: "4".to_string(), - }, - ), - Example::new( - QAInput { - question: "Capital of France?".to_string(), - }, - QAOutput { - answer: "Paris".to_string(), - }, - ), - ] -} - -#[tokio::main] -async fn main() -> Result<()> { - init_tracing()?; - - configure( - LM::builder() - .model("openai:gpt-4o-mini".to_string()) - .build() - .await?, - ChatAdapter, - ); - - let metric = ExactMatch; - let mut module = QAModule::builder().build(); - let trainset = trainset(); - - let baseline = average_score(&evaluate_trainset(&module, &trainset, &metric).await?); - println!("baseline score: {baseline:.3}"); - - let optimizer = COPRO::builder().breadth(4).depth(1).build(); - optimizer - .compile(&mut module, trainset.clone(), &metric) - .await?; - - let optimized = average_score(&evaluate_trainset(&module, &trainset, &metric).await?); - println!("optimized score: {optimized:.3}"); - - Ok(()) -} diff --git a/crates/dspy-rs/examples/04-optimize-hotpotqa.rs b/crates/dspy-rs/examples/04-optimize-hotpotqa.rs deleted file mode 100644 index 0907db86..00000000 --- a/crates/dspy-rs/examples/04-optimize-hotpotqa.rs +++ /dev/null @@ -1,95 +0,0 @@ -/* -Script to optimize a typed QA module for a HotpotQA subset with COPRO. - -Run with: -``` -cargo run --example 04-optimize-hotpotqa --features dataloaders -``` -*/ - -use anyhow::Result; -use bon::Builder; -use dspy_rs::{ - COPRO, ChatAdapter, DataLoader, Example, LM, MetricOutcome, Module, Optimizer, Predict, - PredictError, Predicted, Signature, TypedLoadOptions, TypedMetric, average_score, configure, - evaluate_trainset, init_tracing, -}; - -#[derive(Signature, Clone, Debug)] -struct QA { - /// Concisely answer the question, but be accurate. - - #[input] - question: String, - - #[output(desc = "Answer in less than 5 words.")] - answer: String, -} - -#[derive(Builder, facet::Facet)] -#[facet(crate = facet)] -struct QAModule { - #[builder(default = Predict::::builder().instruction("Answer clearly and briefly.").build())] - answerer: Predict, -} - -impl Module for QAModule { - type Input = QAInput; - type Output = QAOutput; - - async fn forward(&self, input: QAInput) -> Result, PredictError> { - self.answerer.call(input).await - } -} - -struct ExactMatchMetric; - -impl TypedMetric for ExactMatchMetric { - async fn evaluate( - &self, - example: &Example, - prediction: &Predicted, - ) -> Result { - let expected = example.output.answer.trim().to_lowercase(); - let actual = prediction.answer.trim().to_lowercase(); - Ok(MetricOutcome::score((expected == actual) as u8 as f32)) - } -} - -#[tokio::main] -async fn main() -> Result<()> { - init_tracing()?; - - configure( - LM::builder() - .model("openai:gpt-4o-mini".to_string()) - .build() - .await?, - ChatAdapter, - ); - - let examples = DataLoader::load_hf::( - "hotpotqa/hotpot_qa", - "fullwiki", - "validation", - true, - TypedLoadOptions::default(), - )?[..10] - .to_vec(); - - let metric = ExactMatchMetric; - let mut module = QAModule::builder().build(); - - let baseline = average_score(&evaluate_trainset(&module, &examples, &metric).await?); - println!("baseline score: {baseline:.3}"); - - let optimizer = COPRO::builder().breadth(10).depth(1).build(); - optimizer - .compile(&mut module, examples.clone(), &metric) - .await?; - - let optimized = average_score(&evaluate_trainset(&module, &examples, &metric).await?); - println!("optimized score: {optimized:.3}"); - - Ok(()) -} diff --git a/crates/dspy-rs/examples/08-optimize-mipro.rs b/crates/dspy-rs/examples/08-optimize-mipro.rs deleted file mode 100644 index 6fab8439..00000000 --- a/crates/dspy-rs/examples/08-optimize-mipro.rs +++ /dev/null @@ -1,131 +0,0 @@ -/* -Example: optimize a typed QA module using MIPROv2. - -Run with: -``` -cargo run --example 08-optimize-mipro --features dataloaders -``` -*/ - -use anyhow::Result; -use bon::Builder; -use dspy_rs::{ - ChatAdapter, DataLoader, Example, LM, MIPROv2, MetricOutcome, Module, Optimizer, Predict, - PredictError, Predicted, Signature, TypedLoadOptions, TypedMetric, average_score, configure, - evaluate_trainset, init_tracing, -}; - -#[derive(Signature, Clone, Debug)] -struct QuestionAnswering { - /// Answer the question accurately and concisely. - - #[input] - question: String, - - #[output] - answer: String, -} - -#[derive(Builder, facet::Facet)] -#[facet(crate = facet)] -struct SimpleQA { - #[builder(default = Predict::::builder().instruction("Answer clearly.").build())] - answerer: Predict, -} - -impl Module for SimpleQA { - type Input = QuestionAnsweringInput; - type Output = QuestionAnsweringOutput; - - async fn forward( - &self, - input: QuestionAnsweringInput, - ) -> Result, PredictError> { - self.answerer.call(input).await - } -} - -struct ExactMatchMetric; - -impl TypedMetric for ExactMatchMetric { - async fn evaluate( - &self, - example: &Example, - prediction: &Predicted, - ) -> Result { - let expected = example.output.answer.trim().to_lowercase(); - let actual = prediction.answer.trim().to_lowercase(); - - let score = if expected == actual { - 1.0 - } else if expected.contains(&actual) || actual.contains(&expected) { - 0.5 - } else { - 0.0 - }; - - Ok(MetricOutcome::score(score)) - } -} - -#[tokio::main] -async fn main() -> Result<()> { - init_tracing()?; - - println!("=== MIPROv2 Optimizer Example ===\n"); - - configure(LM::default(), ChatAdapter); - - println!("Loading training data from HuggingFace..."); - let train_examples = DataLoader::load_hf::( - "hotpotqa/hotpot_qa", - "fullwiki", - "validation", - true, - TypedLoadOptions::default(), - )?; - - let train_subset = train_examples[..15].to_vec(); - println!("Using {} training examples\n", train_subset.len()); - - let metric = ExactMatchMetric; - let mut qa_module = SimpleQA::builder().build(); - - println!("Evaluating baseline performance..."); - let baseline_score = - average_score(&evaluate_trainset(&qa_module, &train_subset[..5], &metric).await?); - println!("Baseline score: {:.3}\n", baseline_score); - - let optimizer = MIPROv2::builder() - .num_candidates(8) - .num_trials(15) - .minibatch_size(10) - .build(); - - println!("Starting MIPROv2 optimization..."); - optimizer - .compile(&mut qa_module, train_subset.clone(), &metric) - .await?; - - println!("Evaluating optimized performance..."); - let optimized_score = - average_score(&evaluate_trainset(&qa_module, &train_subset[..5], &metric).await?); - println!("Optimized score: {:.3}", optimized_score); - - let improvement = ((optimized_score - baseline_score) / baseline_score.max(1e-6)) * 100.0; - println!( - "\nImprovement: {:.1}% ({:.3} -> {:.3})", - improvement, baseline_score, optimized_score - ); - - let result = qa_module - .call(QuestionAnsweringInput { - question: "What is the capital of France?".to_string(), - }) - .await? - .into_inner(); - println!("Question: What is the capital of France?"); - println!("Answer: {}", result.answer); - - Ok(()) -} diff --git a/crates/dspy-rs/examples/94-smoke-slice5-optimizer-interface.rs b/crates/dspy-rs/examples/94-smoke-slice5-optimizer-interface.rs deleted file mode 100644 index 97d7fec1..00000000 --- a/crates/dspy-rs/examples/94-smoke-slice5-optimizer-interface.rs +++ /dev/null @@ -1,73 +0,0 @@ -use anyhow::{Result, bail}; -use dspy_rs::{ - COPRO, ChainOfThought, ChatAdapter, Example, LM, MetricOutcome, Optimizer, Predicted, - Signature, TypedMetric, WithReasoning, configure, -}; - -#[derive(Signature, Clone, Debug, facet::Facet)] -#[facet(crate = facet)] -struct SmokeSig { - #[input] - prompt: String, - - #[output] - answer: String, -} - -struct SmokeMetric; - -impl TypedMetric> for SmokeMetric { - async fn evaluate( - &self, - _example: &Example, - prediction: &Predicted>, - ) -> Result { - let answer = prediction.answer.to_ascii_lowercase(); - Ok(MetricOutcome::score( - (answer.contains("smoke") || answer.contains("ok")) as u8 as f32, - )) - } -} - -#[tokio::main] -async fn main() -> Result<()> { - // Smoke Label: Slice 5 Optimizer Interface - configure( - LM::builder() - .model("openai:gpt-5.2".to_string()) - .build() - .await?, - ChatAdapter, - ); - - let mut module = ChainOfThought::::new(); - let trainset = vec![Example::new( - SmokeSigInput { - prompt: "Return exactly smoke-ok.".to_string(), - }, - SmokeSigOutput { - answer: "smoke-ok".to_string(), - }, - )]; - - let optimizer = COPRO::builder().breadth(4).depth(1).build(); - optimizer - .compile(&mut module, trainset, &SmokeMetric) - .await?; - - let output = module - .call(SmokeSigInput { - prompt: "Return exactly smoke-ok.".to_string(), - }) - .await? - .into_inner(); - - println!("reasoning: {}", output.reasoning); - println!("answer: {}", output.answer); - - if output.answer.trim().is_empty() { - bail!("unexpected empty answer"); - } - - Ok(()) -} diff --git a/crates/dspy-rs/src/optimizer.rs b/crates/dspy-rs/src/optimizer.rs new file mode 100644 index 00000000..191b2540 --- /dev/null +++ b/crates/dspy-rs/src/optimizer.rs @@ -0,0 +1 @@ +pub use dsrs_gepa::*; diff --git a/crates/dspy-rs/src/optimizer/copro.rs b/crates/dspy-rs/src/optimizer/copro.rs deleted file mode 100644 index 1d97374b..00000000 --- a/crates/dspy-rs/src/optimizer/copro.rs +++ /dev/null @@ -1,303 +0,0 @@ -use anyhow::{Result, anyhow}; -use bon::Builder; - -use dsrs_core::DynPredictor; -use crate::evaluate::{TypedMetric, average_score}; -use crate::optimizer::{ - Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor, -}; -use crate::predictors::Example; -use crate::{Facet, Module, Signature}; - -/// Breadth-first instruction optimizer. -/// -/// COPRO (Collaborative Prompt Optimization) generates `breadth` candidate instructions -/// per predictor, evaluates each on the trainset, keeps the best, then repeats for -/// `depth` rounds. Simple and predictable — good for quick iteration when you want -/// better instructions without complex search. -/// -/// Does not use feedback from the metric — only the numerical score matters. If you -/// have rich textual feedback, use [`GEPA`](crate::GEPA) instead. -/// -/// # Hyperparameters -/// -/// - **`breadth`** (default: 10) — candidates per round per predictor. Higher = more -/// exploration but proportionally more LM calls. Must be > 1. -/// - **`depth`** (default: 3) — optimization rounds. Each round refines the previous -/// best instruction. Diminishing returns beyond ~5. -/// - **`init_temperature`** (default: 1.4) — **currently unused.** Reserved for LM-generated -/// candidate diversity. Setting this has no effect. -/// - **`prompt_model`** — optional separate LM for generating candidate instructions. -/// Falls back to the global LM if unset. -/// -/// # Cost -/// -/// Total LM calls ≈ `breadth × depth × num_predictors × trainset_size`. For a module -/// with 2 predictors, breadth=10, depth=3, and 50 training examples: ~3000 calls. -/// -/// ```ignore -/// let copro = COPRO::builder().breadth(10).depth(3).build(); -/// copro.compile(&mut module, trainset, &metric).await?; -/// ``` -#[derive(Builder)] -pub struct COPRO { - /// Candidate instructions generated per round (must be > 1). - #[builder(default = 10)] - pub breadth: usize, - /// Optimization rounds — each refines the previous best. - #[builder(default = 3)] - pub depth: usize, - /// **Currently unused.** Reserved for controlling LM-generated candidate diversity. - /// Setting this has no effect. - #[builder(default = 1.4)] - pub init_temperature: f32, - /// Whether to track per-round statistics. - #[builder(default = false)] - pub track_stats: bool, - /// Optional separate LM for generating candidate instructions. - pub prompt_model: Option, -} - -impl COPRO { - fn current_instruction(module: &mut M, predictor_name: &str) -> Result - where - M: for<'a> Facet<'a>, - { - with_named_predictor(module, predictor_name, |predictor| { - Ok(predictor.instruction()) - }) - } - - fn set_instruction(module: &mut M, predictor_name: &str, instruction: String) -> Result<()> - where - M: for<'a> Facet<'a>, - { - with_named_predictor(module, predictor_name, |predictor| { - predictor.set_instruction(instruction); - Ok(()) - }) - } - - async fn score_candidate( - &self, - module: &mut M, - predictor_name: &str, - candidate_instruction: &str, - trainset: &[Example], - metric: &MT, - ) -> Result - where - S: Signature, - S::Input: Clone, - M: Module + for<'a> Facet<'a>, - MT: TypedMetric, - { - let original_state = with_named_predictor(module, predictor_name, |predictor| { - Ok(predictor.dump_state()) - })?; - - Self::set_instruction(module, predictor_name, candidate_instruction.to_string())?; - let evaluation = evaluate_module_with_metric(&*module, trainset, metric).await; - - match evaluation { - Ok(outcomes) => { - with_named_predictor(module, predictor_name, |predictor| { - predictor.load_state(original_state.clone()) - })?; - Ok(average_score(&outcomes)) - } - Err(eval_err) => { - if let Err(restore_err) = - with_named_predictor(module, predictor_name, |predictor| { - predictor.load_state(original_state) - }) - { - return Err(anyhow!( - "candidate evaluation failed: {eval_err}; failed to restore predictor state: {restore_err}" - )); - } - Err(eval_err) - } - } - } - - fn candidate_instructions( - &self, - base_instruction: &str, - predictor: &dyn DynPredictor, - depth: usize, - ) -> Vec { - let mut candidates = Vec::with_capacity(self.breadth.max(1)); - candidates.push(base_instruction.to_string()); - - let output_hint = predictor - .schema() - .output_fields() - .last() - .map(|field| field.lm_name) - .unwrap_or("output"); - - for idx in 0..self.breadth.saturating_sub(1) { - candidates.push(format!( - "{base_instruction}\n\nOptimization hint (d{} c{}): Be explicit and concise for `{}`.", - depth + 1, - idx + 1, - output_hint, - )); - } - - candidates - } -} - -impl Optimizer for COPRO { - type Report = (); - - async fn compile( - &self, - module: &mut M, - trainset: Vec>, - metric: &MT, - ) -> Result - where - S: Signature, - S::Input: Clone, - M: Module + for<'a> Facet<'a>, - MT: TypedMetric, - { - if self.breadth <= 1 { - return Err(anyhow!("breadth must be greater than 1")); - } - - let predictor_names = predictor_names(module)?; - - if predictor_names.is_empty() { - return Err(anyhow!("no optimizable predictors found")); - } - - for depth in 0..self.depth { - for predictor_name in &predictor_names { - let base_instruction = Self::current_instruction(module, predictor_name)?; - - let candidates = with_named_predictor(module, predictor_name, |predictor| { - Ok(self.candidate_instructions(&base_instruction, predictor, depth)) - })?; - - let mut best_instruction = base_instruction.clone(); - let mut best_score = f32::MIN; - - for candidate in candidates { - let score = self - .score_candidate::( - module, - predictor_name, - &candidate, - &trainset, - metric, - ) - .await?; - if score > best_score { - best_score = score; - best_instruction = candidate; - } - } - - Self::set_instruction(module, predictor_name, best_instruction)?; - } - } - - Ok(()) - } -} - -#[cfg(test)] -mod tests { - use anyhow::{Result, anyhow}; - - use super::*; - use crate::evaluate::{MetricOutcome, TypedMetric}; - use crate::{CallMetadata, Predict, PredictError, Predicted, Signature}; - - #[derive(Signature, Clone, Debug)] - struct CoproStateSig { - #[input] - prompt: String, - - #[output] - answer: String, - } - - #[derive(facet::Facet)] - #[facet(crate = facet)] - struct CoproStateModule { - predictor: Predict, - } - - impl Module for CoproStateModule { - type Input = CoproStateSigInput; - type Output = CoproStateSigOutput; - - async fn forward( - &self, - input: CoproStateSigInput, - ) -> Result, PredictError> { - Ok(Predicted::new( - CoproStateSigOutput { - answer: input.prompt, - }, - CallMetadata::default(), - )) - } - } - - struct AlwaysFailMetric; - - impl TypedMetric for AlwaysFailMetric { - async fn evaluate( - &self, - _example: &Example, - _prediction: &Predicted, - ) -> Result { - Err(anyhow!("metric failure")) - } - } - - fn trainset() -> Vec> { - vec![Example::new( - CoproStateSigInput { - prompt: "one".to_string(), - }, - CoproStateSigOutput { - answer: "one".to_string(), - }, - )] - } - - #[tokio::test] - async fn score_candidate_restores_state_when_metric_errors() { - let optimizer = COPRO::builder().breadth(2).depth(1).build(); - let mut module = CoproStateModule { - predictor: Predict::::builder() - .instruction("seed-instruction") - .build(), - }; - - let err = optimizer - .score_candidate::( - &mut module, - "predictor", - "candidate instruction", - &trainset(), - &AlwaysFailMetric, - ) - .await - .expect_err("candidate scoring should propagate metric failure"); - assert!(err.to_string().contains("metric failure")); - - let instruction = with_named_predictor(&mut module, "predictor", |predictor| { - Ok(predictor.instruction()) - }) - .expect("predictor lookup should succeed"); - assert_eq!(instruction, "seed-instruction"); - } -} diff --git a/crates/dspy-rs/src/optimizer/mipro.rs b/crates/dspy-rs/src/optimizer/mipro.rs deleted file mode 100644 index 6f2b4136..00000000 --- a/crates/dspy-rs/src/optimizer/mipro.rs +++ /dev/null @@ -1,509 +0,0 @@ -use anyhow::{Result, anyhow}; -use bon::Builder; - -use crate::evaluate::{TypedMetric, average_score}; -use crate::optimizer::{ - Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor, -}; -use crate::predictors::Example; -use crate::{BamlType, BamlValue, Facet, Module, Signature, SignatureSchema}; - -/// A single program execution trace: input, outputs, and score. -/// -/// Used internally by [`MIPROv2`] to collect execution data that informs -/// candidate instruction generation. Traces with higher scores guide the -/// optimizer toward better instructions. -#[derive(Clone, Debug)] -pub struct Trace { - pub input: S::Input, - pub outputs: BamlValue, - pub score: Option, -} - -impl Trace { - pub fn new(input: S::Input, outputs: BamlValue, score: Option) -> Self { - Self { - input, - outputs, - score, - } - } - - pub fn format_for_prompt(&self) -> String { - let mut result = String::new(); - result.push_str("Input:\n"); - - result.push_str(&format!(" {}\n", self.input.to_baml_value())); - - result.push_str("Output:\n"); - result.push_str(&format!(" {}\n", self.outputs)); - - if let Some(score) = self.score { - result.push_str(&format!("Score: {:.3}\n", score)); - } - - result - } -} - -/// An instruction candidate with its evaluated score. -/// -/// Generated by [`MIPROv2`]'s candidate generation step, then scored by -/// evaluating the module with this instruction on a minibatch. -#[derive(Clone, Debug)] -pub struct PromptCandidate { - pub instruction: String, - pub score: f32, -} - -impl PromptCandidate { - pub fn new(instruction: String) -> Self { - Self { - instruction, - score: 0.0, - } - } - - pub fn with_score(mut self, score: f32) -> Self { - self.score = score; - self - } -} - -/// Library of general prompting best practices used to seed candidate generation. -/// -/// These tips are appended to candidate instructions during [`MIPROv2`] optimization -/// to introduce diversity. Each candidate gets a different tip from the rotation. -pub struct PromptingTips { - pub tips: Vec, -} - -impl PromptingTips { - pub fn default_tips() -> Self { - Self { - tips: vec![ - "Use clear and specific language".to_string(), - "Provide context about the task domain".to_string(), - "Specify the desired output format".to_string(), - "Use chain-of-thought reasoning for complex tasks".to_string(), - "Include few-shot examples when helpful".to_string(), - "Break down complex instructions into steps".to_string(), - "Use role-playing (e.g., 'You are an expert...') when appropriate".to_string(), - "Specify constraints and edge cases".to_string(), - "Request explanations or reasoning when needed".to_string(), - "Use structured output formats (JSON, lists, etc.) when applicable".to_string(), - "Consider the model's strengths and limitations".to_string(), - "Be explicit about what to avoid or exclude".to_string(), - "Use positive framing (what to do vs. what not to do)".to_string(), - "Provide examples of both correct and incorrect outputs when useful".to_string(), - "Use delimiters or markers to separate different sections".to_string(), - ], - } - } - - pub fn format_for_prompt(&self) -> String { - self.tips - .iter() - .enumerate() - .map(|(i, tip)| format!("{}. {}", i + 1, tip)) - .collect::>() - .join("\n") - } -} - -/// Trace-guided instruction optimizer. -/// -/// MIPROv2 (Multi-prompt Instruction PRoposal Optimizer v2) works in three phases: -/// -/// 1. **Trace collection** — runs the module on the trainset to collect execution -/// traces with scores -/// 2. **Candidate generation** — uses the traces and prompting tips to generate -/// `num_candidates` instruction variants per predictor -/// 3. **Trial evaluation** — evaluates up to `num_trials` candidates on a minibatch, -/// keeps the best -/// -/// Unlike [`GEPA`](crate::GEPA), MIPROv2 does not require feedback — only numerical scores. -/// Unlike [`COPRO`](crate::COPRO), it uses execution traces to inform candidate generation -/// rather than -/// blind search. -/// -/// # What it doesn't do -/// -/// MIPRO only optimizes instructions, not demos. Per-predictor demo mutation from -/// trace data is the next step — Python DSPy does this and it matters. The -/// `TODO(trace-demos)` markers in the source track this gap. -/// -/// # Hyperparameters -/// -/// - **`num_candidates`** (default: 10) — instruction variants generated per predictor. -/// - **`num_trials`** (default: 20) — maximum candidates evaluated per predictor. -/// If `num_trials` < `num_candidates`, only the first `num_trials` are evaluated. -/// - **`minibatch_size`** (default: 25) — examples per candidate evaluation. -/// -/// # Cost -/// -/// Roughly `num_predictors × (trainset_size + num_trials × minibatch_size)` LM calls. -/// -/// ```ignore -/// let mipro = MIPROv2::builder() -/// .num_candidates(10) -/// .num_trials(20) -/// .build(); -/// mipro.compile(&mut module, trainset, &metric).await?; -/// ``` -#[derive(Builder)] -pub struct MIPROv2 { - /// Instruction variants generated per predictor. - #[builder(default = 10)] - pub num_candidates: usize, - - /// Maximum candidates evaluated per predictor. - #[builder(default = 20)] - pub num_trials: usize, - - /// Examples per candidate evaluation. - #[builder(default = 25)] - pub minibatch_size: usize, -} - -impl MIPROv2 { - async fn generate_traces( - &self, - module: &M, - examples: &[Example], - metric: &MT, - ) -> Result>> - where - S: Signature, - S::Input: Clone, - M: Module, - MT: TypedMetric, - { - let mut traces = Vec::with_capacity(examples.len()); - for example in examples { - let input = example.input.clone(); - let predicted = module.call(input).await.map_err(|err| anyhow!("{err}"))?; - let outcome = metric.evaluate(example, &predicted).await?; - let (output, _) = predicted.into_parts(); - traces.push(Trace::new( - example.input.clone(), - output.to_baml_value(), - Some(outcome.score), - )); - } - - Ok(traces) - } - - pub fn select_best_traces<'a, S: Signature>( - &self, - traces: &'a [Trace], - num_select: usize, - ) -> Vec<&'a Trace> { - let mut scored_traces: Vec<_> = traces.iter().filter(|t| t.score.is_some()).collect(); - - scored_traces.sort_by(|a, b| { - b.score - .partial_cmp(&a.score) - .unwrap_or(std::cmp::Ordering::Equal) - }); - - scored_traces.into_iter().take(num_select).collect() - } - - fn generate_candidate_instructions( - &self, - program_description: &str, - traces: &[Trace], - num_candidates: usize, - ) -> Vec { - let tips = PromptingTips::default_tips(); - let score_hint = traces.iter().filter_map(|t| t.score).fold(0.0f32, f32::max); - - (0..num_candidates) - .map(|idx| { - let tip = &tips.tips[idx % tips.tips.len()]; - format!( - "{program_description}\n\nOptimization candidate {}:\n- {}\n- Target score >= {:.3}", - idx + 1, - tip, - score_hint - ) - }) - .collect() - } - - pub fn create_prompt_candidates(&self, instructions: Vec) -> Vec { - instructions.into_iter().map(PromptCandidate::new).collect() - } - - async fn evaluate_candidate( - &self, - module: &mut M, - candidate: &PromptCandidate, - eval_examples: &[Example], - predictor_name: &str, - metric: &MT, - ) -> Result - where - S: Signature, - S::Input: Clone, - M: Module + for<'a> Facet<'a>, - MT: TypedMetric, - { - let original_state = with_named_predictor(module, predictor_name, |predictor| { - Ok(predictor.dump_state()) - })?; - - with_named_predictor(module, predictor_name, |predictor| { - predictor.set_instruction(candidate.instruction.clone()); - // TODO(trace-demos): derive per-predictor demos from successful traces. - // MIPRO is intentionally instruction-only in this release. - Ok(()) - })?; - - let minibatch_end = eval_examples.len().min(self.minibatch_size); - let minibatch = &eval_examples[..minibatch_end]; - let evaluation = evaluate_module_with_metric(&*module, minibatch, metric).await; - - match evaluation { - Ok(outcomes) => { - with_named_predictor(module, predictor_name, |predictor| { - predictor.load_state(original_state.clone()) - })?; - Ok(average_score(&outcomes)) - } - Err(eval_err) => { - if let Err(restore_err) = - with_named_predictor(module, predictor_name, |predictor| { - predictor.load_state(original_state) - }) - { - return Err(anyhow!( - "candidate evaluation failed: {eval_err}; failed to restore predictor state: {restore_err}" - )); - } - Err(eval_err) - } - } - } - - async fn evaluate_and_select_best( - &self, - module: &mut M, - candidates: Vec, - eval_examples: &[Example], - predictor_name: &str, - metric: &MT, - ) -> Result - where - S: Signature, - S::Input: Clone, - M: Module + for<'a> Facet<'a>, - MT: TypedMetric, - { - let mut evaluated = Vec::new(); - - let num_trials = self.num_trials.max(1); - for candidate in candidates.into_iter().take(num_trials) { - let score = self - .evaluate_candidate::( - module, - &candidate, - eval_examples, - predictor_name, - metric, - ) - .await?; - evaluated.push(candidate.with_score(score)); - } - - evaluated - .into_iter() - .max_by(|a, b| { - a.score - .partial_cmp(&b.score) - .unwrap_or(std::cmp::Ordering::Equal) - }) - .ok_or_else(|| anyhow!("no candidates to evaluate")) - } - - pub fn format_schema_fields(&self, signature: &SignatureSchema) -> String { - let mut result = String::new(); - - result.push_str("Input Fields:\n"); - for field in signature.input_fields() { - let desc = if field.docs.is_empty() { - "No description" - } else { - field.docs.as_str() - }; - result.push_str(&format!(" - {}: {}\n", field.lm_name, desc)); - } - - result.push_str("\nOutput Fields:\n"); - for field in signature.output_fields() { - let desc = if field.docs.is_empty() { - "No description" - } else { - field.docs.as_str() - }; - result.push_str(&format!(" - {}: {}\n", field.lm_name, desc)); - } - - result - } -} - -impl Optimizer for MIPROv2 { - type Report = (); - - async fn compile( - &self, - module: &mut M, - trainset: Vec>, - metric: &MT, - ) -> Result - where - S: Signature, - S::Input: Clone, - M: Module + for<'a> Facet<'a>, - MT: TypedMetric, - { - let predictor_names = predictor_names(module)?; - - if predictor_names.is_empty() { - return Err(anyhow!("no optimizable predictors found")); - } - - for predictor_name in predictor_names { - let signature_desc = { - with_named_predictor(module, &predictor_name, |predictor| { - Ok(self.format_schema_fields(predictor.schema())) - })? - }; - - let traces = self - .generate_traces::(module, &trainset, metric) - .await?; - let instructions = - self.generate_candidate_instructions(&signature_desc, &traces, self.num_candidates); - let candidates = self.create_prompt_candidates(instructions); - let best_candidate = self - .evaluate_and_select_best::( - module, - candidates, - &trainset, - &predictor_name, - metric, - ) - .await?; - - with_named_predictor(module, &predictor_name, |predictor| { - predictor.set_instruction(best_candidate.instruction.clone()); - // TODO(trace-demos): apply per-predictor demos derived from traces. - // MIPRO is intentionally instruction-only in this release. - Ok(()) - })?; - } - - Ok(()) - } -} - -#[cfg(test)] -mod tests { - use anyhow::{Result, anyhow}; - - use super::*; - use crate::evaluate::{MetricOutcome, TypedMetric}; - use crate::{CallMetadata, Predict, PredictError, Predicted, Signature}; - - #[derive(Signature, Clone, Debug)] - struct MiproStateSig { - #[input] - prompt: String, - - #[output] - answer: String, - } - - #[derive(facet::Facet)] - #[facet(crate = facet)] - struct MiproStateModule { - predictor: Predict, - } - - impl Module for MiproStateModule { - type Input = MiproStateSigInput; - type Output = MiproStateSigOutput; - - async fn forward( - &self, - input: MiproStateSigInput, - ) -> Result, PredictError> { - Ok(Predicted::new( - MiproStateSigOutput { - answer: input.prompt, - }, - CallMetadata::default(), - )) - } - } - - struct AlwaysFailMetric; - - impl TypedMetric for AlwaysFailMetric { - async fn evaluate( - &self, - _example: &Example, - _prediction: &Predicted, - ) -> Result { - Err(anyhow!("metric failure")) - } - } - - fn trainset() -> Vec> { - vec![Example::new( - MiproStateSigInput { - prompt: "one".to_string(), - }, - MiproStateSigOutput { - answer: "one".to_string(), - }, - )] - } - - #[tokio::test] - async fn evaluate_candidate_restores_state_when_metric_errors() { - let optimizer = MIPROv2::builder() - .num_candidates(2) - .num_trials(1) - .minibatch_size(1) - .build(); - let mut module = MiproStateModule { - predictor: Predict::::builder() - .instruction("seed-instruction") - .build(), - }; - let candidate = PromptCandidate::new("candidate instruction".to_string()); - - let err = optimizer - .evaluate_candidate::( - &mut module, - &candidate, - &trainset(), - "predictor", - &AlwaysFailMetric, - ) - .await - .expect_err("candidate evaluation should propagate metric failure"); - assert!(err.to_string().contains("metric failure")); - - let instruction = with_named_predictor(&mut module, "predictor", |predictor| { - Ok(predictor.instruction()) - }) - .expect("predictor lookup should succeed"); - assert_eq!(instruction, "seed-instruction"); - } -} diff --git a/crates/dspy-rs/src/optimizer/mod.rs b/crates/dspy-rs/src/optimizer/mod.rs deleted file mode 100644 index fcd761be..00000000 --- a/crates/dspy-rs/src/optimizer/mod.rs +++ /dev/null @@ -1,147 +0,0 @@ -//! Automatic prompt optimization. -//! -//! An optimizer takes a module, a training set, and a metric, then searches for better -//! instructions (and in some cases, demos) for each [`Predict`](crate::Predict) leaf. -//! The module is mutated in-place — after optimization, calling it produces better results -//! without any code changes. -//! -//! The [`Optimizer::compile`] method takes `&mut module` (exclusive access — no concurrent -//! `call()` during optimization) and returns a report. The specific report type depends -//! on the optimizer: [`COPRO`] returns `()`, [`GEPA`] returns [`GEPAResult`] with full -//! evolution history, [`MIPROv2`] returns `()`. -//! -//! # How it works internally -//! -//! 1. The optimizer calls `visit_named_predictors_mut` to discover all `Predict` -//! leaves via Facet reflection -//! 2. For each leaf, it reads the current instruction and generates candidates -//! 3. Each candidate is evaluated by setting the instruction, running the module on the -//! trainset, and scoring with the metric -//! 4. The best instruction (per optimizer's strategy) is kept -//! -//! Users never see this machinery — they call `optimizer.compile(&mut module, trainset, &metric)` -//! and their module gets better. -//! -//! # Choosing an optimizer -//! -//! | Optimizer | Strategy | Needs feedback? | Cost | -//! |-----------|----------|-----------------|------| -//! | [`COPRO`] | Breadth-first instruction search | No | Low (breadth × depth × trainset) | -//! | [`GEPA`] | Genetic-Pareto evolution with feedback | **Yes** | Medium-high (iterations × eval) | -//! | [`MIPROv2`] | Trace-guided candidate generation | No | Medium (candidates × trials × trainset) | - -pub mod copro; -pub mod gepa; -pub mod mipro; -pub mod pareto; - -pub use copro::*; -pub use gepa::*; -pub use mipro::*; -pub use pareto::*; - -use anyhow::Result; -use anyhow::anyhow; -use std::ops::ControlFlow; - -use dsrs_core::{DynPredictor, visit_named_predictors_mut}; -use crate::evaluate::{MetricOutcome, TypedMetric, evaluate_trainset}; -use crate::predictors::Example; -use crate::{Facet, Module, Signature}; - -/// Tunes a module's [`Predict`](crate::Predict) leaves for better performance. -/// -/// Takes exclusive `&mut` access to the module during optimization — you cannot call -/// the module concurrently. After `compile` returns, the module's instructions and/or -/// demos have been mutated in-place. Just call the module as before; no code changes needed. -/// -/// ```ignore -/// let optimizer = COPRO::builder().breadth(10).depth(3).build(); -/// optimizer.compile(&mut module, trainset, &metric).await?; -/// // module is now optimized — call it as usual -/// let result = module.call(input).await?; -/// ``` -/// -/// # Errors -/// -/// Returns an error if: -/// - No optimizable `Predict` leaves are found in the module -/// - The metric evaluation fails on any training example -/// - An LM call fails during candidate evaluation -#[allow(async_fn_in_trait)] -pub trait Optimizer { - type Report; - - async fn compile( - &self, - module: &mut M, - trainset: Vec>, - metric: &MT, - ) -> Result - where - S: Signature, - S::Input: Clone, - M: Module + for<'a> Facet<'a>, - MT: TypedMetric; -} - -/// Evaluates a module on a trainset using a typed metric. -/// -/// Thin wrapper around [`evaluate_trainset`](crate::evaluate::evaluate_trainset) for -/// internal optimizer use. Returns one [`MetricOutcome`] per training example. -pub(crate) async fn evaluate_module_with_metric( - module: &M, - trainset: &[Example], - metric: &MT, -) -> Result> -where - S: Signature, - S::Input: Clone, - M: Module, - MT: TypedMetric, -{ - evaluate_trainset(module, trainset, metric).await -} - -/// Returns the dotted-path names of all [`Predict`](crate::Predict) leaves in a module. -/// -/// Convenience wrapper around -/// [`visit_named_predictors_mut`](crate::core::dyn_predictor::visit_named_predictors_mut) -/// that collects discovered paths. -pub(crate) fn predictor_names(module: &mut M) -> Result> -where - M: for<'a> Facet<'a>, -{ - let mut names = Vec::new(); - visit_named_predictors_mut(module, |name, _predictor| { - names.push(name.to_string()); - ControlFlow::Continue(()) - })?; - Ok(names) -} - -/// Looks up a single named predictor and applies a closure to it. -/// -/// # Errors -/// -/// Returns an error if the predictor name doesn't match any discovered leaf. -pub(crate) fn with_named_predictor(module: &mut M, predictor_name: &str, f: F) -> Result -where - M: for<'a> Facet<'a>, - F: FnOnce(&mut dyn DynPredictor) -> Result, -{ - let mut apply = Some(f); - let mut result = None; - - visit_named_predictors_mut(module, |name, predictor| { - if name != predictor_name { - return ControlFlow::Continue(()); - } - - let f = apply.take().expect("selector closure should only run once"); - result = Some(f(predictor)); - ControlFlow::Break(()) - })?; - - result.unwrap_or_else(|| Err(anyhow!("predictor `{predictor_name}` not found"))) -} diff --git a/crates/dspy-rs/tests/test_dataloader.rs b/crates/dspy-rs/tests/test_dataloader.rs index e98d8db8..55b1f1aa 100644 --- a/crates/dspy-rs/tests/test_dataloader.rs +++ b/crates/dspy-rs/tests/test_dataloader.rs @@ -4,9 +4,8 @@ use arrow::datatypes::{DataType, Field, Schema}; use arrow::record_batch::RecordBatch; use bon::Builder; use dspy_rs::{ - COPRO, CallMetadata, DataLoader, Example, MetricOutcome, Module, Optimizer, Predict, - PredictError, Predicted, Signature, TypedLoadOptions, TypedMetric, UnknownFieldPolicy, - average_score, evaluate_trainset, + CallMetadata, DataLoader, Example, MetricOutcome, Module, Predict, PredictError, Predicted, + Signature, TypedLoadOptions, TypedMetric, UnknownFieldPolicy, average_score, evaluate_trainset, }; use parquet::arrow::ArrowWriter; use std::collections::HashMap; @@ -505,16 +504,11 @@ async fn typed_loader_outputs_feed_evaluator_and_optimizer_paths() -> Result<()> )?; let metric = ExactMatch; - let mut module = EchoModule::builder().build(); + let module = EchoModule::builder().build(); let outcomes = evaluate_trainset(&module, &trainset, &metric).await?; assert_eq!(outcomes.len(), 2); assert_eq!(average_score(&outcomes), 1.0); - let optimizer = COPRO::builder().breadth(2).depth(1).build(); - optimizer - .compile::(&mut module, trainset, &metric) - .await?; - Ok(()) } diff --git a/crates/dspy-rs/tests/test_miprov2.rs b/crates/dspy-rs/tests/test_miprov2.rs deleted file mode 100644 index 79a5435b..00000000 --- a/crates/dspy-rs/tests/test_miprov2.rs +++ /dev/null @@ -1,140 +0,0 @@ -use dspy_rs::{BamlValue, MIPROv2, PromptCandidate, PromptingTips, Signature, Trace}; -use rstest::*; - -#[derive(Signature, Clone, Debug)] -struct TestSignature { - #[input] - question: String, - - #[output] - answer: String, -} - -fn input(question: &str) -> TestSignatureInput { - TestSignatureInput { - question: question.to_string(), - } -} - -#[rstest] -fn test_trace_formatting() { - let trace = Trace::::new( - input("What is 2+2?"), - BamlValue::String("4".to_string()), - Some(1.0), - ); - let formatted = trace.format_for_prompt(); - - assert!(formatted.contains("question")); - assert!(formatted.contains("What is 2+2?")); - assert!(formatted.contains("4")); - assert!(formatted.contains("Score: 1.000")); -} - -#[rstest] -fn test_trace_formatting_without_score() { - let trace = Trace::::new( - input("input"), - BamlValue::String("result".to_string()), - None, - ); - let formatted = trace.format_for_prompt(); - - assert!(formatted.contains("Input:")); - assert!(formatted.contains("Output:")); - assert!(!formatted.contains("Score:")); -} - -#[rstest] -fn test_prompting_tips_default() { - let tips = PromptingTips::default_tips(); - - assert!(!tips.tips.is_empty()); - assert!(tips.tips.len() >= 15); -} - -#[rstest] -fn test_prompting_tips_formatting() { - let tips = PromptingTips::default_tips(); - let formatted = tips.format_for_prompt(); - - assert!(formatted.contains("1.")); - assert!(formatted.contains("\n")); -} - -#[rstest] -fn test_prompt_candidate_creation() { - let candidate = PromptCandidate::new("Test instruction".to_string()); - - assert_eq!(candidate.instruction, "Test instruction"); - assert_eq!(candidate.score, 0.0); -} - -#[rstest] -fn test_prompt_candidate_with_score() { - let candidate = PromptCandidate::new("test".to_string()).with_score(0.85); - assert_eq!(candidate.score, 0.85); -} - -#[rstest] -fn test_miprov2_default_configuration() { - let optimizer = MIPROv2::builder().build(); - - assert_eq!(optimizer.num_candidates, 10); - assert_eq!(optimizer.num_trials, 20); - assert_eq!(optimizer.minibatch_size, 25); -} - -#[rstest] -fn test_select_best_traces_descending_order() { - let optimizer = MIPROv2::builder().build(); - - let traces = vec![ - Trace::::new(input("a"), BamlValue::String("a".to_string()), Some(0.1)), - Trace::::new(input("b"), BamlValue::String("b".to_string()), Some(0.5)), - Trace::::new(input("c"), BamlValue::String("c".to_string()), Some(0.3)), - ]; - - let best = optimizer.select_best_traces(&traces, 2); - assert_eq!(best.len(), 2); - assert_eq!(best[0].score, Some(0.5)); - assert_eq!(best[1].score, Some(0.3)); -} - -#[rstest] -fn test_select_best_traces_ignores_none_scores() { - let optimizer = MIPROv2::builder().build(); - - let traces = vec![ - Trace::::new(input("a"), BamlValue::String("a".to_string()), None), - Trace::::new(input("b"), BamlValue::String("b".to_string()), Some(0.8)), - ]; - - let best = optimizer.select_best_traces(&traces, 2); - assert_eq!(best.len(), 1); - assert_eq!(best[0].score, Some(0.8)); -} - -#[rstest] -fn test_create_prompt_candidates_uses_all_instructions() { - let optimizer = MIPROv2::builder().build(); - let candidates = optimizer.create_prompt_candidates(vec![ - "instruction-1".to_string(), - "instruction-2".to_string(), - ]); - - assert_eq!(candidates.len(), 2); - assert_eq!(candidates[0].instruction, "instruction-1"); - assert_eq!(candidates[1].instruction, "instruction-2"); -} - -#[rstest] -fn test_format_schema_fields_reads_typed_schema() { - let optimizer = MIPROv2::builder().build(); - let rendered = optimizer.format_schema_fields(TestSignature::schema()); - - assert!(rendered.contains("Input Fields:")); - assert!(rendered.contains("question")); - assert!(rendered.contains("Output Fields:")); - assert!(rendered.contains("answer")); -} diff --git a/crates/dspy-rs/tests/test_optimizer_named_parameters_integration.rs b/crates/dspy-rs/tests/test_optimizer_named_parameters_integration.rs deleted file mode 100644 index 8d4abdcd..00000000 --- a/crates/dspy-rs/tests/test_optimizer_named_parameters_integration.rs +++ /dev/null @@ -1,86 +0,0 @@ -use anyhow::Result; -use dspy_rs::{ - COPRO, CallMetadata, Example, MetricOutcome, Module, Optimizer, Predict, PredictError, - Predicted, Signature, TypedMetric, -}; - -#[derive(Signature, Clone, Debug)] -struct OptimizerSig { - #[input] - prompt: String, - - #[output] - answer: String, -} - -#[derive(facet::Facet)] -#[facet(crate = facet)] -struct InstructionEchoModule { - predictor: Predict, -} - -impl Module for InstructionEchoModule { - type Input = OptimizerSigInput; - type Output = OptimizerSigOutput; - - async fn forward( - &self, - input: OptimizerSigInput, - ) -> Result, PredictError> { - let _ = &self.predictor; - Ok(Predicted::new( - OptimizerSigOutput { - answer: input.prompt, - }, - CallMetadata::default(), - )) - } -} - -struct InstructionLengthMetric; - -impl TypedMetric for InstructionLengthMetric { - async fn evaluate( - &self, - _example: &Example, - prediction: &Predicted, - ) -> Result { - Ok(MetricOutcome::score(prediction.answer.len() as f32)) - } -} - -fn trainset() -> Vec> { - vec![ - Example::new( - OptimizerSigInput { - prompt: "one".to_string(), - }, - OptimizerSigOutput { - answer: "one".to_string(), - }, - ), - Example::new( - OptimizerSigInput { - prompt: "two".to_string(), - }, - OptimizerSigOutput { - answer: "two".to_string(), - }, - ), - ] -} - -#[tokio::test] -async fn optimizer_compile_succeeds_without_public_named_parameter_access() { - let mut module = InstructionEchoModule { - predictor: Predict::::builder() - .instruction("seed") - .build(), - }; - - let optimizer = COPRO::builder().breadth(4).depth(1).build(); - optimizer - .compile::(&mut module, trainset(), &InstructionLengthMetric) - .await - .expect("COPRO compile should succeed with internal predictor discovery"); -} diff --git a/crates/dspy-rs/tests/test_optimizer_typed_metric.rs b/crates/dspy-rs/tests/test_optimizer_typed_metric.rs deleted file mode 100644 index c05a590d..00000000 --- a/crates/dspy-rs/tests/test_optimizer_typed_metric.rs +++ /dev/null @@ -1,194 +0,0 @@ -use anyhow::{Result, anyhow}; -use dspy_rs::{ - COPRO, CallMetadata, Example, MIPROv2, MetricOutcome, Module, Optimizer, Predict, PredictError, - Predicted, Signature, TypedMetric, -}; -use std::collections::HashSet; -use std::sync::{Arc, Mutex}; - -#[derive(Signature, Clone, Debug)] -struct OptimizerSig { - #[input] - prompt: String, - - #[output] - answer: String, -} - -#[derive(facet::Facet)] -#[facet(crate = facet)] -struct InstructionEchoModule { - predictor: Predict, -} - -impl Module for InstructionEchoModule { - type Input = OptimizerSigInput; - type Output = OptimizerSigOutput; - - async fn forward( - &self, - input: OptimizerSigInput, - ) -> Result, PredictError> { - let _ = &self.predictor; - Ok(Predicted::new( - OptimizerSigOutput { - answer: input.prompt, - }, - CallMetadata::default(), - )) - } -} - -struct RecordingMetric { - seen_answers: Arc>>, -} - -impl TypedMetric for RecordingMetric { - async fn evaluate( - &self, - example: &Example, - prediction: &Predicted, - ) -> Result { - self.seen_answers - .lock() - .expect("metric lock should not be poisoned") - .push(prediction.answer.clone()); - - let score = (prediction.answer == example.input.prompt) as u8 as f32; - Ok(MetricOutcome::score(score)) - } -} - -struct FailingMetric; - -impl TypedMetric for FailingMetric { - async fn evaluate( - &self, - _example: &Example, - _prediction: &Predicted, - ) -> Result { - Err(anyhow!("metric failure")) - } -} - -fn trainset() -> Vec> { - vec![ - Example::new( - OptimizerSigInput { - prompt: "one".to_string(), - }, - OptimizerSigOutput { - answer: "one".to_string(), - }, - ), - Example::new( - OptimizerSigInput { - prompt: "two".to_string(), - }, - OptimizerSigOutput { - answer: "two".to_string(), - }, - ), - ] -} - -#[tokio::test] -async fn copro_compile_uses_typed_metric_predictions() { - let seen_answers = Arc::new(Mutex::new(Vec::new())); - let metric = RecordingMetric { - seen_answers: Arc::clone(&seen_answers), - }; - - let mut module = InstructionEchoModule { - predictor: Predict::::builder() - .instruction("seed") - .build(), - }; - - let optimizer = COPRO::builder().breadth(3).depth(1).build(); - optimizer - .compile::(&mut module, trainset(), &metric) - .await - .expect("COPRO compile should succeed on typed metric"); - - let seen = seen_answers - .lock() - .expect("metric lock should not be poisoned"); - assert!(!seen.is_empty(), "metric should receive typed predictions"); - let expected_prompts = HashSet::from(["one".to_string(), "two".to_string()]); - assert!(seen.iter().all(|answer| expected_prompts.contains(answer))); - assert!(seen.iter().any(|answer| answer == "one")); - assert!(seen.iter().any(|answer| answer == "two")); -} - -#[tokio::test] -async fn mipro_compile_uses_typed_metric_predictions() { - let seen_answers = Arc::new(Mutex::new(Vec::new())); - let metric = RecordingMetric { - seen_answers: Arc::clone(&seen_answers), - }; - - let mut module = InstructionEchoModule { - predictor: Predict::::builder() - .instruction("seed") - .build(), - }; - - let optimizer = MIPROv2::builder() - .num_candidates(4) - .num_trials(2) - .minibatch_size(2) - .build(); - - optimizer - .compile::(&mut module, trainset(), &metric) - .await - .expect("MIPRO compile should succeed on typed metric"); - - let seen = seen_answers - .lock() - .expect("metric lock should not be poisoned"); - assert!(!seen.is_empty(), "metric should receive typed predictions"); - let expected_prompts = HashSet::from(["one".to_string(), "two".to_string()]); - assert!(seen.iter().all(|answer| expected_prompts.contains(answer))); - assert!(seen.iter().any(|answer| answer == "one")); - assert!(seen.iter().any(|answer| answer == "two")); -} - -#[tokio::test] -async fn copro_compile_propagates_metric_errors() { - let mut module = InstructionEchoModule { - predictor: Predict::::builder() - .instruction("seed") - .build(), - }; - let optimizer = COPRO::builder().breadth(3).depth(1).build(); - - let err = optimizer - .compile::(&mut module, trainset(), &FailingMetric) - .await - .expect_err("COPRO should propagate typed metric errors"); - - assert!(err.to_string().contains("metric failure")); -} - -#[tokio::test] -async fn mipro_compile_propagates_metric_errors() { - let mut module = InstructionEchoModule { - predictor: Predict::::builder() - .instruction("seed") - .build(), - }; - let optimizer = MIPROv2::builder() - .num_candidates(4) - .num_trials(2) - .minibatch_size(2) - .build(); - - let err = optimizer - .compile::(&mut module, trainset(), &FailingMetric) - .await - .expect_err("MIPRO should propagate typed metric errors"); - - assert!(err.to_string().contains("metric failure")); -} diff --git a/crates/dspy-rs/tests/test_public_api_compile_fail.rs b/crates/dspy-rs/tests/test_public_api_compile_fail.rs index 83c387bc..e515284f 100644 --- a/crates/dspy-rs/tests/test_public_api_compile_fail.rs +++ b/crates/dspy-rs/tests/test_public_api_compile_fail.rs @@ -87,7 +87,7 @@ fn optimizer_compile_rejects_wrong_signature_input_type() { "wrong_signature_case", r#" use anyhow::Result; -use dspy_rs::{COPRO, ChainOfThought, Example, MetricOutcome, Optimizer, Predicted, Signature, TypedMetric, WithReasoning}; +use dspy_rs::{ChainOfThought, Example, GEPA, MetricOutcome, Optimizer, Predicted, Signature, TypedMetric, WithReasoning}; #[derive(Signature, Clone, Debug)] struct RightSig { @@ -120,7 +120,7 @@ impl TypedMetric> for Metric { fn main() { let mut module = ChainOfThought::::new(); let trainset: Vec> = Vec::new(); - let optimizer = COPRO::builder().breadth(1).depth(1).build(); + let optimizer = GEPA::builder().num_iterations(1).minibatch_size(1).build(); let _future = optimizer.compile::(&mut module, trainset, &Metric); } "#, diff --git a/crates/dsrs-core/src/dyn_predictor.rs b/crates/dsrs-core/src/dyn_predictor.rs index 69db59e4..c7b00b7d 100644 --- a/crates/dsrs-core/src/dyn_predictor.rs +++ b/crates/dsrs-core/src/dyn_predictor.rs @@ -334,7 +334,7 @@ fn resolve_predict_leaf(shape: &'static Shape) -> PredictLeafResolution { } fn is_predict_shape_identity(shape: &'static Shape) -> bool { - shape.type_identifier == "Predict" && shape.module_path == Some("dspy_rs::predictors::predict") + shape.type_identifier == "Predict" && shape.module_path == Some("dsrs_predict::predict") } fn push_field(path: &str, field: &str) -> String { diff --git a/crates/dsrs-gepa/Cargo.toml b/crates/dsrs-gepa/Cargo.toml index 3ba8a3b6..f4f9d02a 100644 --- a/crates/dsrs-gepa/Cargo.toml +++ b/crates/dsrs-gepa/Cargo.toml @@ -8,3 +8,17 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs GEPA optimizer support." [dependencies] +anyhow = "1.0.99" +bon = "3.7.0" +dsrs-core = { path = "../dsrs-core" } +dsrs-evaluate = { path = "../dsrs-evaluate" } +dsrs-lm = { path = "../dsrs-lm" } +dsrs-predict = { path = "../dsrs-predict" } +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +rand = "0.8.5" +serde = { version = "1.0.219", features = ["derive"] } +tracing = "0.1.44" + +[dev-dependencies] +dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } +tokio = { version = "1.46.1", features = ["macros", "rt"] } diff --git a/crates/dspy-rs/src/optimizer/gepa.rs b/crates/dsrs-gepa/src/gepa.rs similarity index 96% rename from crates/dspy-rs/src/optimizer/gepa.rs rename to crates/dsrs-gepa/src/gepa.rs index e4c799c6..cfdecd26 100644 --- a/crates/dspy-rs/src/optimizer/gepa.rs +++ b/crates/dsrs-gepa/src/gepa.rs @@ -2,14 +2,11 @@ use anyhow::{Context, Result, anyhow}; use bon::Builder; use serde::{Deserialize, Serialize}; -use crate::evaluate::{MetricOutcome, TypedMetric, average_score}; -use crate::optimizer::{ - Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor, -}; -use crate::predictors::Example; -use crate::{BamlType, BamlValue, Facet, Module, Signature}; +use dsrs_core::{BamlType, BamlValue, Example, Facet, Module, Signature}; +use dsrs_evaluate::{MetricOutcome, TypedMetric, average_score}; -use super::pareto::ParetoFrontier; +use crate::{Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor}; +use crate::pareto::ParetoFrontier; /// A single instruction candidate tracked through GEPA's evolutionary search. /// @@ -75,7 +72,7 @@ pub use super::pareto::ParetoStatistics; /// Genetic-Pareto instruction optimizer with feedback-driven evolution. /// /// GEPA uses an evolutionary search guided by per-example feedback from your metric. -/// Unlike [`COPRO`](crate::COPRO) which only uses numerical scores, GEPA requires your +/// GEPA requires your /// [`TypedMetric`] to return [`MetricOutcome::with_feedback`] — textual feedback /// explaining *why* each example scored the way it did. This feedback gets appended /// to the instruction as a mutation prompt for the next generation, so the quality @@ -155,7 +152,7 @@ pub struct GEPA { /// Hard cap on total LM calls (rollouts + generation). pub max_lm_calls: Option, /// Optional separate LM for candidate generation. - pub prompt_model: Option, + pub prompt_model: Option, } impl GEPA { @@ -505,8 +502,10 @@ mod tests { use anyhow::{Result, anyhow}; use super::*; - use crate::evaluate::{MetricOutcome, TypedMetric}; - use crate::{CallMetadata, Predict, PredictError, Predicted, Signature}; + use dsrs_core::{CallMetadata, PredictError, Predicted}; + use dsrs_evaluate::{MetricOutcome, TypedMetric}; + use dsrs_macros::Signature; + use dsrs_predict::Predict; #[derive(Signature, Clone, Debug)] struct GepaStateSig { @@ -571,11 +570,15 @@ mod tests { .instruction("seed-instruction") .build(), }; + let predictor_name = predictor_names(&mut module) + .expect("predictor discovery should succeed") + .pop() + .expect("test module should expose one predictor"); let err = optimizer .evaluate_candidate::( &mut module, - "predictor", + &predictor_name, "candidate instruction", &eval_set(), &AlwaysFailMetric, @@ -584,7 +587,7 @@ mod tests { .expect_err("candidate evaluation should propagate metric failure"); assert!(err.to_string().contains("metric failure")); - let instruction = with_named_predictor(&mut module, "predictor", |predictor| { + let instruction = with_named_predictor(&mut module, &predictor_name, |predictor| { Ok(predictor.instruction()) }) .expect("predictor lookup should succeed"); diff --git a/crates/dsrs-gepa/src/lib.rs b/crates/dsrs-gepa/src/lib.rs index 89442986..ecbfbdd2 100644 --- a/crates/dsrs-gepa/src/lib.rs +++ b/crates/dsrs-gepa/src/lib.rs @@ -1 +1,138 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! Automatic prompt optimization. +//! +//! An optimizer takes a module, a training set, and a metric, then searches for better +//! instructions (and in some cases, demos) for each `Predict` leaf. +//! The module is mutated in-place — after optimization, calling it produces better results +//! without any code changes. +//! +//! The [`Optimizer::compile`] method takes `&mut module` (exclusive access — no concurrent +//! `call()` during optimization) and returns a report. The specific report type depends +//! on the optimizer. This crate carries GEPA and its Pareto frontier support. +//! +//! # How it works internally +//! +//! 1. The optimizer calls `visit_named_predictors_mut` to discover all `Predict` +//! leaves via Facet reflection +//! 2. For each leaf, it reads the current instruction and generates candidates +//! 3. Each candidate is evaluated by setting the instruction, running the module on the +//! trainset, and scoring with the metric +//! 4. The best instruction (per optimizer's strategy) is kept +//! +//! Users never see this machinery — they call `optimizer.compile(&mut module, trainset, &metric)` +//! and their module gets better. +//! +//! # Choosing an optimizer +//! +//! | Optimizer | Strategy | Needs feedback? | Cost | +//! |-----------|----------|-----------------|------| +//! | [`GEPA`] | Genetic-Pareto evolution with feedback | **Yes** | Medium-high (iterations × eval) | + +pub mod gepa; +pub mod pareto; + +pub use gepa::*; +pub use pareto::*; + +use anyhow::Result; +use anyhow::anyhow; +use std::ops::ControlFlow; + +use dsrs_core::{DynPredictor, Example, Facet, Module, Signature, visit_named_predictors_mut}; +use dsrs_evaluate::{MetricOutcome, TypedMetric, evaluate_trainset}; + +/// Tunes a module's `Predict` leaves for better performance. +/// +/// Takes exclusive `&mut` access to the module during optimization — you cannot call +/// the module concurrently. After `compile` returns, the module's instructions and/or +/// demos have been mutated in-place. Just call the module as before; no code changes needed. +/// +/// ```ignore +/// let optimizer = GEPA::builder().num_generations(3).build(); +/// optimizer.compile(&mut module, trainset, &metric).await?; +/// // module is now optimized — call it as usual +/// let result = module.call(input).await?; +/// ``` +/// +/// # Errors +/// +/// Returns an error if: +/// - No optimizable `Predict` leaves are found in the module +/// - The metric evaluation fails on any training example +/// - An LM call fails during candidate evaluation +#[allow(async_fn_in_trait)] +pub trait Optimizer { + type Report; + + async fn compile( + &self, + module: &mut M, + trainset: Vec>, + metric: &MT, + ) -> Result + where + S: Signature, + S::Input: Clone, + M: Module + for<'a> Facet<'a>, + MT: TypedMetric; +} + +/// Evaluates a module on a trainset using a typed metric. +/// +/// Thin wrapper around [`evaluate_trainset`](dsrs_evaluate::evaluate_trainset) for +/// internal optimizer use. Returns one [`MetricOutcome`] per training example. +pub(crate) async fn evaluate_module_with_metric( + module: &M, + trainset: &[Example], + metric: &MT, +) -> Result> +where + S: Signature, + S::Input: Clone, + M: Module, + MT: TypedMetric, +{ + evaluate_trainset(module, trainset, metric).await +} + +/// Returns the dotted-path names of all `Predict` leaves in a module. +/// +/// Convenience wrapper around +/// [`visit_named_predictors_mut`](dsrs_core::visit_named_predictors_mut) +/// that collects discovered paths. +pub(crate) fn predictor_names(module: &mut M) -> Result> +where + M: for<'a> Facet<'a>, +{ + let mut names = Vec::new(); + visit_named_predictors_mut(module, |name, _predictor| { + names.push(name.to_string()); + ControlFlow::Continue(()) + })?; + Ok(names) +} + +/// Looks up a single named predictor and applies a closure to it. +/// +/// # Errors +/// +/// Returns an error if the predictor name doesn't match any discovered leaf. +pub(crate) fn with_named_predictor(module: &mut M, predictor_name: &str, f: F) -> Result +where + M: for<'a> Facet<'a>, + F: FnOnce(&mut dyn DynPredictor) -> Result, +{ + let mut apply = Some(f); + let mut result = None; + + visit_named_predictors_mut(module, |name, predictor| { + if name != predictor_name { + return ControlFlow::Continue(()); + } + + let f = apply.take().expect("selector closure should only run once"); + result = Some(f(predictor)); + ControlFlow::Break(()) + })?; + + result.unwrap_or_else(|| Err(anyhow!("predictor `{predictor_name}` not found"))) +} diff --git a/crates/dspy-rs/src/optimizer/pareto.rs b/crates/dsrs-gepa/src/pareto.rs similarity index 99% rename from crates/dspy-rs/src/optimizer/pareto.rs rename to crates/dsrs-gepa/src/pareto.rs index ecdec4f7..56544a98 100644 --- a/crates/dspy-rs/src/optimizer/pareto.rs +++ b/crates/dsrs-gepa/src/pareto.rs @@ -2,7 +2,7 @@ use rand::Rng; use serde::{Deserialize, Serialize}; use std::collections::{HashMap, HashSet}; -use crate::optimizer::gepa::GEPACandidate; +use crate::gepa::GEPACandidate; /// Per-example dominance frontier for [`GEPA`](crate::GEPA)'s evolutionary search. /// From fab2e1c38b00b1a4d46d6cd7ce30a33dd938b9a1 Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:22:15 -0700 Subject: [PATCH 12/15] feat(data): extract typed loaders into dsrs-data DataLoader, typed row mapping, JSONL serialization, and URL detection now live in dsrs-data. The new crate owns the heavy data stack through explicit features so light users do not pay for CSV/Parquet/HuggingFace unless they opt in. The old dspy-rs data module is reduced to a temporary pass-through, because final test/example relocation still needs a compileable bridge before the aggregator crate is deleted. Verification: - cargo check -p dsrs-data --features all - cargo check --workspace - cargo test -p dsrs-data --features all - cargo test --workspace --no-run Scaffolding: feature gates are in place, but tests still live under dspy-rs until the final hard cutover redistributes them. --- Cargo.lock | 18 +++++++++++ crates/dspy-rs/Cargo.toml | 1 + crates/dspy-rs/src/data.rs | 1 + crates/dspy-rs/src/data/mod.rs | 27 ----------------- crates/dsrs-data/Cargo.toml | 23 ++++++++++++++ .../src/data => dsrs-data/src}/dataloader.rs | 5 ++-- crates/dsrs-data/src/lib.rs | 30 ++++++++++++++++++- .../src/data => dsrs-data/src}/serialize.rs | 2 +- .../src/data => dsrs-data/src}/utils.rs | 0 9 files changed, 75 insertions(+), 32 deletions(-) create mode 100644 crates/dspy-rs/src/data.rs delete mode 100644 crates/dspy-rs/src/data/mod.rs rename crates/{dspy-rs/src/data => dsrs-data/src}/dataloader.rs (99%) rename crates/{dspy-rs/src/data => dsrs-data/src}/serialize.rs (96%) rename crates/{dspy-rs/src/data => dsrs-data/src}/utils.rs (100%) diff --git a/Cargo.lock b/Cargo.lock index 69f0bbb0..2c8bd218 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1267,6 +1267,7 @@ dependencies = [ "csv", "dsrs-cache", "dsrs-core", + "dsrs-data", "dsrs-evaluate", "dsrs-gepa", "dsrs-lm", @@ -1337,6 +1338,23 @@ dependencies = [ [[package]] name = "dsrs-data" version = "0.0.0" +dependencies = [ + "anyhow", + "arrow", + "bamltype", + "csv", + "dsrs-core", + "dsrs-predict", + "hf-hub", + "parquet", + "rayon", + "regex", + "reqwest 0.13.2", + "serde", + "serde_json", + "thiserror 2.0.17", + "tracing", +] [[package]] name = "dsrs-evaluate" diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml index bbdaf6fc..10ae7089 100644 --- a/crates/dspy-rs/Cargo.toml +++ b/crates/dspy-rs/Cargo.toml @@ -32,6 +32,7 @@ dsrs-lm = { path = "../dsrs-lm" } dsrs-predict = { path = "../dsrs-predict" } dsrs-evaluate = { path = "../dsrs-evaluate" } dsrs-gepa = { path = "../dsrs-gepa" } +dsrs-data = { path = "../dsrs-data", features = ["all"] } dsrs-trace = { path = "../dsrs-trace" } # Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } diff --git a/crates/dspy-rs/src/data.rs b/crates/dspy-rs/src/data.rs new file mode 100644 index 00000000..ae93c622 --- /dev/null +++ b/crates/dspy-rs/src/data.rs @@ -0,0 +1 @@ +pub use dsrs_data::*; diff --git a/crates/dspy-rs/src/data/mod.rs b/crates/dspy-rs/src/data/mod.rs deleted file mode 100644 index ec0d8539..00000000 --- a/crates/dspy-rs/src/data/mod.rs +++ /dev/null @@ -1,27 +0,0 @@ -//! Data loading and runtime row types. -//! -//! Typed ingestion is now first-class: -//! -//! - [`DataLoader`] provides `load_*` methods that return -//! [`Example`](crate::predictors::Example) directly. -//! - Typed examples flow directly into evaluation and optimizer APIs. -//! -//! The untyped row type (`RawExample`) remains for internal runtime/tracing/cache bridges. - -pub mod dataloader; -pub mod example { - pub use dsrs_core::RawExample as Example; -} -pub mod prediction { - pub use dsrs_core::Prediction; -} -pub mod serialize; -pub mod utils; - -pub use dataloader::*; -pub use example::*; -pub use prediction::*; -pub use serialize::*; -pub use utils::*; - -pub type RawExample = dsrs_core::RawExample; diff --git a/crates/dsrs-data/Cargo.toml b/crates/dsrs-data/Cargo.toml index 236d2d49..594a4a6d 100644 --- a/crates/dsrs-data/Cargo.toml +++ b/crates/dsrs-data/Cargo.toml @@ -8,3 +8,26 @@ repository = "https://github.com/krypticmouse/DSRs" description = "DSRs dataset loading support." [dependencies] +anyhow = "1.0.99" +bamltype = { path = "../bamltype" } +csv = { version = "1.3.1", optional = true } +dsrs-core = { path = "../dsrs-core" } +dsrs-predict = { path = "../dsrs-predict" } +parquet = { version = "56.1.0", optional = true } +arrow = { version = "56.1.0", optional = true } +hf-hub = { version = "0.4.3", features = ["tokio"], optional = true } +rayon = "1.10.0" +regex = "1.11.2" +reqwest = { version = "0.13", features = ["blocking"], optional = true } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = { version = "1.0.140", features = ["preserve_order"] } +thiserror = "2.0.17" +tracing = "0.1.44" + +[features] +default = ["csv", "json"] +all = ["csv", "json", "parquet", "hf-hub"] +csv = ["dep:csv"] +json = [] +parquet = ["dep:arrow", "dep:parquet"] +hf-hub = ["dep:hf-hub", "dep:reqwest", "parquet"] diff --git a/crates/dspy-rs/src/data/dataloader.rs b/crates/dsrs-data/src/dataloader.rs similarity index 99% rename from crates/dspy-rs/src/data/dataloader.rs rename to crates/dsrs-data/src/dataloader.rs index 94b690b0..48228363 100644 --- a/crates/dspy-rs/src/data/dataloader.rs +++ b/crates/dsrs-data/src/dataloader.rs @@ -15,9 +15,8 @@ use std::io::Cursor; use std::path::{Path, PathBuf}; use tracing::debug; -use crate::data::utils::is_url; -use crate::predictors::Example as TypedExample; -use crate::{BamlType, BamlValue, Signature}; +use crate::utils::is_url; +use dsrs_core::{BamlType, BamlValue, Example as TypedExample, Signature}; /// Controls how typed loaders handle source fields that are not part of the target signature. #[derive(Debug, Clone, Copy, PartialEq, Eq, Default)] diff --git a/crates/dsrs-data/src/lib.rs b/crates/dsrs-data/src/lib.rs index 89442986..6bc5ec9a 100644 --- a/crates/dsrs-data/src/lib.rs +++ b/crates/dsrs-data/src/lib.rs @@ -1 +1,29 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! Data loading and runtime row types. +//! +//! Typed ingestion is now first-class: +//! +//! - [`DataLoader`] provides `load_*` methods that return +//! [`Example`](dsrs_core::Example) directly. +//! - Typed examples flow directly into evaluation and optimizer APIs. +//! +//! The untyped row type (`RawExample`) remains for internal runtime/tracing/cache bridges. + +#[cfg(any(feature = "csv", feature = "json", feature = "parquet", feature = "hf-hub"))] +pub mod dataloader; +pub mod example { + pub use dsrs_core::RawExample as Example; +} +pub mod prediction { + pub use dsrs_core::Prediction; +} +pub mod serialize; +pub mod utils; + +#[cfg(any(feature = "csv", feature = "json", feature = "parquet", feature = "hf-hub"))] +pub use dataloader::*; +pub use example::*; +pub use prediction::*; +pub use serialize::*; +pub use utils::*; + +pub type RawExample = dsrs_core::RawExample; diff --git a/crates/dspy-rs/src/data/serialize.rs b/crates/dsrs-data/src/serialize.rs similarity index 96% rename from crates/dspy-rs/src/data/serialize.rs rename to crates/dsrs-data/src/serialize.rs index bde1724f..10d73b74 100644 --- a/crates/dspy-rs/src/data/serialize.rs +++ b/crates/dsrs-data/src/serialize.rs @@ -2,7 +2,7 @@ use rayon::prelude::*; use std::fs::File; use std::io::{BufRead, BufReader, BufWriter, Write}; -use crate::data::example::Example; +use crate::example::Example; #[allow(clippy::lines_filter_map_ok)] pub fn load_jsonl(path: &str, input_keys: Vec, output_key: Vec) -> Vec { diff --git a/crates/dspy-rs/src/data/utils.rs b/crates/dsrs-data/src/utils.rs similarity index 100% rename from crates/dspy-rs/src/data/utils.rs rename to crates/dsrs-data/src/utils.rs From 82ce149f69554a8a30fc9e3fbcbecfbde1cedbbf Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:25:06 -0700 Subject: [PATCH 13/15] feat(leaven): add compiling integration scaffold Adds dsrs-leaven modules for artifact, change, surface, evaluator/problem, and evidence against the current local leaven crates. The bodies intentionally stay unimplemented; the value here is signature pressure against leaven-core/leaven-surface/leaven-engine/leaven-evidence while the real leaven optimizer path is still pending. The plan sketch expected older trait shapes. Current leaven has Artifact::ApplyError, EditSurface as a separate capability, and OptimizationProblem as the evaluator binding point, so the scaffold follows those actual APIs instead of stale names. Verification: - cargo check -p dsrs-leaven - cargo check --workspace Scaffolding: no runtime bridge yet; this is only a compile-time contract for the follow-up leaven implementation. --- Cargo.lock | 6 +++ crates/dsrs-leaven/Cargo.toml | 6 +++ crates/dsrs-leaven/src/artifact.rs | 53 +++++++++++++++++++++++++ crates/dsrs-leaven/src/change.rs | 5 +++ crates/dsrs-leaven/src/evaluator.rs | 46 ++++++++++++++++++++++ crates/dsrs-leaven/src/evidence.rs | 6 +++ crates/dsrs-leaven/src/lib.rs | 24 +++++++++++- crates/dsrs-leaven/src/surface.rs | 61 +++++++++++++++++++++++++++++ 8 files changed, 206 insertions(+), 1 deletion(-) create mode 100644 crates/dsrs-leaven/src/artifact.rs create mode 100644 crates/dsrs-leaven/src/change.rs create mode 100644 crates/dsrs-leaven/src/evaluator.rs create mode 100644 crates/dsrs-leaven/src/evidence.rs create mode 100644 crates/dsrs-leaven/src/surface.rs diff --git a/Cargo.lock b/Cargo.lock index 2c8bd218..51eccccc 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1388,10 +1388,16 @@ dependencies = [ name = "dsrs-leaven" version = "0.0.0" dependencies = [ + "dsrs-core", + "dsrs-evaluate", + "dsrs-predict", "leaven-core", "leaven-engine", "leaven-evidence", "leaven-surface", + "serde", + "serde_json", + "thiserror 2.0.17", ] [[package]] diff --git a/crates/dsrs-leaven/Cargo.toml b/crates/dsrs-leaven/Cargo.toml index 8605c9a4..d2bef48f 100644 --- a/crates/dsrs-leaven/Cargo.toml +++ b/crates/dsrs-leaven/Cargo.toml @@ -8,7 +8,13 @@ repository = "https://github.com/krypticmouse/DSRs" description = "Leaven integration for DSRs programs." [dependencies] +dsrs-core = { path = "../dsrs-core" } +dsrs-evaluate = { path = "../dsrs-evaluate" } +dsrs-predict = { path = "../dsrs-predict" } leaven-core = { path = "../../../leaven/crates/leaven-core" } leaven-surface = { path = "../../../leaven/crates/leaven-surface" } leaven-engine = { path = "../../../leaven/crates/leaven-engine" } leaven-evidence = { path = "../../../leaven/crates/leaven-evidence" } +serde = { version = "1.0.219", features = ["derive"] } +serde_json = "1.0.140" +thiserror = "2.0.17" diff --git a/crates/dsrs-leaven/src/artifact.rs b/crates/dsrs-leaven/src/artifact.rs new file mode 100644 index 00000000..16f40608 --- /dev/null +++ b/crates/dsrs-leaven/src/artifact.rs @@ -0,0 +1,53 @@ +use std::marker::PhantomData; + +use dsrs_core::{Module, Signature}; + +use crate::{DsrsLeavenError, DsrsProgramChange}; + +#[derive(Debug)] +pub struct DsrsProgramArtifact +where + S: Signature, + M: Module, +{ + _phantom: PhantomData<(S, M)>, +} + +impl Clone for DsrsProgramArtifact +where + S: Signature, + M: Module, +{ + fn clone(&self) -> Self { + Self::scaffold() + } +} + +impl DsrsProgramArtifact +where + S: Signature, + M: Module, +{ + pub const fn scaffold() -> Self { + Self { + _phantom: PhantomData, + } + } +} + +impl leaven_core::Artifact for DsrsProgramArtifact +where + S: Signature, + M: Module + Send + Sync + 'static, +{ + type Change = DsrsProgramChange; + type ApplyError = DsrsLeavenError; + + fn identity(&self) -> leaven_core::ArtifactIdentity { + unimplemented!("dsrs-leaven: artifact identity") + } + + fn apply_change(&self, _change: &Self::Change) -> Result { + unimplemented!("dsrs-leaven: artifact apply_change") + } +} diff --git a/crates/dsrs-leaven/src/change.rs b/crates/dsrs-leaven/src/change.rs new file mode 100644 index 00000000..f95d038a --- /dev/null +++ b/crates/dsrs-leaven/src/change.rs @@ -0,0 +1,5 @@ +#[derive(Clone, Debug, serde::Deserialize, serde::Serialize)] +pub struct DsrsProgramChange { + pub address: String, + pub replacement: serde_json::Value, +} diff --git a/crates/dsrs-leaven/src/evaluator.rs b/crates/dsrs-leaven/src/evaluator.rs new file mode 100644 index 00000000..07fc2282 --- /dev/null +++ b/crates/dsrs-leaven/src/evaluator.rs @@ -0,0 +1,46 @@ +use std::marker::PhantomData; + +use dsrs_core::{Module, Signature}; + +use crate::{DsrsEvidence, DsrsProgramArtifact}; + +#[derive(Clone, Debug)] +pub struct DsrsEvaluator +where + S: Signature, + M: Module, +{ + _phantom: PhantomData<(S, M)>, +} + +impl DsrsEvaluator +where + S: Signature, + M: Module, +{ + pub const fn scaffold() -> Self { + Self { + _phantom: PhantomData, + } + } +} + +#[derive(Clone, Debug)] +pub struct DsrsLeavenProblem +where + S: Signature, + M: Module, +{ + _phantom: PhantomData<(S, M)>, +} + +impl leaven_core::OptimizationProblem for DsrsLeavenProblem +where + S: Signature, + M: Module + Send + Sync + 'static, +{ + type Artifact = DsrsProgramArtifact; + type Case = serde_json::Value; + type Evidence = DsrsEvidence; + type ProposalAnnotations = serde_json::Value; +} diff --git a/crates/dsrs-leaven/src/evidence.rs b/crates/dsrs-leaven/src/evidence.rs new file mode 100644 index 00000000..3a9f18c8 --- /dev/null +++ b/crates/dsrs-leaven/src/evidence.rs @@ -0,0 +1,6 @@ +#[derive(Clone, Debug, serde::Deserialize, serde::Serialize)] +pub struct DsrsEvidence { + pub payload: serde_json::Value, +} + +impl leaven_core::Evidence for DsrsEvidence {} diff --git a/crates/dsrs-leaven/src/lib.rs b/crates/dsrs-leaven/src/lib.rs index 89442986..4fef693b 100644 --- a/crates/dsrs-leaven/src/lib.rs +++ b/crates/dsrs-leaven/src/lib.rs @@ -1 +1,23 @@ -//! Empty placeholder; code is migrated into this crate by a later task. +//! DSRs to leaven integration scaffolding. +//! +//! Bodies are deliberately `unimplemented!()` until the leaven-side optimizer +//! path is real. This crate exists to keep the capability trait signatures +//! compiling against the current leaven crates. + +pub mod artifact; +pub mod change; +pub mod evaluator; +pub mod evidence; +pub mod surface; + +pub use artifact::DsrsProgramArtifact; +pub use change::DsrsProgramChange; +pub use evaluator::{DsrsEvaluator, DsrsLeavenProblem}; +pub use evidence::DsrsEvidence; +pub use surface::DsrsProgramSurface; + +#[derive(Debug, thiserror::Error)] +pub enum DsrsLeavenError { + #[error("dsrs-leaven scaffold is not implemented yet: {0}")] + Unimplemented(&'static str), +} diff --git a/crates/dsrs-leaven/src/surface.rs b/crates/dsrs-leaven/src/surface.rs new file mode 100644 index 00000000..6d73347e --- /dev/null +++ b/crates/dsrs-leaven/src/surface.rs @@ -0,0 +1,61 @@ +use std::marker::PhantomData; + +use dsrs_core::{Module, Signature}; + +use crate::{DsrsProgramArtifact, DsrsProgramChange}; + +#[derive(Clone, Debug)] +pub struct DsrsProgramSurface +where + S: Signature, + M: Module, +{ + _phantom: PhantomData<(S, M)>, +} + +impl DsrsProgramSurface +where + S: Signature, + M: Module, +{ + pub const fn scaffold() -> Self { + Self { + _phantom: PhantomData, + } + } +} + +impl leaven_surface::EditSurface> for DsrsProgramSurface +where + S: Signature, + M: Module + Send + Sync + 'static, +{ + type PartId = String; + type Address = String; + type View<'a> + = &'a str + where + DsrsProgramArtifact: 'a; + type Edit = serde_json::Value; + + fn fingerprint(&self) -> leaven_surface::SurfaceFingerprint { + unimplemented!("dsrs-leaven: surface fingerprint") + } + + fn parts<'a>( + &self, + _artifact: &'a DsrsProgramArtifact, + ) -> Result>>, leaven_surface::SurfaceError> + { + unimplemented!("dsrs-leaven: surface parts") + } + + fn change_part( + &self, + _artifact: &DsrsProgramArtifact, + _id: Self::PartId, + _edit: Self::Edit, + ) -> Result { + unimplemented!("dsrs-leaven: surface change_part") + } +} From 6c138bd93140295f2e9df68996e80b69e7e2c8ae Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:27:01 -0700 Subject: [PATCH 14/15] feat: dissolve dspy-rs compatibility crate The split crates now own their tests and examples directly, so the old dspy-rs shell is gone instead of staying as a facade. cargo metadata no longer reports dspy-rs, and cargo check --workspace passes against the new crate graph. Moved the remaining example programs to their owning crates and rewired imports to dsrs-core, dsrs-lm, dsrs-predict, dsrs-evaluate, dsrs-gepa, dsrs-data, and dsrs-trace. The missing telemetry helper now lives in dsrs-trace because examples and runtime tracing need one canonical owner. Tried leaving dsrs-data with narrow defaults, but cargo check --workspace exposed that the current dataloader module still compiles every loader path together. This snapshot defaults dsrs-data to all loader features so the workspace gate is honest; scaffolding remains to split the dataloader module behind per-format cfgs later. Verification: cargo test -p dsrs-predict --no-run; cargo test -p dsrs-evaluate --no-run; cargo test -p dsrs-data --features all --no-run; cargo test -p dsrs-gepa --no-run; cargo test -p dsrs_macros --no-run; cargo check -p dsrs-{predict,evaluate,gepa,lm,trace} --examples; cargo metadata --format-version 1 --no-deps | rg '"name":"dspy-rs"|dspy-rs' (no matches); cargo check --workspace. --- Cargo.lock | 83 +++-- crates/dspy-rs/Cargo.toml | 64 ---- crates/dspy-rs/src/adapter/mod.rs | 2 - crates/dspy-rs/src/core/mod.rs | 40 --- crates/dspy-rs/src/data.rs | 1 - crates/dspy-rs/src/evaluate/mod.rs | 1 - crates/dspy-rs/src/lib.rs | 300 ------------------ crates/dspy-rs/src/modules/mod.rs | 1 - crates/dspy-rs/src/optimizer.rs | 1 - crates/dspy-rs/src/predictors/mod.rs | 1 - crates/dspy-rs/src/trace/mod.rs | 1 - crates/dspy-rs/src/utils/mod.rs | 18 -- crates/dspy-rs/src/utils/serde_utils.rs | 11 - crates/dsrs-core/Cargo.toml | 5 + crates/dsrs-core/src/lib.rs | 83 +++++ .../tests/test_call_outcome.rs | 2 +- .../tests/test_module_ext.rs | 8 +- .../tests/test_module_forward_all.rs | 2 +- .../tests/test_predictions.rs | 2 +- .../tests/test_signature.rs | 2 +- .../tests/test_signature_macro.rs | 2 +- .../tests/test_signature_schema.rs | 2 +- crates/dsrs-data/Cargo.toml | 11 +- .../tests/test_dataloader.rs | 8 +- .../tests/test_example.rs | 6 +- crates/dsrs-evaluate/Cargo.toml | 9 + .../examples/03-evaluate-hotpotqa.rs | 10 +- .../tests/test_evaluate_trainset_typed.rs | 6 +- crates/dsrs-gepa/Cargo.toml | 3 +- .../examples/09-gepa-sentiment.rs | 11 +- .../examples/10-gepa-llm-judge.rs | 11 +- .../{dspy-rs => dsrs-gepa}/tests/test_gepa.rs | 2 +- .../tests/test_gepa_typed_metric_feedback.rs | 8 +- .../tests/test_pareto.rs | 3 +- crates/dsrs-lm/Cargo.toml | 9 + .../examples/11-custom-client.rs | 4 +- crates/dsrs-lm/src/lib.rs | 2 + .../tests/test_adapters.rs | 2 +- .../tests/test_bamltype_docs_contract.rs | 2 +- .../{dspy-rs => dsrs-lm}/tests/test_chat.rs | 2 +- .../tests/test_chat_adapter_schema.rs | 4 +- .../tests/test_chat_prompt_composition.rs | 2 +- .../tests/test_chat_prompt_golden.rs | 2 +- .../tests/test_input_format.rs | 2 +- crates/{dspy-rs => dsrs-lm}/tests/test_lm.rs | 16 +- .../tests/test_message_roundtrip.rs | 2 +- .../tests/test_settings.rs | 2 +- .../tests/test_tool_call.rs | 6 +- .../tests/test_typed_alias.rs | 2 +- .../tests/test_typed_prompt_format.rs | 2 +- crates/dsrs-macros/Cargo.toml | 4 + .../tests/test_bamltype_attr_contract.rs | 5 +- .../tests/test_field_macro.rs | 2 +- .../tests/test_public_api_compile_fail.rs | 19 +- crates/dsrs-predict/Cargo.toml | 7 + .../examples/01-simple.rs | 11 +- .../examples/05-heterogenous-examples.rs | 5 +- .../examples/06-other-providers-batch.rs | 3 +- .../examples/07-inspect-history.rs | 3 +- .../examples/15-tools.rs | 3 +- .../examples/16-insurance-claim-prompt.rs | 3 +- .../examples/90-smoke-slice1-typed-predict.rs | 2 +- .../91-smoke-slice2-chain-of-thought.rs | 2 +- .../92-smoke-slice3-module-authoring.rs | 2 +- .../93-smoke-slice4-react-operational.rs | 2 +- crates/dsrs-predict/src/lib.rs | 3 +- .../tests/test_caller_managed_conversation.rs | 10 +- .../tests/test_chain_of_thought_swap.rs | 6 +- .../tests/test_flatten_roundtrip.rs | 2 +- .../tests/test_module_facet_shapes.rs | 10 +- .../tests/test_predict_conversation.rs | 2 +- .../tests/test_predict_conversation_live.rs | 2 +- .../tests/test_predict_lm_override.rs | 2 +- .../tests/test_react_builder.rs | 2 +- .../tests/test_with_reasoning_deref.rs | 2 +- .../tests/typed_integration.rs | 2 +- crates/dsrs-trace/Cargo.toml | 9 +- .../examples/12-tracing.rs | 12 +- .../examples/17-pretty-tracing.rs | 5 +- crates/dsrs-trace/src/lib.rs | 2 + .../src/utils => dsrs-trace/src}/telemetry.rs | 9 +- 81 files changed, 319 insertions(+), 615 deletions(-) delete mode 100644 crates/dspy-rs/Cargo.toml delete mode 100644 crates/dspy-rs/src/adapter/mod.rs delete mode 100644 crates/dspy-rs/src/core/mod.rs delete mode 100644 crates/dspy-rs/src/data.rs delete mode 100644 crates/dspy-rs/src/evaluate/mod.rs delete mode 100644 crates/dspy-rs/src/lib.rs delete mode 100644 crates/dspy-rs/src/modules/mod.rs delete mode 100644 crates/dspy-rs/src/optimizer.rs delete mode 100644 crates/dspy-rs/src/predictors/mod.rs delete mode 100644 crates/dspy-rs/src/trace/mod.rs delete mode 100644 crates/dspy-rs/src/utils/mod.rs delete mode 100644 crates/dspy-rs/src/utils/serde_utils.rs rename crates/{dspy-rs => dsrs-core}/tests/test_call_outcome.rs (99%) rename crates/{dspy-rs => dsrs-core}/tests/test_module_ext.rs (93%) rename crates/{dspy-rs => dsrs-core}/tests/test_module_forward_all.rs (94%) rename crates/{dspy-rs => dsrs-core}/tests/test_predictions.rs (98%) rename crates/{dspy-rs => dsrs-core}/tests/test_signature.rs (98%) rename crates/{dspy-rs => dsrs-core}/tests/test_signature_macro.rs (98%) rename crates/{dspy-rs => dsrs-core}/tests/test_signature_schema.rs (97%) rename crates/{dspy-rs => dsrs-data}/tests/test_dataloader.rs (98%) rename crates/{dspy-rs => dsrs-data}/tests/test_example.rs (97%) rename crates/{dspy-rs => dsrs-evaluate}/examples/03-evaluate-hotpotqa.rs (85%) rename crates/{dspy-rs => dsrs-evaluate}/tests/test_evaluate_trainset_typed.rs (94%) rename crates/{dspy-rs => dsrs-gepa}/examples/09-gepa-sentiment.rs (93%) rename crates/{dspy-rs => dsrs-gepa}/examples/10-gepa-llm-judge.rs (95%) rename crates/{dspy-rs => dsrs-gepa}/tests/test_gepa.rs (94%) rename crates/{dspy-rs => dsrs-gepa}/tests/test_gepa_typed_metric_feedback.rs (98%) rename crates/{dspy-rs => dsrs-gepa}/tests/test_pareto.rs (96%) rename crates/{dspy-rs => dsrs-lm}/examples/11-custom-client.rs (91%) rename crates/{dspy-rs => dsrs-lm}/tests/test_adapters.rs (98%) rename crates/{dspy-rs => dsrs-lm}/tests/test_bamltype_docs_contract.rs (98%) rename crates/{dspy-rs => dsrs-lm}/tests/test_chat.rs (99%) rename crates/{dspy-rs => dsrs-lm}/tests/test_chat_adapter_schema.rs (94%) rename crates/{dspy-rs => dsrs-lm}/tests/test_chat_prompt_composition.rs (99%) rename crates/{dspy-rs => dsrs-lm}/tests/test_chat_prompt_golden.rs (98%) rename crates/{dspy-rs => dsrs-lm}/tests/test_input_format.rs (99%) rename crates/{dspy-rs => dsrs-lm}/tests/test_lm.rs (96%) rename crates/{dspy-rs => dsrs-lm}/tests/test_message_roundtrip.rs (99%) rename crates/{dspy-rs => dsrs-lm}/tests/test_settings.rs (93%) rename crates/{dspy-rs => dsrs-lm}/tests/test_tool_call.rs (96%) rename crates/{dspy-rs => dsrs-lm}/tests/test_typed_alias.rs (97%) rename crates/{dspy-rs => dsrs-lm}/tests/test_typed_prompt_format.rs (99%) rename crates/{dspy-rs => dsrs-macros}/tests/test_bamltype_attr_contract.rs (84%) rename crates/{dspy-rs => dsrs-macros}/tests/test_field_macro.rs (99%) rename crates/{dspy-rs => dsrs-macros}/tests/test_public_api_compile_fail.rs (82%) rename crates/{dspy-rs => dsrs-predict}/examples/01-simple.rs (97%) rename crates/{dspy-rs => dsrs-predict}/examples/05-heterogenous-examples.rs (92%) rename crates/{dspy-rs => dsrs-predict}/examples/06-other-providers-batch.rs (93%) rename crates/{dspy-rs => dsrs-predict}/examples/07-inspect-history.rs (88%) rename crates/{dspy-rs => dsrs-predict}/examples/15-tools.rs (97%) rename crates/{dspy-rs => dsrs-predict}/examples/16-insurance-claim-prompt.rs (98%) rename crates/{dspy-rs => dsrs-predict}/examples/90-smoke-slice1-typed-predict.rs (93%) rename crates/{dspy-rs => dsrs-predict}/examples/91-smoke-slice2-chain-of-thought.rs (92%) rename crates/{dspy-rs => dsrs-predict}/examples/92-smoke-slice3-module-authoring.rs (93%) rename crates/{dspy-rs => dsrs-predict}/examples/93-smoke-slice4-react-operational.rs (97%) rename crates/{dspy-rs => dsrs-predict}/tests/test_caller_managed_conversation.rs (96%) rename crates/{dspy-rs => dsrs-predict}/tests/test_chain_of_thought_swap.rs (92%) rename crates/{dspy-rs => dsrs-predict}/tests/test_flatten_roundtrip.rs (92%) rename crates/{dspy-rs => dsrs-predict}/tests/test_module_facet_shapes.rs (89%) rename crates/{dspy-rs => dsrs-predict}/tests/test_predict_conversation.rs (99%) rename crates/{dspy-rs => dsrs-predict}/tests/test_predict_conversation_live.rs (95%) rename crates/{dspy-rs => dsrs-predict}/tests/test_predict_lm_override.rs (96%) rename crates/{dspy-rs => dsrs-predict}/tests/test_react_builder.rs (98%) rename crates/{dspy-rs => dsrs-predict}/tests/test_with_reasoning_deref.rs (93%) rename crates/{dspy-rs => dsrs-predict}/tests/typed_integration.rs (99%) rename crates/{dspy-rs => dsrs-trace}/examples/12-tracing.rs (92%) rename crates/{dspy-rs => dsrs-trace}/examples/17-pretty-tracing.rs (88%) rename crates/{dspy-rs/src/utils => dsrs-trace/src}/telemetry.rs (84%) diff --git a/Cargo.lock b/Cargo.lock index 51eccccc..8be90e1d 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1255,51 +1255,6 @@ version = "1.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "75b325c5dbd37f80359721ad39aca5a29fb04c89279657cffdda8736d0c0b9d2" -[[package]] -name = "dspy-rs" -version = "0.7.3" -dependencies = [ - "anyhow", - "arrow", - "async-trait", - "bamltype", - "bon", - "csv", - "dsrs-cache", - "dsrs-core", - "dsrs-data", - "dsrs-evaluate", - "dsrs-gepa", - "dsrs-lm", - "dsrs-predict", - "dsrs-trace", - "dsrs_macros", - "enum_dispatch", - "facet", - "foyer", - "futures", - "hf-hub", - "indexmap", - "kdam", - "minijinja", - "parquet", - "rand 0.8.5", - "rayon", - "regex", - "reqwest 0.13.2", - "rig-core", - "rstest 0.25.0", - "schemars", - "serde", - "serde_json", - "temp-env", - "tempfile", - "thiserror 2.0.17", - "tokio", - "tracing", - "tracing-subscriber", -] - [[package]] name = "dsrs-cache" version = "0.0.0" @@ -1323,15 +1278,18 @@ dependencies = [ "async-trait", "bamltype", "bon", + "dsrs_macros", "facet", "futures", "indexmap", "kdam", "rig-core", + "rstest 0.25.0", "schemars", "serde", "serde_json", "thiserror 2.0.17", + "tokio", "tracing", ] @@ -1342,17 +1300,24 @@ dependencies = [ "anyhow", "arrow", "bamltype", + "bon", "csv", "dsrs-core", + "dsrs-evaluate", "dsrs-predict", + "dsrs_macros", + "facet", "hf-hub", "parquet", "rayon", "regex", "reqwest 0.13.2", + "rstest 0.25.0", "serde", "serde_json", + "tempfile", "thiserror 2.0.17", + "tokio", "tracing", ] @@ -1361,9 +1326,16 @@ name = "dsrs-evaluate" version = "0.0.0" dependencies = [ "anyhow", + "bamltype", "dsrs-core", + "dsrs-data", + "dsrs-lm", + "dsrs-predict", + "dsrs-trace", + "dsrs_macros", "serde", "serde_json", + "tokio", ] [[package]] @@ -1376,6 +1348,7 @@ dependencies = [ "dsrs-evaluate", "dsrs-lm", "dsrs-predict", + "dsrs-trace", "dsrs_macros", "facet", "rand 0.8.5", @@ -1409,6 +1382,9 @@ dependencies = [ "bon", "dsrs-cache", "dsrs-core", + "dsrs-predict", + "dsrs-trace", + "dsrs_macros", "enum_dispatch", "facet", "indexmap", @@ -1416,8 +1392,10 @@ dependencies = [ "regex", "reqwest 0.13.2", "rig-core", + "rstest 0.25.0", "serde", "serde_json", + "temp-env", "tokio", "tracing", ] @@ -1428,13 +1406,18 @@ version = "0.0.0" dependencies = [ "anyhow", "bamltype", + "bon", "dsrs-core", "dsrs-lm", "dsrs-trace", "dsrs_macros", "facet", "rig-core", + "rstest 0.25.0", + "serde", "serde_json", + "temp-env", + "tokio", "tracing", ] @@ -1443,10 +1426,15 @@ name = "dsrs-trace" version = "0.0.0" dependencies = [ "anyhow", + "bon", "dsrs-core", + "dsrs-lm", + "dsrs-predict", "serde_json", + "thiserror 2.0.17", "tokio", "tracing", + "tracing-subscriber", ] [[package]] @@ -1455,12 +1443,16 @@ version = "0.7.2" dependencies = [ "bamltype", "dsrs-core", + "facet", "minijinja", "proc-macro-crate", "proc-macro2", "quote", + "rstest 0.25.0", + "schemars", "serde_json", "syn 2.0.106", + "tempfile", "trybuild", ] @@ -4766,7 +4758,6 @@ dependencies = [ "bytes", "libc", "mio", - "parking_lot", "pin-project-lite", "signal-hook-registry", "socket2", diff --git a/crates/dspy-rs/Cargo.toml b/crates/dspy-rs/Cargo.toml deleted file mode 100644 index 10ae7089..00000000 --- a/crates/dspy-rs/Cargo.toml +++ /dev/null @@ -1,64 +0,0 @@ -[package] -name = "dspy-rs" -authors = ["Herumb Shandilya "] -version = "0.7.3" -edition = "2024" -description = "A DSPy rewrite(not port) to Rust." -readme = "../../README.md" -documentation = "https://dsrs.herumbshandilya.com" -homepage = "https://dsrs.herumbshandilya.com" -repository = "https://github.com/krypticmouse/DSRs" -license = "Apache-2.0" -exclude = [ - "docs/*", -] - -[dependencies] -futures = "0.3.31" -indexmap = "2.10.0" -rayon = "1.10.0" -rstest = "0.25.0" -schemars = "1.0.4" -serde = { version = "1.0.219", features = ["derive"] } -serde_json = { version = "1.0.140", features = ["preserve_order"] } -tokio = { version = "1.46.1", features = ["full"] } -async-trait = "0.1.83" -anyhow = "1.0.99" -bon = "3.7.0" -bamltype = { path = "../bamltype" } -dsrs-core = { path = "../dsrs-core" } -dsrs-cache = { path = "../dsrs-cache" } -dsrs-lm = { path = "../dsrs-lm" } -dsrs-predict = { path = "../dsrs-predict" } -dsrs-evaluate = { path = "../dsrs-evaluate" } -dsrs-gepa = { path = "../dsrs-gepa" } -dsrs-data = { path = "../dsrs-data", features = ["all"] } -dsrs-trace = { path = "../dsrs-trace" } -# Keep this direct pin in sync with workspace [patch.crates-io] for self-sufficient external path consumers. -facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } -thiserror = "2.0.17" -dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } -csv = { version = "1.3.1" } -hf-hub = { version = "0.4.3", features = ["tokio"] } -parquet = { version = "56.1.0" } -arrow = { version = "56.1.0" } -regex = "1.11.2" -reqwest = { version = "0.13", features = ["blocking"] } -kdam = "0.6.3" -rand = "0.8.5" -foyer = { version = "0.20.0", features = ["serde"]} -tempfile = "3.23.0" -rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev="aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } -enum_dispatch = "0.3.13" -tracing = "0.1.44" -tracing-subscriber = { version = "0.3.22", features = ["env-filter", "fmt"] } -minijinja = { git = "https://github.com/boundaryml/minijinja.git", branch = "main", default-features = false, features = ["builtins", "serde"] } - -[package.metadata.cargo-machete] -ignored = ["rig-core"] - -[features] -default = [] - -[dev-dependencies] -temp-env = { version = "0.3.6", features = ["async_closure"] } diff --git a/crates/dspy-rs/src/adapter/mod.rs b/crates/dspy-rs/src/adapter/mod.rs deleted file mode 100644 index 12fa552d..00000000 --- a/crates/dspy-rs/src/adapter/mod.rs +++ /dev/null @@ -1,2 +0,0 @@ -pub use dsrs_lm::adapter::*; -pub use dsrs_lm::chat::*; diff --git a/crates/dspy-rs/src/core/mod.rs b/crates/dspy-rs/src/core/mod.rs deleted file mode 100644 index 4546defc..00000000 --- a/crates/dspy-rs/src/core/mod.rs +++ /dev/null @@ -1,40 +0,0 @@ -//! The foundational abstractions everything else is built on. -//! -//! A [`Signature`] declares what you want the LM to do — input fields, output fields, -//! and an instruction. [`SignatureSchema`] is the Facet-derived metadata for those fields, -//! cached once per type and shared by the adapter and optimizer. [`Module`] is the trait -//! every prompting strategy implements — it's deliberately narrow (`forward` takes an -//! input, returns a predicted output) so that strategies are interchangeable. -//! -//! [`Predicted`] wraps a typed output with [`CallMetadata`] (raw response text, token -//! usage, per-field parse results). The error hierarchy — [`PredictError`], [`ParseError`], -//! [`LmError`] — distinguishes LM failures from parse failures so callers can handle -//! retries differently. [`LM`] is the language model client itself. -//! -//! Optimizer leaf discovery is internal (`visit_named_predictors_mut`) and currently -//! traverses struct fields plus `Option`, `Vec`, `HashMap`, and `Box`. -//! `Rc`/`Arc` wrappers that contain `Predict` leaves are rejected with explicit -//! container errors. -//! -//! Most users import these through the crate root (`use dspy_rs::*`). Module authors -//! who need fine-grained prompt control also use [`SignatureSchema`] and the adapter -//! building blocks directly. - -pub mod lm { - pub use dsrs_lm::lm::*; -} -pub mod settings { - pub use dsrs_lm::settings::*; -} - -pub use dsrs_core::{ - Augmentation, Augmented, BamlConvertError, BamlType, BamlValue, CallMetadata, Constraint, - ConstraintKind, ConstraintLevel, ConstraintResult, ConstraintSpec, ConversionError, - DynPredictor, ErrorClass, Facet, FieldMeta, FieldMetadataSpec, FieldPath, FieldSchema, Flag, - InputRenderSpec, JsonishError, LmError, LmUsage, Module, ModuleExt, NamedParametersError, - OutputFormatContent, ParseError, PredictError, PredictState, Predicted, Prediction, - RawExample, RenderOptions, ResponseCheck, Shape, Signature, SignatureSchema, StreamingMode, - TrackedValue, TypeIR, visit_named_predictors_mut, -}; -pub use lm::*; -pub use settings::*; diff --git a/crates/dspy-rs/src/data.rs b/crates/dspy-rs/src/data.rs deleted file mode 100644 index ae93c622..00000000 --- a/crates/dspy-rs/src/data.rs +++ /dev/null @@ -1 +0,0 @@ -pub use dsrs_data::*; diff --git a/crates/dspy-rs/src/evaluate/mod.rs b/crates/dspy-rs/src/evaluate/mod.rs deleted file mode 100644 index 4af05aa8..00000000 --- a/crates/dspy-rs/src/evaluate/mod.rs +++ /dev/null @@ -1 +0,0 @@ -pub use dsrs_evaluate::*; diff --git a/crates/dspy-rs/src/lib.rs b/crates/dspy-rs/src/lib.rs deleted file mode 100644 index ec76d90d..00000000 --- a/crates/dspy-rs/src/lib.rs +++ /dev/null @@ -1,300 +0,0 @@ -//! Typed prompt engineering and LM program optimization. -//! -//! DSRs is a Rust port of [DSPy](https://github.com/stanfordnlp/dspy): you declare what -//! you want the LM to produce (a [`Signature`]), pick a prompting strategy (a [`Module`] -//! like [`Predict`] or [`ChainOfThought`]), and let an [`Optimizer`] tune the program's -//! instructions and demos on your training data. The type system enforces correctness -//! at every layer — field types, strategy swaps, and augmentation composition are all -//! compile-time checked. -//! -//! # The mental model -//! -//! Three concepts, three layers: -//! -//! | Layer | Concept | Key types | Who | -//! |-------|---------|-----------|-----| -//! | **Signatures** | "Given these inputs, produce these outputs" | [`Signature`], `#[derive(Signature)]` | Everyone | -//! | **Modules** | Prompting strategies that implement a signature | [`Module`], [`Predict`], [`ChainOfThought`] | Everyone | -//! | **Optimization** | Auto-tuning instructions and demos | [`Optimizer`], [`COPRO`], [`GEPA`], [`MIPROv2`] | When you need better results | -//! -//! A [`Predict`] is the leaf — the only thing that actually calls the LM. Every other -//! module ([`ChainOfThought`], custom pipelines) delegates to one or more `Predict` leaves. -//! Optimizers discover these leaves automatically via Facet reflection and mutate their -//! instructions and few-shot demos. -//! -//! # Quick start -//! -//! ```no_run -//! use dspy_rs::*; -//! -//! #[derive(Signature, Clone, Debug)] -//! /// Answer questions accurately and concisely. -//! struct QA { -//! #[input] question: String, -//! #[output] answer: String, -//! } -//! -//! # async fn example() -> Result<(), PredictError> { -//! // 1. Configure the LM -//! let lm = LM::builder() -//! .model("openai:gpt-4o-mini".to_string()) -//! .build() -//! .await -//! .unwrap(); -//! dspy_rs::configure(lm, ChatAdapter); -//! -//! // 2. Pick a strategy -//! let cot = ChainOfThought::::new(); -//! -//! // 3. Call it -//! let result = cot.call(QAInput { question: "What is 2+2?".into() }).await?; -//! println!("{}", result.reasoning); // chain-of-thought text -//! println!("{}", result.answer); // the actual answer, via Deref -//! # Ok(()) -//! # } -//! ``` -//! -//! `ChainOfThought` returns [`Predicted>`](Predicted), not -//! `Predicted`. You access `.reasoning` directly and `.answer` through auto-deref -//! ([`WithReasoning`] derefs to `O`). This pattern holds for all augmentations — the -//! compiler tells you what changed when you swap strategies. -//! -//! # What doesn't work (yet) -//! -//! - **No dynamic graph / structural optimization.** The type-erased `ProgramGraph`, -//! `DynModule`, `StrategyFactory` layer was prototyped and intentionally removed. -//! Everything here is statically typed, which is both the strength and the constraint. -//! - **MIPRO is instruction-only.** It should also mutate demos per-predictor based on -//! trace data — Python DSPy does this — but it doesn't yet (`TODO(trace-demos)`). -//! - **No `ReAct`, `BestOfN`, `Refine`, or other advanced modules** beyond `ChainOfThought`. -//! The module trait and augmentation system are designed for them, but nobody's built -//! them yet. -//! - **`CallMetadata` is not extensible.** Modules can't attach custom metadata (e.g. -//! "which attempt won in BestOfN"). This should probably be a trait with associated -//! types, but it isn't. -//! - **Container traversal is partial.** The optimizer walker handles `Option`, `Vec`, -//! `HashMap`, and `Box`. `Rc`/`Arc` containing `Predict` leaves return -//! explicit container errors (not silent skips), and `Predict` discovery requires -//! a valid shape-local accessor payload (`TODO(dsrs-shared-ptr-policy)`). -//! -//! # Crate organization -//! -//! - [`adapter`] — Prompt formatting and LM response parsing ([`ChatAdapter`]) -//! - [`core`] — [`Module`] trait, [`Signature`] trait, [`SignatureSchema`], error types, -//! LM client, [`Predicted`] and [`CallMetadata`] -//! - [`predictors`] — [`Predict`] (the leaf module) and typed [`Example`] -//! - [`modules`] — [`ChainOfThought`] and augmentation types -//! - [`evaluate`] — [`TypedMetric`] trait, [`evaluate_trainset`], scoring utilities -//! - [`optimizer`] — [`Optimizer`] trait, [`COPRO`], [`GEPA`], [`MIPROv2`] -//! - [`data`] — [`DataLoader`] for JSON/CSV/Parquet/HuggingFace datasets -//! - [`trace`] — Execution graph recording for debugging -//! - [`utils`] — Response caching - -// TODO(dsrs-facet-lint-scope): remove this crate-level allow once Facet's generated -// extension-attr dispatch no longer triggers rust-lang/rust#52234 on in-crate usage. -#![allow(macro_expanded_macro_exports_accessed_by_absolute_paths)] - -extern crate self as dspy_rs; - -pub mod adapter; -pub mod augmentation { - pub use dsrs_core::*; -} -pub mod core; -pub mod data; -pub mod evaluate; -pub mod modules; -pub mod optimizer; -pub mod predictors; -pub mod trace; -pub mod utils; - -pub use adapter::*; -pub use dsrs_core::*; -pub use core::*; -pub use data::dataloader::*; -pub use data::serialize::*; -pub use data::utils::*; -pub use evaluate::*; -pub use modules::*; -pub use optimizer::*; -pub use predictors::*; -pub use utils::*; - -pub use bamltype::BamlConvertError; -pub use bamltype::BamlType; // attribute macro -pub use bamltype::Shape; -pub use bamltype::baml_types::{ - BamlValue, Constraint, ConstraintLevel, ResponseCheck, StreamingMode, TypeIR, -}; -pub use bamltype::internal_baml_jinja::types::{OutputFormatContent, RenderOptions}; -pub use bamltype::jsonish::deserializer::deserialize_flags::Flag; -pub use dsrs_macros::*; -pub use facet::Facet; - -/// Pre-built signature for use in doc examples. Not part of the public API. -#[doc(hidden)] -pub mod doctest { - #[derive(crate::Signature, Clone, Debug)] - /// Answer questions accurately and concisely. - pub struct QA { - #[input] - pub question: String, - #[output] - pub answer: String, - } -} - -#[doc(hidden)] -pub mod __macro_support { - pub use anyhow; - pub use bamltype; - pub use indexmap; - pub use schemars; - pub use serde; - pub use serde_json; -} - -#[macro_export] -macro_rules! field { - // Example Usage: field! { - // input["Description"] => question: String - // } - // - // Example Output: - // - // { - // "question": { - // "type": "String", - // "desc": "Description", - // "schema": "" - // }, - // ... - // } - - // Pattern for field definitions with descriptions - { $($field_type:ident[$desc:literal] => $field_name:ident : $field_ty:ty),* $(,)? } => {{ - use $crate::__macro_support::serde_json::json; - - let mut result = $crate::__macro_support::serde_json::Map::new(); - - $( - let type_str = stringify!($field_ty); - let schema = { - let schema = $crate::__macro_support::schemars::schema_for!($field_ty); - let schema_json = $crate::__macro_support::serde_json::to_value(schema).unwrap(); - // Extract just the properties if it's an object schema - if let Some(obj) = schema_json.as_object() { - if obj.contains_key("properties") { - schema_json["properties"].clone() - } else { - "".to_string().into() - } - } else { - "".to_string().into() - } - }; - result.insert( - stringify!($field_name).to_string(), - json!({ - "type": type_str, - "desc": $desc, - "schema": schema, - "__dsrs_field_type": stringify!($field_type) - }) - ); - )* - - $crate::__macro_support::serde_json::Value::Object(result) - }}; - - // Pattern for field definitions without descriptions - { $($field_type:ident => $field_name:ident : $field_ty:ty),* $(,)? } => {{ - use $crate::__macro_support::serde_json::json; - - let mut result = $crate::__macro_support::serde_json::Map::new(); - - $( - let type_str = stringify!($field_ty); - let schema = { - let schema = $crate::__macro_support::schemars::schema_for!($field_ty); - let schema_json = $crate::__macro_support::serde_json::to_value(schema).unwrap(); - // Extract just the properties if it's an object schema - if let Some(obj) = schema_json.as_object() { - if obj.contains_key("properties") { - schema_json["properties"].clone() - } else { - "".to_string().into() - } - } else { - "".to_string().into() - } - }; - result.insert( - stringify!($field_name).to_string(), - json!({ - "type": type_str, - "desc": "", - "schema": schema, - "__dsrs_field_type": stringify!($field_type) - }) - ); - )* - - $crate::__macro_support::serde_json::Value::Object(result) - }}; -} - -#[macro_export] -macro_rules! sign { - // Example Usage: signature! { - // question: String, random: bool -> answer: String - // } - // - // Example Output: - // - // #[derive(Signature, Clone)] - // struct InlineSignature { - // #[input] - // question: String, - // #[input] - // random: bool, - // #[output] - // answer: String, - // } - // - // Predict::::new() - - // Pattern: input fields -> output fields - { ($($input_name:ident : $input_type:ty),* $(,)?) -> $($output_name:ident : $output_type:ty),* $(,)? } => {{ - #[derive($crate::Signature, Clone)] - struct __InlineSignature { - $( - #[input] - $input_name: $input_type, - )* - $( - #[output] - $output_name: $output_type, - )* - } - - $crate::Predict::<__InlineSignature>::new() - }}; -} - -/// Source: -/// Author: -/// License: MIT -/// Description: This macro creates a HashMap from a list of key-value pairs. -/// Reason for Reuse: Want to avoid adding a dependency for a simple macro. -#[macro_export] -macro_rules! hashmap { - () => { - ::std::collections::HashMap::new() - }; - - ($($key:expr => $value:expr),+ $(,)?) => { - ::std::collections::HashMap::from([ $(($key, $value)),* ]) - }; -} diff --git a/crates/dspy-rs/src/modules/mod.rs b/crates/dspy-rs/src/modules/mod.rs deleted file mode 100644 index 858fedc9..00000000 --- a/crates/dspy-rs/src/modules/mod.rs +++ /dev/null @@ -1 +0,0 @@ -pub use dsrs_predict::{ChainOfThought, ChainOfThoughtOutput, ReAct, Reasoning, WithReasoning}; diff --git a/crates/dspy-rs/src/optimizer.rs b/crates/dspy-rs/src/optimizer.rs deleted file mode 100644 index 191b2540..00000000 --- a/crates/dspy-rs/src/optimizer.rs +++ /dev/null @@ -1 +0,0 @@ -pub use dsrs_gepa::*; diff --git a/crates/dspy-rs/src/predictors/mod.rs b/crates/dspy-rs/src/predictors/mod.rs deleted file mode 100644 index db271454..00000000 --- a/crates/dspy-rs/src/predictors/mod.rs +++ /dev/null @@ -1 +0,0 @@ -pub use dsrs_predict::*; diff --git a/crates/dspy-rs/src/trace/mod.rs b/crates/dspy-rs/src/trace/mod.rs deleted file mode 100644 index 526262cf..00000000 --- a/crates/dspy-rs/src/trace/mod.rs +++ /dev/null @@ -1 +0,0 @@ -pub use dsrs_trace::*; diff --git a/crates/dspy-rs/src/utils/mod.rs b/crates/dspy-rs/src/utils/mod.rs deleted file mode 100644 index b548bede..00000000 --- a/crates/dspy-rs/src/utils/mod.rs +++ /dev/null @@ -1,18 +0,0 @@ -//! LM response caching. -//! -//! The [`ResponseCache`] provides a hybrid memory + disk cache backed by -//! [foyer](https://docs.rs/foyer). It also maintains a sliding window of recent -//! entries for [`LM::inspect_history`](crate::LM::inspect_history). -//! -//! Caching is per-LM-instance and keyed on the full prompt content. Cache entries -//! are not shared across LM instances. - -pub mod cache { - pub use dsrs_cache::*; -} -pub mod serde_utils; -pub mod telemetry; - -pub use cache::{Cache, CacheEntry, ResponseCache}; -pub use serde_utils::get_iter_from_value; -pub use telemetry::{TelemetryInitError, init_tracing, truncate}; diff --git a/crates/dspy-rs/src/utils/serde_utils.rs b/crates/dspy-rs/src/utils/serde_utils.rs deleted file mode 100644 index 9bda5a78..00000000 --- a/crates/dspy-rs/src/utils/serde_utils.rs +++ /dev/null @@ -1,11 +0,0 @@ -pub fn get_iter_from_value( - value: &serde_json::Value, -) -> impl Iterator { - value - .as_object() - .unwrap() - .iter() - .map(|(k, v)| (k.to_string(), v.clone())) - .collect::>() - .into_iter() -} diff --git a/crates/dsrs-core/Cargo.toml b/crates/dsrs-core/Cargo.toml index ac107f94..dc6b28bf 100644 --- a/crates/dsrs-core/Cargo.toml +++ b/crates/dsrs-core/Cargo.toml @@ -12,6 +12,7 @@ anyhow = "1.0.99" async-trait = "0.1.83" bamltype = { path = "../bamltype" } bon = "3.7.0" +dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } futures = "0.3.31" indexmap = "2.10.0" @@ -22,3 +23,7 @@ serde = { version = "1.0.219", features = ["derive"] } serde_json = { version = "1.0.140", features = ["preserve_order"] } thiserror = "2.0.17" tracing = "0.1.44" + +[dev-dependencies] +rstest = "0.25.0" +tokio = { version = "1.46.1", features = ["macros", "rt", "time"] } diff --git a/crates/dsrs-core/src/lib.rs b/crates/dsrs-core/src/lib.rs index df764619..857e567b 100644 --- a/crates/dsrs-core/src/lib.rs +++ b/crates/dsrs-core/src/lib.rs @@ -40,6 +40,7 @@ pub use bamltype::baml_types::{ }; pub use bamltype::internal_baml_jinja::types::{OutputFormatContent, RenderOptions}; pub use bamltype::jsonish::deserializer::deserialize_flags::Flag; +pub use dsrs_macros::*; pub use facet::Facet; #[doc(hidden)] @@ -52,6 +53,88 @@ pub mod __macro_support { pub use serde_json; } +#[macro_export] +macro_rules! field { + { $($field_type:ident[$desc:literal] => $field_name:ident : $field_ty:ty),* $(,)? } => {{ + use $crate::__macro_support::serde_json::json; + + let mut result = $crate::__macro_support::serde_json::Map::new(); + + $( + let type_str = stringify!($field_ty); + let schema = { + let schema = $crate::__macro_support::schemars::schema_for!($field_ty); + let schema_json = $crate::__macro_support::serde_json::to_value(schema).unwrap(); + if let Some(obj) = schema_json.as_object() { + if obj.contains_key("properties") { + schema_json["properties"].clone() + } else { + "".to_string().into() + } + } else { + "".to_string().into() + } + }; + result.insert( + stringify!($field_name).to_string(), + json!({ + "type": type_str, + "desc": $desc, + "schema": schema, + "__dsrs_field_type": stringify!($field_type) + }) + ); + )* + + $crate::__macro_support::serde_json::Value::Object(result) + }}; + + { $($field_type:ident => $field_name:ident : $field_ty:ty),* $(,)? } => {{ + use $crate::__macro_support::serde_json::json; + + let mut result = $crate::__macro_support::serde_json::Map::new(); + + $( + let type_str = stringify!($field_ty); + let schema = { + let schema = $crate::__macro_support::schemars::schema_for!($field_ty); + let schema_json = $crate::__macro_support::serde_json::to_value(schema).unwrap(); + if let Some(obj) = schema_json.as_object() { + if obj.contains_key("properties") { + schema_json["properties"].clone() + } else { + "".to_string().into() + } + } else { + "".to_string().into() + } + }; + result.insert( + stringify!($field_name).to_string(), + json!({ + "type": type_str, + "desc": "", + "schema": schema, + "__dsrs_field_type": stringify!($field_type) + }) + ); + )* + + $crate::__macro_support::serde_json::Value::Object(result) + }}; +} + +#[macro_export] +macro_rules! hashmap { + () => { + ::std::collections::HashMap::new() + }; + + ($($key:expr => $value:expr),+ $(,)?) => { + ::std::collections::HashMap::from([ $(($key, $value)),* ]) + }; +} + #[derive(Clone, Debug, serde::Serialize)] pub struct TrackedValue { pub value: serde_json::Value, diff --git a/crates/dspy-rs/tests/test_call_outcome.rs b/crates/dsrs-core/tests/test_call_outcome.rs similarity index 99% rename from crates/dspy-rs/tests/test_call_outcome.rs rename to crates/dsrs-core/tests/test_call_outcome.rs index ee8183c1..effb2e81 100644 --- a/crates/dspy-rs/tests/test_call_outcome.rs +++ b/crates/dsrs-core/tests/test_call_outcome.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ +use dsrs_core::{ CallMetadata, ConstraintResult, FieldMeta, LmUsage, ParseError, PredictError, Predicted, }; use indexmap::IndexMap; diff --git a/crates/dspy-rs/tests/test_module_ext.rs b/crates/dsrs-core/tests/test_module_ext.rs similarity index 93% rename from crates/dspy-rs/tests/test_module_ext.rs rename to crates/dsrs-core/tests/test_module_ext.rs index c7bb1c16..1bea0e97 100644 --- a/crates/dspy-rs/tests/test_module_ext.rs +++ b/crates/dsrs-core/tests/test_module_ext.rs @@ -1,4 +1,4 @@ -use dspy_rs::{BamlType, CallMetadata, Module, ModuleExt, ParseError, PredictError, Predicted}; +use dsrs_core::{BamlType, CallMetadata, Module, ModuleExt, ParseError, PredictError, Predicted}; struct MaybeFails; @@ -22,7 +22,7 @@ impl Module for MaybeFails { let input_value = input.value; let metadata = CallMetadata::new( format!("raw:{input_value}"), - dspy_rs::LmUsage::default(), + dsrs_core::LmUsage::default(), Vec::new(), Vec::new(), Some(input_value.max(0) as usize), @@ -36,7 +36,7 @@ impl Module for MaybeFails { raw_response: format!("raw:{input_value}"), }, raw_response: format!("raw:{input_value}"), - lm_usage: dspy_rs::LmUsage::default(), + lm_usage: dsrs_core::LmUsage::default(), }) } else { Ok(Predicted::new( @@ -65,7 +65,7 @@ fn transform_int_payload(value: IntPayload) -> Result raw_response: "transform".to_string(), }, raw_response: "transform".to_string(), - lm_usage: dspy_rs::LmUsage::default(), + lm_usage: dsrs_core::LmUsage::default(), }) } } diff --git a/crates/dspy-rs/tests/test_module_forward_all.rs b/crates/dsrs-core/tests/test_module_forward_all.rs similarity index 94% rename from crates/dspy-rs/tests/test_module_forward_all.rs rename to crates/dsrs-core/tests/test_module_forward_all.rs index a2376455..63495bf2 100644 --- a/crates/dspy-rs/tests/test_module_forward_all.rs +++ b/crates/dsrs-core/tests/test_module_forward_all.rs @@ -1,6 +1,6 @@ use std::time::Duration; -use dspy_rs::{BamlType, CallMetadata, Module, PredictError, Predicted, forward_all}; +use dsrs_core::{BamlType, CallMetadata, Module, PredictError, Predicted, forward_all}; use tokio::time::sleep; struct DelayEcho; diff --git a/crates/dspy-rs/tests/test_predictions.rs b/crates/dsrs-core/tests/test_predictions.rs similarity index 98% rename from crates/dspy-rs/tests/test_predictions.rs rename to crates/dsrs-core/tests/test_predictions.rs index 815ebdba..4fd9aa03 100644 --- a/crates/dspy-rs/tests/test_predictions.rs +++ b/crates/dsrs-core/tests/test_predictions.rs @@ -1,7 +1,7 @@ use rstest::*; use serde_json::json; -use dspy_rs::{LmUsage, Prediction}; +use dsrs_core::{LmUsage, Prediction}; use std::collections::HashMap; #[rstest] diff --git a/crates/dspy-rs/tests/test_signature.rs b/crates/dsrs-core/tests/test_signature.rs similarity index 98% rename from crates/dspy-rs/tests/test_signature.rs rename to crates/dsrs-core/tests/test_signature.rs index 26becc59..cd6d23ac 100644 --- a/crates/dspy-rs/tests/test_signature.rs +++ b/crates/dsrs-core/tests/test_signature.rs @@ -1,4 +1,4 @@ -use dspy_rs::Signature; +use dsrs_core::Signature; #[derive(Signature, Clone, Debug)] struct BasicSignature { diff --git a/crates/dspy-rs/tests/test_signature_macro.rs b/crates/dsrs-core/tests/test_signature_macro.rs similarity index 98% rename from crates/dspy-rs/tests/test_signature_macro.rs rename to crates/dsrs-core/tests/test_signature_macro.rs index 01f527d4..0f446b86 100644 --- a/crates/dspy-rs/tests/test_signature_macro.rs +++ b/crates/dsrs-core/tests/test_signature_macro.rs @@ -1,4 +1,4 @@ -use dspy_rs::{InputRenderSpec, Signature}; +use dsrs_core::{InputRenderSpec, Signature}; #[derive(Signature, Clone, Debug)] struct AliasAndFormatSignature { diff --git a/crates/dspy-rs/tests/test_signature_schema.rs b/crates/dsrs-core/tests/test_signature_schema.rs similarity index 97% rename from crates/dspy-rs/tests/test_signature_schema.rs rename to crates/dsrs-core/tests/test_signature_schema.rs index bb0a9eb5..366dbd0b 100644 --- a/crates/dspy-rs/tests/test_signature_schema.rs +++ b/crates/dsrs-core/tests/test_signature_schema.rs @@ -1,4 +1,4 @@ -use dspy_rs::{BamlType, Signature, SignatureSchema}; +use dsrs_core::{BamlType, Signature, SignatureSchema}; #[derive(Clone, Debug)] #[BamlType] diff --git a/crates/dsrs-data/Cargo.toml b/crates/dsrs-data/Cargo.toml index 594a4a6d..d5ec99b9 100644 --- a/crates/dsrs-data/Cargo.toml +++ b/crates/dsrs-data/Cargo.toml @@ -25,9 +25,18 @@ thiserror = "2.0.17" tracing = "0.1.44" [features] -default = ["csv", "json"] +default = ["all"] all = ["csv", "json", "parquet", "hf-hub"] csv = ["dep:csv"] json = [] parquet = ["dep:arrow", "dep:parquet"] hf-hub = ["dep:hf-hub", "dep:reqwest", "parquet"] + +[dev-dependencies] +bon = "3.7.0" +dsrs-evaluate = { path = "../dsrs-evaluate" } +dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +rstest = "0.25.0" +tempfile = "3.23.0" +tokio = { version = "1.46.1", features = ["macros", "rt"] } diff --git a/crates/dspy-rs/tests/test_dataloader.rs b/crates/dsrs-data/tests/test_dataloader.rs similarity index 98% rename from crates/dspy-rs/tests/test_dataloader.rs rename to crates/dsrs-data/tests/test_dataloader.rs index 55b1f1aa..15dc624b 100644 --- a/crates/dspy-rs/tests/test_dataloader.rs +++ b/crates/dsrs-data/tests/test_dataloader.rs @@ -3,10 +3,10 @@ use arrow::array::{ArrayRef, Int64Array, StringArray}; use arrow::datatypes::{DataType, Field, Schema}; use arrow::record_batch::RecordBatch; use bon::Builder; -use dspy_rs::{ - CallMetadata, DataLoader, Example, MetricOutcome, Module, Predict, PredictError, Predicted, - Signature, TypedLoadOptions, TypedMetric, UnknownFieldPolicy, average_score, evaluate_trainset, -}; +use dsrs_core::{CallMetadata, Example, Module, PredictError, Predicted, Signature}; +use dsrs_data::{DataLoader, TypedLoadOptions, UnknownFieldPolicy}; +use dsrs_evaluate::{MetricOutcome, TypedMetric, average_score, evaluate_trainset}; +use dsrs_predict::Predict; use parquet::arrow::ArrowWriter; use std::collections::HashMap; use std::fs; diff --git a/crates/dspy-rs/tests/test_example.rs b/crates/dsrs-data/tests/test_example.rs similarity index 97% rename from crates/dspy-rs/tests/test_example.rs rename to crates/dsrs-data/tests/test_example.rs index 439ce535..57157ffa 100644 --- a/crates/dspy-rs/tests/test_example.rs +++ b/crates/dsrs-data/tests/test_example.rs @@ -1,6 +1,6 @@ -use dspy_rs::data::example::Example; -use dspy_rs::data::serialize::{load_jsonl, save_examples_as_jsonl}; -use dspy_rs::hashmap; +use dsrs_data::example::Example; +use dsrs_data::serialize::{load_jsonl, save_examples_as_jsonl}; +use dsrs_core::hashmap; use rstest::*; #[rstest] diff --git a/crates/dsrs-evaluate/Cargo.toml b/crates/dsrs-evaluate/Cargo.toml index d284bba2..29814923 100644 --- a/crates/dsrs-evaluate/Cargo.toml +++ b/crates/dsrs-evaluate/Cargo.toml @@ -12,3 +12,12 @@ anyhow = "1.0.99" dsrs-core = { path = "../dsrs-core" } serde = { version = "1.0.219", features = ["derive"] } serde_json = { version = "1.0.140", features = ["preserve_order"] } + +[dev-dependencies] +bamltype = { path = "../bamltype" } +dsrs-data = { path = "../dsrs-data", features = ["all"] } +dsrs-lm = { path = "../dsrs-lm" } +dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } +dsrs-predict = { path = "../dsrs-predict" } +dsrs-trace = { path = "../dsrs-trace" } +tokio = { version = "1.46.1", features = ["macros", "rt", "rt-multi-thread"] } diff --git a/crates/dspy-rs/examples/03-evaluate-hotpotqa.rs b/crates/dsrs-evaluate/examples/03-evaluate-hotpotqa.rs similarity index 85% rename from crates/dspy-rs/examples/03-evaluate-hotpotqa.rs rename to crates/dsrs-evaluate/examples/03-evaluate-hotpotqa.rs index f9cf6a69..0f6308cc 100644 --- a/crates/dspy-rs/examples/03-evaluate-hotpotqa.rs +++ b/crates/dsrs-evaluate/examples/03-evaluate-hotpotqa.rs @@ -8,10 +8,12 @@ cargo run --example 03-evaluate-hotpotqa --features dataloaders */ use anyhow::Result; -use dspy_rs::{ - ChatAdapter, DataLoader, Example, LM, MetricOutcome, Predict, Predicted, Signature, - TypedLoadOptions, TypedMetric, average_score, configure, evaluate_trainset, init_tracing, -}; +use dsrs_core::{Example, Predicted, Signature}; +use dsrs_data::{DataLoader, TypedLoadOptions}; +use dsrs_evaluate::{MetricOutcome, TypedMetric, average_score, evaluate_trainset}; +use dsrs_lm::{ChatAdapter, LM, configure}; +use dsrs_predict::Predict; +use dsrs_trace::init_tracing; #[derive(Signature, Clone, Debug)] struct QA { diff --git a/crates/dspy-rs/tests/test_evaluate_trainset_typed.rs b/crates/dsrs-evaluate/tests/test_evaluate_trainset_typed.rs similarity index 94% rename from crates/dspy-rs/tests/test_evaluate_trainset_typed.rs rename to crates/dsrs-evaluate/tests/test_evaluate_trainset_typed.rs index 95e3f26b..5d4fe8fe 100644 --- a/crates/dspy-rs/tests/test_evaluate_trainset_typed.rs +++ b/crates/dsrs-evaluate/tests/test_evaluate_trainset_typed.rs @@ -1,8 +1,6 @@ use anyhow::{Result, anyhow}; -use dspy_rs::{ - CallMetadata, Example, MetricOutcome, Module, PredictError, Predicted, Signature, TypedMetric, - average_score, evaluate_trainset, -}; +use dsrs_core::{CallMetadata, Example, Module, PredictError, Predicted, Signature}; +use dsrs_evaluate::{MetricOutcome, TypedMetric, average_score, evaluate_trainset}; use std::sync::{Arc, Mutex}; #[derive(Signature, Clone, Debug)] diff --git a/crates/dsrs-gepa/Cargo.toml b/crates/dsrs-gepa/Cargo.toml index f4f9d02a..5909a892 100644 --- a/crates/dsrs-gepa/Cargo.toml +++ b/crates/dsrs-gepa/Cargo.toml @@ -21,4 +21,5 @@ tracing = "0.1.44" [dev-dependencies] dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } -tokio = { version = "1.46.1", features = ["macros", "rt"] } +dsrs-trace = { path = "../dsrs-trace" } +tokio = { version = "1.46.1", features = ["macros", "rt", "rt-multi-thread"] } diff --git a/crates/dspy-rs/examples/09-gepa-sentiment.rs b/crates/dsrs-gepa/examples/09-gepa-sentiment.rs similarity index 93% rename from crates/dspy-rs/examples/09-gepa-sentiment.rs rename to crates/dsrs-gepa/examples/09-gepa-sentiment.rs index 515fe70b..c697a511 100644 --- a/crates/dspy-rs/examples/09-gepa-sentiment.rs +++ b/crates/dsrs-gepa/examples/09-gepa-sentiment.rs @@ -9,11 +9,12 @@ OPENAI_API_KEY=your_key cargo run --example 09-gepa-sentiment use anyhow::Result; use bon::Builder; -use dspy_rs::{ - ChatAdapter, Example, FeedbackMetric, GEPA, LM, MetricOutcome, Module, Optimizer, Predict, - PredictError, Predicted, Signature, TypedMetric, average_score, configure, evaluate_trainset, - init_tracing, -}; +use dsrs_core::{Example, Module, PredictError, Predicted, Signature}; +use dsrs_evaluate::{FeedbackMetric, MetricOutcome, TypedMetric, average_score, evaluate_trainset}; +use dsrs_gepa::{GEPA, Optimizer}; +use dsrs_lm::{ChatAdapter, LM, configure}; +use dsrs_predict::Predict; +use dsrs_trace::init_tracing; #[derive(Signature, Clone, Debug)] struct SentimentSignature { diff --git a/crates/dspy-rs/examples/10-gepa-llm-judge.rs b/crates/dsrs-gepa/examples/10-gepa-llm-judge.rs similarity index 95% rename from crates/dspy-rs/examples/10-gepa-llm-judge.rs rename to crates/dsrs-gepa/examples/10-gepa-llm-judge.rs index 95255284..f0e977d8 100644 --- a/crates/dspy-rs/examples/10-gepa-llm-judge.rs +++ b/crates/dsrs-gepa/examples/10-gepa-llm-judge.rs @@ -9,11 +9,12 @@ OPENAI_API_KEY=your_key cargo run --example 10-gepa-llm-judge use anyhow::Result; use bon::Builder; -use dspy_rs::{ - ChatAdapter, Example, FeedbackMetric, GEPA, LM, MetricOutcome, Module, Optimizer, Predict, - PredictError, Predicted, Signature, TypedMetric, average_score, configure, evaluate_trainset, - init_tracing, -}; +use dsrs_core::{Example, Module, PredictError, Predicted, Signature}; +use dsrs_evaluate::{FeedbackMetric, MetricOutcome, TypedMetric, average_score, evaluate_trainset}; +use dsrs_gepa::{GEPA, Optimizer}; +use dsrs_lm::{ChatAdapter, LM, configure}; +use dsrs_predict::Predict; +use dsrs_trace::init_tracing; #[derive(Signature, Clone, Debug)] struct MathWordProblem { diff --git a/crates/dspy-rs/tests/test_gepa.rs b/crates/dsrs-gepa/tests/test_gepa.rs similarity index 94% rename from crates/dspy-rs/tests/test_gepa.rs rename to crates/dsrs-gepa/tests/test_gepa.rs index 0f081f5b..aa063ee4 100644 --- a/crates/dspy-rs/tests/test_gepa.rs +++ b/crates/dsrs-gepa/tests/test_gepa.rs @@ -1,4 +1,4 @@ -use dspy_rs::optimizer::gepa::GEPACandidate; +use dsrs_gepa::GEPACandidate; #[test] fn test_candidate_creation() { diff --git a/crates/dspy-rs/tests/test_gepa_typed_metric_feedback.rs b/crates/dsrs-gepa/tests/test_gepa_typed_metric_feedback.rs similarity index 98% rename from crates/dspy-rs/tests/test_gepa_typed_metric_feedback.rs rename to crates/dsrs-gepa/tests/test_gepa_typed_metric_feedback.rs index f6ab2a63..8e304171 100644 --- a/crates/dspy-rs/tests/test_gepa_typed_metric_feedback.rs +++ b/crates/dsrs-gepa/tests/test_gepa_typed_metric_feedback.rs @@ -1,8 +1,8 @@ use anyhow::Result; -use dspy_rs::{ - CallMetadata, Example, FeedbackMetric, GEPA, MetricOutcome, Module, Optimizer, Predict, - PredictError, Predicted, Signature, TypedMetric, -}; +use dsrs_core::{CallMetadata, Example, Module, PredictError, Predicted, Signature}; +use dsrs_evaluate::{FeedbackMetric, MetricOutcome, TypedMetric}; +use dsrs_gepa::{GEPA, Optimizer}; +use dsrs_predict::Predict; use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::{Arc, Mutex}; diff --git a/crates/dspy-rs/tests/test_pareto.rs b/crates/dsrs-gepa/tests/test_pareto.rs similarity index 96% rename from crates/dspy-rs/tests/test_pareto.rs rename to crates/dsrs-gepa/tests/test_pareto.rs index a7c5f7bb..62367049 100644 --- a/crates/dspy-rs/tests/test_pareto.rs +++ b/crates/dsrs-gepa/tests/test_pareto.rs @@ -1,5 +1,4 @@ -use dspy_rs::optimizer::gepa::GEPACandidate; -use dspy_rs::optimizer::pareto::ParetoFrontier; +use dsrs_gepa::{GEPACandidate, ParetoFrontier}; fn make_test_candidate(instruction: &str) -> GEPACandidate { GEPACandidate { diff --git a/crates/dsrs-lm/Cargo.toml b/crates/dsrs-lm/Cargo.toml index 4de1cb6e..05d94212 100644 --- a/crates/dsrs-lm/Cargo.toml +++ b/crates/dsrs-lm/Cargo.toml @@ -24,3 +24,12 @@ serde = { version = "1.0.219", features = ["derive"] } serde_json = { version = "1.0.140", features = ["preserve_order"] } tokio = { version = "1.46.1", features = ["sync"] } tracing = "0.1.44" + +[dev-dependencies] +dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } +dsrs-predict = { path = "../dsrs-predict" } +dsrs-trace = { path = "../dsrs-trace" } +rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev = "aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } +rstest = "0.25.0" +temp-env = { version = "0.3.6", features = ["async_closure"] } +tokio = { version = "1.46.1", features = ["macros", "rt", "sync"] } diff --git a/crates/dspy-rs/examples/11-custom-client.rs b/crates/dsrs-lm/examples/11-custom-client.rs similarity index 91% rename from crates/dspy-rs/examples/11-custom-client.rs rename to crates/dsrs-lm/examples/11-custom-client.rs index 8bdcb6b0..111b763f 100644 --- a/crates/dspy-rs/examples/11-custom-client.rs +++ b/crates/dsrs-lm/examples/11-custom-client.rs @@ -8,7 +8,9 @@ cargo run --example 11-custom-client */ use anyhow::Result; -use dspy_rs::{ChatAdapter, LM, LMClient, Predict, Signature, configure, init_tracing}; +use dsrs_lm::{ChatAdapter, LM, LMClient, Signature, configure}; +use dsrs_predict::Predict; +use dsrs_trace::init_tracing; use rig::providers::azure; use std::env; diff --git a/crates/dsrs-lm/src/lib.rs b/crates/dsrs-lm/src/lib.rs index ba5fc6d3..dccac4d5 100644 --- a/crates/dsrs-lm/src/lib.rs +++ b/crates/dsrs-lm/src/lib.rs @@ -7,5 +7,7 @@ pub mod settings; pub use adapter::*; pub use chat::*; +pub use dsrs_cache::*; +pub use dsrs_core::*; pub use lm::*; pub use settings::*; diff --git a/crates/dspy-rs/tests/test_adapters.rs b/crates/dsrs-lm/tests/test_adapters.rs similarity index 98% rename from crates/dspy-rs/tests/test_adapters.rs rename to crates/dsrs-lm/tests/test_adapters.rs index 65ee7279..bf43197f 100644 --- a/crates/dspy-rs/tests/test_adapters.rs +++ b/crates/dsrs-lm/tests/test_adapters.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, Message, Signature}; +use dsrs_lm::{ChatAdapter, Message, Signature}; #[derive(Signature, Clone, Debug, PartialEq)] struct BasicSignature { diff --git a/crates/dspy-rs/tests/test_bamltype_docs_contract.rs b/crates/dsrs-lm/tests/test_bamltype_docs_contract.rs similarity index 98% rename from crates/dspy-rs/tests/test_bamltype_docs_contract.rs rename to crates/dsrs-lm/tests/test_bamltype_docs_contract.rs index f1216b73..ed8364e5 100644 --- a/crates/dspy-rs/tests/test_bamltype_docs_contract.rs +++ b/crates/dsrs-lm/tests/test_bamltype_docs_contract.rs @@ -1,5 +1,5 @@ use bamltype::HoistClasses; -use dspy_rs::{BamlType, ChatAdapter, RenderOptions, Signature}; +use dsrs_lm::{BamlType, ChatAdapter, RenderOptions, Signature}; #[derive(Clone, Debug)] #[BamlType] diff --git a/crates/dspy-rs/tests/test_chat.rs b/crates/dsrs-lm/tests/test_chat.rs similarity index 99% rename from crates/dspy-rs/tests/test_chat.rs rename to crates/dsrs-lm/tests/test_chat.rs index fd8e0fe9..9644d304 100644 --- a/crates/dspy-rs/tests/test_chat.rs +++ b/crates/dsrs-lm/tests/test_chat.rs @@ -1,4 +1,4 @@ -use dspy_rs::core::lm::chat::{Chat, ContentBlock, Message, Role}; +use dsrs_lm::{Chat, ContentBlock, Message, Role}; use rig::OneOrMany; use rig::message::{ AssistantContent, Message as RigMessage, Reasoning, ToolCall, ToolFunction, ToolResult, diff --git a/crates/dspy-rs/tests/test_chat_adapter_schema.rs b/crates/dsrs-lm/tests/test_chat_adapter_schema.rs similarity index 94% rename from crates/dspy-rs/tests/test_chat_adapter_schema.rs rename to crates/dsrs-lm/tests/test_chat_adapter_schema.rs index 388218a7..c8526198 100644 --- a/crates/dspy-rs/tests/test_chat_adapter_schema.rs +++ b/crates/dsrs-lm/tests/test_chat_adapter_schema.rs @@ -1,4 +1,4 @@ -use dspy_rs::{CallMetadata, ChatAdapter, Message, Predicted, Signature}; +use dsrs_lm::{CallMetadata, ChatAdapter, Message, Predicted, Signature}; #[derive(Signature, Clone, Debug)] /// Adapter schema parse fixture. @@ -36,7 +36,7 @@ fn parse_response_typed_uses_schema_field_names() { let metadata = CallMetadata::new( response.content(), - dspy_rs::LmUsage::default(), + dsrs_lm::LmUsage::default(), Vec::new(), Vec::new(), None, diff --git a/crates/dspy-rs/tests/test_chat_prompt_composition.rs b/crates/dsrs-lm/tests/test_chat_prompt_composition.rs similarity index 99% rename from crates/dspy-rs/tests/test_chat_prompt_composition.rs rename to crates/dsrs-lm/tests/test_chat_prompt_composition.rs index e216c15a..76fbb791 100644 --- a/crates/dspy-rs/tests/test_chat_prompt_composition.rs +++ b/crates/dsrs-lm/tests/test_chat_prompt_composition.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, Example, Signature}; +use dsrs_lm::{ChatAdapter, Example, Signature}; #[derive(Signature, Clone, Debug)] /// Answer the prompt using the provided context. diff --git a/crates/dspy-rs/tests/test_chat_prompt_golden.rs b/crates/dsrs-lm/tests/test_chat_prompt_golden.rs similarity index 98% rename from crates/dspy-rs/tests/test_chat_prompt_golden.rs rename to crates/dsrs-lm/tests/test_chat_prompt_golden.rs index 0cca5ece..a5f10d46 100644 --- a/crates/dspy-rs/tests/test_chat_prompt_golden.rs +++ b/crates/dsrs-lm/tests/test_chat_prompt_golden.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, Example, Signature}; +use dsrs_lm::{ChatAdapter, Example, Signature}; #[derive(Signature, Clone, Debug)] struct GoldenSig { diff --git a/crates/dspy-rs/tests/test_input_format.rs b/crates/dsrs-lm/tests/test_input_format.rs similarity index 99% rename from crates/dspy-rs/tests/test_input_format.rs rename to crates/dsrs-lm/tests/test_input_format.rs index 8170ae9c..c71ec819 100644 --- a/crates/dspy-rs/tests/test_input_format.rs +++ b/crates/dsrs-lm/tests/test_input_format.rs @@ -1,4 +1,4 @@ -use dspy_rs::{BamlType, BamlValue, ChatAdapter, Signature}; +use dsrs_lm::{BamlType, BamlValue, ChatAdapter, Signature}; #[derive(Clone, Debug)] #[BamlType] diff --git a/crates/dspy-rs/tests/test_lm.rs b/crates/dsrs-lm/tests/test_lm.rs similarity index 96% rename from crates/dspy-rs/tests/test_lm.rs rename to crates/dsrs-lm/tests/test_lm.rs index 9fed309c..f739454d 100644 --- a/crates/dspy-rs/tests/test_lm.rs +++ b/crates/dsrs-lm/tests/test_lm.rs @@ -1,5 +1,5 @@ -use dspy_rs::data::RawExample; -use dspy_rs::{Cache, Chat, DummyLM, LM, LmUsage, Message, hashmap}; +use dsrs_lm::RawExample; +use dsrs_lm::{Cache, Chat, DummyLM, LM, LmUsage, Message, hashmap}; use rstest::*; #[cfg_attr(miri, ignore)] // Miri doesn't support tokio's I/O driver @@ -135,8 +135,8 @@ async fn test_lm_cache_initialization_on_first_call() { #[tokio::test] #[cfg_attr(miri, ignore)] async fn test_lm_cache_direct_operations() { - use dspy_rs::Prediction; - use dspy_rs::data::RawExample; + use dsrs_lm::Prediction; + use dsrs_lm::RawExample; use std::collections::HashMap; let lm = temp_env::async_with_vars( @@ -175,7 +175,7 @@ async fn test_lm_cache_direct_operations() { // Create a channel to send the result let (tx, rx) = tokio::sync::mpsc::channel(1); - use dspy_rs::utils::cache::CacheEntry; + use dsrs_lm::CacheEntry; tx.send(CacheEntry { prompt: "test prompt".to_string(), prediction: value.clone(), @@ -229,8 +229,8 @@ async fn test_lm_cache_with_different_models() { #[tokio::test] #[cfg_attr(miri, ignore)] async fn test_cache_with_complex_inputs() { - use dspy_rs::Prediction; - use dspy_rs::data::RawExample; + use dsrs_lm::Prediction; + use dsrs_lm::RawExample; use std::collections::HashMap; let lm = temp_env::async_with_vars( @@ -292,7 +292,7 @@ async fn test_cache_with_complex_inputs() { // Insert and retrieve let (tx, rx) = tokio::sync::mpsc::channel(1); - use dspy_rs::utils::cache::CacheEntry; + use dsrs_lm::CacheEntry; tx.send(CacheEntry { prompt: "complex test prompt".to_string(), prediction: value.clone(), diff --git a/crates/dspy-rs/tests/test_message_roundtrip.rs b/crates/dsrs-lm/tests/test_message_roundtrip.rs similarity index 99% rename from crates/dspy-rs/tests/test_message_roundtrip.rs rename to crates/dsrs-lm/tests/test_message_roundtrip.rs index 96461483..5d40f428 100644 --- a/crates/dspy-rs/tests/test_message_roundtrip.rs +++ b/crates/dsrs-lm/tests/test_message_roundtrip.rs @@ -4,7 +4,7 @@ //! all content through: DSRs Message → rig Message → DSRs Message, and //! through JSON serialization/deserialization. -use dspy_rs::core::lm::chat::{Chat, ContentBlock, Message, Role}; +use dsrs_lm::{Chat, ContentBlock, Message, Role}; use rig::OneOrMany; use rig::message::{ Message as RigMessage, Reasoning, ToolCall, ToolFunction, ToolResult, ToolResultContent, diff --git a/crates/dspy-rs/tests/test_settings.rs b/crates/dsrs-lm/tests/test_settings.rs similarity index 93% rename from crates/dspy-rs/tests/test_settings.rs rename to crates/dsrs-lm/tests/test_settings.rs index 3bc328fd..50aab421 100644 --- a/crates/dspy-rs/tests/test_settings.rs +++ b/crates/dsrs-lm/tests/test_settings.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, LM, configure, get_lm}; +use dsrs_lm::{ChatAdapter, LM, configure, get_lm}; #[tokio::test] #[cfg_attr(miri, ignore)] diff --git a/crates/dspy-rs/tests/test_tool_call.rs b/crates/dsrs-lm/tests/test_tool_call.rs similarity index 96% rename from crates/dspy-rs/tests/test_tool_call.rs rename to crates/dsrs-lm/tests/test_tool_call.rs index 80a8b741..20cf3c7b 100644 --- a/crates/dspy-rs/tests/test_tool_call.rs +++ b/crates/dsrs-lm/tests/test_tool_call.rs @@ -1,4 +1,4 @@ -use dspy_rs::{Chat, LM, Message}; +use dsrs_lm::{Chat, LM, Message}; use rig::completion::ToolDefinition; use rig::tool::ToolDyn; use std::error::Error; @@ -108,7 +108,7 @@ async fn test_tool_call_with_no_tools() { } let response = response.unwrap(); - assert_eq!(response.output.role, dspy_rs::Role::Assistant); + assert_eq!(response.output.role, dsrs_lm::Role::Assistant); let content = response.output.content(); // The response should contain some mention of 4 println!("Assistant response: {}", content); @@ -137,7 +137,7 @@ async fn test_tool_call_with_calculator() { // Call with the calculator tool let response = lm.call(chat, tools).await.unwrap(); - assert_eq!(response.output.role, dspy_rs::Role::Assistant); + assert_eq!(response.output.role, dsrs_lm::Role::Assistant); let content = response.output.content(); println!("Assistant response after tool use: {}", content); // The response should mention the result (100) or that the tool was called diff --git a/crates/dspy-rs/tests/test_typed_alias.rs b/crates/dsrs-lm/tests/test_typed_alias.rs similarity index 97% rename from crates/dspy-rs/tests/test_typed_alias.rs rename to crates/dsrs-lm/tests/test_typed_alias.rs index 55118527..b3060a22 100644 --- a/crates/dspy-rs/tests/test_typed_alias.rs +++ b/crates/dsrs-lm/tests/test_typed_alias.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, Message, Signature}; +use dsrs_lm::{ChatAdapter, Message, Signature}; #[derive(Signature, Clone, Debug)] /// Provide an answer using aliases. diff --git a/crates/dspy-rs/tests/test_typed_prompt_format.rs b/crates/dsrs-lm/tests/test_typed_prompt_format.rs similarity index 99% rename from crates/dspy-rs/tests/test_typed_prompt_format.rs rename to crates/dsrs-lm/tests/test_typed_prompt_format.rs index c8e9f7dd..8f9aa5d7 100644 --- a/crates/dspy-rs/tests/test_typed_prompt_format.rs +++ b/crates/dsrs-lm/tests/test_typed_prompt_format.rs @@ -3,7 +3,7 @@ reason = "Signature derive emits multi-field constructors for schema coverage tests." )] -use dspy_rs::{BamlType, ChatAdapter, Signature}; +use dsrs_lm::{BamlType, ChatAdapter, Signature}; #[derive(Clone, Debug)] #[BamlType] diff --git a/crates/dsrs-macros/Cargo.toml b/crates/dsrs-macros/Cargo.toml index d633513b..c010bcde 100644 --- a/crates/dsrs-macros/Cargo.toml +++ b/crates/dsrs-macros/Cargo.toml @@ -24,4 +24,8 @@ minijinja = { git = "https://github.com/boundaryml/minijinja.git", branch = "mai [dev-dependencies] bamltype = { path = "../bamltype" } dsrs-core = { path = "../dsrs-core" } +facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03e63659db34a947989b45c8a5", default-features = false, features = ["std"] } +rstest = "0.25.0" +schemars = "1.0.4" +tempfile = "3.23.0" trybuild = "1.0.110" diff --git a/crates/dspy-rs/tests/test_bamltype_attr_contract.rs b/crates/dsrs-macros/tests/test_bamltype_attr_contract.rs similarity index 84% rename from crates/dspy-rs/tests/test_bamltype_attr_contract.rs rename to crates/dsrs-macros/tests/test_bamltype_attr_contract.rs index 1d661b3b..85e1640a 100644 --- a/crates/dspy-rs/tests/test_bamltype_attr_contract.rs +++ b/crates/dsrs-macros/tests/test_bamltype_attr_contract.rs @@ -1,4 +1,5 @@ -use dspy_rs::{BamlType, RenderOptions}; +use dsrs_core::{BamlType, RenderOptions}; +use facet as _; #[derive(Debug, Clone, PartialEq)] #[BamlType] @@ -10,7 +11,7 @@ struct DsrsUser { } #[test] -fn bamltype_attribute_macro_works_from_dspy_rs() { +fn bamltype_attribute_macro_works_from_dsrs_core() { let schema = ::baml_output_format() .render(RenderOptions::default()) .expect("render schema") diff --git a/crates/dspy-rs/tests/test_field_macro.rs b/crates/dsrs-macros/tests/test_field_macro.rs similarity index 99% rename from crates/dspy-rs/tests/test_field_macro.rs rename to crates/dsrs-macros/tests/test_field_macro.rs index 3b7ae167..f909b252 100644 --- a/crates/dspy-rs/tests/test_field_macro.rs +++ b/crates/dsrs-macros/tests/test_field_macro.rs @@ -1,4 +1,4 @@ -use dspy_rs::field; +use dsrs_core::field; use rstest::*; use serde_json::json; diff --git a/crates/dspy-rs/tests/test_public_api_compile_fail.rs b/crates/dsrs-macros/tests/test_public_api_compile_fail.rs similarity index 82% rename from crates/dspy-rs/tests/test_public_api_compile_fail.rs rename to crates/dsrs-macros/tests/test_public_api_compile_fail.rs index e515284f..aa941a8b 100644 --- a/crates/dspy-rs/tests/test_public_api_compile_fail.rs +++ b/crates/dsrs-macros/tests/test_public_api_compile_fail.rs @@ -8,9 +8,15 @@ fn run_compile_fail_case(name: &str, source: &str) -> String { fs::create_dir_all(case_dir.join("src")).expect("case src dir should be creatable"); let manifest_path = Path::new(env!("CARGO_MANIFEST_DIR")); + let crates_dir = manifest_path + .parent() + .expect("dsrs-macros should live under crates/"); let cargo_toml = format!( - "[package]\nname = \"{name}\"\nversion = \"0.1.0\"\nedition = \"2024\"\n\n[dependencies]\ndspy-rs = {{ path = \"{}\" }}\nanyhow = \"1\"\n", - manifest_path.display() + "[package]\nname = \"{name}\"\nversion = \"0.1.0\"\nedition = \"2024\"\n\n[dependencies]\nanyhow = \"1\"\ndsrs-core = {{ path = \"{}\" }}\ndsrs-evaluate = {{ path = \"{}\" }}\ndsrs-gepa = {{ path = \"{}\" }}\ndsrs-predict = {{ path = \"{}\" }}\n", + crates_dir.join("dsrs-core").display(), + crates_dir.join("dsrs-evaluate").display(), + crates_dir.join("dsrs-gepa").display(), + crates_dir.join("dsrs-predict").display(), ); fs::write(case_dir.join("Cargo.toml"), cargo_toml).expect("cargo manifest should be writable"); @@ -44,7 +50,7 @@ fn dyn_predictor_is_not_publicly_importable() { let stderr = run_compile_fail_case( "private_dyn_predictor_case", r#" -use dspy_rs::DynPredictor; +use dsrs_core::DynPredictor; fn main() { let _ = std::any::type_name::>(); @@ -65,7 +71,7 @@ fn named_parameters_is_not_publicly_importable() { let stderr = run_compile_fail_case( "private_named_parameters_case", r#" -use dspy_rs::named_parameters; +use dsrs_core::named_parameters; fn main() { let _ = named_parameters; @@ -87,7 +93,10 @@ fn optimizer_compile_rejects_wrong_signature_input_type() { "wrong_signature_case", r#" use anyhow::Result; -use dspy_rs::{ChainOfThought, Example, GEPA, MetricOutcome, Optimizer, Predicted, Signature, TypedMetric, WithReasoning}; +use dsrs_core::{Example, Predicted, Signature}; +use dsrs_evaluate::{MetricOutcome, TypedMetric}; +use dsrs_gepa::{GEPA, Optimizer}; +use dsrs_predict::{ChainOfThought, WithReasoning}; #[derive(Signature, Clone, Debug)] struct RightSig { diff --git a/crates/dsrs-predict/Cargo.toml b/crates/dsrs-predict/Cargo.toml index a7eccdb6..9e0dac94 100644 --- a/crates/dsrs-predict/Cargo.toml +++ b/crates/dsrs-predict/Cargo.toml @@ -18,3 +18,10 @@ facet = { git = "https://github.com/darinkishore/facet", rev = "cc8613c97cd1ec03 rig-core = { git = "https://github.com/0xPlaygrounds/rig", rev = "aee3b8bf6576ce41c9ac1dd82520752a65fa0127" } serde_json = { version = "1.0.140", features = ["preserve_order"] } tracing = "0.1.44" + +[dev-dependencies] +bon = "3.7.0" +rstest = "0.25.0" +serde = { version = "1.0.219", features = ["derive"] } +temp-env = { version = "0.3.6", features = ["async_closure"] } +tokio = { version = "1.46.1", features = ["macros", "rt", "sync", "time"] } diff --git a/crates/dspy-rs/examples/01-simple.rs b/crates/dsrs-predict/examples/01-simple.rs similarity index 97% rename from crates/dspy-rs/examples/01-simple.rs rename to crates/dsrs-predict/examples/01-simple.rs index d662d92e..cf93afd4 100644 --- a/crates/dspy-rs/examples/01-simple.rs +++ b/crates/dsrs-predict/examples/01-simple.rs @@ -15,16 +15,17 @@ cargo run --example 01-simple use anyhow::Result; use bon::Builder; -use dspy_rs::data::RawExample; -use dspy_rs::{ +use dsrs_core::RawExample; +use dsrs_predict::{ CallMetadata, ChatAdapter, Example, LM, LmError, Module, Predict, PredictError, Predicted, - Prediction, configure, init_tracing, + Prediction, configure, }; +use dsrs_trace::init_tracing; const QA_INSTRUCTION: &str = "Answer the question step by step."; const RATE_INSTRUCTION: &str = "Rate the answer on a scale of 1 (very bad) to 10 (very good)."; -#[derive(dspy_rs::Signature, Clone, Debug)] +#[derive(dsrs_core::Signature, Clone, Debug)] pub struct QA { #[input] pub question: String, @@ -36,7 +37,7 @@ pub struct QA { pub answer: String, } -#[derive(dspy_rs::Signature, Clone, Debug)] +#[derive(dsrs_core::Signature, Clone, Debug)] pub struct Rate { #[input] pub question: String, diff --git a/crates/dspy-rs/examples/05-heterogenous-examples.rs b/crates/dsrs-predict/examples/05-heterogenous-examples.rs similarity index 92% rename from crates/dspy-rs/examples/05-heterogenous-examples.rs rename to crates/dsrs-predict/examples/05-heterogenous-examples.rs index d32d01ea..510018e8 100644 --- a/crates/dspy-rs/examples/05-heterogenous-examples.rs +++ b/crates/dsrs-predict/examples/05-heterogenous-examples.rs @@ -8,8 +8,9 @@ cargo run --example 05-heterogenous-examples */ use anyhow::Result; -use dspy_rs::data::RawExample; -use dspy_rs::{ChatAdapter, LM, Predict, Signature, configure, init_tracing}; +use dsrs_core::RawExample; +use dsrs_predict::{ChatAdapter, LM, Predict, Signature, configure}; +use dsrs_trace::init_tracing; use serde_json::json; use std::collections::HashMap; diff --git a/crates/dspy-rs/examples/06-other-providers-batch.rs b/crates/dsrs-predict/examples/06-other-providers-batch.rs similarity index 93% rename from crates/dspy-rs/examples/06-other-providers-batch.rs rename to crates/dsrs-predict/examples/06-other-providers-batch.rs index 57cf792b..ed7ec638 100644 --- a/crates/dspy-rs/examples/06-other-providers-batch.rs +++ b/crates/dsrs-predict/examples/06-other-providers-batch.rs @@ -8,7 +8,8 @@ cargo run --example 06-other-providers-batch */ use anyhow::Result; -use dspy_rs::{ChatAdapter, LM, Predict, Signature, configure, forward_all, init_tracing}; +use dsrs_predict::{ChatAdapter, LM, Predict, Signature, configure, forward_all}; +use dsrs_trace::init_tracing; #[derive(Signature, Clone, Debug)] struct QA { diff --git a/crates/dspy-rs/examples/07-inspect-history.rs b/crates/dsrs-predict/examples/07-inspect-history.rs similarity index 88% rename from crates/dspy-rs/examples/07-inspect-history.rs rename to crates/dsrs-predict/examples/07-inspect-history.rs index b15b5cec..4cde051d 100644 --- a/crates/dspy-rs/examples/07-inspect-history.rs +++ b/crates/dsrs-predict/examples/07-inspect-history.rs @@ -8,7 +8,8 @@ cargo run --example 07-inspect-history */ use anyhow::Result; -use dspy_rs::{ChatAdapter, LM, Predict, Signature, configure, get_lm, init_tracing}; +use dsrs_predict::{ChatAdapter, LM, Predict, Signature, configure, get_lm}; +use dsrs_trace::init_tracing; #[derive(Signature, Clone, Debug)] struct QA { diff --git a/crates/dspy-rs/examples/15-tools.rs b/crates/dsrs-predict/examples/15-tools.rs similarity index 97% rename from crates/dspy-rs/examples/15-tools.rs rename to crates/dsrs-predict/examples/15-tools.rs index c2170238..af74781f 100644 --- a/crates/dspy-rs/examples/15-tools.rs +++ b/crates/dsrs-predict/examples/15-tools.rs @@ -8,7 +8,8 @@ cargo run --example 15-tools */ use anyhow::Result; -use dspy_rs::{ChatAdapter, LM, Predict, Signature, configure, init_tracing}; +use dsrs_predict::{ChatAdapter, LM, Predict, Signature, configure}; +use dsrs_trace::init_tracing; use rig::completion::ToolDefinition; use rig::tool::Tool; use serde::{Deserialize, Serialize}; diff --git a/crates/dspy-rs/examples/16-insurance-claim-prompt.rs b/crates/dsrs-predict/examples/16-insurance-claim-prompt.rs similarity index 98% rename from crates/dspy-rs/examples/16-insurance-claim-prompt.rs rename to crates/dsrs-predict/examples/16-insurance-claim-prompt.rs index da712dc9..6dd626e6 100644 --- a/crates/dspy-rs/examples/16-insurance-claim-prompt.rs +++ b/crates/dsrs-predict/examples/16-insurance-claim-prompt.rs @@ -5,7 +5,8 @@ Run with: cargo run --example 16-insurance-claim-prompt */ -use dspy_rs::{BamlType, ChatAdapter, Signature, init_tracing}; +use dsrs_predict::{BamlType, ChatAdapter, Signature}; +use dsrs_trace::init_tracing; // Keep the example self-contained; dates are represented as YYYY-MM-DD strings. type NaiveDate = String; diff --git a/crates/dspy-rs/examples/90-smoke-slice1-typed-predict.rs b/crates/dsrs-predict/examples/90-smoke-slice1-typed-predict.rs similarity index 93% rename from crates/dspy-rs/examples/90-smoke-slice1-typed-predict.rs rename to crates/dsrs-predict/examples/90-smoke-slice1-typed-predict.rs index 7b485756..d857a870 100644 --- a/crates/dspy-rs/examples/90-smoke-slice1-typed-predict.rs +++ b/crates/dsrs-predict/examples/90-smoke-slice1-typed-predict.rs @@ -1,5 +1,5 @@ use anyhow::{Result, bail}; -use dspy_rs::{ChatAdapter, LM, Predict, PredictError, Signature, configure}; +use dsrs_predict::{ChatAdapter, LM, Predict, PredictError, Signature, configure}; #[derive(Signature, Clone, Debug)] struct SmokeSig { diff --git a/crates/dspy-rs/examples/91-smoke-slice2-chain-of-thought.rs b/crates/dsrs-predict/examples/91-smoke-slice2-chain-of-thought.rs similarity index 92% rename from crates/dspy-rs/examples/91-smoke-slice2-chain-of-thought.rs rename to crates/dsrs-predict/examples/91-smoke-slice2-chain-of-thought.rs index 12b90e56..3d8c04af 100644 --- a/crates/dspy-rs/examples/91-smoke-slice2-chain-of-thought.rs +++ b/crates/dsrs-predict/examples/91-smoke-slice2-chain-of-thought.rs @@ -1,5 +1,5 @@ use anyhow::{Result, bail}; -use dspy_rs::{ChainOfThought, ChatAdapter, LM, PredictError, Signature, configure}; +use dsrs_predict::{ChainOfThought, ChatAdapter, LM, PredictError, Signature, configure}; #[derive(Signature, Clone, Debug)] struct SmokeSig { diff --git a/crates/dspy-rs/examples/92-smoke-slice3-module-authoring.rs b/crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs similarity index 93% rename from crates/dspy-rs/examples/92-smoke-slice3-module-authoring.rs rename to crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs index 50da034c..683bbd24 100644 --- a/crates/dspy-rs/examples/92-smoke-slice3-module-authoring.rs +++ b/crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs @@ -1,5 +1,5 @@ use anyhow::{Result, bail}; -use dspy_rs::{ChatAdapter, LM, Module, Predict, PredictError, Predicted, Signature, configure}; +use dsrs_predict::{ChatAdapter, LM, Module, Predict, PredictError, Predicted, Signature, configure}; #[derive(Signature, Clone, Debug)] struct SmokeSig { diff --git a/crates/dspy-rs/examples/93-smoke-slice4-react-operational.rs b/crates/dsrs-predict/examples/93-smoke-slice4-react-operational.rs similarity index 97% rename from crates/dspy-rs/examples/93-smoke-slice4-react-operational.rs rename to crates/dsrs-predict/examples/93-smoke-slice4-react-operational.rs index 90c358f7..0e290814 100644 --- a/crates/dspy-rs/examples/93-smoke-slice4-react-operational.rs +++ b/crates/dsrs-predict/examples/93-smoke-slice4-react-operational.rs @@ -1,5 +1,5 @@ use anyhow::{Result, bail}; -use dspy_rs::{ChatAdapter, LM, PredictError, ReAct, Signature, configure, forward_all}; +use dsrs_predict::{ChatAdapter, LM, PredictError, ReAct, Signature, configure, forward_all}; use serde_json::Value; #[derive(Signature, Clone, Debug)] diff --git a/crates/dsrs-predict/src/lib.rs b/crates/dsrs-predict/src/lib.rs index 360b1c69..5cf0efb2 100644 --- a/crates/dsrs-predict/src/lib.rs +++ b/crates/dsrs-predict/src/lib.rs @@ -5,6 +5,7 @@ pub mod predict; pub mod react; pub use chain_of_thought::{ChainOfThought, ChainOfThoughtOutput, Reasoning, WithReasoning}; -pub use dsrs_core::Example; +pub use dsrs_core::*; +pub use dsrs_lm::*; pub use predict::*; pub use react::ReAct; diff --git a/crates/dspy-rs/tests/test_caller_managed_conversation.rs b/crates/dsrs-predict/tests/test_caller_managed_conversation.rs similarity index 96% rename from crates/dspy-rs/tests/test_caller_managed_conversation.rs rename to crates/dsrs-predict/tests/test_caller_managed_conversation.rs index dc2f9b2f..72367535 100644 --- a/crates/dspy-rs/tests/test_caller_managed_conversation.rs +++ b/crates/dsrs-predict/tests/test_caller_managed_conversation.rs @@ -3,9 +3,9 @@ //! This is the RLM critical path: the caller controls tool execution and //! manages the conversation loop, not the LM layer's auto tool loop. -use dspy_rs::{ - ChatAdapter, LM, LMClient, Message, Predict, Role, Signature, TestCompletionModel, - ToolLoopMode, configure, +use dsrs_predict::{ + Chat, ChatAdapter, LM, LMClient, Message, Predict, PredictError, Role, Signature, + TestCompletionModel, ToolLoopMode, configure, }; use rig::completion::AssistantContent; use rig::message::{Text, ToolCall, ToolFunction}; @@ -137,7 +137,7 @@ async fn lm_caller_managed_returns_tool_calls_in_chat_history() { let (lm, _client) = build_test_lm(vec![tool_call_content]).await; - let chat = dspy_rs::Chat::new(vec![Message::user("Run some code")]); + let chat = Chat::new(vec![Message::user("Run some code")]); let response = lm .call_with_tool_loop_mode(chat, vec![], ToolLoopMode::CallerManaged) .await @@ -190,7 +190,7 @@ async fn parse_failure_on_second_turn_includes_correct_raw_response() { .expect_err("second turn should fail"); match err { - dspy_rs::PredictError::Parse { + PredictError::Parse { raw_response, source, .. diff --git a/crates/dspy-rs/tests/test_chain_of_thought_swap.rs b/crates/dsrs-predict/tests/test_chain_of_thought_swap.rs similarity index 92% rename from crates/dspy-rs/tests/test_chain_of_thought_swap.rs rename to crates/dsrs-predict/tests/test_chain_of_thought_swap.rs index 80819747..0bb68ec9 100644 --- a/crates/dspy-rs/tests/test_chain_of_thought_swap.rs +++ b/crates/dsrs-predict/tests/test_chain_of_thought_swap.rs @@ -1,5 +1,5 @@ -use dspy_rs::{ - ChainOfThought, ChatAdapter, LM, LMClient, Module, Predict, Reasoning, Signature, +use dsrs_predict::{ + Augmented, ChainOfThought, ChatAdapter, LM, LMClient, Module, Predict, Reasoning, Signature, TestCompletionModel, WithReasoning, configure, }; use rig::completion::AssistantContent; @@ -73,5 +73,5 @@ async fn chain_of_thought_swaps_and_returns_with_reasoning() { assert_eq!(result.reasoning, "Think"); assert_eq!(result.answer, "Paris"); - let _predict = Predict::>::new(); + let _predict = Predict::>::new(); } diff --git a/crates/dspy-rs/tests/test_flatten_roundtrip.rs b/crates/dsrs-predict/tests/test_flatten_roundtrip.rs similarity index 92% rename from crates/dspy-rs/tests/test_flatten_roundtrip.rs rename to crates/dsrs-predict/tests/test_flatten_roundtrip.rs index 78874ff9..83640ae5 100644 --- a/crates/dspy-rs/tests/test_flatten_roundtrip.rs +++ b/crates/dsrs-predict/tests/test_flatten_roundtrip.rs @@ -1,4 +1,4 @@ -use dspy_rs::{Augmented, ChatAdapter, Example, Message, Reasoning, Signature, WithReasoning}; +use dsrs_predict::{Augmented, ChatAdapter, Example, Message, Reasoning, Signature, WithReasoning}; #[derive(Signature, Clone, Debug)] struct QA { diff --git a/crates/dspy-rs/tests/test_module_facet_shapes.rs b/crates/dsrs-predict/tests/test_module_facet_shapes.rs similarity index 89% rename from crates/dspy-rs/tests/test_module_facet_shapes.rs rename to crates/dsrs-predict/tests/test_module_facet_shapes.rs index 9aaa8d07..3d77996b 100644 --- a/crates/dspy-rs/tests/test_module_facet_shapes.rs +++ b/crates/dsrs-predict/tests/test_module_facet_shapes.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChainOfThought, Facet, ModuleExt, PredictError, ReAct, Signature}; +use dsrs_predict::{ChainOfThought, Facet, ModuleExt, PredictError, ReAct, Signature}; use facet::{self, Type, UserType}; #[derive(Signature, Clone, Debug, facet::Facet)] @@ -41,7 +41,7 @@ fn find_field(shape: &'static facet::Shape, name: &str) -> &'static facet::Field }) } -fn drop_reasoning(output: dspy_rs::WithReasoning) -> QAOutput { +fn drop_reasoning(output: dsrs_predict::WithReasoning) -> QAOutput { output.inner } @@ -50,7 +50,7 @@ fn drop_reasoning(output: dspy_rs::WithReasoning) -> QAOutput { reason = "Test verifies ModuleExt::and_then shape with the crate's public PredictError." )] fn drop_reasoning_checked( - output: dspy_rs::WithReasoning, + output: dsrs_predict::WithReasoning, ) -> Result { Ok(output.inner) } @@ -86,7 +86,7 @@ fn react_shape_exposes_action_and_extract_and_skips_non_parameters() { #[test] fn map_shape_exposes_inner_chain_of_thought_shape() { let mapped = ChainOfThought::::new() - .map(drop_reasoning as fn(dspy_rs::WithReasoning) -> QAOutput); + .map(drop_reasoning as fn(dsrs_predict::WithReasoning) -> QAOutput); let map_shape = shape_of(&mapped); let inner = find_field(map_shape, "inner"); @@ -101,7 +101,7 @@ fn map_shape_exposes_inner_chain_of_thought_shape() { fn and_then_shape_exposes_inner_chain_of_thought_shape() { let chained = ChainOfThought::::new().and_then( drop_reasoning_checked - as fn(dspy_rs::WithReasoning) -> Result, + as fn(dsrs_predict::WithReasoning) -> Result, ); let and_then_shape = shape_of(&chained); let inner = find_field(and_then_shape, "inner"); diff --git a/crates/dspy-rs/tests/test_predict_conversation.rs b/crates/dsrs-predict/tests/test_predict_conversation.rs similarity index 99% rename from crates/dspy-rs/tests/test_predict_conversation.rs rename to crates/dsrs-predict/tests/test_predict_conversation.rs index 5ec381a4..be364c71 100644 --- a/crates/dspy-rs/tests/test_predict_conversation.rs +++ b/crates/dsrs-predict/tests/test_predict_conversation.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ +use dsrs_predict::{ ChatAdapter, LM, LMClient, Message, Predict, Role, Signature, TestCompletionModel, configure, }; use rig::completion::{AssistantContent, CompletionRequest}; diff --git a/crates/dspy-rs/tests/test_predict_conversation_live.rs b/crates/dsrs-predict/tests/test_predict_conversation_live.rs similarity index 95% rename from crates/dspy-rs/tests/test_predict_conversation_live.rs rename to crates/dsrs-predict/tests/test_predict_conversation_live.rs index 984fe7a7..45c7c1ea 100644 --- a/crates/dspy-rs/tests/test_predict_conversation_live.rs +++ b/crates/dsrs-predict/tests/test_predict_conversation_live.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, LM, Message, Predict, Signature, configure}; +use dsrs_predict::{ChatAdapter, LM, Message, Predict, Signature, configure}; use std::sync::LazyLock; use tokio::sync::Mutex; diff --git a/crates/dspy-rs/tests/test_predict_lm_override.rs b/crates/dsrs-predict/tests/test_predict_lm_override.rs similarity index 96% rename from crates/dspy-rs/tests/test_predict_lm_override.rs rename to crates/dsrs-predict/tests/test_predict_lm_override.rs index 4d2ab610..9caafe10 100644 --- a/crates/dspy-rs/tests/test_predict_lm_override.rs +++ b/crates/dsrs-predict/tests/test_predict_lm_override.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ChatAdapter, LM, LMClient, Predict, Signature, TestCompletionModel, configure}; +use dsrs_predict::{ChatAdapter, LM, LMClient, Predict, Signature, TestCompletionModel, configure}; use rig::completion::AssistantContent; use rig::message::Text; use std::sync::LazyLock; diff --git a/crates/dspy-rs/tests/test_react_builder.rs b/crates/dsrs-predict/tests/test_react_builder.rs similarity index 98% rename from crates/dspy-rs/tests/test_react_builder.rs rename to crates/dsrs-predict/tests/test_react_builder.rs index 40f69e33..3bd2e3ee 100644 --- a/crates/dspy-rs/tests/test_react_builder.rs +++ b/crates/dsrs-predict/tests/test_react_builder.rs @@ -1,7 +1,7 @@ use std::sync::LazyLock; use std::sync::atomic::{AtomicUsize, Ordering}; -use dspy_rs::{ChatAdapter, LM, LMClient, ReAct, Signature, TestCompletionModel, configure}; +use dsrs_predict::{ChatAdapter, LM, LMClient, ReAct, Signature, TestCompletionModel, configure}; use rig::completion::AssistantContent; use rig::message::Text; use serde_json::Value; diff --git a/crates/dspy-rs/tests/test_with_reasoning_deref.rs b/crates/dsrs-predict/tests/test_with_reasoning_deref.rs similarity index 93% rename from crates/dspy-rs/tests/test_with_reasoning_deref.rs rename to crates/dsrs-predict/tests/test_with_reasoning_deref.rs index 7cc03493..96a48ce7 100644 --- a/crates/dspy-rs/tests/test_with_reasoning_deref.rs +++ b/crates/dsrs-predict/tests/test_with_reasoning_deref.rs @@ -1,4 +1,4 @@ -use dspy_rs::{Signature, WithReasoning}; +use dsrs_predict::{Signature, WithReasoning}; #[derive(Signature, Clone, Debug, PartialEq)] #[expect( diff --git a/crates/dspy-rs/tests/typed_integration.rs b/crates/dsrs-predict/tests/typed_integration.rs similarity index 99% rename from crates/dspy-rs/tests/typed_integration.rs rename to crates/dsrs-predict/tests/typed_integration.rs index 973dc0a4..cabb5997 100644 --- a/crates/dspy-rs/tests/typed_integration.rs +++ b/crates/dsrs-predict/tests/typed_integration.rs @@ -1,4 +1,4 @@ -use dspy_rs::{ +use dsrs_predict::{ ChatAdapter, LM, LMClient, ParseError, Predict, PredictError, Signature, TestCompletionModel, configure, }; diff --git a/crates/dsrs-trace/Cargo.toml b/crates/dsrs-trace/Cargo.toml index 8c81af74..9d011825 100644 --- a/crates/dsrs-trace/Cargo.toml +++ b/crates/dsrs-trace/Cargo.toml @@ -11,5 +11,12 @@ description = "DSRs tracing and execution graph support." anyhow = "1.0.99" dsrs-core = { path = "../dsrs-core" } serde_json = { version = "1.0.140", features = ["preserve_order"] } -tokio = { version = "1.46.1", features = ["rt", "macros", "sync"] } +thiserror = "2.0.17" +tokio = { version = "1.46.1", features = ["rt", "rt-multi-thread", "macros", "sync"] } tracing = "0.1.44" +tracing-subscriber = { version = "0.3.20", features = ["env-filter", "fmt"] } + +[dev-dependencies] +bon = "3.7.0" +dsrs-lm = { path = "../dsrs-lm" } +dsrs-predict = { path = "../dsrs-predict" } diff --git a/crates/dspy-rs/examples/12-tracing.rs b/crates/dsrs-trace/examples/12-tracing.rs similarity index 92% rename from crates/dspy-rs/examples/12-tracing.rs rename to crates/dsrs-trace/examples/12-tracing.rs index f1e3d412..b9179ff9 100644 --- a/crates/dspy-rs/examples/12-tracing.rs +++ b/crates/dsrs-trace/examples/12-tracing.rs @@ -9,12 +9,12 @@ cargo run --example 12-tracing use anyhow::Result; use bon::Builder; -use dspy_rs::data::RawExample; -use dspy_rs::{ - CallMetadata, ChatAdapter, LM, LmUsage, Module, Predict, PredictError, Predicted, Prediction, - Signature, configure, init_tracing, - trace::{self, Executor}, +use dsrs_core::{ + CallMetadata, LmUsage, Module, PredictError, Predicted, Prediction, RawExample, Signature, }; +use dsrs_lm::{ChatAdapter, LM, configure}; +use dsrs_predict::Predict; +use dsrs_trace::{Executor, init_tracing, trace}; use serde_json::json; use std::collections::HashMap; @@ -102,7 +102,7 @@ async fn main() -> Result<()> { let module = QARater::builder().build(); println!("Starting trace..."); - let (result, graph) = trace::trace(|| async { + let (result, graph) = trace(|| async { module .call(QASignatureInput { question: "Hello".to_string(), diff --git a/crates/dspy-rs/examples/17-pretty-tracing.rs b/crates/dsrs-trace/examples/17-pretty-tracing.rs similarity index 88% rename from crates/dspy-rs/examples/17-pretty-tracing.rs rename to crates/dsrs-trace/examples/17-pretty-tracing.rs index 1dbc112c..68c9a287 100644 --- a/crates/dspy-rs/examples/17-pretty-tracing.rs +++ b/crates/dsrs-trace/examples/17-pretty-tracing.rs @@ -1,6 +1,7 @@ use anyhow::Result; -use dspy_rs::data::RawExample; -use dspy_rs::{Chat, DummyLM, Message, hashmap, init_tracing}; +use dsrs_core::{RawExample, hashmap}; +use dsrs_lm::{Chat, DummyLM, Message}; +use dsrs_trace::init_tracing; #[tokio::main] async fn main() -> Result<()> { diff --git a/crates/dsrs-trace/src/lib.rs b/crates/dsrs-trace/src/lib.rs index a4571119..0a66a3d8 100644 --- a/crates/dsrs-trace/src/lib.rs +++ b/crates/dsrs-trace/src/lib.rs @@ -3,9 +3,11 @@ pub mod context; pub mod dag; pub mod executor; +pub mod telemetry; pub mod value; pub use context::*; pub use dag::*; pub use executor::*; +pub use telemetry::*; pub use value::*; diff --git a/crates/dspy-rs/src/utils/telemetry.rs b/crates/dsrs-trace/src/telemetry.rs similarity index 84% rename from crates/dspy-rs/src/utils/telemetry.rs rename to crates/dsrs-trace/src/telemetry.rs index 43cb8184..b815e25b 100644 --- a/crates/dspy-rs/src/utils/telemetry.rs +++ b/crates/dsrs-trace/src/telemetry.rs @@ -1,8 +1,9 @@ use std::sync::OnceLock; + use thiserror::Error; use tracing_subscriber::EnvFilter; -const DEFAULT_PRETTY_FILTER: &str = "dspy_rs=debug"; +const DEFAULT_PRETTY_FILTER: &str = "dsrs=debug"; static TRACING_INITIALIZED: OnceLock<()> = OnceLock::new(); #[derive(Debug, Error)] @@ -16,12 +17,6 @@ pub enum TelemetryInitError { SetGlobalDefault(#[from] tracing::subscriber::SetGlobalDefaultError), } -/// Installs process-global, pretty tracing output for DSRs. -/// -/// Behavior: -/// - Uses `RUST_LOG` when present. -/// - Falls back to `dspy_rs=debug` when `RUST_LOG` is unset/invalid. -/// - Is idempotent: repeated calls are no-ops after first successful init. pub fn init_tracing() -> Result<(), TelemetryInitError> { if TRACING_INITIALIZED.get().is_some() { return Ok(()); From 88d0c887628ab41c5dfef16cbcb0676d60c5593e Mon Sep 17 00:00:00 2001 From: Darin Kishore Date: Fri, 8 May 2026 01:43:51 -0700 Subject: [PATCH 15/15] test: add split-crate coverage harness and harden docs After dissolving dspy-rs, the low-coverage split crates had no durable way to measure progress independently. A monolithic cargo llvm-cov workspace report also proved fragile: branch report generation can segfault in llvm-cov, and full branch instrumentation on this Mac hit disk pressure when run across data/leaven dependencies. This adds tools/coverage-runtime.mjs. It records line coverage per split crate, supports --package for focused runs, and makes branch coverage explicit via --branch / --strict-branch instead of hiding reporter failures. The harness writes JSON summaries plus target/llvm-cov/runtime-coverage-summary.md. Coverage evidence from this slice: - full line run completed: dsrs-cache 43/43 lines, dsrs-trace 170/272 lines, dsrs-leaven 48/81 lines; the prior baseline for those crates was 0-line coverage. - focused branch run completed for dsrs-core: 22/77 branches, 28.57%. Test harness improvements: - cache insertion/history/noop channel tests. - trace graph/context/value tests. - leaven scaffold contract and serde payload tests. - removed obsolete macro compile-fail expectation that DynPredictor is private; dsrs-core now deliberately exports it. Hard cutover cleanup: - bamltype-derive no longer falls back to dspy-rs macro support. - active README/Mintlify docs now teach split crate imports. - COPRO/MIPROv2 docs are deleted from active navigation because those optimizers were deleted with the split. Verification: - cargo fmt - node --check tools/coverage-runtime.mjs - cargo test -p dsrs-cache - cargo test -p dsrs-trace - cargo test -p dsrs-leaven - cargo test -p bamltype -p bamltype-derive -p dsrs_macros --test test_public_api_compile_fail --test test_bamltype_attr_contract --test test_field_macro - tools/coverage-runtime.mjs - tools/coverage-runtime.mjs --branch --package dsrs-core - cargo test --workspace --no-run --- CURRENT_PLAN.md | 2 + CURRENT_SPEC.md | 2 + Cargo.lock | 2 + README.md | 138 +++++--------- crates/bamltype-derive/src/lib.rs | 6 +- crates/bamltype/AGENTS.md | 2 +- crates/dsrs-cache/Cargo.toml | 3 + crates/dsrs-cache/src/cache.rs | 59 ++++++ crates/dsrs-data/src/lib.rs | 14 +- crates/dsrs-data/tests/test_example.rs | 2 +- crates/dsrs-gepa/src/gepa.rs | 2 +- crates/dsrs-leaven/Cargo.toml | 4 + crates/dsrs-leaven/src/lib.rs | 74 +++++++ crates/dsrs-leaven/src/surface.rs | 6 +- crates/dsrs-lm/src/chat.rs | 10 +- .../tests/test_public_api_compile_fail.rs | 21 -- .../92-smoke-slice3-module-authoring.rs | 4 +- crates/dsrs-predict/src/chain_of_thought.rs | 2 +- crates/dsrs-predict/src/predict.rs | 4 +- crates/dsrs-predict/src/react.rs | 2 +- .../tests/test_predict_lm_override.rs | 4 +- crates/dsrs-trace/src/context.rs | 45 +++++ crates/dsrs-trace/src/dag.rs | 68 +++++++ crates/dsrs-trace/src/executor.rs | 2 +- crates/dsrs-trace/src/value.rs | 23 +++ docs/docs.json | 2 - docs/docs/building-blocks/adapter.mdx | 4 +- docs/docs/building-blocks/lm.mdx | 10 +- docs/docs/building-blocks/module.mdx | 13 +- docs/docs/building-blocks/predictors.mdx | 7 +- docs/docs/building-blocks/signature.mdx | 5 +- docs/docs/building-blocks/types.mdx | 7 +- docs/docs/data/dataloader.mdx | 12 +- docs/docs/data/examples.mdx | 3 +- docs/docs/getting-started/introduction.mdx | 6 +- docs/docs/getting-started/quickstart.mdx | 25 ++- docs/docs/optimizers/copro.mdx | 180 ------------------ docs/docs/optimizers/gepa-llm-judge.mdx | 2 +- docs/docs/optimizers/gepa.mdx | 43 ++--- docs/docs/optimizers/miprov2.mdx | 153 --------------- docs/index.mdx | 2 +- tools/coverage-runtime.mjs | 139 ++++++++++++++ 42 files changed, 565 insertions(+), 549 deletions(-) delete mode 100644 docs/docs/optimizers/copro.mdx delete mode 100644 docs/docs/optimizers/miprov2.mdx create mode 100755 tools/coverage-runtime.mjs diff --git a/CURRENT_PLAN.md b/CURRENT_PLAN.md index 33eafe2b..d7173215 100644 --- a/CURRENT_PLAN.md +++ b/CURRENT_PLAN.md @@ -1,3 +1,5 @@ +> **Superseded** by [`docs/plans/2026-05-08-dsrs-crate-split-design.md`](docs/plans/2026-05-08-dsrs-crate-split-design.md) and its implementation plan. Retained for historical context. + > Status Update (2026-02-08): **Superseded historical plan**. > > Phase 1 (Bridge Root Excision) is now the active baseline: legacy bridge crates are removed from the workspace. diff --git a/CURRENT_SPEC.md b/CURRENT_SPEC.md index 4f390c5c..0650b792 100644 --- a/CURRENT_SPEC.md +++ b/CURRENT_SPEC.md @@ -4,6 +4,8 @@ **Status:** Draft **Last Updated:** 2026-01-08 +> **Superseded** by [`docs/plans/2026-05-08-dsrs-crate-split-design.md`](docs/plans/2026-05-08-dsrs-crate-split-design.md) and its implementation plan for runtime crate topology. Retained for historical context. + > Status Update (2026-02-08): > Legacy bridge crates are removed from the workspace. > Current typed and optimizer contracts remain unchanged in Phase 1. diff --git a/Cargo.lock b/Cargo.lock index 8be90e1d..468f751c 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1361,9 +1361,11 @@ dependencies = [ name = "dsrs-leaven" version = "0.0.0" dependencies = [ + "anyhow", "dsrs-core", "dsrs-evaluate", "dsrs-predict", + "dsrs_macros", "leaven-core", "leaven-engine", "leaven-evidence", diff --git a/README.md b/README.md index 47cfd12f..0a61caa6 100644 --- a/README.md +++ b/README.md @@ -2,54 +2,52 @@ logo # DSRs -A high-performance DSPy rewrite in Rust for building LM-powered applications +A high-performance Rust runtime for building typed LM-powered applications [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Rust](https://img.shields.io/badge/rust-1.70+-orange.svg)](https://www.rust-lang.org) -[![Crates.io](https://img.shields.io/crates/v/dspy-rs)](https://crates.io/crates/dspy-rs) -[![Documentation](https://docs.rs/dspy-rs/badge.svg)](https://docs.rs/dspy-rs) +[![Crates.io](https://img.shields.io/badge/crates-dsrs--core%20%7C%20dsrs--predict-orange)](#crates) +[![Documentation](https://img.shields.io/badge/docs-DSRs-blue)](https://dsrs.herumbshandilya.com) [![Build Status](https://img.shields.io/badge/build-passing-green.svg)](#) -[Documentation](https://dsrs.herumbshandilya.com) • [API Reference](https://docs.rs/dspy-rs) • [Examples](crates/dspy-rs/examples/) • [Issues](https://github.com/krypticmouse/dsrs/issues) • [Discord](https://discord.com/invite/ZAEGgxjPUe) +[Documentation](https://dsrs.herumbshandilya.com) • [Crates](#crates) • [Examples](crates/dsrs-predict/examples/) • [Issues](https://github.com/krypticmouse/dsrs/issues) • [Discord](https://discord.com/invite/ZAEGgxjPUe) --- -## 🚀 Overview +## Overview -**DSRs** (DSPy Rust) is a ground-up rewrite of the [DSPy framework](https://github.com/stanfordnlp/dspy) in Rust, designed for building robust, high-performance applications powered by Language Models. Unlike a simple port, DSRs leverages Rust's type system, memory safety, and concurrency features to provide a more efficient and reliable foundation for LM applications. +**DSRs** is a ground-up Rust runtime for building robust, high-performance applications powered by language models. It uses Rust's type system, memory safety, and concurrency features to provide a reliable foundation for typed LM pipelines. -## 📦 Installation +## Installation -Add DSRs to your `Cargo.toml`: +Depend on the crates you use: ```toml [dependencies] -# Option 1: Use the shorter alias (recommended) -dsrs = { package = "dspy-rs", version = "0.7.3" } - -# Option 2: Use the full name -dspy-rs = "0.7.3" +dsrs-core = "0.7" +dsrs-lm = "0.7" +dsrs-predict = "0.7" +dsrs-trace = "0.7" ``` Or use cargo: ```bash -# Option 1: Add with alias (recommended) -cargo add dsrs --package dspy-rs - -# Option 2: Add with full name -cargo add dspy-rs +cargo add dsrs-core dsrs-lm dsrs-predict dsrs-trace ``` -## 🔧 Quick Start +## Quick Start Here's a simple example to get you started: ```rust use anyhow::Result; -use dspy_rs::{configure, init_tracing, ChatAdapter, LM, Predict, Signature}; +use dsrs_lm::{configure, ChatAdapter, LM}; +use dsrs_macros::Signature; +use dsrs_predict::Predict; +use dsrs_trace::init_tracing; #[derive(Signature, Clone)] struct SentimentAnalyzer { @@ -98,19 +96,22 @@ Result: Answer: "Positive" ``` -## 🏗️ Architecture +## Crates -DSRs follows a modular architecture with clear separation of concerns: +DSRs is split into layer-aligned crates. There is no facade crate; depend on the leaf crates directly. -``` -dsrs/ -├── core/ # Core abstractions (LM, Module, Signature) -├── adapter/ # LM provider adapters (OpenAI, etc.) -├── data/ # Data structures (Example, Prediction) -├── predictors/ # Built-in predictors (Predict, Chain, etc.) -├── evaluate/ # Evaluation framework and metrics -└── macros/ # Derive macros for signatures -``` +| Crate | Purpose | +|-------|---------| +| `dsrs-core` | Signatures, modules, schema, errors, typed data, and abstract bridge traits. | +| `dsrs-lm` | LM client, client registry, usage accounting, and `ChatAdapter`. | +| `dsrs-predict` | `Predict`, `ChainOfThought`, and ReAct predictors. | +| `dsrs-evaluate` | Evaluation framework, typed metrics, and feedback helpers. | +| `dsrs-gepa` | GEPA optimizer. | +| `dsrs-data` | DataLoader with feature-gated CSV, Parquet, and Hugging Face support. | +| `dsrs-trace` | Execution graph recording and tracing helpers. | +| `dsrs-cache` | Foyer-backed LM cache. | +| `dsrs-leaven` | Leaven integration scaffold. | +| `dsrs-macros` | Derive macros for signatures and field metadata. | ### Core Components @@ -203,45 +204,12 @@ println!("Average score: {}", score); #### 6. **Optimization** - Optimize your Modules -DSRs provides two powerful optimizers: - -**COPRO (Collaborative Prompt Optimization)** -```rust -#[derive(Builder, facet::Facet)] -#[facet(crate = facet)] -pub struct MyModule { - predictor: Predict, -} - -// Create and configure the optimizer -let optimizer = COPRO::builder() - .breadth(10) // Number of candidates per iteration - .depth(3) // Number of refinement iterations - .build(); - -// Prepare training data -let train_examples = load_training_data(); -let metric = ExactMatchMetric; +DSRs keeps GEPA as the active optimizer crate while the Leaven integration is being built out. COPRO and MIPROv2 were deleted with the crate split. -// Compile optimizes the module in-place -let mut module = MyModule::new(); -optimizer.compile(&mut module, train_examples, &metric).await?; -``` - -**MIPROv2 (Multi-prompt Instruction Proposal Optimizer v2)** - Advanced optimizer using LLMs ```rust -// MIPROv2 uses a 3-stage process: -// 1. Generate execution traces -// 2. LLM generates candidate prompts with best practices -// 3. Evaluate and select the best prompt - -let optimizer = MIPROv2::builder() - .num_candidates(10) // Number of candidate prompts to generate - .num_trials(20) // Number of evaluation trials - .minibatch_size(25) // Examples per evaluation - .temperature(1.0) // Temperature for prompt generation - .build(); +use dsrs_gepa::GEPAOptimizer; +let optimizer = GEPAOptimizer::builder().build(); optimizer.compile(&mut module, train_examples, &metric).await?; ``` @@ -253,7 +221,8 @@ Default behavior is: - Missing signature-required fields return an error with row + field context. ```rust -use dspy_rs::{DataLoader, Signature, TypedLoadOptions}; +use dsrs_data::{DataLoader, TypedLoadOptions}; +use dsrs_macros::Signature; #[derive(Signature, Clone, Debug)] struct QA { @@ -280,7 +249,7 @@ let trainset = DataLoader::load_csv_with::( true, TypedLoadOptions::default(), |row| { - Ok(dspy_rs::Example::new( + Ok(dsrs_core::Example::new( QAInput { question: row.get::("prompt")?, }, @@ -297,7 +266,7 @@ Migration note: - `save_json` / `save_csv` were removed from `DataLoader`. - Use typed `load_*` / `load_*_with` APIs. -See `examples/08-optimize-mipro.rs` for a complete example (requires `parquet` feature). +See the `dsrs-data` crate tests and examples for complete loader coverage. **Component Discovery:** ```rust @@ -375,7 +344,7 @@ cargo run --example 01-simple ### Chain of Thought (CoT) Reasoning ```rust -use dspy_rs::ChainOfThought; +use dsrs_predict::ChainOfThought; // ChainOfThought wraps any signature, adding a `reasoning` field let cot = ChainOfThought::::new(); @@ -393,27 +362,7 @@ DSRs includes a tracing system that captures the dataflow through modules as a D See `examples/12-tracing.rs` for a complete example. -### Optimizer Comparison - -| Feature | COPRO | MIPROv2 | GEPA | -|---------|-------|---------|------| -| **Approach** | Iterative refinement | LLM-guided generation | Evolutionary search with textual feedback | -| **Complexity** | Simple | Advanced | Advanced | -| **Best For** | Quick optimization | Best results | Complex tasks with subtle failure modes | -| **Training Data** | Uses scores | Uses traces & descriptions | Uses rich textual feedback | -| **Prompting Tips** | No | Yes (15+ best practices) | No | -| **Program Understanding** | Basic | LLM-generated descriptions | LLM-judge feedback | -| **Few-shot Examples** | No | Yes (auto-selected) | No | - -**When to use COPRO:** -- Fast iteration needed -- Simple tasks -- Limited compute budget - -**When to use MIPROv2:** -- Best possible results needed -- Complex reasoning tasks -- Have good training data (15+ examples recommended) +### GEPA **When to use GEPA:** - Tasks where score alone doesn't explain what went wrong @@ -462,13 +411,12 @@ This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENS - Inspired by the original [DSPy](https://github.com/stanfordnlp/dspy) framework - Built with the amazing Rust ecosystem - Special thanks to the DSPy community for the discussion and ideas -- MIPROv2 implementation ## 🔗 Resources - [Documentation](https://dsrs.herumbshandilya.com) -- [API Reference](https://docs.rs/dspy-rs) -- [Examples](crates/dspy-rs/examples/) +- [Crates](#crates) +- [Examples](crates/dsrs-predict/examples/) - [GitHub Issues](https://github.com/krypticmouse/dsrs/issues) - [Discord Community](https://discord.com/invite/ZAEGgxjPUe) - [Original DSPy Paper](https://arxiv.org/abs/2310.03714) diff --git a/crates/bamltype-derive/src/lib.rs b/crates/bamltype-derive/src/lib.rs index a34963ed..8254e7db 100644 --- a/crates/bamltype-derive/src/lib.rs +++ b/crates/bamltype-derive/src/lib.rs @@ -168,13 +168,9 @@ fn resolve_runtime_crate() -> syn::Result { return Ok(path); } - if let Some(dspy_path) = find_crate_path("dspy-rs") { - return Ok(syn::parse_quote!(#dspy_path::__macro_support::bamltype)); - } - Err(syn::Error::new( Span::call_site(), - "could not resolve bamltype runtime crate; expected dependency on `bamltype` or `dspy-rs`", + "could not resolve bamltype runtime crate; expected dependency on `bamltype`", )) } diff --git a/crates/bamltype/AGENTS.md b/crates/bamltype/AGENTS.md index f9a680c8..c95dc9a2 100644 --- a/crates/bamltype/AGENTS.md +++ b/crates/bamltype/AGENTS.md @@ -3,7 +3,7 @@ ## Baseline Commands 1. `cargo test -p bamltype --tests` -2. `cargo test -p dspy-rs --test typed_integration --test test_typed_alias --test test_typed_prompt_format` +2. `cargo test -p bamltype && cargo test -p dsrs_macros --test test_bamltype_attr_contract --test test_field_macro` ## Test Layers diff --git a/crates/dsrs-cache/Cargo.toml b/crates/dsrs-cache/Cargo.toml index 16934c21..d90e1377 100644 --- a/crates/dsrs-cache/Cargo.toml +++ b/crates/dsrs-cache/Cargo.toml @@ -17,3 +17,6 @@ serde_json = { version = "1.0.140", features = ["preserve_order"] } tempfile = "3.23.0" tokio = { version = "1.46.1", features = ["sync"] } tracing = "0.1.44" + +[dev-dependencies] +tokio = { version = "1.46.1", features = ["macros", "rt", "sync"] } diff --git a/crates/dsrs-cache/src/cache.rs b/crates/dsrs-cache/src/cache.rs index 37ee3c1c..efc8cab2 100644 --- a/crates/dsrs-cache/src/cache.rs +++ b/crates/dsrs-cache/src/cache.rs @@ -132,3 +132,62 @@ impl Cache for ResponseCache { Ok(self.history_window[..actual_n].to_vec()) } } + +#[cfg(test)] +mod tests { + use super::*; + use dsrs_core::{LmUsage, hashmap}; + + fn raw_key(question: &str) -> RawExample { + RawExample::new( + hashmap! { + "question".to_string() => question.into(), + }, + vec!["question".to_string()], + vec![], + ) + } + + fn prediction(answer: &str) -> Prediction { + Prediction::new( + hashmap! { + "answer".to_string() => answer.into(), + }, + LmUsage::default(), + ) + } + + #[tokio::test] + async fn insert_get_and_history_round_trip_cached_prediction() { + let mut cache = ResponseCache::new().await; + let key = raw_key("capital?"); + assert!(cache.get(key.clone()).await.unwrap().is_none()); + + let (tx, rx) = mpsc::channel(1); + let entry = CacheEntry { + prompt: "prompt".to_string(), + prediction: prediction("Paris"), + }; + tx.send(entry.clone()).await.unwrap(); + drop(tx); + + cache.insert(key.clone(), rx).await.unwrap(); + + let cached = cache.get(key).await.unwrap().unwrap(); + assert_eq!(cached.get("answer", None), "Paris"); + let history = cache.get_history(10).await.unwrap(); + assert_eq!(history.len(), 1); + assert_eq!(history[0].prompt, entry.prompt); + } + + #[tokio::test] + async fn insert_with_closed_channel_is_noop() { + let mut cache = ResponseCache::new().await; + let (_tx, rx) = mpsc::channel(1); + drop(_tx); + + cache.insert(raw_key("missing"), rx).await.unwrap(); + + assert!(cache.get_history(1).await.unwrap().is_empty()); + } +} diff --git a/crates/dsrs-data/src/lib.rs b/crates/dsrs-data/src/lib.rs index 6bc5ec9a..ffda8253 100644 --- a/crates/dsrs-data/src/lib.rs +++ b/crates/dsrs-data/src/lib.rs @@ -8,7 +8,12 @@ //! //! The untyped row type (`RawExample`) remains for internal runtime/tracing/cache bridges. -#[cfg(any(feature = "csv", feature = "json", feature = "parquet", feature = "hf-hub"))] +#[cfg(any( + feature = "csv", + feature = "json", + feature = "parquet", + feature = "hf-hub" +))] pub mod dataloader; pub mod example { pub use dsrs_core::RawExample as Example; @@ -19,7 +24,12 @@ pub mod prediction { pub mod serialize; pub mod utils; -#[cfg(any(feature = "csv", feature = "json", feature = "parquet", feature = "hf-hub"))] +#[cfg(any( + feature = "csv", + feature = "json", + feature = "parquet", + feature = "hf-hub" +))] pub use dataloader::*; pub use example::*; pub use prediction::*; diff --git a/crates/dsrs-data/tests/test_example.rs b/crates/dsrs-data/tests/test_example.rs index 57157ffa..a994aafa 100644 --- a/crates/dsrs-data/tests/test_example.rs +++ b/crates/dsrs-data/tests/test_example.rs @@ -1,6 +1,6 @@ +use dsrs_core::hashmap; use dsrs_data::example::Example; use dsrs_data::serialize::{load_jsonl, save_examples_as_jsonl}; -use dsrs_core::hashmap; use rstest::*; #[rstest] diff --git a/crates/dsrs-gepa/src/gepa.rs b/crates/dsrs-gepa/src/gepa.rs index cfdecd26..1775b928 100644 --- a/crates/dsrs-gepa/src/gepa.rs +++ b/crates/dsrs-gepa/src/gepa.rs @@ -5,8 +5,8 @@ use serde::{Deserialize, Serialize}; use dsrs_core::{BamlType, BamlValue, Example, Facet, Module, Signature}; use dsrs_evaluate::{MetricOutcome, TypedMetric, average_score}; -use crate::{Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor}; use crate::pareto::ParetoFrontier; +use crate::{Optimizer, evaluate_module_with_metric, predictor_names, with_named_predictor}; /// A single instruction candidate tracked through GEPA's evolutionary search. /// diff --git a/crates/dsrs-leaven/Cargo.toml b/crates/dsrs-leaven/Cargo.toml index d2bef48f..b0dabb1a 100644 --- a/crates/dsrs-leaven/Cargo.toml +++ b/crates/dsrs-leaven/Cargo.toml @@ -18,3 +18,7 @@ leaven-evidence = { path = "../../../leaven/crates/leaven-evidence" } serde = { version = "1.0.219", features = ["derive"] } serde_json = "1.0.140" thiserror = "2.0.17" + +[dev-dependencies] +anyhow = "1.0.99" +dsrs_macros = { version = "0.7.2", path = "../dsrs-macros" } diff --git a/crates/dsrs-leaven/src/lib.rs b/crates/dsrs-leaven/src/lib.rs index 4fef693b..dba68cff 100644 --- a/crates/dsrs-leaven/src/lib.rs +++ b/crates/dsrs-leaven/src/lib.rs @@ -21,3 +21,77 @@ pub enum DsrsLeavenError { #[error("dsrs-leaven scaffold is not implemented yet: {0}")] Unimplemented(&'static str), } + +#[cfg(test)] +mod tests { + use super::*; + use anyhow::Result; + use dsrs_core::{CallMetadata, Module, PredictError, Predicted, Signature}; + use leaven_core::{Artifact, OptimizationProblem}; + + #[derive(Signature, Clone, Debug)] + struct TestSig { + #[input] + prompt: String, + + #[output] + answer: String, + } + + struct TestModule; + + impl Module for TestModule { + type Input = TestSigInput; + type Output = TestSigOutput; + + async fn forward( + &self, + input: Self::Input, + ) -> Result, PredictError> { + Ok(Predicted::new( + TestSigOutput { + answer: input.prompt, + }, + CallMetadata::default(), + )) + } + } + + #[test] + fn scaffold_constructors_are_cloneable_markers() { + let artifact = DsrsProgramArtifact::::scaffold(); + let _clone = artifact.clone(); + let _surface = DsrsProgramSurface::::scaffold(); + let _evaluator = DsrsEvaluator::::scaffold(); + } + + #[test] + fn problem_associated_types_match_dsrs_scaffold() { + fn assert_problem() {} + assert_problem::>(); + } + + #[test] + #[should_panic(expected = "dsrs-leaven: artifact identity")] + fn artifact_identity_is_explicit_scaffold_panic() { + let artifact = DsrsProgramArtifact::::scaffold(); + let _ = artifact.identity(); + } + + #[test] + fn change_and_evidence_round_trip_json_payloads() { + let change = DsrsProgramChange { + address: "predictor.instruction".to_string(), + replacement: serde_json::json!("new instruction"), + }; + let encoded = serde_json::to_string(&change).unwrap(); + let decoded: DsrsProgramChange = serde_json::from_str(&encoded).unwrap(); + assert_eq!(decoded.address, "predictor.instruction"); + assert_eq!(decoded.replacement, "new instruction"); + + let evidence = DsrsEvidence { + payload: serde_json::json!({"score": 1.0}), + }; + assert_eq!(evidence.payload["score"], 1.0); + } +} diff --git a/crates/dsrs-leaven/src/surface.rs b/crates/dsrs-leaven/src/surface.rs index 6d73347e..1bde122b 100644 --- a/crates/dsrs-leaven/src/surface.rs +++ b/crates/dsrs-leaven/src/surface.rs @@ -45,8 +45,10 @@ where fn parts<'a>( &self, _artifact: &'a DsrsProgramArtifact, - ) -> Result>>, leaven_surface::SurfaceError> - { + ) -> Result< + Vec>>, + leaven_surface::SurfaceError, + > { unimplemented!("dsrs-leaven: surface parts") } diff --git a/crates/dsrs-lm/src/chat.rs b/crates/dsrs-lm/src/chat.rs index 0b47ca60..3163d4f1 100644 --- a/crates/dsrs-lm/src/chat.rs +++ b/crates/dsrs-lm/src/chat.rs @@ -570,10 +570,7 @@ impl ChatAdapter { /// /// Convenience method that calls [`format_user_message_typed`](ChatAdapter::format_user_message_typed) /// and [`format_assistant_message_typed`](ChatAdapter::format_assistant_message_typed). - pub fn format_demo_typed( - &self, - demo: &Example, - ) -> (String, String) + pub fn format_demo_typed(&self, demo: &Example) -> (String, String) where S::Input: BamlType, S::Output: BamlType, @@ -896,10 +893,7 @@ fn parse_sections(content: &str) -> IndexMap { parsed } -fn value_for_path_relaxed<'a>( - value: &'a BamlValue, - path: &FieldPath, -) -> Option<&'a BamlValue> { +fn value_for_path_relaxed<'a>(value: &'a BamlValue, path: &FieldPath) -> Option<&'a BamlValue> { let mut current = value; let parts: Vec<_> = path.iter().collect(); let mut idx = 0usize; diff --git a/crates/dsrs-macros/tests/test_public_api_compile_fail.rs b/crates/dsrs-macros/tests/test_public_api_compile_fail.rs index aa941a8b..7c48e09b 100644 --- a/crates/dsrs-macros/tests/test_public_api_compile_fail.rs +++ b/crates/dsrs-macros/tests/test_public_api_compile_fail.rs @@ -45,27 +45,6 @@ fn assert_not_masked_by_e0401(stderr: &str) { ); } -#[test] -fn dyn_predictor_is_not_publicly_importable() { - let stderr = run_compile_fail_case( - "private_dyn_predictor_case", - r#" -use dsrs_core::DynPredictor; - -fn main() { - let _ = std::any::type_name::>(); -} -"#, - ); - - assert_not_masked_by_e0401(&stderr); - assert!( - stderr.contains("DynPredictor") - && (stderr.contains("private") || stderr.contains("no `DynPredictor` in the root")), - "expected DynPredictor import failure, got:\n{stderr}" - ); -} - #[test] fn named_parameters_is_not_publicly_importable() { let stderr = run_compile_fail_case( diff --git a/crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs b/crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs index 683bbd24..f002e6f1 100644 --- a/crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs +++ b/crates/dsrs-predict/examples/92-smoke-slice3-module-authoring.rs @@ -1,5 +1,7 @@ use anyhow::{Result, bail}; -use dsrs_predict::{ChatAdapter, LM, Module, Predict, PredictError, Predicted, Signature, configure}; +use dsrs_predict::{ + ChatAdapter, LM, Module, Predict, PredictError, Predicted, Signature, configure, +}; #[derive(Signature, Clone, Debug)] struct SmokeSig { diff --git a/crates/dsrs-predict/src/chain_of_thought.rs b/crates/dsrs-predict/src/chain_of_thought.rs index 9966f1f4..0967eff6 100644 --- a/crates/dsrs-predict/src/chain_of_thought.rs +++ b/crates/dsrs-predict/src/chain_of_thought.rs @@ -1,6 +1,6 @@ +use crate::{Predict, PredictBuilder}; use dsrs_core::Augmented; use dsrs_core::{BamlType, Example, Module, PredictError, Predicted, Signature}; -use crate::{Predict, PredictBuilder}; use dsrs_lm::LM; /// Augmentation that prepends a `reasoning: String` field to a signature's output. diff --git a/crates/dsrs-predict/src/predict.rs b/crates/dsrs-predict/src/predict.rs index 28c97ca9..6fd5bdff 100644 --- a/crates/dsrs-predict/src/predict.rs +++ b/crates/dsrs-predict/src/predict.rs @@ -9,11 +9,13 @@ use std::sync::Arc; use tracing::{debug, trace}; use dsrs_core as dsrs; -use dsrs_core::{DynPredictor, Example, Module, PredictAccessorFns, PredictState, RawExample, Signature}; use dsrs_core::{ BamlType, BamlValue, CallMetadata, LmError, LmUsage, PredictError, Predicted, Prediction, SignatureSchema, }; +use dsrs_core::{ + DynPredictor, Example, Module, PredictAccessorFns, PredictState, RawExample, Signature, +}; use dsrs_lm::{Chat, ChatAdapter, GLOBAL_SETTINGS, LM, LMResponse}; fn predict_dyn_visit( diff --git a/crates/dsrs-predict/src/react.rs b/crates/dsrs-predict/src/react.rs index 6b67d668..1b3d3afa 100644 --- a/crates/dsrs-predict/src/react.rs +++ b/crates/dsrs-predict/src/react.rs @@ -7,9 +7,9 @@ use rig::message::{ToolCall, ToolFunction}; use rig::tool::{ToolDyn, ToolError}; use rig::wasm_compat::WasmBoxedFuture; +use crate::{Predict, PredictBuilder}; use dsrs_core::{BamlType, Module, PredictError, Predicted, Signature}; use dsrs_lm::LM; -use crate::{Predict, PredictBuilder}; /// ReAct action-step schema. #[derive(dsrs_macros::Signature, Clone, Debug)] diff --git a/crates/dsrs-predict/tests/test_predict_lm_override.rs b/crates/dsrs-predict/tests/test_predict_lm_override.rs index 9caafe10..33c48d7b 100644 --- a/crates/dsrs-predict/tests/test_predict_lm_override.rs +++ b/crates/dsrs-predict/tests/test_predict_lm_override.rs @@ -61,9 +61,7 @@ async fn predict_uses_per_instance_lm_over_global() { let (override_lm, _override_client) = make_test_lm(vec![override_response]).await; // Predict with per-instance LM override - let predict = Predict::::builder() - .lm(override_lm) - .build(); + let predict = Predict::::builder().lm(override_lm).build(); let result = predict .call(QAInput { diff --git a/crates/dsrs-trace/src/context.rs b/crates/dsrs-trace/src/context.rs index 3287ce05..d7027a7d 100644 --- a/crates/dsrs-trace/src/context.rs +++ b/crates/dsrs-trace/src/context.rs @@ -85,3 +85,48 @@ pub fn record_output(node_id: usize, output: Prediction) { trace!(node_id, "trace output recorded"); }); } + +#[cfg(test)] +mod tests { + use super::*; + use dsrs_core::{LmUsage, hashmap}; + + #[tokio::test] + async fn trace_scope_records_nodes_and_outputs() { + assert!(!is_tracing()); + + let (result, graph) = trace(|| async { + assert!(is_tracing()); + let id = record_node( + NodeType::Operator { + name: "normalize".to_string(), + }, + vec![], + None, + ) + .unwrap(); + record_output( + id, + Prediction::new( + hashmap! { + "normalized".to_string() => true.into(), + }, + LmUsage::default(), + ), + ); + 7 + }) + .await; + + assert_eq!(result, 7); + assert!(!is_tracing()); + assert_eq!(graph.nodes.len(), 1); + assert_eq!(graph.nodes[0].output.as_ref().unwrap()["normalized"], true); + } + + #[test] + fn record_outside_trace_is_noop() { + assert!(record_node(NodeType::Root, vec![], None).is_none()); + record_output(0, Prediction::new(hashmap! {}, LmUsage::default())); + } +} diff --git a/crates/dsrs-trace/src/dag.rs b/crates/dsrs-trace/src/dag.rs index edb165f8..9ff05f0e 100644 --- a/crates/dsrs-trace/src/dag.rs +++ b/crates/dsrs-trace/src/dag.rs @@ -119,3 +119,71 @@ impl Graph { self.nodes.get(id) } } + +#[cfg(test)] +mod tests { + use super::*; + use dsrs_core::{LmUsage, hashmap}; + + #[test] + fn graph_assigns_ids_and_ignores_missing_outputs() { + let mut graph = Graph::new(); + let input = RawExample::new( + hashmap! { + "question".to_string() => "2+2?".into(), + }, + vec!["question".to_string()], + vec![], + ); + let root = graph.add_node(NodeType::Root, vec![], Some(input)); + let predict = graph.add_node( + NodeType::Predict { + signature_name: "QA".to_string(), + }, + vec![root], + None, + ); + let output = Prediction::new( + hashmap! { + "answer".to_string() => "4".into(), + }, + LmUsage::default(), + ); + + graph.set_output(predict, output.clone()); + graph.set_output(99, output); + + assert_eq!(root, 0); + assert_eq!(predict, 1); + assert_eq!(graph.get_node(root).unwrap().inputs, Vec::::new()); + assert_eq!(graph.get_node(predict).unwrap().inputs, vec![root]); + assert_eq!( + graph.get_node(predict).unwrap().output.as_ref().unwrap()["answer"], + "4" + ); + assert!(graph.get_node(99).is_none()); + } + + #[test] + fn node_type_debug_includes_variant_payloads() { + assert_eq!(format!("{:?}", NodeType::Root), "Root"); + assert!( + format!( + "{:?}", + NodeType::Operator { + name: "step".into() + } + ) + .contains("step") + ); + assert!( + format!( + "{:?}", + NodeType::Map { + mapping: vec![("x".into(), (0, "y".into()))], + } + ) + .contains("mapping") + ); + } +} diff --git a/crates/dsrs-trace/src/executor.rs b/crates/dsrs-trace/src/executor.rs index c2b88835..d3b7a85f 100644 --- a/crates/dsrs-trace/src/executor.rs +++ b/crates/dsrs-trace/src/executor.rs @@ -1,6 +1,6 @@ use crate::dag::{Graph, NodeType}; -use dsrs_core::{Prediction, RawExample}; use anyhow::Result; +use dsrs_core::{Prediction, RawExample}; use std::collections::HashMap; /// Replays a traced execution graph with new input data. diff --git a/crates/dsrs-trace/src/value.rs b/crates/dsrs-trace/src/value.rs index 14512934..f60df4b9 100644 --- a/crates/dsrs-trace/src/value.rs +++ b/crates/dsrs-trace/src/value.rs @@ -20,3 +20,26 @@ impl IntoTracked for Value { } } } + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn serde_value_becomes_unlinked_tracked_value() { + let tracked = serde_json::json!({"answer": 42}).into_tracked(); + assert_eq!(tracked.value["answer"], 42); + assert!(tracked.source.is_none()); + } + + #[test] + fn tracked_value_identity_conversion_preserves_source() { + let original = TrackedValue { + value: serde_json::json!("x"), + source: Some((3, "field".to_string())), + }; + let tracked = original.clone().into_tracked(); + assert_eq!(tracked.value, original.value); + assert_eq!(tracked.source, original.source); + } +} diff --git a/docs/docs.json b/docs/docs.json index 53455c87..9b6afc25 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -43,8 +43,6 @@ { "group": "Optimizers", "pages": [ - "docs/optimizers/copro", - "docs/optimizers/miprov2", "docs/optimizers/gepa", "docs/optimizers/gepa-llm-judge" ] diff --git a/docs/docs/building-blocks/adapter.mdx b/docs/docs/building-blocks/adapter.mdx index 3e2f5157..01466b3f 100644 --- a/docs/docs/building-blocks/adapter.mdx +++ b/docs/docs/building-blocks/adapter.mdx @@ -132,7 +132,7 @@ ParseError::Multiple { Usually you don't touch the adapter - `Predict` handles it. But for debugging: ```rust -use dspy_rs::ChatAdapter; +use dsrs_lm::ChatAdapter; let adapter = ChatAdapter; @@ -190,7 +190,7 @@ Compiled templates are cached process-wide by template string. ## Real example: Insurance claim extraction -Here's what a prompt looks like for a complex nested type (from [`examples/16-insurance-claim-prompt.rs`](https://github.com/darinkishore/DSRs/blob/main/crates/dspy-rs/examples/16-insurance-claim-prompt.rs)): +Here's what a prompt looks like for a complex nested type (from [`16-insurance-claim-prompt.rs`](https://github.com/darinkishore/DSRs/blob/main/crates/dsrs-lm/examples/16-insurance-claim-prompt.rs)): ```rust #[derive(Signature, Clone, Debug)] diff --git a/docs/docs/building-blocks/lm.mdx b/docs/docs/building-blocks/lm.mdx index 02d83f57..17c5ae25 100644 --- a/docs/docs/building-blocks/lm.mdx +++ b/docs/docs/building-blocks/lm.mdx @@ -48,7 +48,8 @@ The `LM::builder()` must be awaited with `.build().await` because client initial ### Basic usage ```rust -use dspy_rs::{init_tracing, LM}; +use dsrs_lm::LM; +use dsrs_trace::init_tracing; #[tokio::main] async fn main() -> anyhow::Result<()> { @@ -102,7 +103,7 @@ let lm = LM::builder() ## API Reference -You can browse the full `LM` module reference on [docs.rs](https://docs.rs/dspy-rs/latest/dspy_rs/core/lm/index.html). +You can browse the full `LM` module reference in the `dsrs-lm` crate docs once the split crates are published. ## Global vs explicit usage @@ -118,7 +119,8 @@ You can browse the full `LM` module reference on [docs.rs](https://docs.rs/dspy- ```rust -use dspy_rs::{init_tracing, LM}; +use dsrs_lm::LM; +use dsrs_trace::init_tracing; #[tokio::main] async fn main() -> anyhow::Result<()> { @@ -137,7 +139,7 @@ async fn main() -> anyhow::Result<()> { ```rust fn main() -> anyhow::Result<()> { - dspy_rs::init_tracing()?; + dsrs_trace::init_tracing()?; let rt = tokio::runtime::Runtime::new()?; rt.block_on(async move { diff --git a/docs/docs/building-blocks/module.mdx b/docs/docs/building-blocks/module.mdx index 9cd2d59e..da971937 100644 --- a/docs/docs/building-blocks/module.mdx +++ b/docs/docs/building-blocks/module.mdx @@ -53,7 +53,8 @@ async fn answer_with_reasoning(q: &str) -> Result::builder() Demos for ChainOfThought include reasoning — they're `Example>`. The reasoning field shows the LM what good chain-of-thought looks like. ```rust -use dspy_rs::{Example, Augmented, Reasoning, WithReasoning}; +use dsrs_core::{Augmented, Example, Reasoning, WithReasoning}; let cot = ChainOfThought::::builder() .demo(Example::>::new( @@ -106,7 +107,9 @@ In practice you rarely write demos by hand. Optimizers generate them automatical Define a struct, derive Facet, implement Module. ```rust -use dspy_rs::{Module, Predict, ChainOfThought, Predicted, PredictError, Signature}; +use dsrs_core::{Module, PredictError, Predicted}; +use dsrs_macros::Signature; +use dsrs_predict::{ChainOfThought, Predict}; #[derive(Signature, Clone, Debug)] /// Retrieve relevant passages for a question. @@ -191,7 +194,7 @@ async fn forward(&self, input: Self::Input) -> Result, P For simple post-processing, use `.map()` instead of writing a full module: ```rust -use dspy_rs::ModuleExt; +use dsrs_core::ModuleExt; let cot = ChainOfThought::::new(); @@ -219,7 +222,7 @@ let inputs: Vec = questions.iter() .map(|q| QAInput { question: q.clone() }) .collect(); -let results = dspy_rs::forward_all(&cot, inputs, 10).await; +let results = dsrs_core::forward_all(&cot, inputs, 10).await; // Vec>, PredictError>> ``` diff --git a/docs/docs/building-blocks/predictors.mdx b/docs/docs/building-blocks/predictors.mdx index 518f6b6a..9c78f01b 100644 --- a/docs/docs/building-blocks/predictors.mdx +++ b/docs/docs/building-blocks/predictors.mdx @@ -9,7 +9,8 @@ A `Predict` takes a [signature](/docs/building-blocks/signature) and actually ca ## Basic usage ```rust -use dspy_rs::{Predict, Signature}; +use dsrs_macros::Signature; +use dsrs_predict::Predict; #[derive(Signature, Clone, Debug)] /// Answer questions accurately. @@ -55,7 +56,7 @@ This overrides the docstring instruction on the signature. ### With demos (few-shot) ```rust -use dspy_rs::Example; +use dsrs_core::Example; let predict = Predict::::builder() .demo(Example::::new( @@ -131,7 +132,7 @@ if let Some(field) = result.metadata().field_meta.get("answer") { ## Error handling ```rust -use dspy_rs::PredictError; +use dsrs_core::PredictError; match predict.call(input).await { Ok(output) => println!("{}", output.answer), diff --git a/docs/docs/building-blocks/signature.mdx b/docs/docs/building-blocks/signature.mdx index 21a8a60e..7c158b42 100644 --- a/docs/docs/building-blocks/signature.mdx +++ b/docs/docs/building-blocks/signature.mdx @@ -11,7 +11,7 @@ You write a struct with `#[derive(Signature)]`, mark fields as `#[input]` or `#[ ## Basic syntax ```rust -use dspy_rs::Signature; +use dsrs_macros::Signature; /// Answer questions accurately and concisely. #[derive(Signature, Clone, Debug)] @@ -96,7 +96,8 @@ struct SpamAnalysis { When you have a non-standard type in a field, add `#[BamlType]` on it: ```rust -use dspy_rs::{Signature, BamlType}; +use bamltype::BamlType; +use dsrs_macros::Signature; #[BamlType] #[derive(Clone, Debug)] diff --git a/docs/docs/building-blocks/types.mdx b/docs/docs/building-blocks/types.mdx index 1d3fd584..0222cec0 100644 --- a/docs/docs/building-blocks/types.mdx +++ b/docs/docs/building-blocks/types.mdx @@ -34,7 +34,8 @@ Custom types need `#[BamlType]`: Example: ```rust -use dspy_rs::{BamlType, Signature}; +use bamltype::BamlType; +use dsrs_macros::Signature; #[derive(Clone, Debug)] #[BamlType] @@ -214,7 +215,7 @@ Visible effect: The user-visible claims on this page are locked by tests: -- Prompt/render effects: `crates/dspy-rs/tests/test_bamltype_docs_contract.rs` -- Basic `#[BamlType]` end-user contract: `crates/dspy-rs/tests/test_bamltype_attr_contract.rs` +- Prompt/render effects: `crates/dsrs-macros/tests/test_bamltype_docs_contract.rs` +- Basic `#[BamlType]` end-user contract: `crates/dsrs-macros/tests/test_bamltype_attr_contract.rs` - Unsupported/compile-fail type shapes: `crates/bamltype/tests/ui.rs` - Signature macro compile-fail coverage: `crates/dsrs-macros/tests/ui.rs` diff --git a/docs/docs/data/dataloader.mdx b/docs/docs/data/dataloader.mdx index 53963ce7..d5e05eb5 100644 --- a/docs/docs/data/dataloader.mdx +++ b/docs/docs/data/dataloader.mdx @@ -15,7 +15,9 @@ No manual `RawExample -> Example` conversion is required. ## Core API ```rust -use dspy_rs::{DataLoader, Example, Signature, TypedLoadOptions}; +use dsrs_core::Example; +use dsrs_data::{DataLoader, TypedLoadOptions}; +use dsrs_macros::Signature; ``` Typed loaders: @@ -39,7 +41,8 @@ Mapper overloads: - Uses signature field names directly unless remapped. ```rust -use dspy_rs::{DataLoader, Signature, TypedLoadOptions}; +use dsrs_data::{DataLoader, TypedLoadOptions}; +use dsrs_macros::Signature; #[derive(Signature, Clone, Debug)] struct QA { @@ -63,7 +66,7 @@ Use `TypedLoadOptions.field_map` when source column names differ from signature ```rust use std::collections::HashMap; -use dspy_rs::{DataLoader, TypedLoadOptions, UnknownFieldPolicy}; +use dsrs_data::{DataLoader, TypedLoadOptions, UnknownFieldPolicy}; let mut field_map = HashMap::new(); field_map.insert("question".to_string(), "prompt".to_string()); @@ -85,7 +88,8 @@ let trainset = DataLoader::load_csv::( Use mapper overloads for fully custom row conversion logic. ```rust -use dspy_rs::{DataLoader, Example, TypedLoadOptions}; +use dsrs_core::Example; +use dsrs_data::{DataLoader, TypedLoadOptions}; let trainset = DataLoader::load_json_with::( "data/train.jsonl", diff --git a/docs/docs/data/examples.mdx b/docs/docs/data/examples.mdx index 93297830..6cf668a6 100644 --- a/docs/docs/data/examples.mdx +++ b/docs/docs/data/examples.mdx @@ -7,7 +7,8 @@ icon: "table" `Example` is the typed training/evaluation row for a signature `S`. ```rust -use dspy_rs::{Example, Signature}; +use dsrs_core::Example; +use dsrs_macros::Signature; #[derive(Signature, Clone, Debug)] struct QA { diff --git a/docs/docs/getting-started/introduction.mdx b/docs/docs/getting-started/introduction.mdx index e713a9e3..893e5216 100644 --- a/docs/docs/getting-started/introduction.mdx +++ b/docs/docs/getting-started/introduction.mdx @@ -52,11 +52,7 @@ An optimizer takes your module, a training set (input/output examples), and a me The analogy is a compiler: you write the program (your module), define what "correct" means (your metric), provide training data, and the optimizer produces a better version. This is why the entry point is called `compile`. -Three optimizers exist, each with different tradeoffs: - -- **COPRO** iterates: generate candidate instructions, evaluate, refine, repeat. Fast, simple, good enough for straightforward tasks. -- **MIPROv2** uses an LM to understand your program and generate candidates informed by prompting best practices. Slower, higher quality. -- **GEPA** uses rich textual feedback (not just scores) to guide evolutionary search over a Pareto frontier of candidates. Best for complex tasks with subtle failure modes. +GEPA uses rich textual feedback, not just scores, to guide evolutionary search over a Pareto frontier of candidates. Leaven integration scaffolding lives in `dsrs-leaven` as the next optimizer runtime. COPRO and MIPROv2 were removed with the crate split. The optimizer does not see your Rust code. It sees the predictor leaves inside your module -- their schemas, demos, and instructions -- and mutates only those. After optimization, you call your module exactly as before. The optimized state is invisible to your calling code. diff --git a/docs/docs/getting-started/quickstart.mdx b/docs/docs/getting-started/quickstart.mdx index 85b18dc7..c41f6375 100644 --- a/docs/docs/getting-started/quickstart.mdx +++ b/docs/docs/getting-started/quickstart.mdx @@ -16,7 +16,11 @@ Add to your `Cargo.toml`: ```toml [dependencies] -dspy-rs = "0.7" +dsrs-core = "0.7" +dsrs-lm = "0.7" +dsrs-predict = "0.7" +dsrs-trace = "0.7" +dsrs-macros = "0.7" tokio = { version = "1", features = ["full"] } anyhow = "1" ``` @@ -24,7 +28,7 @@ anyhow = "1" Or via cargo: ```bash -cargo add dspy-rs tokio anyhow +cargo add dsrs-core dsrs-lm dsrs-predict dsrs-trace dsrs-macros tokio anyhow ``` @@ -34,7 +38,8 @@ cargo add dspy-rs tokio anyhow Tell DSRs which model to use. This sets a global default that all predictors will use: ```rust -use dspy_rs::{configure, init_tracing, ChatAdapter, LM}; +use dsrs_lm::{configure, ChatAdapter, LM}; +use dsrs_trace::init_tracing; #[tokio::main] async fn main() -> anyhow::Result<()> { @@ -63,7 +68,7 @@ Set `OPENAI_API_KEY` in your environment. For other providers, use the appropria A [signature](/docs/building-blocks/signature) declares your task's inputs and outputs: ```rust -use dspy_rs::Signature; +use dsrs_macros::Signature; /// Answer questions accurately and concisely. #[derive(Signature, Clone, Debug)] @@ -89,7 +94,7 @@ The doc comments become: Create a [predictor](/docs/building-blocks/predictors) and call it: ```rust -use dspy_rs::Predict; +use dsrs_predict::Predict; let predict = Predict::::new(); @@ -109,7 +114,10 @@ The `#[derive(Signature)]` macro generates `QAInput` from your `#[input]` fields ## Complete example ```rust -use dspy_rs::{configure, init_tracing, ChatAdapter, LM, Predict, Signature}; +use dsrs_lm::{configure, ChatAdapter, LM}; +use dsrs_macros::Signature; +use dsrs_predict::Predict; +use dsrs_trace::init_tracing; /// Answer questions accurately and concisely. #[derive(Signature, Clone, Debug)] @@ -174,7 +182,8 @@ See the full attribute reference in [Signatures](/docs/building-blocks/signature When you need more than primitives, add [`#[BamlType]`](/docs/building-blocks/types): ```rust -use dspy_rs::{Signature, BamlType}; +use bamltype::BamlType; +use dsrs_macros::Signature; #[BamlType] #[derive(Clone, Debug)] @@ -203,7 +212,7 @@ struct SentimentAnalysis { Add examples to guide the LM: ```rust -use dspy_rs::Example; +use dsrs_core::Example; let predict = Predict::::builder() .demo(Example::::new( diff --git a/docs/docs/optimizers/copro.mdx b/docs/docs/optimizers/copro.mdx deleted file mode 100644 index 5e46917e..00000000 --- a/docs/docs/optimizers/copro.mdx +++ /dev/null @@ -1,180 +0,0 @@ ---- -title: 'COPRO' -description: 'Fast iterative prompt refinement' -icon: 'rotate' ---- - -COPRO (Collaborative Prompt Optimizer) is a simple but effective optimizer that iteratively refines prompts through generation and evaluation cycles. - -## How it Works - -COPRO uses a straightforward approach: - -1. **Generate candidates**: Create multiple prompt variations using an LLM -2. **Evaluate**: Test each candidate on your training data -3. **Refine**: Use the best candidates to generate improved versions -4. **Repeat**: Continue for a fixed number of depth iterations - -## Configuration - -```rust -let copro = COPRO::builder() - .breadth(10) // Number of candidates per iteration - .depth(3) // Number of refinement iterations - .init_temperature(1.4) // Temperature for generation - .track_stats(false) // Track optimization statistics - .build(); -``` - -## Usage Example - -```rust -use anyhow::Result; -use bon::Builder; -use facet; -use dspy_rs::{ - COPRO, ChatAdapter, Example, LM, MetricOutcome, Module, Optimizer, Predict, PredictError, - Predicted, Signature, TypedMetric, configure, init_tracing, -}; - -#[derive(Signature, Clone, Debug)] -struct QA { - #[input] - question: String, - - #[output] - answer: String, -} - -#[derive(Builder, facet::Facet)] -#[facet(crate = facet)] -struct MyModule { - #[builder(default = Predict::::new())] - predictor: Predict, -} - -impl Module for MyModule { - type Input = QAInput; - type Output = QAOutput; - - async fn forward(&self, inputs: QAInput) -> Result, PredictError> { - self.predictor.call(inputs).await - } -} - -struct ExactMatchMetric; - -impl TypedMetric for ExactMatchMetric { - async fn evaluate(&self, example: &Example, prediction: &Predicted) -> Result { - let expected = example.output.answer.trim().to_lowercase(); - let actual = prediction.answer.trim().to_lowercase(); - Ok(MetricOutcome::score((expected == actual) as u8 as f32)) - } -} - -#[tokio::main] -async fn main() -> Result<()> { - init_tracing()?; - - // API key automatically read from OPENAI_API_KEY env var - configure( - LM::builder() - .model("gpt-4o-mini".to_string()) - .build() - .await?, - ChatAdapter, - ); - - let mut module = MyModule::builder().build(); - let trainset = vec![ - Example::new( - QAInput { - question: "What is 2+2?".to_string(), - }, - QAOutput { - answer: "4".to_string(), - }, - ), - Example::new( - QAInput { - question: "Capital of France?".to_string(), - }, - QAOutput { - answer: "Paris".to_string(), - }, - ), - ]; - - let copro = COPRO::builder() - .breadth(10) - .depth(3) - .build(); - let metric = ExactMatchMetric; - -copro.compile::(&mut module, trainset, &metric).await?; - - Ok(()) -} -``` - -### Typed Data Loading - -Use the shared data ingress guide: [`DataLoader`](/docs/data/dataloader). - -## When to Use COPRO - -**Best for:** -- Quick iteration cycles -- Simple tasks -- Limited compute budget -- When you need results fast - -**Avoid when:** -- You need best possible quality (use MIPROv2 or GEPA) -- Task has complex failure modes (use GEPA) -- You want to leverage prompting best practices (use MIPROv2) - -## Comparison with Other Optimizers - -| Feature | COPRO | MIPROv2 | GEPA | -|---------|-------|---------|------| -| **Speed** | Fast | Slow | Medium | -| **Quality** | Good | Better | Best | -| **Feedback** | Score | Score | Score + Text | -| **Diversity** | Low | Medium | High | -| **Setup** | Simple | Moderate | Moderate | - -## Configuration Details - -### Breadth -Number of candidate prompts generated at each iteration. Higher breadth means more exploration but more compute. - -Recommended: 5-15 - -### Depth -Number of refinement iterations. Each iteration builds on the best candidates from the previous one. - -Recommended: 2-5 - -### Temperature -Controls randomness in prompt generation. Higher temperature means more diverse candidates. - -Recommended: 1.0-1.5 - -### Track Stats -When enabled, COPRO tracks detailed statistics about all evaluated candidates and their scores over time. - -## Implementation Notes - -COPRO maintains: -- All evaluated candidates with their scores -- Best candidates from each iteration -- History of improvements over iterations - -The algorithm avoids re-evaluating candidates it has already seen by caching results. - -## Examples - - - See examples 03-evaluate-hotpotqa.rs and 04-optimize-hotpotqa.rs - diff --git a/docs/docs/optimizers/gepa-llm-judge.mdx b/docs/docs/optimizers/gepa-llm-judge.mdx index d56f5fa6..2439b56b 100644 --- a/docs/docs/optimizers/gepa-llm-judge.mdx +++ b/docs/docs/optimizers/gepa-llm-judge.mdx @@ -39,7 +39,7 @@ Better Task LM prompt ## Complete Example Walkthrough - + See the complete implementation with step-by-step comments diff --git a/docs/docs/optimizers/gepa.mdx b/docs/docs/optimizers/gepa.mdx index 0f702772..d2d7bb1b 100644 --- a/docs/docs/optimizers/gepa.mdx +++ b/docs/docs/optimizers/gepa.mdx @@ -16,7 +16,7 @@ GEPA is a reflective optimizer that adaptively evolves textual components (such ## What Makes GEPA Unique? -Unlike traditional optimizers (COPRO, MIPROv2), GEPA introduces several key innovations: +Unlike score-only prompt search, GEPA uses rich textual feedback to guide candidate evolution: ### 1. Rich Textual Feedback Instead of just scalar scores (0.8, 0.9), GEPA leverages detailed explanations: @@ -49,7 +49,12 @@ Can optimize at test time, not just training time. ### 1. Implement a Typed Metric with Feedback ```rust -use dspy_rs::*; +use dsrs_core::{Example, Module, PredictError}; +use dsrs_evaluate::{feedback_helpers, MetricOutcome, TypedMetric}; +use dsrs_gepa::GEPAOptimizer; +use dsrs_lm::{ChatAdapter, LM}; +use dsrs_macros::Signature; +use dsrs_predict::Predict; #[derive(Builder, facet::Facet)] #[facet(crate = facet)] @@ -117,7 +122,7 @@ DSRs provides utilities for common feedback patterns: ### Document Retrieval ```rust -use dspy_rs::feedback_helpers::retrieval_feedback; +use dsrs_evaluate::feedback_helpers::retrieval_feedback; let feedback = retrieval_feedback( &retrieved_docs, @@ -133,7 +138,7 @@ let feedback = retrieval_feedback( ### Code Generation ```rust -use dspy_rs::feedback_helpers::{code_pipeline_feedback, CodeStage, StageResult}; +use dsrs_evaluate::feedback_helpers::{code_pipeline_feedback, CodeStage, StageResult}; let stages = vec![ (CodeStage::Parse, StageResult::Success), @@ -153,7 +158,7 @@ let feedback = code_pipeline_feedback(&stages, 0.6); ### Multi-Objective Optimization ```rust -use dspy_rs::feedback_helpers::multi_objective_feedback; +use dsrs_evaluate::feedback_helpers::multi_objective_feedback; let mut objectives = HashMap::new(); objectives.insert("accuracy".to_string(), (0.9, "High accuracy".to_string())); @@ -330,7 +335,7 @@ GEPA::builder() ## Examples - + Basic GEPA usage with explicit feedback for sentiment classification @@ -338,30 +343,6 @@ GEPA::builder() Using an LLM judge to automatically generate feedback -## Comparison with Other Optimizers - -| Feature | COPRO | MIPROv2 | GEPA | -|----------------------|-------|---------|------| -| **Feedback Type** | Score | Score | Score + Text | -| **Selection Strategy** | Best | Batch | Pareto | -| **Diversity** | Low | Medium | High | -| **Actionability** | Low | Medium | High | -| **Compute Cost** | Low | Medium | Medium-High | -| **Sample Efficiency** | Medium | High | Very High | - -### When to Use GEPA - -- Complex tasks with subtle failure modes -- When you can provide rich feedback -- Multi-objective optimization -- Need for diverse solutions -- Inference-time search - -### When to Use Alternatives - -- **COPRO**: Simple tasks, quick iteration -- **MIPROv2**: Best prompting practices, single objective - ## Troubleshooting ### Issue: "GEPA requires feedback for every evaluated example" @@ -423,4 +404,4 @@ let best_outputs = result.best_outputs_valset; - [GEPA Paper](https://arxiv.org/abs/2507.19457) - [GEPA GitHub](https://github.com/gepa-ai/gepa) - [LLM-as-Judge Pattern](/docs/optimizers/gepa-llm-judge) -- [Example Code](https://github.com/krypticmouse/DSRs/blob/main/crates/dspy-rs/examples/09-gepa-sentiment.rs) +- [Example Code](https://github.com/krypticmouse/DSRs/blob/main/crates/dsrs-gepa/examples/09-gepa-sentiment.rs) diff --git a/docs/docs/optimizers/miprov2.mdx b/docs/docs/optimizers/miprov2.mdx deleted file mode 100644 index 9c3e9655..00000000 --- a/docs/docs/optimizers/miprov2.mdx +++ /dev/null @@ -1,153 +0,0 @@ ---- -title: 'MIPROv2' -description: 'LLM-guided prompt generation with best practices' -icon: 'wand-magic-sparkles' ---- - -MIPROv2 (Multi-prompt Instruction Proposal Optimizer v2) is an optimizer that uses LLMs to generate and refine prompts automatically. It differs from COPRO by using an LLM to understand the program and apply prompting best practices, rather than just iterating on existing instructions. - -## How it Works - -### Stage 1: Trace Generation - -```rust -async fn generate_traces( - &self, - module: &M, - examples: &[Example], -) -> Result> -``` - -- Runs the existing program with training examples -- Captures input/output pairs along with evaluation scores -- Creates a dataset of execution traces that show current behavior - -### Stage 2: Candidate Prompt Generation - -This stage has two sub-steps: - -First, an LLM analyzes the signature and traces to generate a program description. Then it uses that description along with prompting tips to generate candidate instructions. - -The prompting tips library includes: -- Use clear, specific language -- Consider chain-of-thought for complex tasks -- Specify output formats -- Use role-playing when appropriate -- Handle edge cases explicitly -- Request structured outputs when needed - -### Stage 3: Evaluation and Selection - -- Evaluates each candidate on a minibatch of examples -- Computes performance scores -- Selects the best performing candidate -- Applies it to the module - -## Configuration - -Default settings: -```rust -let optimizer = MIPROv2::builder() - .num_candidates(10) - .minibatch_size(25) - .temperature(1.0) - .track_stats(true) - .build(); -``` - -You can adjust: -- Number of candidates to generate -- Minibatch size for evaluation -- Temperature for generation diversity -- Whether to display progress stats - -## Usage Example - -```rust -use dspy_rs::{MIPROv2, Optimizer}; - -// Create optimizer -let optimizer = MIPROv2::builder() - .num_candidates(10) - .num_trials(20) - .minibatch_size(25) - .build(); - -// Typed metric implementing TypedMetric -let metric = ExactMatchMetric; - -// Optimize your module -optimizer.compile(&mut module, train_examples, &metric).await?; -``` - -### Typed Data Loading - -Use the shared data ingress guide: [`DataLoader`](/docs/data/dataloader). - -## Comparison: COPRO vs MIPROv2 vs GEPA - -| Feature | COPRO | MIPROv2 | GEPA | -|---------|-------|---------|------| -| **Approach** | Iterative refinement | LLM-guided generation | Reflective evolution | -| **Feedback** | Score only | Score only | Score + Text | -| **Selection** | Best candidate | Batch evaluation | Pareto frontier | -| **LLM calls** | Moderate | High | Medium-High | -| **Speed** | Faster | Slower | Medium | -| **Diversity** | Low | Medium | High | -| **Best for** | Quick iteration | Best results | Complex tasks | - -### When to Use MIPROv2 - -- You have decent training data (15+ examples recommended) -- Quality matters more than speed -- Task benefits from prompting best practices -- Need LLM-generated program understanding - -### When to Use COPRO - -- You need fast iteration -- Compute budget is limited -- Task is straightforward - -### When to Use GEPA - -- Complex tasks with subtle failure modes -- You can provide rich feedback -- Multi-objective optimization -- Need diverse solutions - -## Implementation Notes - -The code follows standard Rust practices: -- No unsafe blocks -- Results for error handling with context via anyhow -- Strong types (Trace, PromptCandidate, PromptingTips) -- Builder pattern for configuration -- Async throughout, no blocking calls - -Key types: -- `Trace` - Input/output pair with evaluation score -- `PromptCandidate` - Instruction text with score -- `PromptingTips` - Library of best practices - -## Testing - -Run tests: -```bash -cargo test test_miprov2 -``` - -There are 29 test cases covering trace generation, candidate selection, configuration, and edge cases. - -## Example - - - Complete working example with HuggingFace data loading - - -The example loads data, measures baseline performance, runs optimization, and shows the improvement. - -## References - -- [DSPy Framework](https://github.com/stanfordnlp/dspy) -- [DSPy Paper](https://arxiv.org/abs/2310.03714) diff --git a/docs/index.mdx b/docs/index.mdx index b1afda58..45d7f95c 100644 --- a/docs/index.mdx +++ b/docs/index.mdx @@ -63,6 +63,6 @@ Learn about the foundational concepts of DSRs icon="wand-magic-sparkles" href="/docs/optimizers/copro" > - Optimize your LM applications with COPRO, MIPROv2, and GEPA. + Optimize your LM applications with GEPA today, with Leaven integration scaffolding separated in `dsrs-leaven`. diff --git a/tools/coverage-runtime.mjs b/tools/coverage-runtime.mjs new file mode 100755 index 00000000..c5f75c37 --- /dev/null +++ b/tools/coverage-runtime.mjs @@ -0,0 +1,139 @@ +#!/usr/bin/env node +import { mkdirSync, readFileSync, writeFileSync } from "node:fs"; +import { spawnSync } from "node:child_process"; + +const allPackages = [ + "dsrs-core", + "dsrs-cache", + "dsrs-trace", + "dsrs-lm", + "dsrs-predict", + "dsrs-evaluate", + "dsrs-gepa", + "dsrs-data", + "dsrs-leaven", +]; + +const outDir = "target/llvm-cov"; +const packageArgs = process.argv + .map((arg, index, args) => (arg === "--package" || arg === "-p" ? args[index + 1] : null)) + .filter(Boolean); +const packages = packageArgs.length > 0 ? packageArgs : allPackages; +const runBranch = process.argv.includes("--branch") || process.argv.includes("--strict-branch"); +const strictBranch = process.argv.includes("--strict-branch"); +mkdirSync(outDir, { recursive: true }); + +function run(args) { + const result = spawnSync("cargo", args, { stdio: "inherit" }); + return result.status ?? 1; +} + +function coveragePercent(summary, key) { + const totals = summary.data[0].totals[key]; + return { + covered: totals.covered, + count: totals.count, + percent: totals.percent.toFixed(2), + }; +} + +function readSummary(path) { + return JSON.parse(readFileSync(path, "utf8")); +} + +function formatMetric(metric) { + return `${metric.covered}/${metric.count} (${metric.percent}%)`; +} + +const lines = [ + "# Runtime Coverage Summary", + "", + "Generated by `tools/coverage-runtime.mjs`.", + "", + "## Line Coverage", + "", + "| Package | Lines | Functions | Regions |", + "|---|---:|---:|---:|", +]; + +let failed = false; + +run(["llvm-cov", "clean", "--workspace"]); + +for (const pkg of packages) { + const outputPath = `${outDir}/${pkg}-line-summary.json`; + const status = run([ + "llvm-cov", + "-p", + pkg, + "--json", + "--summary-only", + "--output-path", + outputPath, + ]); + + if (status !== 0) { + failed = true; + lines.push(`| \`${pkg}\` | FAILED | FAILED | FAILED |`); + continue; + } + + const summary = readSummary(outputPath); + lines.push( + `| \`${pkg}\` | ${formatMetric(coveragePercent(summary, "lines"))} | ${formatMetric( + coveragePercent(summary, "functions"), + )} | ${formatMetric(coveragePercent(summary, "regions"))} |`, + ); +} + +lines.push("", "## Branch Coverage", ""); + +if (!runBranch) { + lines.push( + "Not run in this invocation. Pass `--branch` to record branch summaries, or `--strict-branch` to fail on any branch-report failure.", + ); + writeFileSync(`${outDir}/runtime-coverage-summary.md`, `${lines.join("\n")}\n`); + console.log(`Wrote ${outDir}/runtime-coverage-summary.md`); + process.exit(failed ? 1 : 0); +} + +lines.push( + "Branch coverage is attempted package-by-package. The current llvm-cov reporter can segfault on some crates; failures are recorded here instead of being hidden. Pass `--strict-branch` to make any branch-report failure fail this script.", + "", + "| Package | Branches | Status |", + "|---|---:|---|", +); + +let branchFailed = false; + +run(["llvm-cov", "clean", "--workspace"]); + +for (const pkg of packages) { + const outputPath = `${outDir}/${pkg}-branch-summary.json`; + const status = run([ + "llvm-cov", + "-p", + pkg, + "--json", + "--summary-only", + "--branch", + "--output-path", + outputPath, + ]); + + if (status !== 0) { + branchFailed = true; + lines.push(`| \`${pkg}\` | n/a | reporter failed with status ${status} |`); + continue; + } + + const summary = readSummary(outputPath); + lines.push(`| \`${pkg}\` | ${formatMetric(coveragePercent(summary, "branches"))} | ok |`); +} + +writeFileSync(`${outDir}/runtime-coverage-summary.md`, `${lines.join("\n")}\n`); +console.log(`Wrote ${outDir}/runtime-coverage-summary.md`); + +if (failed || (strictBranch && branchFailed)) { + process.exit(1); +}