Skip to content

Commit 733cc0c

Browse files
committed
fix: resolve merge conflicts
Signed-off-by: Jake LoRocco <jake.lorocco@ibm.com>
2 parents 798ac76 + 642ec7c commit 733cc0c

174 files changed

Lines changed: 10217 additions & 5282 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/CODEOWNERS

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# Default: request review from contributors
2-
* @generative-computing/mellea-contributors
1+
# Default: request review from maintainers
2+
* @generative-computing/mellea-maintainers
33

44
# Mellea Core requires special review
55
/mellea/core/ @nrfulton @jakelorocco

.github/workflows/docs-publish.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,11 @@ jobs:
8181
- name: Generate API documentation
8282
run: uv run python tooling/docs-autogen/build.py
8383

84+
# -- Run docs-autogen unit tests ------------------------------------------
85+
86+
- name: Run CLI reference tests
87+
run: uv run pytest tooling/docs-autogen/test_cli_reference.py -v --tb=short
88+
8489
# -- Validate static docs ------------------------------------------------
8590

8691
- name: Lint static docs (markdownlint)

.github/workflows/quality.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ jobs:
4141
- name: Send failure message pre-commit
4242
if: failure() # This step will only run if a previous step failed
4343
run: echo "The quality verification failed. Please run precommit "
44+
- name: Download NLTK data
45+
run: uv run python -m nltk.downloader punkt_tab
4446
- name: Install Ollama
4547
run: curl -fsSL https://ollama.com/install.sh | sh
4648
- name: Start serving ollama

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -381,6 +381,7 @@ celerybeat.pid
381381
# Environments
382382
.env
383383
.venv
384+
.*-venv
384385
env/
385386
venv/
386387
ENV/
@@ -454,7 +455,8 @@ pyrightconfig.json
454455
.claude/*
455456
!.claude/settings.json
456457

457-
# Generated API documentation (built by tooling/docs-autogen/)
458+
# Generated documentation (built by tooling/docs-autogen/)
458459
docs/docs/api/
459460
docs/docs/api-reference.mdx
461+
docs/docs/reference/cli.md
460462
.venv-docs-autogen/

AGENTS.md

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ Tests use a four-tier granularity system (`unit`, `integration`, `e2e`, `qualita
5151

5252
See **[test/MARKERS_GUIDE.md](test/MARKERS_GUIDE.md)** for the full marker reference (tier definitions, backend markers, resource gates, auto-skip logic, common patterns).
5353

54-
**Examples in `docs/examples/`** use comment-based markers:
54+
**Examples in `docs/examples/`** are opt-in — unlike `test/` files (auto-collected, default `unit`), examples require an explicit `# pytest:` comment to be collected. Files without this comment are silently ignored (they won't appear in skip summaries either). This is because examples have variable dependencies and limited setup:
5555
```python
5656
# pytest: e2e, ollama, qualitative
5757
"""Example description..."""
@@ -88,7 +88,8 @@ mkdir -p .bob && ln -s ../.agents/skills .bob/skills
8888
- Use `...` in `@generative` function bodies
8989
- Prefer primitives over classes
9090
- **Friendly Dependency Errors**: Wraps optional backend imports in `try/except ImportError` with a helpful message (e.g., "Please pip install mellea[hf]"). See `mellea/stdlib/session.py` for examples.
91-
- **Backend telemetry fields**: All backends must populate `mot.usage` (dict with `prompt_tokens`, `completion_tokens`, `total_tokens`), `mot.model` (str), and `mot.provider` (str) in their `post_processing()` method. Metrics are automatically recorded by `TokenMetricsPlugin` — don't add manual `record_token_usage_metrics()` calls.
91+
- **CLI command docstrings**: Typer command functions in `cli/` follow an enriched convention with `Prerequisites:` and `See Also:` sections — these feed the auto-generated CLI reference page. See [`docs/docs/guide/CONTRIBUTING.md`](docs/docs/guide/CONTRIBUTING.md) for the full pattern. Regenerate after changes: `uv run poe clidocs`. Test the generator: `uv run pytest tooling/docs-autogen/test_cli_reference.py -v`. Full pipeline docs: [`tooling/docs-autogen/README.md`](tooling/docs-autogen/README.md).
92+
- **Backend telemetry fields**: All backends must populate `mot.usage` (dict with `prompt_tokens`, `completion_tokens`, `total_tokens`), `mot.model` (str), and `mot.provider` (str) in their `post_processing()` method. `mot.streaming` (bool) and `mot.ttfb_ms` (float | None) are set automatically in `astream()` — backends do not need to set them. Metrics are automatically recorded by `TokenMetricsPlugin` and `LatencyMetricsPlugin` — don't add manual `record_token_usage_metrics()` or `record_request_duration()` calls.
9293

9394
## 6. Commits & Hooks
9495
[Angular format](https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit): `feat:`, `fix:`, `docs:`, `test:`, `refactor:`, `release:`
@@ -146,3 +147,68 @@ Found a bug, workaround, or pattern? Update the docs:
146147
- **Issue/workaround?** → Add to Section 7 (Common Issues) in this file
147148
- **Usage pattern?** → Add to [`docs/AGENTS_TEMPLATE.md`](docs/AGENTS_TEMPLATE.md)
148149
- **New pitfall?** → Add warning near relevant section
150+
151+
## 13. Working with Intrinsics
152+
153+
Intrinsics are specialized LoRA adapters that add task-specific capabilities (RAG evaluation, safety checks, calibration, etc.) to Granite models. Mellea handles adapter loading and input formatting automatically — you just call the right function.
154+
155+
### Using Intrinsics in Mellea
156+
157+
**Prefer the high-level wrappers** in `mellea/stdlib/components/intrinsic/`. These handle adapter loading, context formatting, and output parsing for you:
158+
159+
| Module | Function | Description |
160+
|--------|----------|-------------|
161+
| `core` | `check_certainty(context, backend)` | Model certainty about its last response (0–1) |
162+
| `core` | `requirement_check(context, backend, requirement)` | Whether text meets a requirement (0–1) |
163+
| `core` | `find_context_attributions(response, documents, context, backend)` | Sentences that influenced the response |
164+
| `rag` | `check_answerability(question, documents, context, backend)` | Whether documents can answer a question (0–1) |
165+
| `rag` | `rewrite_question(question, context, backend)` | Rewrite question into a retrieval query |
166+
| `rag` | `clarify_query(question, documents, context, backend)` | Generate clarification or return "CLEAR" |
167+
| `rag` | `find_citations(response, documents, context, backend)` | Document sentences supporting the response |
168+
| `rag` | `check_context_relevance(question, document, context, backend)` | Whether a document is relevant (0–1) |
169+
| `rag` | `flag_hallucinated_content(response, documents, context, backend)` | Flag potentially hallucinated sentences |
170+
171+
```python
172+
from mellea.backends.huggingface import LocalHFBackend
173+
from mellea.stdlib.components import Message
174+
from mellea.stdlib.components.intrinsic import core
175+
from mellea.stdlib.context import ChatContext
176+
177+
backend = LocalHFBackend(model_id="ibm-granite/granite-4.0-micro")
178+
context = (
179+
ChatContext()
180+
.add(Message("user", "What is the square root of 4?"))
181+
.add(Message("assistant", "The square root of 4 is 2."))
182+
)
183+
score = core.check_certainty(context, backend)
184+
```
185+
186+
For lower-level control (custom adapters, model options), use `mfuncs.act()` with `Intrinsic` directly — see examples in `docs/examples/intrinsics/`.
187+
188+
### Project Resources
189+
190+
- **Canonical catalog**: `mellea/backends/adapters/catalog.py` — source of truth for intrinsic names, HF repo IDs, and adapter types
191+
- **Usage examples**: `docs/examples/intrinsics/` — working code for every intrinsic
192+
- **Helper functions**: `mellea/stdlib/components/intrinsic/rag.py` and `core.py`
193+
194+
### Adding New Intrinsics
195+
196+
When adding support for a new intrinsic (not just using an existing one), fetch its README from Hugging Face first. Each README contains the authoritative spec for input/output format, intended use, and examples.
197+
198+
**Writing examples?** The HF READMEs also document intended usage patterns and example inputs — useful reference when writing code in `docs/examples/intrinsics/`.
199+
200+
| Repo | Purpose | Intrinsics |
201+
|------|---------|------------|
202+
| [`ibm-granite/granitelib-rag-r1.0`](https://huggingface.co/ibm-granite/granitelib-rag-r1.0) | RAG pipeline | answerability, citations, context_relevance, hallucination_detection, query_rewrite, query_clarification |
203+
| [`ibm-granite/granitelib-core-r1.0`](https://huggingface.co/ibm-granite/granitelib-core-r1.0) | Core capabilities | context-attribution, requirement-check, uncertainty |
204+
| [`ibm-granite/granitelib-guardian-r1.0`](https://huggingface.co/ibm-granite/granitelib-guardian-r1.0) | Safety & compliance | guardian-core, policy-guardrails, factuality-detection, factuality-correction |
205+
206+
**README URLs** — RAG intrinsics (no model subfolder):
207+
```
208+
https://huggingface.co/ibm-granite/granitelib-rag-r1.0/blob/main/{intrinsic_name}/README.md
209+
```
210+
211+
Core and Guardian intrinsics (include model subfolder):
212+
```
213+
https://huggingface.co/ibm-granite/granitelib-{core,guardian}-r1.0/blob/main/{intrinsic_name}/granite-4.0-micro/README.md
214+
```

CHANGELOG.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,50 @@
1+
## [v0.4.2](https://github.com/generative-computing/mellea/releases/tag/v0.4.2) - 2026-04-08
2+
3+
<!-- Release notes generated using configuration in .github/release.yml at main -->
4+
5+
## What's Changed
6+
### New Features
7+
* feat: add tests for mellea optional dependencies by @jakelorocco in https://github.com/generative-computing/mellea/pull/724
8+
* feat: further vram optimizations by @avinash2692 in https://github.com/generative-computing/mellea/pull/765
9+
* feat: (m decomp) M Decompose Readme and Docstring Updates by @csbobby in https://github.com/generative-computing/mellea/pull/767
10+
* feat: add top level async streaming by @jakelorocco in https://github.com/generative-computing/mellea/pull/655
11+
* feat(serve): improve OpenAI API compatibility with usage, finish_reas… by @markstur in https://github.com/generative-computing/mellea/pull/771
12+
* feat: removing vllm backend by @avinash2692 in https://github.com/generative-computing/mellea/pull/781
13+
### Bug Fixes
14+
* fix: modifications to granite formatter tests by @jakelorocco in https://github.com/generative-computing/mellea/pull/703
15+
* fix: exclude tooling from mypy check by @planetf1 in https://github.com/generative-computing/mellea/pull/748
16+
* fix: setting ollama host in conftest by @avinash2692 in https://github.com/generative-computing/mellea/pull/751
17+
* fix: Add qualitative and slow markers so the example is skipped by @markstur in https://github.com/generative-computing/mellea/pull/764
18+
* fix(tools): correct args validation in langchain tool wrapper by @markstur in https://github.com/generative-computing/mellea/pull/761
19+
* fix: remove references to old pytest markers by @jakelorocco in https://github.com/generative-computing/mellea/pull/776
20+
* fix: add error handling to OpenAI-compatible serve endpoint by @markstur in https://github.com/generative-computing/mellea/pull/774
21+
* fix: assertion for test_find_context_attributions and range for hallucination detection by @jakelorocco in https://github.com/generative-computing/mellea/pull/779
22+
* fix: add xfail to citation test; functionality is tested elsewhere by @jakelorocco in https://github.com/generative-computing/mellea/pull/787
23+
### Documentation
24+
* docs: remove discord link in main readme by @AngeloDanducci in https://github.com/generative-computing/mellea/pull/720
25+
* docs: note virtual environment requirement for pre-commit hooks by @ajbozarth in https://github.com/generative-computing/mellea/pull/745
26+
* docs: condense README to elevator pitch (#478) by @planetf1 in https://github.com/generative-computing/mellea/pull/688
27+
* docs: update qiskit_code_validation example defaults by @ajbozarth in https://github.com/generative-computing/mellea/pull/743
28+
* docs: remove pre-IVR validation and update readme with v2 benchmark results by @ajbozarth in https://github.com/generative-computing/mellea/pull/769
29+
### Other Changes
30+
* docs: add multi-turn strategy option to Qiskit code validation example by @vabarbosa in https://github.com/generative-computing/mellea/pull/717
31+
* chore: use github tooling to build release notes by @psschwei in https://github.com/generative-computing/mellea/pull/710
32+
* docs: add release.md by @psschwei in https://github.com/generative-computing/mellea/pull/723
33+
* fix: proper permissions on pr labeling job by @psschwei in https://github.com/generative-computing/mellea/pull/741
34+
* ci: memory management in tests by @avinash2692 in https://github.com/generative-computing/mellea/pull/721
35+
* chore: enforce commit formatting on PR titles by @psschwei in https://github.com/generative-computing/mellea/pull/750
36+
* chore: Update HF repo names by @frreiss in https://github.com/generative-computing/mellea/pull/753
37+
* ci: drop mergify, add release entry to pr-labels action by @psschwei in https://github.com/generative-computing/mellea/pull/752
38+
* ci: fix to make pr label job required check by @psschwei in https://github.com/generative-computing/mellea/pull/756
39+
* test: agent skills infrastructure and marker taxonomy audit (#727, #728) by @planetf1 in https://github.com/generative-computing/mellea/pull/742
40+
* chore: add governance doc by @psschwei in https://github.com/generative-computing/mellea/pull/786
41+
* chore: updating governance doc to use maintainers by @psschwei in https://github.com/generative-computing/mellea/pull/791
42+
43+
## New Contributors
44+
* @markstur made their first contribution in https://github.com/generative-computing/mellea/pull/764
45+
46+
**Full Changelog**: https://github.com/generative-computing/mellea/compare/v0.4.1...v0.4.2
47+
148
## [v0.4.1](https://github.com/generative-computing/mellea/releases/tag/v0.4.1) - 2026-03-23
249

350
### Feature

CONTRIBUTING.md

Lines changed: 37 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -354,12 +354,44 @@ uv run ruff check .
354354
### Required Models
355355

356356
#### Ollama
357-
- `granite4:micro-h`
358-
- `granite3.2-vision`
359-
- `granite4:micro`
360-
- `qwen2.5vl:7b`
361357

362-
_Note: ollama models can be obtained by running `ollama pull <model>`_
358+
HuggingFace and cloud backends download or host models automatically. Ollama
359+
models must be pulled locally before running the tests that need them.
360+
361+
**CI (unit + integration tests):**
362+
363+
- `granite4:micro` — default model for `start_session()` and most examples
364+
- `granite4:micro-h` — hybrid variant used by conftest fixtures
365+
366+
**Examples (`docs/examples/`):**
367+
368+
- `deepseek-r1:8b` — safety / guardian examples
369+
- `granite3-guardian:2b` — mini-researcher guardian backend
370+
- `granite3.2-vision` — vision (Ollama chat) example
371+
- `granite3.3:8b` — m\_decompose example
372+
- `granite4:latest` — melp examples
373+
- `llama3.2` — repair-with-guardian example
374+
- `llama3.2:3b` — tutorial / mify examples (via `META_LLAMA_3_2_3B`)
375+
- `qwen2.5vl:7b` — vision (OpenAI-via-Ollama) example
376+
377+
**Additional test models (`test/`):**
378+
379+
- `granite4:small-h` — hybrid-small tests
380+
- `llama3.2:1b` — lightweight inference tests
381+
- `llama3:8b` — legacy Llama 3 tests
382+
- `llava` — multimodal tests
383+
- `mistral:7b` — Mistral backend tests
384+
- `smollm2:1.7b` — SmolLM tests
385+
386+
Pull everything:
387+
388+
```bash
389+
for m in granite4:micro granite4:micro-h deepseek-r1:8b \
390+
granite3-guardian:2b granite3.2-vision granite3.3:8b granite4:latest \
391+
llama3.2 llama3.2:3b \
392+
qwen2.5vl:7b granite4:small-h llama3.2:1b llama3:8b llava mistral:7b \
393+
smollm2:1.7b; do ollama pull "$m"; done
394+
```
363395

364396
### Test Markers
365397

0 commit comments

Comments
 (0)