generative-computing
diff --git a/‎.github/CODEOWNERS‎
Lines changed: 2 additions & 2 deletions b/‎.github/CODEOWNERS‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎.github/workflows/docs-publish.yml‎
Lines changed: 5 additions & 0 deletions b/‎.github/workflows/docs-publish.yml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎.github/workflows/quality.yml‎
Lines changed: 2 additions & 0 deletions b/‎.github/workflows/quality.yml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 1 deletion b/‎.gitignore‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎AGENTS.md‎
Lines changed: 68 additions & 2 deletions b/‎AGENTS.md‎
Lines changed: 68 additions & 2 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 47 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 47 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 37 additions & 5 deletions b/‎CONTRIBUTING.md‎
Lines changed: 37 additions & 5 deletions
@@ -1,5 +1,5 @@
-# Default: request review from contributors
-* @generative-computing/mellea-contributors
+# Default: request review from maintainers
+* @generative-computing/mellea-maintainers
 
 # Mellea Core requires special review
 /mellea/core/ @nrfulton @jakelorocco
 
@@ -81,6 +81,11 @@ jobs:
       - name: Generate API documentation
         run: uv run python tooling/docs-autogen/build.py
 
+      # -- Run docs-autogen unit tests ------------------------------------------
+
+      - name: Run CLI reference tests
+        run: uv run pytest tooling/docs-autogen/test_cli_reference.py -v --tb=short
+
       # -- Validate static docs ------------------------------------------------
 
       - name: Lint static docs (markdownlint)
 
@@ -41,6 +41,8 @@ jobs:
       - name: Send failure message pre-commit
         if: failure() # This step will only run if a previous step failed
         run: echo "The quality verification failed. Please run precommit "
+      - name: Download NLTK data
+        run: uv run python -m nltk.downloader punkt_tab
       - name: Install Ollama
         run: curl -fsSL https://ollama.com/install.sh | sh
       - name: Start serving ollama
 
@@ -381,6 +381,7 @@ celerybeat.pid
 # Environments
 .env
 .venv
+.*-venv
 env/
 venv/
 ENV/
@@ -454,7 +455,8 @@ pyrightconfig.json
 .claude/*
 !.claude/settings.json
 
-# Generated API documentation (built by tooling/docs-autogen/)
+# Generated documentation (built by tooling/docs-autogen/)
 docs/docs/api/
 docs/docs/api-reference.mdx
+docs/docs/reference/cli.md
 .venv-docs-autogen/
@@ -51,7 +51,7 @@ Tests use a four-tier granularity system (`unit`, `integration`, `e2e`, `qualita
 
 See **[test/MARKERS_GUIDE.md](test/MARKERS_GUIDE.md)** for the full marker reference (tier definitions, backend markers, resource gates, auto-skip logic, common patterns).
 
-**Examples in `docs/examples/`** use comment-based markers:
+**Examples in `docs/examples/`** are opt-in — unlike `test/` files (auto-collected, default `unit`), examples require an explicit `# pytest:` comment to be collected. Files without this comment are silently ignored (they won't appear in skip summaries either). This is because examples have variable dependencies and limited setup:
 ```python
 # pytest: e2e, ollama, qualitative
 """Example description..."""
@@ -88,7 +88,8 @@ mkdir -p .bob && ln -s ../.agents/skills .bob/skills
 - Use `...` in `@generative` function bodies
 - Prefer primitives over classes
 - **Friendly Dependency Errors**: Wraps optional backend imports in `try/except ImportError` with a helpful message (e.g., "Please pip install mellea[hf]"). See `mellea/stdlib/session.py` for examples.
-- **Backend telemetry fields**: All backends must populate `mot.usage` (dict with `prompt_tokens`, `completion_tokens`, `total_tokens`), `mot.model` (str), and `mot.provider` (str) in their `post_processing()` method. Metrics are automatically recorded by `TokenMetricsPlugin` — don't add manual `record_token_usage_metrics()` calls.
+- **CLI command docstrings**: Typer command functions in `cli/` follow an enriched convention with `Prerequisites:` and `See Also:` sections — these feed the auto-generated CLI reference page. See [`docs/docs/guide/CONTRIBUTING.md`](docs/docs/guide/CONTRIBUTING.md) for the full pattern. Regenerate after changes: `uv run poe clidocs`. Test the generator: `uv run pytest tooling/docs-autogen/test_cli_reference.py -v`. Full pipeline docs: [`tooling/docs-autogen/README.md`](tooling/docs-autogen/README.md).
+- **Backend telemetry fields**: All backends must populate `mot.usage` (dict with `prompt_tokens`, `completion_tokens`, `total_tokens`), `mot.model` (str), and `mot.provider` (str) in their `post_processing()` method. `mot.streaming` (bool) and `mot.ttfb_ms` (float | None) are set automatically in `astream()` — backends do not need to set them. Metrics are automatically recorded by `TokenMetricsPlugin` and `LatencyMetricsPlugin` — don't add manual `record_token_usage_metrics()` or `record_request_duration()` calls.
 
 ## 6. Commits & Hooks
 [Angular format](https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit): `feat:`, `fix:`, `docs:`, `test:`, `refactor:`, `release:`
@@ -146,3 +147,68 @@ Found a bug, workaround, or pattern? Update the docs:
 - **Issue/workaround?** → Add to Section 7 (Common Issues) in this file
 - **Usage pattern?** → Add to [`docs/AGENTS_TEMPLATE.md`](docs/AGENTS_TEMPLATE.md)
 - **New pitfall?** → Add warning near relevant section
+
+## 13. Working with Intrinsics
+
+Intrinsics are specialized LoRA adapters that add task-specific capabilities (RAG evaluation, safety checks, calibration, etc.) to Granite models. Mellea handles adapter loading and input formatting automatically — you just call the right function.
+
+### Using Intrinsics in Mellea
+
+**Prefer the high-level wrappers** in `mellea/stdlib/components/intrinsic/`. These handle adapter loading, context formatting, and output parsing for you:
+
+| Module | Function | Description |
+|--------|----------|-------------|
+| `core` | `check_certainty(context, backend)` | Model certainty about its last response (0–1) |
+| `core` | `requirement_check(context, backend, requirement)` | Whether text meets a requirement (0–1) |
+| `core` | `find_context_attributions(response, documents, context, backend)` | Sentences that influenced the response |
+| `rag` | `check_answerability(question, documents, context, backend)` | Whether documents can answer a question (0–1) |
+| `rag` | `rewrite_question(question, context, backend)` | Rewrite question into a retrieval query |
+| `rag` | `clarify_query(question, documents, context, backend)` | Generate clarification or return "CLEAR" |
+| `rag` | `find_citations(response, documents, context, backend)` | Document sentences supporting the response |
+| `rag` | `check_context_relevance(question, document, context, backend)` | Whether a document is relevant (0–1) |
+| `rag` | `flag_hallucinated_content(response, documents, context, backend)` | Flag potentially hallucinated sentences |
+
+```python
+from mellea.backends.huggingface import LocalHFBackend
+from mellea.stdlib.components import Message
+from mellea.stdlib.components.intrinsic import core
+from mellea.stdlib.context import ChatContext
+
+backend = LocalHFBackend(model_id="ibm-granite/granite-4.0-micro")
+context = (
+    ChatContext()
+    .add(Message("user", "What is the square root of 4?"))
+    .add(Message("assistant", "The square root of 4 is 2."))
+)
+score = core.check_certainty(context, backend)
+```
+
+For lower-level control (custom adapters, model options), use `mfuncs.act()` with `Intrinsic` directly — see examples in `docs/examples/intrinsics/`.
+
+### Project Resources
+
+- **Canonical catalog**: `mellea/backends/adapters/catalog.py` — source of truth for intrinsic names, HF repo IDs, and adapter types
+- **Usage examples**: `docs/examples/intrinsics/` — working code for every intrinsic
+- **Helper functions**: `mellea/stdlib/components/intrinsic/rag.py` and `core.py`
+
+### Adding New Intrinsics
+
+When adding support for a new intrinsic (not just using an existing one), fetch its README from Hugging Face first. Each README contains the authoritative spec for input/output format, intended use, and examples.
+
+**Writing examples?** The HF READMEs also document intended usage patterns and example inputs — useful reference when writing code in `docs/examples/intrinsics/`.
+
+| Repo | Purpose | Intrinsics |
+|------|---------|------------|
+| [`ibm-granite/granitelib-rag-r1.0`](https://huggingface.co/ibm-granite/granitelib-rag-r1.0) | RAG pipeline | answerability, citations, context_relevance, hallucination_detection, query_rewrite, query_clarification |
+| [`ibm-granite/granitelib-core-r1.0`](https://huggingface.co/ibm-granite/granitelib-core-r1.0) | Core capabilities | context-attribution, requirement-check, uncertainty |
+| [`ibm-granite/granitelib-guardian-r1.0`](https://huggingface.co/ibm-granite/granitelib-guardian-r1.0) | Safety & compliance | guardian-core, policy-guardrails, factuality-detection, factuality-correction |
+
+**README URLs** — RAG intrinsics (no model subfolder):
+```
+https://huggingface.co/ibm-granite/granitelib-rag-r1.0/blob/main/{intrinsic_name}/README.md
+```
+
+Core and Guardian intrinsics (include model subfolder):
+```
+https://huggingface.co/ibm-granite/granitelib-{core,guardian}-r1.0/blob/main/{intrinsic_name}/granite-4.0-micro/README.md
+```
@@ -1,3 +1,50 @@
+## [v0.4.2](https://github.com/generative-computing/mellea/releases/tag/v0.4.2) - 2026-04-08
+
+<!-- Release notes generated using configuration in .github/release.yml at main -->
+
+## What's Changed
+### New Features
+* feat: add tests for mellea optional dependencies by @jakelorocco in https://github.com/generative-computing/mellea/pull/724
+* feat: further vram optimizations by @avinash2692 in https://github.com/generative-computing/mellea/pull/765
+* feat: (m decomp) M Decompose Readme and Docstring Updates by @csbobby in https://github.com/generative-computing/mellea/pull/767
+* feat: add top level async streaming by @jakelorocco in https://github.com/generative-computing/mellea/pull/655
+* feat(serve): improve OpenAI API compatibility with usage, finish_reas… by @markstur in https://github.com/generative-computing/mellea/pull/771
+* feat: removing vllm backend by @avinash2692 in https://github.com/generative-computing/mellea/pull/781
+### Bug Fixes
+* fix: modifications to granite formatter tests by @jakelorocco in https://github.com/generative-computing/mellea/pull/703
+* fix: exclude tooling from mypy check by @planetf1 in https://github.com/generative-computing/mellea/pull/748
+* fix: setting ollama host in conftest by @avinash2692 in https://github.com/generative-computing/mellea/pull/751
+* fix: Add qualitative and slow markers so the example is skipped by @markstur in https://github.com/generative-computing/mellea/pull/764
+* fix(tools): correct args validation in langchain tool wrapper by @markstur in https://github.com/generative-computing/mellea/pull/761
+* fix: remove references to old pytest markers by @jakelorocco in https://github.com/generative-computing/mellea/pull/776
+* fix: add error handling to OpenAI-compatible serve endpoint by @markstur in https://github.com/generative-computing/mellea/pull/774
+* fix: assertion for test_find_context_attributions and range for hallucination detection by @jakelorocco in https://github.com/generative-computing/mellea/pull/779
+* fix: add xfail to citation test; functionality is tested elsewhere by @jakelorocco in https://github.com/generative-computing/mellea/pull/787
+### Documentation
+* docs: remove discord link in main readme by @AngeloDanducci in https://github.com/generative-computing/mellea/pull/720
+* docs: note virtual environment requirement for pre-commit hooks by @ajbozarth in https://github.com/generative-computing/mellea/pull/745
+* docs: condense README to elevator pitch (#478) by @planetf1 in https://github.com/generative-computing/mellea/pull/688
+* docs: update qiskit_code_validation example defaults by @ajbozarth in https://github.com/generative-computing/mellea/pull/743
+* docs: remove pre-IVR validation and update readme with v2 benchmark results by @ajbozarth in https://github.com/generative-computing/mellea/pull/769
+### Other Changes
+* docs: add multi-turn strategy option to Qiskit code validation example by @vabarbosa in https://github.com/generative-computing/mellea/pull/717
+* chore: use github tooling to build release notes by @psschwei in https://github.com/generative-computing/mellea/pull/710
+* docs: add release.md by @psschwei in https://github.com/generative-computing/mellea/pull/723
+* fix: proper permissions on pr labeling job by @psschwei in https://github.com/generative-computing/mellea/pull/741
+* ci: memory management in tests by @avinash2692 in https://github.com/generative-computing/mellea/pull/721
+* chore: enforce commit formatting on PR titles by @psschwei in https://github.com/generative-computing/mellea/pull/750
+* chore: Update HF repo names by @frreiss in https://github.com/generative-computing/mellea/pull/753
+* ci: drop mergify, add release entry to pr-labels action by @psschwei in https://github.com/generative-computing/mellea/pull/752
+* ci: fix to make pr label job required check by @psschwei in https://github.com/generative-computing/mellea/pull/756
+* test: agent skills infrastructure and marker taxonomy audit (#727, #728) by @planetf1 in https://github.com/generative-computing/mellea/pull/742
+* chore: add governance doc by @psschwei in https://github.com/generative-computing/mellea/pull/786
+* chore: updating governance doc to use maintainers by @psschwei in https://github.com/generative-computing/mellea/pull/791
+
+## New Contributors
+* @markstur made their first contribution in https://github.com/generative-computing/mellea/pull/764
+
+**Full Changelog**: https://github.com/generative-computing/mellea/compare/v0.4.1...v0.4.2
+
 ## [v0.4.1](https://github.com/generative-computing/mellea/releases/tag/v0.4.1) - 2026-03-23
 
 ### Feature
 
@@ -354,12 +354,44 @@ uv run ruff check .
 ### Required Models
 
 #### Ollama
-- `granite4:micro-h`
-- `granite3.2-vision`
-- `granite4:micro`
-- `qwen2.5vl:7b`
 
-_Note: ollama models can be obtained by running `ollama pull <model>`_
+HuggingFace and cloud backends download or host models automatically. Ollama
+models must be pulled locally before running the tests that need them.
+
+**CI (unit + integration tests):**
+
+- `granite4:micro` — default model for `start_session()` and most examples
+- `granite4:micro-h` — hybrid variant used by conftest fixtures
+
+**Examples (`docs/examples/`):**
+
+- `deepseek-r1:8b` — safety / guardian examples
+- `granite3-guardian:2b` — mini-researcher guardian backend
+- `granite3.2-vision` — vision (Ollama chat) example
+- `granite3.3:8b` — m\_decompose example
+- `granite4:latest` — melp examples
+- `llama3.2` — repair-with-guardian example
+- `llama3.2:3b` — tutorial / mify examples (via `META_LLAMA_3_2_3B`)
+- `qwen2.5vl:7b` — vision (OpenAI-via-Ollama) example
+
+**Additional test models (`test/`):**
+
+- `granite4:small-h` — hybrid-small tests
+- `llama3.2:1b` — lightweight inference tests
+- `llama3:8b` — legacy Llama 3 tests
+- `llava` — multimodal tests
+- `mistral:7b` — Mistral backend tests
+- `smollm2:1.7b` — SmolLM tests
+
+Pull everything:
+
+```bash
+for m in granite4:micro granite4:micro-h deepseek-r1:8b \
+  granite3-guardian:2b granite3.2-vision granite3.3:8b granite4:latest \
+  llama3.2 llama3.2:3b \
+  qwen2.5vl:7b granite4:small-h llama3.2:1b llama3:8b llava mistral:7b \
+  smollm2:1.7b; do ollama pull "$m"; done
+```
 
 ### Test Markers