feat: RAG implementation and benchmark for Nabledge v6 (#383)#386
Draft
kiyotis wants to merge 50 commits into
Draft
feat: RAG implementation and benchmark for Nabledge v6 (#383)#386kiyotis wants to merge 50 commits into
kiyotis wants to merge 50 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… tasks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…v1.18.0 (#383) - Add `--model` CLI arg (default: cohere.embed-multilingual-v3) for v4 swap without code changes - Truncate texts to 2048 chars for v3 models (Bedrock rejects longer inputs) - Bump Qdrant Docker image v1.13.4 → v1.18.0 to match qdrant-client 1.18.0 - Add _MODEL_VECTOR_SIZES dict and _MODEL_MAX_CHARS dict for per-model config - Add 9 new unit tests (TestModelVectorSizes + TestEmbedTextsModelMaxChars): 36 total complete task #1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fied Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Single upsert of 9376 chunks (122 MB) exceeded Qdrant's limit. Batch into 500-point chunks per upsert call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
17 tests covering embed_query (search_query input_type), build_processing_type_filter (nablarch-batch+none OR filter), search_qdrant (query_points API), and format_results (path.json:sN section_ref format). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Embeds question with Cohere Embed (input_type=search_query) via Bedrock, builds processing_type OR "none" filter, queries Qdrant via query_points API, and returns QueryResult list with section_ref in path.json:sN format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ied) Retrieves top-k sections via query.py, loads section content, builds RAG prompt with context, calls LLM (claude -p), parses e2e-prompt.md format output, saves workflow_details/answer/metrics/evaluation.json. Compatible with run_qa.py output structure. Verified on pre-01: scores 0.90 correctness, 0.92 relevancy, 1.0 faithfulness. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, harden error handling (#383) - Fix A: remove dead boto3 try/except at module level; rename _DEFAULT_EMBED_MODEL_ID, _DEFAULT_TOP_K, _QDRANT_HOST, _QDRANT_PORT → public (no leading underscore) - Fix B: query() now accepts optional qdrant_client param (for DI / testing) - Fix C: run_rag_qa.py imports public constants and delegates to rag_query() instead of duplicating the 4-step embed→filter→search→format pipeline - Fix D: call_llm() wraps json.loads and subprocess.run in try/except; raises RuntimeError with clear messages on TimeoutExpired and JSONDecodeError - Fix E: format_results() skips hits with empty page_id (logs warning to stderr) - Fix F: test_query.py — module-level import json, full-path assertions, value check for filter test, simplified test_passes_text_in_texts_array; add truncation tests and TestQuery orchestration test and test_empty_page_id_hit_is_skipped (17 → 21 tests, all pass) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fallback (#383) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sign (#383) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
#383) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
#383) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…py (#383) - find_truncated_pages: wrap open+json.load in try/except (json.JSONDecodeError, OSError), print [WARN] to stderr and continue on corrupt files - page_id_from_section_ref: warn to stderr when ref has no ':' separator, return path stripped of .json suffix (graceful degradation) - main(): replace data["scenarios"] with data.get("scenarios") guarded by sys.exit(1) when key is missing - _V3_MAX_CHARS: clarify comment — this is a quality threshold (truncation reduces retrieval accuracy), not an absence marker Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rios.py (#383) - Fix sys.path.insert to use repo-root approach (4x .parent hops) matching test_index.py, and import via tools.rag.scripts.select_scenarios - Add TestPageIdFromSectionRef::test_ref_with_no_colon_returns_path_stripped_of_json_suffix to cover the warning + graceful-degradation path added to page_id_from_section_ref Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er benchmark Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-report Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iting SCP unlock) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See steering.