Skip to content

Commit 55d675e

Browse files
phernandezclaude
andauthored
feat: min_similarity override, cloud promo improvements (#570)
Signed-off-by: phernandez <paul@basicmachines.co> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 6afe4fd commit 55d675e

106 files changed

Lines changed: 7504 additions & 856 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/settings.json

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
11
{
2-
"enabledPlugins": {
3-
"basic-memory@basicmachines": true
4-
}
2+
"enabledPlugins": {}
53
}

.github/workflows/test.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ on:
1515
jobs:
1616
test-sqlite:
1717
name: Test SQLite (${{ matrix.os }}, Python ${{ matrix.python-version }})
18+
timeout-minutes: 30
1819
strategy:
1920
fail-fast: false
2021
matrix:
@@ -62,6 +63,7 @@ jobs:
6263
6364
test-postgres:
6465
name: Test Postgres (Python ${{ matrix.python-version }})
66+
timeout-minutes: 30
6567
strategy:
6668
fail-fast: false
6769
matrix:
@@ -102,6 +104,7 @@ jobs:
102104
103105
coverage:
104106
name: Coverage Summary (combined, Python 3.12)
107+
timeout-minutes: 30
105108
runs-on: ubuntu-latest
106109

107110
steps:

docs/post-v0.18.0-test-plan.md

Lines changed: 344 additions & 0 deletions
Large diffs are not rendered by default.

docs/semantic-search-test-log.md

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# Semantic Search Manual Test Log
2+
3+
## Overview
4+
5+
Manual test session for semantic (vector) search on the main project.
6+
- Date: 2026-02-15
7+
- Database: ~/.basic-memory/memory.db (SQLite)
8+
- Entities: 456 embedded, 2714 vector chunks
9+
- Search index: 2390 FTS entries
10+
- Embedding model: default (384-dim, sqlite-vec)
11+
12+
## Test Plan
13+
14+
1. **Search Type Routing** — verify vector/hybrid/text dispatch, invalid search_type handling
15+
2. **Conceptual Queries** — natural language where vector should beat FTS
16+
3. **Keyword Queries** — exact terms where FTS should be strong
17+
4. **Hybrid Ranking** — queries where both FTS and vector contribute
18+
5. **Result Types** — entities, observations, relations in vector results
19+
6. **Filters + Vector** — combine vector with types/entity_types/after_date
20+
7. **Edge Cases** — short queries, long queries, empty, special chars, no-match
21+
8. **Pagination** — page > 1, page_size respected
22+
23+
---
24+
25+
## Test Results
26+
27+
### Test 1: Search Type Routing
28+
29+
#### 1a: search_type="semantic" (invalid value)
30+
- **Input:** query="how does the knowledge graph work", search_type="semantic"
31+
- **Expected:** error or explicit fallback
32+
- **Actual:** Silently falls through to text search (else branch in search.py:430)
33+
- **Verdict:** BUG — should either be a recognized alias for "vector" or return an error
34+
35+
#### 1b: search_type="vector"
36+
- **Input:** query="keeping AI context between sessions", search_type="vector"
37+
- **Actual:** 5 results, scores ~0.58-0.59, found "Maintaining context across conversation boundaries" observation
38+
- **Verdict:** PASS
39+
40+
#### 1c: search_type="text" with conceptual query
41+
- **Input:** query="keeping AI context between sessions", search_type="text"
42+
- **Actual:** 0 results (no exact keyword match)
43+
- **Verdict:** PASS (expected — FTS requires token overlap)
44+
45+
#### 1d: search_type="hybrid" with conceptual query
46+
- **Input:** query="keeping AI context between sessions", search_type="hybrid"
47+
- **Actual:** 5 results, same ranking as vector (FTS contributed nothing here)
48+
- **Verdict:** PASS
49+
50+
#### 1e: search_type="text" with keyword query
51+
- **Input:** query="OAuth authentication", search_type="text"
52+
- **Actual:** 3 results — AUTH.md Supabase OAuth, OAuth Rip-and-Replace, OAuth Integration Analysis
53+
- **Verdict:** PASS
54+
55+
#### 1f: search_type="vector" with keyword query
56+
- **Input:** query="OAuth authentication", search_type="vector"
57+
- **Actual:** Same top results as text (keyword-rich content also scores well in vector space)
58+
- **Verdict:** PASS
59+
60+
---
61+
62+
### Test 2: Conceptual Queries (vector advantage)
63+
64+
#### 2a: Natural language question
65+
- **Input:** query="why do AI assistants forget things", search_type="vector"
66+
- **Actual:** 5 results — Manual Testing Session, "Balance security and usability" observation, "Tools should match thought patterns" observation. Scores ~0.56-0.57
67+
- **Vector advantage:** Found conceptually related content despite no exact keyword overlap
68+
- **Verdict:** PASS
69+
70+
#### 2b: Same query, text search
71+
- **Input:** query="why do AI assistants forget things", search_type="text"
72+
- **Actual:** 1 result — "What is Basic Memory?" (likely matched on "AI" token)
73+
- **Verdict:** PASS (demonstrates vector advantage — text barely matched)
74+
75+
#### 2c: Domain concept with no jargon
76+
- **Input:** query="pricing strategy for cloud product", search_type="vector"
77+
- **Actual:** 3 results — SPEC-16 MCP Cloud Service Consolidation, knowledge architecture observation, Visual Knowledge Spaces relation. Scores ~0.56-0.57
78+
- **Verdict:** PASS (found cloud-related content conceptually)
79+
80+
#### 2d: Technical concept, long query
81+
- **Input:** query="SQLite performance optimization WAL mode concurrent writes", search_type="vector"
82+
- **Actual:** 3 results — SPEC-11 API Performance Optimization, Real-Time Updates with WebSockets, marketing status update. Scores ~0.55-0.58
83+
- **Verdict:** PASS (found performance-related content)
84+
85+
---
86+
87+
### Test 3: Keyword Queries (FTS strength)
88+
89+
#### 3a: Exact term match — "OAuth authentication"
90+
- **Text:** 3 results with high relevance (exact matches in titles)
91+
- **Vector:** Same top results (keyword overlap helps vector too)
92+
- **Verdict:** PASS — FTS and vector converge on keyword-rich queries
93+
94+
#### 3b: "OAuth" single keyword, hybrid mode
95+
- **Input:** query="OAuth", search_type="hybrid"
96+
- **Actual:** 5 results — Basic Memory Coding Guide, AI Collaboration Examples, SPEC-18, daily note, Manual Testing Session. FTS + vector blended. Scores ~0.016-0.032
97+
- **Note:** Top hybrid result is "Basic Memory Coding Guide" not an OAuth-specific doc — suggests hybrid scoring may dilute strong FTS matches
98+
- **Verdict:** PASS but hybrid ranking questionable for single-keyword queries
99+
100+
---
101+
102+
### Test 4: Hybrid Ranking
103+
104+
#### 4a: Hybrid vs vector on "OAuth authentication"
105+
- **Hybrid with entity_types=["entity"]:** 5 results — RLS Implementation Lessons, Cloud Readiness Assessment, AUTH.md OAuth, Core Service Implementation, OAuth Rip-and-Replace. Scores ~0.016-0.023
106+
- **Vector with entity_types=["entity"]:** 5 results — Core Service Implementation, SPEC-13 CLI Auth, Coding Guide, Authentication Service, ADR Production Auth. Scores ~0.55-0.60
107+
- **Observation:** Hybrid surfaces different top results than vector-only. Hybrid found RLS and Cloud Readiness docs that vector didn't prioritize. Different ranking is expected from RRF fusion.
108+
- **Verdict:** PASS — hybrid produces meaningfully different ranking
109+
110+
---
111+
112+
### Test 5: Result Types
113+
114+
#### 5a: Vector returns all result types
115+
- **Input:** query="keeping AI context between sessions", search_type="vector"
116+
- **Entities:** SPEC-18 AI Memory Management Tool (type=entity)
117+
- **Relations:** Prompt Builder integrates_with (type=relation)
118+
- **Observations:** "Translation layer is key" (type=observation), "Maintaining context across conversation boundaries" (type=observation)
119+
- **Verdict:** PASS — all three types appear in vector results
120+
121+
#### 5b: Observations carry metadata
122+
- **Observation result:** category="challenge", content="Maintaining context across conversation boundaries", from_entity="research/ai-knowledge-management-research"
123+
- **Verdict:** PASS — category, content, from_entity, tags all present
124+
125+
#### 5c: Relations carry link info
126+
- **Relation result:** relation_type="integrates_with", from_entity="development/features/prompt-builder...", to_entity (present but truncated in some)
127+
- **Verdict:** PASS — relation metadata present
128+
129+
---
130+
131+
### Test 6: Filters + Vector Search
132+
133+
#### 6a: entity_types=["entity"] with vector
134+
- **Input:** query="OAuth authentication", search_type="vector", entity_types=["entity"]
135+
- **Actual:** 5 results, all type="entity" (Core Service Implementation, SPEC-13, Coding Guide, Authentication Service, ADR Auth)
136+
- **Verdict:** PASS — filter correctly restricts to entities only
137+
138+
#### 6b: types=["note"] with vector
139+
- **Input:** query="OAuth authentication", search_type="vector", types=["note"]
140+
- **Actual:** Same 5 results (all have entity_type="note" in metadata)
141+
- **Verdict:** PASS — types filter works with vector search
142+
143+
#### 6c: after_date with vector
144+
- **Input:** query="OAuth authentication", search_type="vector", after_date="2025-06-01"
145+
- **Actual:** 3 results — Core Service Implementation, Cloud Web App analysis observation, SPEC-13. Filtered out older OAuth docs.
146+
- **Verdict:** PASS — date filter applied correctly
147+
148+
#### 6d: entity_types=["entity"] with hybrid
149+
- **Input:** query="OAuth authentication", search_type="hybrid", entity_types=["entity"]
150+
- **Actual:** 5 results, all type="entity" — RLS lessons, Cloud Readiness, AUTH.md OAuth, Core Service, OAuth Rip-and-Replace
151+
- **Verdict:** PASS — filter works with hybrid mode too
152+
153+
#### 6e: types=["entity"] with vector (WRONG filter name)
154+
- **Input:** query="OAuth authentication", search_type="vector", types=["entity"]
155+
- **Actual:** 0 results
156+
- **Note:** `types` filters by entity_type metadata (e.g., "note", "person"), NOT by SearchItemType. Using types=["entity"] looks for entity_type="entity" which few/no notes have. This is a UX confusion point — the param names are ambiguous.
157+
- **Verdict:** PASS (correct behavior) but USABILITY ISSUE — easy to confuse types vs entity_types
158+
159+
---
160+
161+
### Test 7: Edge Cases
162+
163+
#### 7a: Single character query
164+
- **Input:** query="x", search_type="vector"
165+
- **Actual:** 3 results — "Self-contained application bundle" observation, Non-Markdown File Support relation, quick-win-tools entity. Scores ~0.57-0.59
166+
- **Note:** Single character still produces an embedding and returns results. Quality is low/random as expected.
167+
- **Verdict:** PASS (no crash, returns results)
168+
169+
#### 7b: Whitespace-only query
170+
- **Input:** query=" ", search_type="vector"
171+
- **Actual:** 0 results
172+
- **Verdict:** PASS (handled gracefully — _check_vector_eligible strips and rejects empty)
173+
174+
#### 7c: Query with no relevant content
175+
- **Input:** query="quantum computing blockchain", search_type="vector"
176+
- **Actual:** 3 results — Inter-Agent Communication relation, Self-contained bundle observation, JSON-LD interop observation. Scores ~0.54
177+
- **Note:** Still returns results because vector search always finds nearest neighbors. Scores are lower (~0.54) than relevant queries (~0.58-0.60). No relevance threshold applied.
178+
- **Verdict:** PASS (expected behavior) but NOTE — no relevance cutoff means irrelevant queries always return something
179+
180+
---
181+
182+
### Test 8: Pagination
183+
184+
#### 8a: Vector search page 2
185+
- **Input:** query="keeping AI context between sessions", search_type="vector", page=2, page_size=3
186+
- **Actual:** 3 results on page 2, current_page=2. Different results from page 1. Top: "Maintaining context across conversation boundaries" observation (score 0.587)
187+
- **Note:** Interestingly, page 2 had a higher-scoring result than some page 1 results. This may indicate pagination doesn't sort globally — it might be paginating within a pre-scored set.
188+
- **Verdict:** PASS (pagination works) but POSSIBLE ISSUE — result ordering across pages needs investigation
189+
190+
---
191+
192+
## Summary
193+
194+
### Passing Tests: 20/21
195+
196+
### Bugs Found
197+
1. **search_type="semantic" silently falls through** (Test 1a) — Invalid search_type values fall to the `else` branch and default to text search without any warning. Should either alias "semantic" to "vector" or raise an error.
198+
199+
### Usability Issues
200+
2. **types vs entity_types confusion** (Test 6e) — `types` filters by entity_type metadata (note, person, etc.) while `entity_types` filters by SearchItemType (entity, observation, relation). The naming is ambiguous and easy to mix up.
201+
3. **No relevance threshold** (Test 7c) — Vector search always returns nearest neighbors even for completely irrelevant queries. Consider adding a minimum score threshold or at least documenting expected score ranges.
202+
4. **Hybrid ranking for single keywords** (Test 3b) — Hybrid mode on simple keyword queries produced less intuitive rankings than pure FTS or pure vector. The RRF fusion may dilute strong FTS signals.
203+
204+
### Observations
205+
- Vector search successfully finds conceptually related content that FTS misses entirely
206+
- Score ranges: relevant queries ~0.56-0.60, irrelevant queries ~0.54 (narrow spread)
207+
- All three result types (entity, observation, relation) appear correctly in vector results
208+
- Filters (entity_types, types, after_date) all work correctly with vector and hybrid modes
209+
- Pagination works but cross-page ordering may need investigation

0 commit comments

Comments
 (0)