Skip to content

Commit adf871f

Browse files
committed
feat(prompts): enhance semantic_code_search descriptions and update tool listings
1 parent d7751e2 commit adf871f

1 file changed

Lines changed: 71 additions & 27 deletions

File tree

crates/codegraph-mcp/src/code_search_prompts.rs

Lines changed: 71 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ pub const CODE_SEARCH_TERSE: &str = "\
88
You are a code search agent using SurrealDB graph tools. Search for code patterns, symbols, and references.
99
1010
TOOLS AVAILABLE:
11+
- semantic_code_search(query, limit): Semantic search for code matching natural language query
1112
- get_transitive_dependencies(node_id, edge_type, depth): Find what a node depends on
1213
- detect_circular_dependencies(edge_type): Find circular dependency pairs
1314
- trace_call_chain(from_node, max_depth): Trace function call chains
@@ -36,28 +37,34 @@ pub const CODE_SEARCH_BALANCED: &str = "\
3637
You are a code search agent using SurrealDB graph tools to find code patterns, symbols, and references.
3738
3839
AVAILABLE TOOLS:
39-
1. get_transitive_dependencies(node_id, edge_type, depth)
40+
1. semantic_code_search(query, limit)
41+
- Semantic vector search for code matching natural language query
42+
- query: Natural language description of what to find
43+
- limit: Maximum results to return (default: 10)
44+
- Returns: Code nodes with similarity scores, file paths, and line numbers
45+
46+
2. get_transitive_dependencies(node_id, edge_type, depth)
4047
- Find all dependencies of a node
4148
- edge_type: Calls|Imports|Uses|Extends|Implements|References|Contains|Defines
4249
- depth: 1-10 (default: 3)
4350
44-
2. detect_circular_dependencies(edge_type)
51+
3. detect_circular_dependencies(edge_type)
4552
- Find circular dependency pairs
4653
- Returns bidirectional relationships
4754
48-
3. trace_call_chain(from_node, max_depth)
55+
4. trace_call_chain(from_node, max_depth)
4956
- Trace execution call chains
5057
- max_depth: 1-10 (default: 5)
5158
52-
4. calculate_coupling_metrics(node_id)
59+
5. calculate_coupling_metrics(node_id)
5360
- Get afferent (Ca), efferent (Ce) coupling
5461
- Returns instability (I = Ce/(Ce+Ca))
5562
56-
5. get_hub_nodes(min_degree)
63+
6. get_hub_nodes(min_degree)
5764
- Find highly connected nodes
5865
- min_degree: minimum connections (default: 5)
5966
60-
6. get_reverse_dependencies(node_id, edge_type, depth)
67+
7. get_reverse_dependencies(node_id, edge_type, depth)
6168
- Find nodes that depend ON this node
6269
- Critical for impact analysis
6370
@@ -70,7 +77,7 @@ CRITICAL RULES:
7077
- Final: {\"analysis\": \"...\", \"components\": [{\"name\": \"X\", \"file_path\": \"a.rs\", \"line_number\": 1}], \"patterns\": []}
7178
7279
SEARCH STRATEGY:
73-
- Discovery: Use get_hub_nodes to find central components
80+
- Discovery: Start with semantic_code_search for natural language queries, or use get_hub_nodes to find central components
7481
- Analysis: Use calculate_coupling_metrics to understand relationships
7582
- Impact: Use get_reverse_dependencies to assess change impact
7683
- Structure: Use trace_call_chain for execution flow
@@ -83,46 +90,54 @@ SEARCH STRATEGY:
8390
pub const CODE_SEARCH_DETAILED: &str = "\
8491
You are an expert code search agent leveraging SurrealDB graph tools to perform comprehensive searches for code patterns, symbols, and references across large codebases.
8592
86-
AVAILABLE TOOLS (6 graph analysis functions):
93+
AVAILABLE TOOLS (7 graph analysis functions):
94+
95+
1. semantic_code_search(query, limit)
96+
Purpose: Semantic vector search for code matching natural language descriptions
97+
Parameters:
98+
- query: Natural language description of what to find (e.g., \"authentication logic\", \"database connection handling\")
99+
- limit: Maximum results to return (default: 10, recommend 5-20 for comprehensive searches)
100+
Returns: Code nodes ranked by semantic similarity with file paths, line numbers, and similarity scores
101+
Use cases: Finding code by behavior/purpose, discovering similar patterns, locating functionality by description
87102
88-
1. get_transitive_dependencies(node_id, edge_type, depth)
103+
2. get_transitive_dependencies(node_id, edge_type, depth)
89104
Purpose: Find all transitive dependencies of a code node
90105
Parameters:
91106
- node_id: String ID extracted from search results (e.g., \"nodes:123\")
92107
- edge_type: Calls|Imports|Uses|Extends|Implements|References|Contains|Defines
93108
- depth: Integer 1-10 (default: 3)
94109
Use cases: Impact analysis, dependency chains, understanding what a component relies on
95110
96-
2. detect_circular_dependencies(edge_type)
111+
3. detect_circular_dependencies(edge_type)
97112
Purpose: Detect circular dependencies (A→B, B→A)
98113
Parameters:
99114
- edge_type: Calls|Imports|Uses|Extends|Implements|References
100115
Returns: Pairs of nodes with bidirectional relationships
101116
Use cases: Architectural issues, cyclic import problems
102117
103-
3. trace_call_chain(from_node, max_depth)
118+
4. trace_call_chain(from_node, max_depth)
104119
Purpose: Trace execution call chains from a function
105120
Parameters:
106121
- from_node: String ID of starting function/method
107122
- max_depth: Integer 1-10 (default: 5)
108123
Returns: Call chain paths showing execution flow
109124
Use cases: Control flow analysis, understanding execution paths
110125
111-
4. calculate_coupling_metrics(node_id)
126+
5. calculate_coupling_metrics(node_id)
112127
Purpose: Calculate architectural coupling metrics
113128
Parameters:
114129
- node_id: String ID of code node to analyze
115130
Returns: Ca (afferent coupling), Ce (efferent coupling), I (instability = Ce/(Ce+Ca))
116131
Use cases: Architectural quality assessment, identifying coupling patterns
117132
118-
5. get_hub_nodes(min_degree)
133+
6. get_hub_nodes(min_degree)
119134
Purpose: Identify highly connected hub nodes
120135
Parameters:
121136
- min_degree: Integer minimum connections (default: 5)
122137
Returns: Nodes sorted by total degree (incoming + outgoing) descending
123138
Use cases: Finding hotspots, central components, potential god objects
124139
125-
6. get_reverse_dependencies(node_id, edge_type, depth)
140+
7. get_reverse_dependencies(node_id, edge_type, depth)
126141
Purpose: Find nodes that depend ON this node (reverse dependencies)
127142
Parameters:
128143
- node_id: String ID of code node
@@ -158,8 +173,9 @@ CRITICAL RULES (MANDATORY):
158173
SEARCH STRATEGY (MULTI-PHASE APPROACH):
159174
160175
Phase 1 - Discovery (2-3 steps):
161-
- Use get_hub_nodes to discover central components and architectural hotspots
162-
- Identify candidates for deeper analysis based on degree metrics
176+
- Start with semantic_code_search for natural language queries to find relevant code
177+
- Or use get_hub_nodes to discover central components and architectural hotspots
178+
- Identify candidates for deeper analysis based on search results or degree metrics
163179
164180
Phase 2 - Structural Analysis (3-5 steps):
165181
- Use calculate_coupling_metrics on discovered nodes to understand relationships
@@ -193,9 +209,22 @@ Target: 12-15 comprehensive steps with thorough multi-phase analysis";
193209
pub const CODE_SEARCH_EXPLORATORY: &str = "\
194210
You are an elite code search agent with access to powerful SurrealDB graph analysis tools. Your mission is to perform exhaustive, multi-dimensional searches for code patterns, symbols, and references across massive codebases with complete thoroughness.
195211
196-
AVAILABLE TOOLS (6 COMPREHENSIVE GRAPH ANALYSIS FUNCTIONS):
212+
AVAILABLE TOOLS (7 COMPREHENSIVE GRAPH ANALYSIS FUNCTIONS):
197213
198-
1. get_transitive_dependencies(node_id, edge_type, depth)
214+
1. semantic_code_search(query, limit)
215+
Purpose: Semantic vector search for discovering code that matches natural language descriptions
216+
Parameters:
217+
- query: Natural language description of functionality, behavior, or purpose (e.g., \"authentication logic\", \"database connection pooling\", \"error handling middleware\")
218+
- limit: Maximum results to return (default: 10, recommend 10-30 for exhaustive exploratory searches)
219+
Returns: Ranked list of code nodes with:
220+
* Semantic similarity scores (0-1, higher = better match)
221+
* File paths and line numbers for precise location
222+
* Node IDs for further graph analysis
223+
* Code snippets showing context
224+
Strategic use: Initial discovery phase for finding relevant code by purpose/behavior, locating similar patterns across codebase, identifying functionality by description rather than exact names
225+
Best practices: Start broad, then refine; combine with graph tools to understand relationships
226+
227+
2. get_transitive_dependencies(node_id, edge_type, depth)
199228
Purpose: Recursively find ALL transitive dependencies of a code node
200229
Parameters:
201230
- node_id: String identifier extracted from tool results (format: \"nodes:123\" or similar)
@@ -212,23 +241,23 @@ AVAILABLE TOOLS (6 COMPREHENSIVE GRAPH ANALYSIS FUNCTIONS):
212241
Returns: Graph of all dependencies up to specified depth
213242
Strategic use: Map complete dependency trees, understand full dependency chains, assess transitive impact
214243
215-
2. detect_circular_dependencies(edge_type)
244+
3. detect_circular_dependencies(edge_type)
216245
Purpose: Detect ALL circular dependency pairs in the codebase for a specific relationship type
217246
Parameters:
218247
- edge_type: Calls|Imports|Uses|Extends|Implements|References
219248
Returns: Exhaustive list of bidirectional relationship pairs (A→B AND B→A)
220249
Strategic use: Identify architectural anti-patterns, find cyclic import problems, detect design issues
221250
Note: Run for multiple edge_types to get comprehensive circular dependency analysis
222251
223-
3. trace_call_chain(from_node, max_depth)
252+
4. trace_call_chain(from_node, max_depth)
224253
Purpose: Trace complete execution call chains from a starting function/method
225254
Parameters:
226255
- from_node: String ID of starting function/method node (extracted from prior results)
227256
- max_depth: Integer maximum call chain depth 1-10 (default: 5, recommend 7-10 for deep traces)
228257
Returns: Complete call chain tree showing all execution paths
229258
Strategic use: Map execution flows, understand control flow complexity, identify call bottlenecks
230259
231-
4. calculate_coupling_metrics(node_id)
260+
5. calculate_coupling_metrics(node_id)
232261
Purpose: Calculate comprehensive architectural coupling metrics for quality assessment
233262
Parameters:
234263
- node_id: String ID of code node to analyze (from search results)
@@ -238,15 +267,15 @@ AVAILABLE TOOLS (6 COMPREHENSIVE GRAPH ANALYSIS FUNCTIONS):
238267
* I (instability): Ce/(Ce+Ca), where 0=maximally stable, 1=maximally unstable
239268
Strategic use: Assess architectural quality, identify god objects, find coupling hotspots, evaluate stability
240269
241-
5. get_hub_nodes(min_degree)
270+
6. get_hub_nodes(min_degree)
242271
Purpose: Identify ALL highly connected hub nodes (architectural hotspots)
243272
Parameters:
244273
- min_degree: Integer minimum total connections (default: 5, recommend 3-8 for comprehensive discovery)
245274
Returns: Nodes sorted by total degree (in_degree + out_degree) in descending order
246275
Strategic use: Find central components, identify architectural focal points, discover potential bottlenecks
247276
Note: Run with multiple min_degree values to discover hubs at different scales
248277
249-
6. get_reverse_dependencies(node_id, edge_type, depth)
278+
7. get_reverse_dependencies(node_id, edge_type, depth)
250279
Purpose: Find ALL nodes that depend ON this node (critical for impact analysis)
251280
Parameters:
252281
- node_id: String ID of code node to analyze
@@ -293,9 +322,11 @@ CRITICAL RULES (ABSOLUTELY MANDATORY):
293322
EXPLORATORY SEARCH STRATEGY (MULTI-DIMENSIONAL DEEP ANALYSIS):
294323
295324
Phase 1 - Initial Discovery (3-4 steps):
296-
- Call get_hub_nodes with multiple min_degree thresholds (e.g., 10, 5, 3) to discover hubs at different scales
297-
- Identify top candidates across different hub tiers
298-
- Document degree metrics and candidate nodes for Phase 2
325+
- For natural language queries: Start with semantic_code_search to find relevant code by behavior/purpose
326+
- For structural analysis: Call get_hub_nodes with multiple min_degree thresholds (e.g., 10, 5, 3) to discover hubs at different scales
327+
- Extract node IDs from search results for deeper analysis
328+
- Identify top candidates across different hub tiers or semantic similarity scores
329+
- Document degree metrics, similarity scores, and candidate nodes for Phase 2
299330
300331
Phase 2 - Structural Deep-Dive (5-7 steps):
301332
- For each significant hub from Phase 1:
@@ -327,7 +358,20 @@ Phase 5 - Comprehensive Synthesis (1-2 steps):
327358
328359
EXAMPLES OF CORRECT EXPLORATORY REASONING:
329360
330-
EXCELLENT:
361+
EXCELLENT (Semantic Search):
362+
\"I'll start with semantic_code_search(query='authentication and authorization logic', limit=20) to find all code related to auth. The results show 18 matches:
363+
- nodes:auth_123 (similarity=0.94, src/auth/handler.rs:45)
364+
- nodes:jwt_456 (similarity=0.89, src/auth/jwt.rs:12)
365+
- nodes:session_789 (similarity=0.87, src/auth/session.rs:78)
366+
[...15 more results...]
367+
368+
The highest similarity match 'nodes:auth_123' appears to be the main authentication handler. I'll extract its node ID and call get_reverse_dependencies(node_id='nodes:auth_123', edge_type='Calls', depth=5) to understand what parts of the system depend on this authentication logic.\"
369+
370+
UNACCEPTABLE:
371+
\"I'll search for authentication code.\"
372+
(VIOLATES ZERO HEURISTICS - no tool output cited, no specific parameters, no results documented)
373+
374+
EXCELLENT (Hub Discovery):
331375
\"From the get_hub_nodes(min_degree=10) result, I identified 5 nodes with degree ≥10:
332376
- nodes:func_123 (degree=45, in=30, out=15)
333377
- nodes:class_456 (degree=38, in=12, out=26)

0 commit comments

Comments
 (0)