Skip to content

Commit db9f57e

Browse files
committed
Add scenario-guided MCP prompt validation
1 parent 87c01d8 commit db9f57e

11 files changed

Lines changed: 258 additions & 138 deletions

File tree

docs/mcp-prompt-validation.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# MCP Prompt Validation
2+
3+
This document records the prompt-validation scenario set for the ServiceControl MCP surface.
4+
5+
The validation perspective is intentionally narrow: assume the agent only sees discovered tool names, tool descriptions, and parameter descriptions. It does not rely on `docs/mcp-investigation-guide.md` or repository source code.
6+
7+
## Error Scenarios
8+
9+
| Prompt | Expected tool choice | Validation notes |
10+
| --- | --- | --- |
11+
| What are the biggest current failure categories? | `get_errors_summary` or `get_failure_groups` | `get_failure_groups` is positioned as the first step for root-cause analysis; detail and mutating tools are not framed as starting points. |
12+
| Why are messages failing in Billing? | `get_failure_groups` -> `get_failed_messages_by_endpoint` -> `get_failed_message_last_attempt` | The metadata separates grouped root-cause analysis, endpoint-scoped inspection, and last-attempt detail lookup. |
13+
| Retry only the timeout-related failures | `get_failure_groups` -> `retry_failure_group` | `retry_failure_group` is described as the grouped retry for one root cause, while broader retry tools explicitly warn about broad impact. |
14+
| Show me details for this failed message | `get_failed_message_by_id` | The tool description says it is for a specific failed message and points agents to list/group tools only when an ID is not yet known. |
15+
| Retry everything | `retry_all_failed_messages` | The metadata allows the broad tool when explicitly requested, while warning that it changes system state and may affect a large number of messages. |
16+
17+
## Audit Scenarios
18+
19+
| Prompt | Expected tool choice | Validation notes |
20+
| --- | --- | --- |
21+
| Find messages related to order 12345 | `search_audit_messages` | The description explicitly says it is for a specific business identifier or text, and browsing tools point agents toward search for targeted lookups. |
22+
| Show me what happened in this conversation | `get_audit_messages_by_conversation` | The description frames it as tracing a full flow across multiple endpoints once a conversation ID is known. |
23+
| What is endpoint Billing doing? | `get_audit_messages_by_endpoint` | The metadata positions this as the single-endpoint activity view rather than a cross-endpoint trace. |
24+
| Show recent system activity | `get_audit_messages` | The browsing tool is positioned for recent activity and timeline exploration. |
25+
| Show the payload of this message | `get_audit_message_body` | The description explicitly says it is for inspecting payload or message data after locating a specific audit message. |
26+
27+
## Outcome
28+
29+
- Summary and grouping tools are preferred before detail tools for error investigation.
30+
- Search and browse are clearly separated for audit scenarios.
31+
- Conversation tracing and endpoint-centric inspection are differentiated.
32+
- Broad mutating tools remain discoverable but are framed as explicit, risky choices rather than defaults.
33+
- Identifier and endpoint parameter descriptions support the scenario selection by clarifying where IDs and names come from.

src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@
9696
},
9797
{
9898
"name": "get_failed_message_by_id",
99-
"description": "Read-only. Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. Good for questions like: \u0027show me details for this failed message\u0027, \u0027what exception caused this failure?\u0027, or \u0027how many times has this message failed?\u0027. You need a failed message ID, which you can get from GetFailedMessages or GetFailureGroups results. If you only need the most recent failure attempt, use GetFailedMessageLastAttempt instead \u2014 it returns less data.",
99+
"description": "Get detailed information about a specific failed message. Use this when you already know the failed message ID and need to inspect its contents or failure details. Use GetFailedMessages or GetFailureGroups to locate relevant messages before calling this tool. Read-only.",
100100
"inputSchema": {
101101
"type": "object",
102102
"properties": {
@@ -121,7 +121,7 @@
121121
},
122122
{
123123
"name": "get_failed_message_last_attempt",
124-
"description": "Read-only. Use this tool to see how a specific message failed most recently. Good for questions like: \u0027what was the last error for this message?\u0027, \u0027show me the latest exception\u0027, or \u0027what happened on the last attempt?\u0027. Returns the latest processing attempt with its exception, stack trace, and headers. Lighter than GetFailedMessageById when you only care about the most recent failure rather than the full history.",
124+
"description": "Retrieve the last processing attempt for a failed message. Use this to understand the most recent failure behavior, including exception details and processing context. Typically used after identifying a failed message via GetFailedMessages or GetFailedMessageById. Read-only.",
125125
"inputSchema": {
126126
"type": "object",
127127
"properties": {
@@ -146,7 +146,7 @@
146146
},
147147
{
148148
"name": "get_failed_messages",
149-
"description": "Read-only. Use this tool to retrieve failed messages for investigation when the user wants to see what is failing. Good for questions like: \u0027what messages are currently failing?\u0027, \u0027are there failures in a specific queue?\u0027, or \u0027what failed recently?\u0027. Returns a paged list of failed messages with their status, exception details, and queue information. For broad requests, call with no parameters to get the most recent failures \u2014 only add filters when you need to narrow the scope. Prefer GetFailedMessagesByEndpoint when the user mentions a specific endpoint.",
149+
"description": "Retrieve failed messages for investigation. Use this when exploring recent failures or narrowing down failures by queue, status, or time range. Prefer GetFailureGroups when starting root-cause analysis across many failures. Use GetFailedMessageById when inspecting a specific failed message. Read-only.",
150150
"inputSchema": {
151151
"type": "object",
152152
"properties": {
@@ -159,7 +159,7 @@
159159
"default": null
160160
},
161161
"modified": {
162-
"description": "Filter failed messages to entries modified after this ISO 8601 date/time. Omit this filter to include older results.",
162+
"description": "Restricts failed-message results to entries modified after this ISO 8601 date/time. Omitting this may return a large result set.",
163163
"type": [
164164
"string",
165165
"null"
@@ -208,12 +208,12 @@
208208
},
209209
{
210210
"name": "get_failed_messages_by_endpoint",
211-
"description": "Read-only. Use this tool to see failed messages for a specific NServiceBus endpoint. Good for questions like: \u0027what is failing in the Sales endpoint?\u0027, \u0027show errors for Shipping\u0027, or \u0027are there failures in this endpoint?\u0027. Returns the same paged failure data as GetFailedMessages but scoped to one endpoint. Prefer this tool over GetFailedMessages when the user mentions a specific endpoint name.",
211+
"description": "Retrieve failed messages for a specific endpoint. Use this when investigating failures in a named endpoint such as Billing or Sales. Prefer GetFailureGroups when you need root-cause analysis across many failures. Use GetFailedMessageLastAttempt after this when you need the most recent failure details for a specific message. Read-only.",
212212
"inputSchema": {
213213
"type": "object",
214214
"properties": {
215215
"endpointName": {
216-
"description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.",
216+
"description": "The endpoint name that owns the failed messages. Use values obtained from endpoint-aware failed-message results.",
217217
"type": "string"
218218
},
219219
"status": {
@@ -225,7 +225,7 @@
225225
"default": null
226226
},
227227
"modified": {
228-
"description": "Filter endpoint results to failed messages modified after this ISO 8601 date/time. Omit this filter to include older results.",
228+
"description": "Restricts endpoint failed-message results to entries modified after this ISO 8601 date/time. Omitting this may return a large result set.",
229229
"type": [
230230
"string",
231231
"null"
@@ -269,7 +269,7 @@
269269
},
270270
{
271271
"name": "get_failure_groups",
272-
"description": "Read-only. Use this tool to understand why messages are failing by seeing failures grouped by root cause. Good for questions like: \u0027why are messages failing?\u0027, \u0027what errors are happening?\u0027, \u0027group failures by exception\u0027, or \u0027what are the top failure causes?\u0027. Each group represents a distinct exception type and stack trace, showing how many messages are affected and when failures started and last occurred. This is usually the best starting point for diagnosing production issues \u2014 call it before drilling into individual messages. Call with no parameters to use the default grouping by exception type and stack trace.",
272+
"description": "Retrieve failure groups, where failed messages are grouped by exception type and stack trace. Use this as the first step when analyzing large numbers of failures to identify dominant root causes. Prefer GetFailedMessages when you need individual message details. Read-only.",
273273
"inputSchema": {
274274
"type": "object",
275275
"properties": {
@@ -317,7 +317,7 @@
317317
},
318318
{
319319
"name": "retry_all_failed_messages",
320-
"description": "Use this tool to retry every unresolved failed message across all queues and endpoints. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry everything\u0027, \u0027reprocess all failures\u0027, or \u0027retry all failed messages\u0027. It affects all unresolved failed messages across the instance. This is a broad operation \u2014 prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly.",
320+
"description": "Retry all currently failed messages across all queues. Use only when the user explicitly requests a broad retry operation. Prefer narrower retry tools such as RetryFailureGroup or RetryFailedMessages when possible. This operation changes system state. It may affect many messages. It affects all unresolved failed messages across the instance and may affect a large number of messages.",
321321
"inputSchema": {
322322
"type": "object",
323323
"properties": {}
@@ -334,12 +334,12 @@
334334
},
335335
{
336336
"name": "retry_all_failed_messages_by_endpoint",
337-
"description": "Use this tool to retry all failed messages for a specific NServiceBus endpoint. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry all failures in the Sales endpoint\u0027, \u0027the bug in Shipping is fixed, retry its failures\u0027, or \u0027reprocess all errors for this endpoint\u0027. Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed.",
337+
"description": "Retry all failed messages for a specific endpoint. Use this when the user explicitly wants an endpoint-scoped retry after an endpoint-specific issue is fixed. Prefer RetryFailureGroup or RetryFailedMessages when you can retry a narrower set of failures. This operation changes system state. It may affect many messages. Use the endpoint name from failed-message results.",
338338
"inputSchema": {
339339
"type": "object",
340340
"properties": {
341341
"endpointName": {
342-
"description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027",
342+
"description": "The endpoint name whose failed messages should be retried. Use values obtained from failed-message results.",
343343
"type": "string"
344344
}
345345
},
@@ -384,7 +384,7 @@
384384
},
385385
{
386386
"name": "retry_failed_messages",
387-
"description": "Use this tool to reprocess multiple specific failed messages at once. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry these messages\u0027, \u0027reprocess messages msg-1, msg-2, msg-3\u0027, or \u0027retry this batch\u0027. Prefer RetryFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to retry.",
387+
"description": "Retry a selected set of failed messages by their IDs. Use this when the user explicitly wants to retry specific known messages. Prefer RetryFailureGroup when retrying all messages with the same root cause. This operation changes system state. It may affect many messages. Use values obtained from failed-message investigation tools.",
388388
"inputSchema": {
389389
"type": "object",
390390
"properties": {
@@ -412,12 +412,12 @@
412412
},
413413
{
414414
"name": "retry_failed_messages_by_queue",
415-
"description": "Use this tool to retry all unresolved failed messages from a specific queue. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry all failures in the Sales queue\u0027, \u0027reprocess everything from this queue\u0027, or \u0027the queue consumer is back, retry its failures\u0027. Useful when a queue\u0027s consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status.",
415+
"description": "Retry all unresolved failed messages from a specific queue. Use this when the user explicitly wants a queue-scoped retry after a queue or consumer issue is fixed. Prefer RetryFailureGroup or RetryFailedMessages when you can retry a narrower set of failures. This operation changes system state. It may affect many messages. Use the queue address from failed-message results.",
416416
"inputSchema": {
417417
"type": "object",
418418
"properties": {
419419
"queueAddress": {
420-
"description": "The full queue address including machine name, e.g. \u0027Sales@machine\u0027",
420+
"description": "Queue address whose unresolved failed messages should be retried. Use values obtained from failed-message results.",
421421
"type": "string"
422422
}
423423
},
@@ -437,7 +437,7 @@
437437
},
438438
{
439439
"name": "retry_failure_group",
440-
"description": "Use this tool to retry all failed messages that share the same exception type and stack trace. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry this failure group\u0027, \u0027the bug causing these NullReferenceExceptions is fixed, retry them\u0027, or \u0027retry all messages in this group\u0027. This is the most targeted way to retry related failures after fixing a specific bug. You need a failure group ID, which you can get from GetFailureGroups. Returns InProgress if a retry is already running for this group.",
440+
"description": "Retry all failed messages in a failure group that share the same root cause. Use this when multiple failures are caused by the same issue and can be retried together. Prefer RetryFailedMessages for more granular control. This operation changes system state. It may affect many messages. Use the failure group ID from GetFailureGroups. Returns InProgress if a retry is already running for this group.",
441441
"inputSchema": {
442442
"type": "object",
443443
"properties": {

0 commit comments

Comments
 (0)