-
Notifications
You must be signed in to change notification settings - Fork 8
tester.ts required fields for open ai models #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -546,8 +546,8 @@ export class Tester extends TaskAgent implements Agent { | |||||
| const schema = z.object({ | ||||||
| assessment: z.string().describe('Short review of current progress toward the main scenario goal'), | ||||||
| suggestion: z.string().describe('Specific next action recommendation'), | ||||||
| recommendReset: z.boolean().optional().describe('Recommend reset() if persistent failures suggest navigation issues'), | ||||||
| recommendStop: z.boolean().optional().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | ||||||
| recommendReset: z.boolean().describe('Recommend reset() if persistent failures suggest navigation issues'), | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can be confusing as it requires LLM to fill all fields and it would provide recommendation which it was previously about to avoid Probably best way to remove all that recommended* fields alltogether And keep only suggestion Also probably we can use enum if this allowed: |
||||||
| recommendStop: z.boolean().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: The Severity Level: Major
|
||||||
| recommendStop: z.boolean().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), | |
| recommendStop: z.boolean().optional().describe('Recommend stop() if test is fundamentally incompatible or cannot proceed'), |
Steps of Reproduction ✅
1. Trigger the same code path as previous step: execute a test that leads to
Tester.analyzeProgress() being called from the main test loop (the conditional call to
analyzeProgress is in src/ai/tester.ts within the main loop).
2. Inspect the schema in analyzeProgress where recommendStop is declared (schema block at
lines 546-551, with recommendStop at line 550).
3. The function calls the provider to parse the LLM output via
this.provider.generateObject(..., schema, model). If the LLM response includes assessment
and suggestion but omits recommendStop, Zod will reject the object because recommendStop
is required (line 550).
4. When validation fails, generateObject yields no usable response.object, analyzeProgress
treats that as no result (const result = response?.object; if (!result) return ''), and
the function exits without adding the expected progress note or advising stop/reset.
Reproduce by running any test that causes analyzeProgress to run and configuring the LLM
(or using a prompt/seed) so it omits the recommendStop boolean.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/ai/tester.ts
**Line:** 550:550
**Comment:**
*Possible Bug: The `recommendStop` field is also required in the Zod schema without the prompt explicitly enforcing its presence, so if the model does not include it, `generateObject` may fail validation and result handling will be skipped.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: In finalReview, the new recommendation field is required in the Zod schema but the prompt never asks the model to provide it, so generateObject can fail schema validation if the field is omitted, preventing summaries from being processed. [possible bug]
Severity Level: Major ⚠️
- ❌ Final summary not recorded in task.summary.
- ❌ finalReview may not call task.finish().
- ⚠️ Test session ends without final notes.| recommendation: z.string().describe('Follow-up suggestion if needed'), | |
| recommendation: z.string().optional().describe('Follow-up suggestion if needed'), |
Steps of Reproduction ✅
1. Execute a test that reaches the end-of-test cleanup so Tester.finalReview(task) is
invoked. finalReview is defined in src/ai/tester.ts (the function signature is private
async finalReview(task: Test): Promise<void; see the finalReview function block).
2. In finalReview, inspect the Zod schema declared for the final summary (schema block at
lines 664-668, where recommendation is currently required at line 667).
3. finalReview then calls the provider to parse the LLM output: const model =
this.provider.getModelForAgent('tester'); const response = await
this.provider.generateObject(..., schema, model); (the generateObject invocation follows
the schema definition in the same function).
4. If the LLM provides summary and scenarioAchieved but does not emit a recommendation
string, Zod validation will fail because recommendation is required (line 667). As a
result response?.object is undefined and finalReview returns early (if (!result) return;),
preventing task.summary assignment and the task.finish() logic from executing. Reproduce
by running a complete test and using an LLM reply that contains summary and
scenarioAchieved but omits recommendation.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/ai/tester.ts
**Line:** 667:667
**Comment:**
*Possible Bug: In `finalReview`, the new `recommendation` field is required in the Zod schema but the prompt never asks the model to provide it, so `generateObject` can fail schema validation if the field is omitted, preventing summaries from being processed.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The
recommendResetfield is required in the Zod schema, but the model prompt does not explicitly guarantee it will always be returned, sogenerateObjectcan fail validation if the model omits it, causingresponseto be undefined and silently skipping the progress report logic. [possible bug]Severity Level: Major⚠️
Steps of Reproduction ✅
Prompt for AI Agent 🤖