Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions src/mastra/a2a/a2aCoordinatorAgent.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import type { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
import type { GoogleLanguageModelOptions } from '@ai-sdk/google'
import { Agent } from '@mastra/core/agent'
import {
createAnswerRelevancyScorer,
Expand All @@ -16,7 +16,6 @@ import { editorAgent } from '../agents/editorAgent'
import { knowledgeIndexingAgent } from '../agents/knowledgeIndexingAgent'
import { projectManagementAgent } from '../agents/projectManagementAgent'
import { researchAgent } from '../agents/researchAgent'
import { googleAI, googleAIFlashLite } from '../config/google'
import { pgMemory } from '../config/pg-storage'
import { repoIngestionWorkflow } from '../workflows/repo-ingestion-workflow'
import { researchSynthesisWorkflow } from '../workflows/research-synthesis-workflow'
Expand Down Expand Up @@ -96,15 +95,14 @@ Use knowledgeIndexingAgent to provide semantic context for complex queries.
thinkingConfig: {
thinkingLevel: 'high',
includeThoughts: true,
thinkingBudget: -1,
},
mediaResolution: 'MEDIA_RESOLUTION_MEDIUM',
responseModalities: ['TEXT', 'IMAGE'],
} satisfies GoogleGenerativeAIProviderOptions,
} satisfies GoogleLanguageModelOptions,
},
}
},
model: googleAI,
model: 'google/gemini-3.1-flash-preview',
memory: pgMemory,
options: {},
agents: {
Expand All @@ -128,11 +126,13 @@ Use knowledgeIndexingAgent to provide semantic context for complex queries.
tools: {},
scorers: {
relevancy: {
scorer: createAnswerRelevancyScorer({ model: googleAIFlashLite }),
scorer: createAnswerRelevancyScorer({ model:
'google/gemini-3.1-flash-lite-preview'
}),
sampling: { type: 'ratio', rate: 0.4 },
},
safety: {
scorer: createToxicityScorer({ model: googleAIFlashLite }),
scorer: createToxicityScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.3 },
},
},
Expand Down
40 changes: 8 additions & 32 deletions src/mastra/a2a/codingA2ACoordinator.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,32 +4,17 @@ import {
createToxicityScorer,
} from '@mastra/evals/scorers/prebuilt'

import { googleAIFlashLite } from '../config/google'
import { log } from '../config/logger'
import { pgMemory } from '../config/pg-storage'

import type { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
import { google, type GoogleLanguageModelOptions } from '@ai-sdk/google'
import { InternalSpans } from '@mastra/core/observability'
import {
codeArchitectAgent,
codeReviewerAgent,
refactoringAgent,
testEngineerAgent,
} from '../agents/codingAgents'
import {
checkFileExists,
createDirectory,
createSandbox,
deleteFile,
getFileInfo,
getFileSize,
listFiles,
runCode,
runCommand,
watchDirectory,
writeFile,
writeFiles,
} from '../tools/e2b'
import { automatedReportingWorkflow } from '../workflows/automated-reporting-workflow'
import { dataAnalysisWorkflow } from '../workflows/data-analysis-workflow'
import { financialReportWorkflow } from '../workflows/financial-report-workflow'
Expand Down Expand Up @@ -197,11 +182,11 @@ When a user's request requires prolonged, structured work across multiple subtas
},
mediaResolution: 'MEDIA_RESOLUTION_MEDIUM',
responseModalities: ['TEXT', 'IMAGE'],
} satisfies GoogleGenerativeAIProviderOptions,
} satisfies GoogleLanguageModelOptions,
},
}
},
model: googleAIFlashLite,
model: 'google/gemini-3.1-flash-preview',
memory: pgMemory,

agents: {
Expand All @@ -222,18 +207,9 @@ When a user's request requires prolonged, structured work across multiple subtas
automatedReportingWorkflow,
},
tools: {
createSandbox,
writeFile,
writeFiles,
listFiles,
deleteFile,
createDirectory,
getFileInfo,
checkFileExists,
getFileSize,
watchDirectory,
runCommand,
runCode,
google_search: google.tools.googleSearch({}),
url_context: google.tools.urlContext({}),
code_execution: google.tools.codeExecution({}),
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This coordinator’s instructions heavily emphasize using E2B sandbox tools, but the tools map now only exposes Google tools (google_search, url_context, code_execution). Update the instructions to reflect the available execution mechanism, or re-add the sandbox tools so the coordinator can actually perform the described verification steps.

Suggested change
code_execution: google.tools.codeExecution({}),
code_execution: google.tools.codeExecution({}),
// Aliases to support "E2B sandbox" style instructions while using Google's code execution.
e2b_sandbox: google.tools.codeExecution({}),
e2b_code_execution: google.tools.codeExecution({}),

Copilot uses AI. Check for mistakes.
},
Comment on lines 209 to 213
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

E2B sandbox tools replaced with Google tools, but instructions still reference E2B extensively.

The tools configuration now provides google_search, url_context, and code_execution from Google's SDK. However, the agent instructions (lines 108-118, 136-140, 162-165, 173-174) contain extensive E2B sandbox documentation including:

  • "createSandbox", "writeFiles", "runCommand", "readFile", "deleteFile"
  • "Use E2B sandboxes for any execution-related tasks"
  • "verifying in E2B sandbox", "run tests in sandbox"

The google.tools.codeExecution({}) tool is not equivalent to E2B—it provides isolated code execution within Google's infrastructure but lacks filesystem operations (writeFiles, readFile, deleteFile) and sandbox lifecycle management (createSandbox).

This mismatch will cause the coordinator to generate plans referencing unavailable tools.

🐛 Instructions need significant updates to reflect new tool capabilities

Key sections to update:

  • Lines 108-110: Remove E2B sandbox pattern references
  • Lines 112-118: Update "Sandbox Code Execution" pattern to reflect code_execution limitations
  • Lines 135-140: Remove E2B sandbox usage documentation
  • Lines 162-163: Update refactoring verification approach
  • Line 165: Update to reflect that code_execution doesn't support sandbox test runs

The instructions should document what google_search, url_context, and code_execution actually provide instead of E2B capabilities.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mastra/a2a/codingA2ACoordinator.ts` around lines 209 - 213, The agent
instructions still describe E2B sandbox APIs (createSandbox, writeFiles,
readFile, deleteFile, run tests in sandbox) but the tools config now uses
google.tools.googleSearch, google.tools.urlContext and
google.tools.codeExecution; update the instructional text in
codingA2ACoordinator.ts to remove E2B-specific patterns and replace them with
accurate capabilities: describe that google_search provides web search/links,
url_context fetches page context/metadata, and code_execution runs isolated code
snippets only (no persistent filesystem, no create/delete sandbox lifecycle, no
multi-file writes or running test suites that expect file I/O). Specifically,
remove references to createSandbox/writeFiles/readFile/deleteFile and any “use
E2B sandbox” guidance, change the “Sandbox Code Execution” and refactoring
verification sections to instruct using code_execution for short-lived snippet
runs and to rely on external CI or mocked tests for multi-file or
filesystem-dependent test runs, and update examples and planning templates to
avoid suggesting unavailable APIs.

maxRetries: 5,
options: {
Expand All @@ -244,11 +220,11 @@ When a user's request requires prolonged, structured work across multiple subtas

scorers: {
relevancy: {
scorer: createAnswerRelevancyScorer({ model: googleAIFlashLite }),
scorer: createAnswerRelevancyScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.4 },
},
safety: {
scorer: createToxicityScorer({ model: googleAIFlashLite }),
scorer: createToxicityScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.3 },
},
},
Expand Down
4 changes: 2 additions & 2 deletions src/mastra/agents/academicResearchAgent.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import type { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
import type { GoogleLanguageModelOptions } from '@ai-sdk/google'
import { Agent } from '@mastra/core/agent'
import type { RequestContext } from '@mastra/core/request-context'
import type { AgentRequestContext } from './request-context'
Expand Down Expand Up @@ -106,7 +106,7 @@ ${
includeThoughts: true,
thinkingLevel: 'high',
},
} satisfies GoogleGenerativeAIProviderOptions,
} satisfies GoogleLanguageModelOptions,
},
}
},
Expand Down
20 changes: 3 additions & 17 deletions src/mastra/agents/acpAgent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,6 @@ import { Agent } from '@mastra/core/agent'
import { pgMemory, pgQueryTool } from '../config'
import { arxivTool } from '../tools/arxiv.tool'
import { csvToJsonTool } from '../tools/csv-to-json.tool'
import {
createDataDirTool,
getDataFileInfoTool,
listDataDirTool,
moveDataFileTool,
searchDataFilesTool,
writeDataFileTool,
} from '../tools/data-file-manager'
import {
csvToExcalidrawTool,
readCSVDataTool,
Expand All @@ -25,7 +17,7 @@ import {
} from '../tools/github'
import { jsonToCsvTool } from '../tools/json-to-csv.tool'
import { pdfToMarkdownTool } from '../tools/pdf-data-conversion.tool'
import type { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
import type { GoogleLanguageModelOptions } from '@ai-sdk/google'
import type { RequestContext } from '@mastra/core/request-context'
import { PGVECTOR_PROMPT } from '@mastra/pg'
import { TokenLimiterProcessor } from '@mastra/core/processors'
Expand Down Expand Up @@ -74,10 +66,10 @@ User: ${userId} | Role: ${roleConstraint}
google: {
thinkingConfig: {
includeThoughts: true,
thinkingBudget: -1,
thinkingLevel: 'medium',
},
responseModalities: ['TEXT'],
} satisfies GoogleGenerativeAIProviderOptions,
} satisfies GoogleLanguageModelOptions,
},
}
},
Expand All @@ -96,12 +88,6 @@ User: ${userId} | Role: ${roleConstraint}
csvToExcalidrawTool,
readCSVDataTool,
// convertDataFormatTool,
writeDataFileTool,
listDataDirTool,
searchDataFilesTool,
moveDataFileTool,
getDataFileInfoTool,
createDataDirTool,
searchCode,
getFileContent,
getRepositoryInfo,
Expand Down
58 changes: 9 additions & 49 deletions src/mastra/agents/codingAgents.ts
Original file line number Diff line number Diff line change
@@ -1,39 +1,20 @@
import { Agent } from '@mastra/core/agent'

import type { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
import type { GoogleGenerativeAIProviderOptions, GoogleLanguageModelOptions } from '@ai-sdk/google'
import {
TokenLimiterProcessor
} from '@mastra/core/processors'
import {
createAnswerRelevancyScorer,
createToxicityScorer,
} from '@mastra/evals/scorers/prebuilt'
import {
google3,
googleAI3,
googleAIFlashLite,
} from '../config/google'
import { log } from '../config/logger'

import { InternalSpans } from '@mastra/core/observability'
import { pgMemory } from '../config/pg-storage'
import { codeAnalysisTool } from '../tools/code-analysis.tool'
import { codeSearchTool } from '../tools/code-search.tool'
import { diffReviewTool } from '../tools/diff-review.tool'
import {
checkFileExists,
createDirectory,
createSandbox,
deleteFile,
getFileInfo,
getFileSize,
listFiles,
runCode,
runCommand,
watchDirectory,
writeFile,
writeFiles,
} from '../tools/e2b'
import { findReferencesTool } from '../tools/find-references.tool'
import { findSymbolTool } from '../tools/find-symbol.tool'
import {
Expand Down Expand Up @@ -87,34 +68,13 @@ const codeReviewerTools = {
const testEngineerTools = {
codeAnalysisTool,
testGeneratorTool,
createSandbox,
runCommand,
runCode,
writeFile,
writeFiles,
deleteFile,
listFiles,
getFileInfo,
getFileSize,
checkFileExists,

Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testEngineerTools/refactoringTools no longer expose the E2B sandbox/file tools, but the agent prompts later in this file still instruct using writeFiles, runCommand, sandboxes, etc. Either restore the removed tools or update the instructions to match the current toolset so the agents can execute their own workflow steps.

Suggested change
codeSearchTool,
multiStringEditTool,
searchCode,
getFileContent,
getRepositoryInfo,

Copilot uses AI. Check for mistakes.
}
Comment on lines 68 to 72
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't remove the execution backend while these agents still promise sandbox verification.

Both tool maps now omit every execution/sandbox primitive, but testEngineerAgent still instructs the model to createSandbox, writeFiles, and runCommand, refactoringAgent still tells it to verify changes in a sandbox, and src/mastra/a2a/codingA2ACoordinator.ts:103-119,135-141 still routes work assuming those capabilities exist. After this change, the agents can generate tests/diffs, but they can no longer perform the verification step they advertise.

Either wire in the replacement execution tool in this PR or remove the sandbox-verification promises from the prompts/coordinator in the same pass.

Also applies to: 74-81

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mastra/agents/codingAgents.ts` around lines 68 - 72, The agents advertise
sandbox verification but the tool map testEngineerTools no longer exposes any
execution primitives; either re-add or wire a replacement execution tool (e.g.,
an ExecutionTool/TestRunner with methods createSandbox, writeFiles, runCommand)
into testEngineerTools alongside codeAnalysisTool and testGeneratorTool and
update testEngineerAgent and refactoringAgent to call that tool’s methods, or
remove all sandbox-verification wording and coordinator routing that expects
createSandbox/writeFiles/runCommand from testEngineerAgent, refactoringAgent and
the coordinator logic so prompts and routing match the available tools. Ensure
references to createSandbox, writeFiles and runCommand in testEngineerAgent,
refactoringAgent and the coordinator are made consistent with whichever approach
you choose.


const refactoringTools = {
codeAnalysisTool,
diffReviewTool,
multiStringEditTool,
createSandbox,
runCode,
runCommand,
writeFile,
writeFiles,
deleteFile,
listFiles,
getFileInfo,
getFileSize,
checkFileExists,
createDirectory,
watchDirectory,
searchCode,
getFileContent,
getRepositoryInfo,
Expand Down Expand Up @@ -178,19 +138,19 @@ Always consider maintainability, scalability, and testability in your recommenda
responseModalities: ['TEXT'],
cachedContent:
'Repo Name, Description, Key Modules, Recent Commits',
} satisfies GoogleGenerativeAIProviderOptions,
} satisfies GoogleLanguageModelOptions,
},
}
},
model: ({ requestContext }) => {
const userTier = getUserTierFromContext(requestContext)
return userTier === 'enterprise' ? googleAI3 : google3
return userTier === 'enterprise' ? 'google/gemini-3.1-flash-preview' : 'google/gemini-3.1-flash-lite-preview'
},
tools: codeArchitectTools,
memory: pgMemory,
scorers: {
relevancy: {
scorer: createAnswerRelevancyScorer({ model: googleAIFlashLite }),
scorer: createAnswerRelevancyScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.5 },
},
},
Expand Down Expand Up @@ -304,11 +264,11 @@ Be constructive and educational in feedback.`,
},
scorers: {
relevancy: {
scorer: createAnswerRelevancyScorer({ model: googleAIFlashLite }),
scorer: createAnswerRelevancyScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.5 },
},
safety: {
scorer: createToxicityScorer({ model: googleAIFlashLite }),
scorer: createToxicityScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.3 },
},
},
Expand Down Expand Up @@ -427,7 +387,7 @@ Always use Vitest syntax: describe, it, expect, vi.mock, vi.fn.`,
},
scorers: {
relevancy: {
scorer: createAnswerRelevancyScorer({ model: googleAIFlashLite }),
scorer: createAnswerRelevancyScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.5 },
},
},
Expand Down Expand Up @@ -554,7 +514,7 @@ For each refactoring:
},
scorers: {
relevancy: {
scorer: createAnswerRelevancyScorer({ model: googleAIFlashLite }),
scorer: createAnswerRelevancyScorer({ model: 'google/gemini-3.1-flash-lite-preview' }),
sampling: { type: 'ratio', rate: 0.5 },
},
},
Expand Down
8 changes: 5 additions & 3 deletions src/mastra/agents/contentStrategistAgent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,9 @@ import { Agent } from '@mastra/core/agent'
import { pgMemory } from '../config/pg-storage'
import {
scrapingSchedulerTool,
webScraperTool,
} from '../tools/web-scraper-tool'

import type { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
import { google, type GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent type and thinking config compared to other agents.

This agent imports GoogleGenerativeAIProviderOptions and uses thinkingBudget: -1, while other agents in this PR (e.g., reportAgent, acpAgent, editorAgent) were updated to use GoogleLanguageModelOptions with thinkingLevel: 'medium'.

♻️ Suggested fix for consistency
-import { google, type GoogleGenerativeAIProviderOptions } from '@ai-sdk/google'
+import { google, type GoogleLanguageModelOptions } from '@ai-sdk/google'
       providerOptions: {
         google: {
           thinkingConfig: {
             includeThoughts: true,
-            thinkingBudget: -1,
+            thinkingLevel: 'medium',
           },
           mediaResolution: 'MEDIA_RESOLUTION_MEDIUM',
           responseModalities: ['TEXT'],
-        } satisfies GoogleGenerativeAIProviderOptions,
+        } satisfies GoogleLanguageModelOptions,
       },

Also applies to: 132-141

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mastra/agents/contentStrategistAgent.ts` at line 7, The agent is using
the old GoogleGenerativeAIProviderOptions type and thinkingBudget:-1, which is
inconsistent with other agents; update the import to use
GoogleLanguageModelOptions from '@ai-sdk/google', change any provider/config
typing from GoogleGenerativeAIProviderOptions to GoogleLanguageModelOptions, and
replace the thinkingBudget setting with thinkingLevel: 'medium' (apply the same
change where the provider/config is constructed in the ContentStrategistAgent
class or its create/config function and in the block corresponding to the
previous thinkingBudget usage).

import { InternalSpans } from '@mastra/core/observability'
import { TokenLimiterProcessor } from '@mastra/core/processors'
import {
Expand All @@ -19,6 +18,7 @@ import {
createToneScorer,
} from '../evals/scorers/prebuilt'
import { chartSupervisorTool } from '../tools/financial-chart-tools'
import { fetchTool } from '../tools/fetch.tool'

const STAGGERED_OUTPUT_CONTEXT_KEY = 'staggeredOutput' as const
const SECTION_COUNT_CONTEXT_KEY = 'sectionCount' as const
Expand Down Expand Up @@ -79,9 +79,11 @@ function getBackupDataToolsFromContext(requestContext: {
}

const contentStrategistTools = {
webScraperTool,
fetchTool,
chartSupervisorTool,
scrapingSchedulerTool,
google_search: google.tools.googleSearch({}),
url_context: google.tools.urlContext({}),
}
Comment on lines 81 to 87
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contentStrategistTools no longer includes webScraperTool, but the system prompt still instructs “Use 'webScraperTool' for trends, audience, and competitors.” Update the prompt to reference fetchTool/google_search (or re-add webScraperTool) to avoid tool-call failures and confusion.

Copilot uses AI. Check for mistakes.

export const contentStrategistAgent = new Agent({
Expand Down
Loading
Loading