feat(server): agent-definition loader (Phase 1)#363
Conversation
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 8 issue(s) (1 critical) (4 warning).
server/agent-loader.ts
The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: const → let without mutation, hardcoded fallback identity name, and cache key missing the server URL.
- 🔴 bugs (L14): Imports
parsefrom'yaml'but the project depends on'js-yaml'(see root package.json andserver/session-index.tswhich imports from'js-yaml'). Theyamlpackage is not installed —loadFromLocal()will fail at runtime with a module-not-found error. The test mocks theyamlmodule so this is invisible in tests. Should beimport { load as parseYaml } from 'js-yaml'(note:js-yamlexportsload, notparse).[fixable] - 🟡 unsafe_assumptions (L208): Cache key is
${agentName}:${cwd}but does not includecontexginUrl. IfloadAgentDefis ever called with differentcontexginUrlvalues for the same agent+cwd pair, the cache will return the result from the first server. Currently safe becausecontexginUrlis always the default, but fragile if the API surface expands.[fixable] - 🟡 bugs (L239): The bundled fallback always returns
DEFAULT_AGENT_DEFINITIONwhich hardcodesidentity.name: 'mitzo-conversational', regardless of theagentNameparameter. If a different agent name is requested and all sources fail, the returned definition has a mismatched name. Consider spreading the constant and overridingidentity.namewith the requestedagentName.[fixable] - 🟡 unsafe_assumptions (L127): ContexGin response fields (
identity,provider,governance,memory) are cast directly to typed interfaces viaaswithout runtime validation. A malformed ContexGin response (e.g.,identity.nameis a number) would pass the null-check but produce a silently incorrectAgentDefinition. Theidentity?.nameandidentity?.descriptiontruthy checks catch null/undefined but not wrong types.[fixable]
server/chat.ts
The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: const → let without mutation, hardcoded fallback identity name, and cache key missing the server URL.
- 🔵 style (L888):
systemPromptAppendwas changed fromconsttoletbut is never reassigned anywhere in the function. Should remainconst.[fixable] - 🟡 regressions (L920): The refactored
fetchBootContextno longer passesrepoRootwhen called from_startChatInner— it uses the defaultBASE_REPO. The old code usedcwd(which may be a worktree path). If the local Python fallback script (build_boot_context.py) needs to run from the worktree rather than BASE_REPO, this changes behavior. Verify the script works correctly when run from BASE_REPO even when the session is in a worktree.[fixable]
server/__tests__/agent-loader.test.ts
The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: const → let without mutation, hardcoded fallback identity name, and cache key missing the server URL.
- 🔵 missing_tests: No test for cache TTL expiration — a cached entry past its 5-minute TTL should trigger a re-fetch. The existing cache tests only verify basic hit/miss and
clearCache(), not time-based expiry.[fixable]
packages/harness/src/session-registry.ts
The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: const → let without mutation, hardcoded fallback identity name, and cache key missing the server URL.
- 🔵 style (L34):
agentDefinitionis typed asRecord<string, unknown>in ManagedSession but the loader produces a well-typedAgentDefinition. Consider importing and usingAgentDefinitionfrom agent-loader to preserve type safety end-to-end, instead of the double cast throughunknownat chat.ts:903.[fixable]
|
|
||
| import { readFileSync } from 'fs'; | ||
| import { join } from 'path'; | ||
| import { parse as parseYaml } from 'yaml'; |
There was a problem hiding this comment.
🔴 bugs: Imports parse from 'yaml' but the project depends on 'js-yaml' (see root package.json and server/session-index.ts which imports from 'js-yaml'). The yaml package is not installed — loadFromLocal() will fail at runtime with a module-not-found error. The test mocks the yaml module so this is invisible in tests. Should be import { load as parseYaml } from 'js-yaml' (note: js-yaml exports load, not parse). [fixable]
| contexginUrl: string = process.env.CONTEXGIN_URL || 'http://localhost:8321', | ||
| ): Promise<LoadedAgentDefinition> { | ||
| // Check cache | ||
| const cacheKey = `${agentName}:${cwd}`; |
There was a problem hiding this comment.
🟡 unsafe_assumptions: Cache key is ${agentName}:${cwd} but does not include contexginUrl. If loadAgentDef is ever called with different contexginUrl values for the same agent+cwd pair, the cache will return the result from the first server. Currently safe because contexginUrl is always the default, but fragile if the API surface expands. [fixable]
| // 3. Bundled fallback | ||
| log.info('using bundled agent definition fallback', { agent: agentName }); | ||
| const fallback: LoadedAgentDefinition = { | ||
| definition: DEFAULT_AGENT_DEFINITION, |
There was a problem hiding this comment.
🟡 bugs: The bundled fallback always returns DEFAULT_AGENT_DEFINITION which hardcodes identity.name: 'mitzo-conversational', regardless of the agentName parameter. If a different agent name is requested and all sources fail, the returned definition has a mismatched name. Consider spreading the constant and overriding identity.name with the requested agentName. [fixable]
|
|
||
| if (!res.ok) return null; | ||
|
|
||
| const data = (await res.json()) as Record<string, unknown>; |
There was a problem hiding this comment.
🟡 unsafe_assumptions: ContexGin response fields (identity, provider, governance, memory) are cast directly to typed interfaces via as without runtime validation. A malformed ContexGin response (e.g., identity.name is a number) would pass the null-check but produce a silently incorrect AgentDefinition. The identity?.name and identity?.description truthy checks catch null/undefined but not wrong types. [fixable]
|
|
||
| // Build the system prompt append string (used by both query and comparison) | ||
| const systemPromptAppend = | ||
| let systemPromptAppend = |
There was a problem hiding this comment.
🔵 style: systemPromptAppend was changed from const to let but is never reassigned anywhere in the function. Should remain const. [fixable]
| }); | ||
|
|
||
| // Fire-and-forget: fetch boot context from running ContexGin server | ||
| fetchBootContext(agentName) |
There was a problem hiding this comment.
🟡 regressions: The refactored fetchBootContext no longer passes repoRoot when called from _startChatInner — it uses the default BASE_REPO. The old code used cwd (which may be a worktree path). If the local Python fallback script (build_boot_context.py) needs to run from the worktree rather than BASE_REPO, this changes behavior. Verify the script works correctly when run from BASE_REPO even when the session is in a worktree. [fixable]
| /** All worktrees created for this session, keyed by repo name. */ | ||
| worktreePaths: Map<string, { path: string; wtId: string }>; | ||
| /** Parsed agent definition (recipe). Null if loading failed. */ | ||
| agentDefinition: Record<string, unknown> | null; |
There was a problem hiding this comment.
🔵 style: agentDefinition is typed as Record<string, unknown> in ManagedSession but the loader produces a well-typed AgentDefinition. Consider importing and using AgentDefinition from agent-loader to preserve type safety end-to-end, instead of the double cast through unknown at chat.ts:903. [fixable]
Load full agent definitions from ContexGin API, local .agents/ overrides, or bundled fallback at session start. Store parsed recipe in session registry for downstream consumption (governance, provider tiering, memory scope in future phases). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8066d4d to
cbca1e5
Compare
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 6 issue(s) (2 warning).
server/agent-loader.ts
Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.
- 🟡 unsafe_assumptions (L164): Path traversal via user-controlled
agentName:join(cwd, '.agents',${agentName}.yaml)does not validate that the resolved path stays undercwd/.agents/. An authenticated WebSocket client sendingagentName: '../../../etc/something'could read arbitrary.yamlfiles on disk. The same pattern already exists atchat.ts:825, so this is pre-existing, but worth fixing as part of this new module — e.g.resolve()the path and check it starts with the expected prefix.[fixable] - 🔵 bugs (L239): The bundled fallback always returns
DEFAULT_AGENT_DEFINITIONwithidentity.name: 'mitzo-conversational'regardless of whichagentNamewas requested. If a session starts with a non-default agent and all sources fail, the stored definition's name won't match the session'sagentName. Consider overridingidentity.namewith the requestedagentNamein the fallback path.[fixable] - 🔵 style (L165):
readFileSyncblocks the event loop inside what is otherwise a fully async resolution chain. SinceloadAgentDefis already async, consider usingreadFilefromfs/promisesand makingloadFromLocalasync to match.[fixable]
server/chat.ts
Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.
- 🟡 style (L777):
loaded.definition as unknown as Record<string, unknown>double-casts away the typedAgentDefinitioninto an untyped record. This happens becauseManagedSession.agentDefinitionis declared asRecord<string, unknown> | nullinstead of importing and usingAgentDefinition. Phase 2 consumers will have to cast it back. Consider typing the session field asAgentDefinition | null(import fromagent-loader.js) to preserve type safety end-to-end.[fixable]
packages/harness/src/session-registry.ts
Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.
- 🔵 style (L56):
agentDefinitionSourceis typed asstring?but the actual values are the union'contexgin' | 'local' | 'fallback'(AgentDefinitionSourcein agent-loader.ts). Using the concrete union type would prevent invalid values and give downstream consumers better autocomplete/exhaustiveness checking.[fixable]
server/__tests__/agent-loader.test.ts
Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.
- 🔵 missing_tests: Cache TTL expiry is untested — the tests verify cache hits and
clearCache(), but don't verify that an expired entry is re-fetched (e.g. by advancingDate.now()withvi.useFakeTimers). Also, theContexGin 200 with missing providerpath (exercising theprovider ?? { default: 'claude-opus-4' }fallback at line 146) has no test.[fixable]
| */ | ||
| function loadFromLocal(agentName: string, cwd: string): LoadedAgentDefinition | null { | ||
| try { | ||
| const filePath = join(cwd, '.agents', `${agentName}.yaml`); |
There was a problem hiding this comment.
🟡 unsafe_assumptions: Path traversal via user-controlled agentName: join(cwd, '.agents', ${agentName}.yaml) does not validate that the resolved path stays under cwd/.agents/. An authenticated WebSocket client sending agentName: '../../../etc/something' could read arbitrary .yaml files on disk. The same pattern already exists at chat.ts:825, so this is pre-existing, but worth fixing as part of this new module — e.g. resolve() the path and check it starts with the expected prefix. [fixable]
| // 3. Bundled fallback | ||
| log.info('using bundled agent definition fallback', { agent: agentName }); | ||
| const fallback: LoadedAgentDefinition = { | ||
| definition: DEFAULT_AGENT_DEFINITION, |
There was a problem hiding this comment.
🔵 bugs: The bundled fallback always returns DEFAULT_AGENT_DEFINITION with identity.name: 'mitzo-conversational' regardless of which agentName was requested. If a session starts with a non-default agent and all sources fail, the stored definition's name won't match the session's agentName. Consider overriding identity.name with the requested agentName in the fallback path. [fixable]
| function loadFromLocal(agentName: string, cwd: string): LoadedAgentDefinition | null { | ||
| try { | ||
| const filePath = join(cwd, '.agents', `${agentName}.yaml`); | ||
| const raw = readFileSync(filePath, 'utf-8'); |
There was a problem hiding this comment.
🔵 style: readFileSync blocks the event loop inside what is otherwise a fully async resolution chain. Since loadAgentDef is already async, consider using readFile from fs/promises and making loadFromLocal async to match. [fixable]
| .then((loaded) => { | ||
| const s = registry.get(clientId); | ||
| if (s) { | ||
| s.agentDefinition = loaded.definition as unknown as Record<string, unknown>; |
There was a problem hiding this comment.
🟡 style: loaded.definition as unknown as Record<string, unknown> double-casts away the typed AgentDefinition into an untyped record. This happens because ManagedSession.agentDefinition is declared as Record<string, unknown> | null instead of importing and using AgentDefinition. Phase 2 consumers will have to cast it back. Consider typing the session field as AgentDefinition | null (import from agent-loader.js) to preserve type safety end-to-end. [fixable]
| /** Parsed agent definition (recipe). Null if loading failed. */ | ||
| agentDefinition: Record<string, unknown> | null; | ||
| /** Source of the agent definition: 'contexgin' | 'local' | 'fallback'. */ | ||
| agentDefinitionSource?: string; |
There was a problem hiding this comment.
🔵 style: agentDefinitionSource is typed as string? but the actual values are the union 'contexgin' | 'local' | 'fallback' (AgentDefinitionSource in agent-loader.ts). Using the concrete union type would prevent invalid values and give downstream consumers better autocomplete/exhaustiveness checking. [fixable]
Summary
server/agent-loader.ts— three-tier recipe discovery (ContexGin API → local.agents/override → bundled fallback) with 5-minute TTL cacheagentDefinition+agentDefinitionSourceinManagedSession(session-registry)loadAgentDef()fire-and-forget into session start inchat.tsDEFAULT_AGENT_DEFINITIONbundled fallback toconstants.tsThis is Phase 1 of the agent-definition-loading spec. Future phases will consume the stored recipe for governance enforcement, provider tiering, memory scoping, and output conventions.
Test plan
agent-loader.test.ts— all sources, caching, error handling🤖 Generated with Claude Code