Skip to content

feat(server): agent-definition loader (Phase 1)#363

Open
dimakis wants to merge 1 commit into
mainfrom
session/2026-05-30-e90bfb59aa9c
Open

feat(server): agent-definition loader (Phase 1)#363
dimakis wants to merge 1 commit into
mainfrom
session/2026-05-30-e90bfb59aa9c

Conversation

@dimakis
Copy link
Copy Markdown
Owner

@dimakis dimakis commented May 30, 2026

Summary

  • Add server/agent-loader.ts — three-tier recipe discovery (ContexGin API → local .agents/ override → bundled fallback) with 5-minute TTL cache
  • Store parsed agentDefinition + agentDefinitionSource in ManagedSession (session-registry)
  • Wire loadAgentDef() fire-and-forget into session start in chat.ts
  • Add DEFAULT_AGENT_DEFINITION bundled fallback to constants.ts

This is Phase 1 of the agent-definition-loading spec. Future phases will consume the stored recipe for governance enforcement, provider tiering, memory scoping, and output conventions.

Test plan

  • 13 new tests in agent-loader.test.ts — all sources, caching, error handling
  • 45 existing tests (boot-context, session-registry) pass with no regressions
  • Manual: verify session starts with ContexGin running (source=contexgin)
  • Manual: verify session starts with ContexGin down (source=fallback)

🤖 Generated with Claude Code

Copy link
Copy Markdown
Owner Author

@dimakis dimakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centaur Review

Found 8 issue(s) (1 critical) (4 warning).

server/agent-loader.ts

The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: constlet without mutation, hardcoded fallback identity name, and cache key missing the server URL.

  • 🔴 bugs (L14): Imports parse from 'yaml' but the project depends on 'js-yaml' (see root package.json and server/session-index.ts which imports from 'js-yaml'). The yaml package is not installed — loadFromLocal() will fail at runtime with a module-not-found error. The test mocks the yaml module so this is invisible in tests. Should be import { load as parseYaml } from 'js-yaml' (note: js-yaml exports load, not parse). [fixable]
  • 🟡 unsafe_assumptions (L208): Cache key is ${agentName}:${cwd} but does not include contexginUrl. If loadAgentDef is ever called with different contexginUrl values for the same agent+cwd pair, the cache will return the result from the first server. Currently safe because contexginUrl is always the default, but fragile if the API surface expands. [fixable]
  • 🟡 bugs (L239): The bundled fallback always returns DEFAULT_AGENT_DEFINITION which hardcodes identity.name: 'mitzo-conversational', regardless of the agentName parameter. If a different agent name is requested and all sources fail, the returned definition has a mismatched name. Consider spreading the constant and overriding identity.name with the requested agentName. [fixable]
  • 🟡 unsafe_assumptions (L127): ContexGin response fields (identity, provider, governance, memory) are cast directly to typed interfaces via as without runtime validation. A malformed ContexGin response (e.g., identity.name is a number) would pass the null-check but produce a silently incorrect AgentDefinition. The identity?.name and identity?.description truthy checks catch null/undefined but not wrong types. [fixable]

server/chat.ts

The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: constlet without mutation, hardcoded fallback identity name, and cache key missing the server URL.

  • 🔵 style (L888): systemPromptAppend was changed from const to let but is never reassigned anywhere in the function. Should remain const. [fixable]
  • 🟡 regressions (L920): The refactored fetchBootContext no longer passes repoRoot when called from _startChatInner — it uses the default BASE_REPO. The old code used cwd (which may be a worktree path). If the local Python fallback script (build_boot_context.py) needs to run from the worktree rather than BASE_REPO, this changes behavior. Verify the script works correctly when run from BASE_REPO even when the session is in a worktree. [fixable]

server/__tests__/agent-loader.test.ts

The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: constlet without mutation, hardcoded fallback identity name, and cache key missing the server URL.

  • 🔵 missing_tests: No test for cache TTL expiration — a cached entry past its 5-minute TTL should trigger a re-fetch. The existing cache tests only verify basic hit/miss and clearCache(), not time-based expiry. [fixable]

packages/harness/src/session-registry.ts

The agent-loader module is well-structured with good test coverage, but has a critical bug: it imports from the yaml package which is not installed (the project uses js-yaml). The local YAML fallback path will fail at runtime, hidden by test mocks. Secondary concerns: constlet without mutation, hardcoded fallback identity name, and cache key missing the server URL.

  • 🔵 style (L34): agentDefinition is typed as Record<string, unknown> in ManagedSession but the loader produces a well-typed AgentDefinition. Consider importing and using AgentDefinition from agent-loader to preserve type safety end-to-end, instead of the double cast through unknown at chat.ts:903. [fixable]

Comment thread server/agent-loader.ts

import { readFileSync } from 'fs';
import { join } from 'path';
import { parse as parseYaml } from 'yaml';
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 bugs: Imports parse from 'yaml' but the project depends on 'js-yaml' (see root package.json and server/session-index.ts which imports from 'js-yaml'). The yaml package is not installed — loadFromLocal() will fail at runtime with a module-not-found error. The test mocks the yaml module so this is invisible in tests. Should be import { load as parseYaml } from 'js-yaml' (note: js-yaml exports load, not parse). [fixable]

Comment thread server/agent-loader.ts
contexginUrl: string = process.env.CONTEXGIN_URL || 'http://localhost:8321',
): Promise<LoadedAgentDefinition> {
// Check cache
const cacheKey = `${agentName}:${cwd}`;
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 unsafe_assumptions: Cache key is ${agentName}:${cwd} but does not include contexginUrl. If loadAgentDef is ever called with different contexginUrl values for the same agent+cwd pair, the cache will return the result from the first server. Currently safe because contexginUrl is always the default, but fragile if the API surface expands. [fixable]

Comment thread server/agent-loader.ts
// 3. Bundled fallback
log.info('using bundled agent definition fallback', { agent: agentName });
const fallback: LoadedAgentDefinition = {
definition: DEFAULT_AGENT_DEFINITION,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 bugs: The bundled fallback always returns DEFAULT_AGENT_DEFINITION which hardcodes identity.name: 'mitzo-conversational', regardless of the agentName parameter. If a different agent name is requested and all sources fail, the returned definition has a mismatched name. Consider spreading the constant and overriding identity.name with the requested agentName. [fixable]

Comment thread server/agent-loader.ts

if (!res.ok) return null;

const data = (await res.json()) as Record<string, unknown>;
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 unsafe_assumptions: ContexGin response fields (identity, provider, governance, memory) are cast directly to typed interfaces via as without runtime validation. A malformed ContexGin response (e.g., identity.name is a number) would pass the null-check but produce a silently incorrect AgentDefinition. The identity?.name and identity?.description truthy checks catch null/undefined but not wrong types. [fixable]

Comment thread server/chat.ts Outdated

// Build the system prompt append string (used by both query and comparison)
const systemPromptAppend =
let systemPromptAppend =
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 style: systemPromptAppend was changed from const to let but is never reassigned anywhere in the function. Should remain const. [fixable]

Comment thread server/chat.ts Outdated
});

// Fire-and-forget: fetch boot context from running ContexGin server
fetchBootContext(agentName)
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 regressions: The refactored fetchBootContext no longer passes repoRoot when called from _startChatInner — it uses the default BASE_REPO. The old code used cwd (which may be a worktree path). If the local Python fallback script (build_boot_context.py) needs to run from the worktree rather than BASE_REPO, this changes behavior. Verify the script works correctly when run from BASE_REPO even when the session is in a worktree. [fixable]

/** All worktrees created for this session, keyed by repo name. */
worktreePaths: Map<string, { path: string; wtId: string }>;
/** Parsed agent definition (recipe). Null if loading failed. */
agentDefinition: Record<string, unknown> | null;
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 style: agentDefinition is typed as Record<string, unknown> in ManagedSession but the loader produces a well-typed AgentDefinition. Consider importing and using AgentDefinition from agent-loader to preserve type safety end-to-end, instead of the double cast through unknown at chat.ts:903. [fixable]

Load full agent definitions from ContexGin API, local .agents/ overrides,
or bundled fallback at session start. Store parsed recipe in session
registry for downstream consumption (governance, provider tiering, memory
scope in future phases).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dimakis dimakis force-pushed the session/2026-05-30-e90bfb59aa9c branch from 8066d4d to cbca1e5 Compare May 30, 2026 11:27
Copy link
Copy Markdown
Owner Author

@dimakis dimakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centaur Review

Found 6 issue(s) (2 warning).

server/agent-loader.ts

Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.

  • 🟡 unsafe_assumptions (L164): Path traversal via user-controlled agentName: join(cwd, '.agents', ${agentName}.yaml) does not validate that the resolved path stays under cwd/.agents/. An authenticated WebSocket client sending agentName: '../../../etc/something' could read arbitrary .yaml files on disk. The same pattern already exists at chat.ts:825, so this is pre-existing, but worth fixing as part of this new module — e.g. resolve() the path and check it starts with the expected prefix. [fixable]
  • 🔵 bugs (L239): The bundled fallback always returns DEFAULT_AGENT_DEFINITION with identity.name: 'mitzo-conversational' regardless of which agentName was requested. If a session starts with a non-default agent and all sources fail, the stored definition's name won't match the session's agentName. Consider overriding identity.name with the requested agentName in the fallback path. [fixable]
  • 🔵 style (L165): readFileSync blocks the event loop inside what is otherwise a fully async resolution chain. Since loadAgentDef is already async, consider using readFile from fs/promises and making loadFromLocal async to match. [fixable]

server/chat.ts

Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.

  • 🟡 style (L777): loaded.definition as unknown as Record<string, unknown> double-casts away the typed AgentDefinition into an untyped record. This happens because ManagedSession.agentDefinition is declared as Record<string, unknown> | null instead of importing and using AgentDefinition. Phase 2 consumers will have to cast it back. Consider typing the session field as AgentDefinition | null (import from agent-loader.js) to preserve type safety end-to-end. [fixable]

packages/harness/src/session-registry.ts

Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.

  • 🔵 style (L56): agentDefinitionSource is typed as string? but the actual values are the union 'contexgin' | 'local' | 'fallback' (AgentDefinitionSource in agent-loader.ts). Using the concrete union type would prevent invalid values and give downstream consumers better autocomplete/exhaustiveness checking. [fixable]

server/__tests__/agent-loader.test.ts

Solid Phase 1 with good test coverage and graceful degradation; main concerns are type erasure through Record<string, unknown> on the session interface (will compound in Phase 2) and an inherited path-traversal pattern on the user-controlled agentName.

  • 🔵 missing_tests: Cache TTL expiry is untested — the tests verify cache hits and clearCache(), but don't verify that an expired entry is re-fetched (e.g. by advancing Date.now() with vi.useFakeTimers). Also, the ContexGin 200 with missing provider path (exercising the provider ?? { default: 'claude-opus-4' } fallback at line 146) has no test. [fixable]

Comment thread server/agent-loader.ts
*/
function loadFromLocal(agentName: string, cwd: string): LoadedAgentDefinition | null {
try {
const filePath = join(cwd, '.agents', `${agentName}.yaml`);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 unsafe_assumptions: Path traversal via user-controlled agentName: join(cwd, '.agents', ${agentName}.yaml) does not validate that the resolved path stays under cwd/.agents/. An authenticated WebSocket client sending agentName: '../../../etc/something' could read arbitrary .yaml files on disk. The same pattern already exists at chat.ts:825, so this is pre-existing, but worth fixing as part of this new module — e.g. resolve() the path and check it starts with the expected prefix. [fixable]

Comment thread server/agent-loader.ts
// 3. Bundled fallback
log.info('using bundled agent definition fallback', { agent: agentName });
const fallback: LoadedAgentDefinition = {
definition: DEFAULT_AGENT_DEFINITION,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 bugs: The bundled fallback always returns DEFAULT_AGENT_DEFINITION with identity.name: 'mitzo-conversational' regardless of which agentName was requested. If a session starts with a non-default agent and all sources fail, the stored definition's name won't match the session's agentName. Consider overriding identity.name with the requested agentName in the fallback path. [fixable]

Comment thread server/agent-loader.ts
function loadFromLocal(agentName: string, cwd: string): LoadedAgentDefinition | null {
try {
const filePath = join(cwd, '.agents', `${agentName}.yaml`);
const raw = readFileSync(filePath, 'utf-8');
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 style: readFileSync blocks the event loop inside what is otherwise a fully async resolution chain. Since loadAgentDef is already async, consider using readFile from fs/promises and making loadFromLocal async to match. [fixable]

Comment thread server/chat.ts
.then((loaded) => {
const s = registry.get(clientId);
if (s) {
s.agentDefinition = loaded.definition as unknown as Record<string, unknown>;
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 style: loaded.definition as unknown as Record<string, unknown> double-casts away the typed AgentDefinition into an untyped record. This happens because ManagedSession.agentDefinition is declared as Record<string, unknown> | null instead of importing and using AgentDefinition. Phase 2 consumers will have to cast it back. Consider typing the session field as AgentDefinition | null (import from agent-loader.js) to preserve type safety end-to-end. [fixable]

/** Parsed agent definition (recipe). Null if loading failed. */
agentDefinition: Record<string, unknown> | null;
/** Source of the agent definition: 'contexgin' | 'local' | 'fallback'. */
agentDefinitionSource?: string;
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 style: agentDefinitionSource is typed as string? but the actual values are the union 'contexgin' | 'local' | 'fallback' (AgentDefinitionSource in agent-loader.ts). Using the concrete union type would prevent invalid values and give downstream consumers better autocomplete/exhaustiveness checking. [fixable]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant