Skip to content

feat: add google discovery api backends#126

Open
ian-pascoe wants to merge 5 commits into
mainfrom
feat/google-discovery-apis
Open

feat: add google discovery api backends#126
ian-pascoe wants to merge 5 commits into
mainfrom
feat/google-discovery-apis

Conversation

@ian-pascoe

@ian-pascoe ian-pascoe commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds first-class Google Discovery API Caplets alongside the existing MCP, OpenAPI, GraphQL, HTTP, CLI, and Caplet-set backends.

What changed

  • Added googleDiscoveryApis config and googleDiscoveryApi Markdown Caplet frontmatter support, including generated JSON schemas and docs references.
  • Implemented native Google Discovery parsing, operation filtering, request construction, OAuth scope inference, and Google media upload/download handling.
  • Added shared media artifact handling for HTTP-like responses and wired HTTP/OpenAPI/Google Discovery through it.
  • Wired Google Discovery through engine dispatch, progressive/direct/native exposure, nested Caplet sets, CLI add/list/auth flows, remote add, doctor/setup/completion discovery, and runtime planning.
  • Added docs, architecture notes, troubleshooting guidance, and a changeset for the public behavior.

Validation

  • pnpm verify

Summary by CodeRabbit

Release Notes

  • New Features
    • Introduced Google Discovery API as a first-class backend, generating tools from Discovery documents with include/exclude operation filtering, OAuth scope inference, and media upload/download support.
    • Added caplets add google-discovery to create Discovery-backed caplets from local or remote inputs.
    • Added googleDiscoveryApis configuration for managing multiple Discovery-backed caplets.
  • Documentation
    • Updated capabilities/reference docs, added troubleshooting for auth scope mismatches and artifact-based downloads, and introduced an ADR for media artifacts.
  • Bug Fixes
    • Strengthened OAuth scope matching for Discovery-backed login/refresh flows.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 14abb5c9-8c87-46e6-a777-5ca687a2d7f0

📥 Commits

Reviewing files that changed from the base of the PR and between 8f39ac6 and 4d6c21b.

📒 Files selected for processing (5)
  • packages/core/src/google-discovery/manager.ts
  • packages/core/src/google-discovery/operations.ts
  • packages/core/src/media/input.ts
  • packages/core/test/google-discovery.test.ts
  • packages/core/test/media-artifacts.test.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • packages/core/src/media/input.ts
  • packages/core/src/google-discovery/operations.ts
  • packages/core/test/google-discovery.test.ts
  • packages/core/test/media-artifacts.test.ts

📝 Walkthrough

Walkthrough

This PR adds Google Discovery backend support, shared media-artifact handling for non-inline HTTP-like responses, and the required config, auth, CLI, runtime, and test wiring across Caplets.

Changes

Google Discovery backend

Layer / File(s) Summary
Docs, specs, and schema contracts
\.changeset/*, CONTEXT.md, apps/docs/src/content/docs/..., docs/adr/*, docs/architecture.md, docs/plans/*, docs/specs/*, schemas/*, apps/landing/public/*
Adds release notes, glossary and troubleshooting entries, backend and media artifact specifications, implementation plan documents, and schema definitions for googleDiscoveryApi and googleDiscoveryApis.
Config models and caplet loading
packages/core/src/config*.ts, packages/core/src/config/paths.ts, packages/core/src/caplet-files-bundle.ts, packages/core/src/caplet-source/parse.ts, packages/core/src/registry.ts, packages/core/test/config.test.ts, packages/core/test/caplet-source.test.ts, packages/core/test/fixtures/google-discovery/*
Introduces Google Discovery config types and validation, adds caplet frontmatter loading and path normalization, includes the backend in registry and source parsing, and validates those paths and schemas in tests.
Shared media artifact response handling
packages/core/src/media/*, packages/core/src/http/response.ts, packages/core/src/http-actions.ts, packages/core/src/openapi.ts, packages/core/test/media-artifacts.test.ts, packages/core/test/http-actions.test.ts, packages/core/test/openapi.test.ts
Adds artifact directory helpers, artifact read/write and media input utilities, a shared HTTP-like response reader that writes large or binary bodies as artifacts, and updates HTTP/OpenAPI managers and tests to use it.
Google Discovery parsing and execution
packages/core/src/google-discovery/*, packages/core/test/google-discovery.test.ts
Adds typed Discovery document parsing, schema conversion, operation filtering and scope aggregation, safe request construction, and a manager that lists, describes, and calls tools including media upload and artifact-based download flows.
Engine, tools, CLI, and auth wiring
packages/core/src/auth.ts, packages/core/src/cli*.ts, packages/core/src/engine.ts, packages/core/src/runtime*.ts, packages/core/src/tools.ts, packages/core/src/caplet-sets.ts, packages/core/src/cloud/runtime-adapter.ts, packages/core/src/native/service.ts, packages/core/src/runtime-plan/planner.ts, packages/core/src/serve/http.ts, packages/core/src/remote-control/dispatch.ts, packages/core/test/auth.test.ts, packages/core/test/cli*.ts, packages/core/test/engine.test.ts, packages/core/test/caplet-sets.test.ts, packages/core/test/native.test.ts, packages/core/test/tools.test.ts, packages/core/test/exposure-discovery.test.ts, packages/core/test/remote-control-dispatch.test.ts
Wires Google Discovery through engine and child runtimes, tool dispatch, worker-safe planning, CLI add/list/setup/completion flows, remote add dispatch, OAuth target resolution, and the related tests for listing, completion, setup, auth, nested runtimes, and remote control.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant Engine
  participant GoogleDiscoveryManager
  participant GoogleAPI

  CLI->>Engine: call_tool for Google Discovery caplet
  Engine->>GoogleDiscoveryManager: callTool(api, toolName, args)
  GoogleDiscoveryManager->>GoogleAPI: fetch discovery or API request
  GoogleAPI-->>GoogleDiscoveryManager: discovery doc / HTTP response
  GoogleDiscoveryManager-->>Engine: structured content
  Engine-->>CLI: tool result
Loading
sequenceDiagram
  participant HttpActionManager
  participant readHttpLikeResponse
  participant writeMediaArtifact
  participant ArtifactStore

  HttpActionManager->>readHttpLikeResponse: parse downstream Response
  readHttpLikeResponse->>writeMediaArtifact: persist non-inline body bytes
  writeMediaArtifact->>ArtifactStore: write artifact file and metadata
  ArtifactStore-->>readHttpLikeResponse: MediaArtifact
  readHttpLikeResponse-->>HttpActionManager: response envelope with body.artifact
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~110 minutes

Possibly related PRs

  • spiritledsoftware/caplets#21: Extends the shared generic OAuth auth plumbing that this PR uses for Google Discovery scope resolution and token bundle checks.
  • spiritledsoftware/caplets#27: Touches the same handleServerTool/backendFor dispatch path that now routes Google Discovery managers.
  • spiritledsoftware/caplets#52: Updates caplet-sets child-runtime wiring in the same area that now instantiates GoogleDiscoveryManager.

Poem

🐇 I found a map in Discovery's glade,
with tools and scopes in careful braid.
Big bytes hopped off to artifact burrows,
while tiny JSON stayed sans sorrows.
I thumped the CLI, “new paths, hooray!”
and nibble-tested every way.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/google-discovery-apis

@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Preview Deployed

Landing: https://pr-126.preview.caplets.dev
Docs: https://docs.pr-126.preview.caplets.dev

Built from commit 1a164d1

@ian-pascoe ian-pascoe changed the title [codex] Add Google Discovery API backends feat: add google discovery api backends Jun 16, 2026
@ian-pascoe ian-pascoe marked this pull request as ready for review June 16, 2026 19:05

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
packages/core/src/media/input.ts (1)

50-61: 💤 Low value

Artifact file is read twice: once for hashing, once for bytes.

resolveMediaArtifact reads the entire file to compute the SHA-256 hash, then discards the bytes. Immediately after, readMediaFile reads the same file again to return the content. For large artifacts this doubles I/O.

Consider having resolveMediaArtifact optionally return the bytes it already read, or deferring the hash computation to callers that need it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/media/input.ts` around lines 50 - 61, The code reads the
artifact file twice: once inside resolveMediaArtifact for hashing, and again via
readMediaFile for the bytes. Modify resolveMediaArtifact to optionally return
the file bytes it already read during hash computation along with the artifact
metadata, then update the code in this section to use those returned bytes
instead of calling readMediaFile separately. This eliminates the duplicate file
I/O and improves performance for large artifacts.
docs/plans/2026-06-16-google-discovery-api-backend-implementation.md (1)

1-2152: ⚡ Quick win

Please confirm this implementation plan is intended to be committed long-term.

This looks like a short-lived execution checklist; if it’s not explicitly required as a durable artifact, it should be removed after implementation (or repurposed into durable docs) to avoid stale guidance.

As per coding guidelines, “Avoid committing short-lived implementation plans or design specs unless explicitly requested; if needed, put them in docs/plans/ or docs/specs/ and delete or repurpose once superseded.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/2026-06-16-google-discovery-api-backend-implementation.md` around
lines 1 - 2152, The comment asks whether this implementation plan document
should be committed as permanent reference material or removed after
implementation completion. This is a documentation lifecycle decision, not a
code defect. Decide whether to keep the plan at
`docs/plans/2026-06-16-google-discovery-api-backend-implementation.md` as a
durable architecture/implementation reference for future maintainers, or delete
it after implementation is complete and migrate essential long-term guidance
into `docs/architecture.md` and the public reference documentation in
`apps/docs/`. If retaining the plan, add a note at the top clarifying its
purpose (for example, "This document serves as both execution guidance and
permanent architectural record for the Google Discovery API backend design
decisions and implementation approach"). If treating it as temporary, file a
cleanup task after implementation to remove the file and ensure all enduring
patterns are documented in the appropriate permanent reference materials.

Source: Coding guidelines

packages/core/src/caplet-source/parse.ts (1)

138-138: 💤 Low value

Unnecessary fallback ?? {} is inconsistent with other backends.

The CapletsConfig type declares googleDiscoveryApis as a required property (not optional), so the ?? {} fallback is unnecessary and inconsistent with how the other backend records on lines 136-137 and 139-143 are accessed.

♻️ Suggested fix for consistency
-    ...Object.values(config.googleDiscoveryApis ?? {}),
+    ...Object.values(config.googleDiscoveryApis),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/caplet-source/parse.ts` at line 138, The line accessing
config.googleDiscoveryApis contains an unnecessary `?? {}` fallback that is
inconsistent with other backend record accesses. Since the CapletsConfig type
declares googleDiscoveryApis as a required property (not optional), remove the
`?? {}` fallback and change the expression to directly use
Object.values(config.googleDiscoveryApis) to match the pattern used for the
other backend records accessed nearby.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/http/response.ts`:
- Around line 89-95: The rejectOversizedContentLength function throws an error
when content length exceeds the limit but does not cancel the response body
stream first, leaving it unconsumed and degrading HTTP connection reuse. Before
throwing the responseExceededLimit error in rejectOversizedContentLength, call
response.body?.cancel() to properly clean up the stream. This aligns with the
established pattern already used in readInlineCandidate and readBoundedBytes
functions when they reject oversized content during streaming.

---

Nitpick comments:
In `@docs/plans/2026-06-16-google-discovery-api-backend-implementation.md`:
- Around line 1-2152: The comment asks whether this implementation plan document
should be committed as permanent reference material or removed after
implementation completion. This is a documentation lifecycle decision, not a
code defect. Decide whether to keep the plan at
`docs/plans/2026-06-16-google-discovery-api-backend-implementation.md` as a
durable architecture/implementation reference for future maintainers, or delete
it after implementation is complete and migrate essential long-term guidance
into `docs/architecture.md` and the public reference documentation in
`apps/docs/`. If retaining the plan, add a note at the top clarifying its
purpose (for example, "This document serves as both execution guidance and
permanent architectural record for the Google Discovery API backend design
decisions and implementation approach"). If treating it as temporary, file a
cleanup task after implementation to remove the file and ensure all enduring
patterns are documented in the appropriate permanent reference materials.

In `@packages/core/src/caplet-source/parse.ts`:
- Line 138: The line accessing config.googleDiscoveryApis contains an
unnecessary `?? {}` fallback that is inconsistent with other backend record
accesses. Since the CapletsConfig type declares googleDiscoveryApis as a
required property (not optional), remove the `?? {}` fallback and change the
expression to directly use Object.values(config.googleDiscoveryApis) to match
the pattern used for the other backend records accessed nearby.

In `@packages/core/src/media/input.ts`:
- Around line 50-61: The code reads the artifact file twice: once inside
resolveMediaArtifact for hashing, and again via readMediaFile for the bytes.
Modify resolveMediaArtifact to optionally return the file bytes it already read
during hash computation along with the artifact metadata, then update the code
in this section to use those returned bytes instead of calling readMediaFile
separately. This eliminates the duplicate file I/O and improves performance for
large artifacts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 915083f3-b449-454f-b566-160a178b68a2

📥 Commits

Reviewing files that changed from the base of the PR and between 4301c4f and 10ee981.

📒 Files selected for processing (66)
  • .changeset/google-discovery-media.md
  • CONTEXT.md
  • apps/docs/src/content/docs/capabilities.mdx
  • apps/docs/src/content/docs/reference/caplet-files.mdx
  • apps/docs/src/content/docs/reference/config.mdx
  • apps/docs/src/content/docs/troubleshooting.mdx
  • apps/landing/public/caplet-frontmatter.schema.json
  • apps/landing/public/config.schema.json
  • docs/adr/0002-media-artifacts-for-non-inline-results.md
  • docs/architecture.md
  • docs/plans/2026-06-16-google-discovery-api-backend-implementation.md
  • docs/specs/2026-06-16-google-discovery-api-backend.md
  • packages/core/src/auth.ts
  • packages/core/src/caplet-files-bundle.ts
  • packages/core/src/caplet-sets.ts
  • packages/core/src/caplet-source/parse.ts
  • packages/core/src/cli.ts
  • packages/core/src/cli/add.ts
  • packages/core/src/cli/auth.ts
  • packages/core/src/cli/commands.ts
  • packages/core/src/cli/completion-discovery.ts
  • packages/core/src/cli/doctor.ts
  • packages/core/src/cli/inspection.ts
  • packages/core/src/cli/setup-caplet.ts
  • packages/core/src/cli/setup.ts
  • packages/core/src/cloud/runtime-adapter.ts
  • packages/core/src/config-runtime.ts
  • packages/core/src/config.ts
  • packages/core/src/config/paths.ts
  • packages/core/src/engine.ts
  • packages/core/src/google-discovery/index.ts
  • packages/core/src/google-discovery/manager.ts
  • packages/core/src/google-discovery/operations.ts
  • packages/core/src/google-discovery/request.ts
  • packages/core/src/google-discovery/schema.ts
  • packages/core/src/google-discovery/types.ts
  • packages/core/src/http-actions.ts
  • packages/core/src/http/response.ts
  • packages/core/src/media/artifacts.ts
  • packages/core/src/media/index.ts
  • packages/core/src/media/input.ts
  • packages/core/src/native/service.ts
  • packages/core/src/openapi.ts
  • packages/core/src/registry.ts
  • packages/core/src/remote-control/dispatch.ts
  • packages/core/src/runtime-plan/planner.ts
  • packages/core/src/runtime.ts
  • packages/core/src/tools.ts
  • packages/core/test/auth.test.ts
  • packages/core/test/caplet-sets.test.ts
  • packages/core/test/caplet-source.test.ts
  • packages/core/test/cli-completion.test.ts
  • packages/core/test/cli-remote.test.ts
  • packages/core/test/cli.test.ts
  • packages/core/test/config.test.ts
  • packages/core/test/engine.test.ts
  • packages/core/test/exposure-discovery.test.ts
  • packages/core/test/fixtures/google-discovery/drive.discovery.json
  • packages/core/test/google-discovery.test.ts
  • packages/core/test/http-actions.test.ts
  • packages/core/test/media-artifacts.test.ts
  • packages/core/test/native.test.ts
  • packages/core/test/openapi.test.ts
  • packages/core/test/tools.test.ts
  • schemas/caplet.schema.json
  • schemas/caplets-config.schema.json

Comment thread packages/core/src/http/response.ts Outdated
Comment on lines +89 to +95
function rejectOversizedContentLength(response: Response, maxBytes: number): void {
const contentLength = response.headers.get("content-length");
if (!contentLength) return;
const byteLength = Number.parseInt(contentLength, 10);
if (Number.isFinite(byteLength) && byteLength > maxBytes) {
throw responseExceededLimit(maxBytes);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify the preflight branch and whether it cancels the response body before throwing.
nl -ba packages/core/src/http/response.ts | sed -n '84,105p'
rg -n "rejectOversizedContentLength|body\\?\\.cancel\\(" packages/core/src/http/response.ts

Repository: spiritledsoftware/caplets

Length of output: 112


🏁 Script executed:

#!/bin/bash
# View the relevant section with a different approach
cat -n packages/core/src/http/response.ts | sed -n '84,105p'

Repository: spiritledsoftware/caplets

Length of output: 839


🏁 Script executed:

# Search for the function and body cancellation patterns
rg -n "rejectOversizedContentLength" packages/core/src/http/response.ts
rg -n "body.*cancel|cancel.*body" packages/core/src/http/response.ts

Repository: spiritledsoftware/caplets

Length of output: 211


🏁 Script executed:

# Check the full file size to understand context better
wc -l packages/core/src/http/response.ts
head -n 100 packages/core/src/http/response.ts | tail -n 30

Repository: spiritledsoftware/caplets

Length of output: 1000


🏁 Script executed:

# Check where rejectOversizedContentLength is called (line 23)
sed -n '15,35p' packages/core/src/http/response.ts

Repository: spiritledsoftware/caplets

Length of output: 1064


🏁 Script executed:

# Get the full context of the readHttpLikeResponse function
sed -n '15,50p' packages/core/src/http/response.ts

Repository: spiritledsoftware/caplets

Length of output: 1585


🏁 Script executed:

# Check if response.body is ever accessed after rejectOversizedContentLength throws
rg -n "readHttpLikeResponse|rejectOversizedContentLength" -A 5 packages/core/src/http/response.ts | head -40

Repository: spiritledsoftware/caplets

Length of output: 1028


🏁 Script executed:

# Check the readBoundedBytes function to confirm it has reader.cancel() logic
sed -n '63,87p' packages/core/src/http/response.ts

Repository: spiritledsoftware/caplets

Length of output: 693


🏁 Script executed:

# Verify if there are any tests or other handlers for unconsumed response bodies
rg -n "cancel|unconsumed|body" packages/core/src/http/response.ts | head -20

Repository: spiritledsoftware/caplets

Length of output: 509


🏁 Script executed:

# Check if this pattern affects other files matching the guidelines
fd '\.ts$' packages/core/src --type f | head -5

Repository: spiritledsoftware/caplets

Length of output: 230


Cancel the response body before throwing on oversized Content-Length.

At line 23, rejectOversizedContentLength(response, maxBytes) is called before any body reading, and throws without canceling the response body stream. This leaves the response unconsumed, which can degrade HTTP connection reuse under load. The pattern of canceling before throwing is already established in readInlineCandidate (line 57) and readBoundedBytes (line 80) when hitting size limits during streaming, so this preflight check should be consistent.

Suggested fix
function rejectOversizedContentLength(response: Response, maxBytes: number): void {
+  response.body?.cancel().catch(() => {});
   const contentLength = response.headers.get("content-length");

Alternatively, make it async:

-function rejectOversizedContentLength(response: Response, maxBytes: number): void {
+async function rejectOversizedContentLength(response: Response, maxBytes: number): Promise<void> {
   const contentLength = response.headers.get("content-length");
   if (!contentLength) return;
   const byteLength = Number.parseInt(contentLength, 10);
   if (Number.isFinite(byteLength) && byteLength > maxBytes) {
+    await response.body?.cancel().catch(() => {});
     throw responseExceededLimit(maxBytes);
   }
 }

And update the call at line 23 to await rejectOversizedContentLength(response, maxBytes);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function rejectOversizedContentLength(response: Response, maxBytes: number): void {
const contentLength = response.headers.get("content-length");
if (!contentLength) return;
const byteLength = Number.parseInt(contentLength, 10);
if (Number.isFinite(byteLength) && byteLength > maxBytes) {
throw responseExceededLimit(maxBytes);
}
function rejectOversizedContentLength(response: Response, maxBytes: number): void {
response.body?.cancel().catch(() => {});
const contentLength = response.headers.get("content-length");
if (!contentLength) return;
const byteLength = Number.parseInt(contentLength, 10);
if (Number.isFinite(byteLength) && byteLength > maxBytes) {
throw responseExceededLimit(maxBytes);
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/http/response.ts` around lines 89 - 95, The
rejectOversizedContentLength function throws an error when content length
exceeds the limit but does not cancel the response body stream first, leaving it
unconsumed and degrading HTTP connection reuse. Before throwing the
responseExceededLimit error in rejectOversizedContentLength, call
response.body?.cancel() to properly clean up the stream. This aligns with the
established pattern already used in readInlineCandidate and readBoundedBytes
functions when they reject oversized content during streaming.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

const config = loadConfigWithSources(context.configPath, context.projectConfigPath).config;
const target = findAuthTarget(serverId, config);

P2 Badge Resolve Discovery scopes for remote OAuth login

Remote-control auth login still starts from findAuthTarget, so Google Discovery targets without explicitly configured auth.scopes never load the Discovery document and never request the operation scopes that normal calls later require. In remote/cloud auth flows this can complete successfully but store credentials that immediately fail tokenBundleMissingScopes on the first Google API call, forcing users into a login loop; it should use the same resolved target path as local loginAuth.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/core/src/cli/auth.ts Outdated
Comment thread packages/core/src/google-discovery/manager.ts Outdated
Comment thread packages/core/src/google-discovery/request.ts Outdated
Comment thread packages/core/src/google-discovery/manager.ts Outdated
Comment thread packages/core/src/google-discovery/manager.ts Outdated
Comment thread packages/core/src/caplet-sets.ts Outdated
Comment thread packages/core/src/media/artifacts.ts Outdated
@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds Google Discovery API as a first-class Caplet backend, including Discovery document parsing, operation/schema conversion, OAuth scope inference, media upload/download, and shared media-artifact handling that is also retrofitted onto the HTTP and OpenAPI backends.

  • New google-discovery/ modulemanager.ts orchestrates document fetching, caching, and tool dispatch; operations.ts converts Discovery methods into MCP tools with JSON Schema input/output; request.ts builds and validates URLs with path-traversal guards; schema.ts converts Discovery types to JSON Schema.
  • Shared http/response.ts and media/ — a unified response reader inlines small JSON/text and writes large or binary responses as filesystem-backed media artifacts; the same path is wired into HTTP actions, OpenAPI, and Google Discovery.
  • Config, engine, CLI, and authgoogleDiscoveryApis config key added throughout the dispatch chain; OAuth scope handling extended to inject per-operation Discovery scopes at auth time.

Confidence Score: 3/5

The new Google Discovery backend is functionally correct for common cases but has two confirmed bugs in the new code: schema generation produces over-restrictive types for any-typed Discovery fields, and multiple error paths in the request layer leave response bodies unconsumed, leaking TCP connections.

The type any to object mapping in schema.ts will silently emit wrong input and output schemas for every Google API that uses unconstrained value types such as Firestore and BigQuery, causing LLM agents to construct invalid tool calls at runtime. Independently, the 3xx and 401 and 403 error paths in both callTool and the fetchGoogleRequest helper throw without cancelling the response body, which will exhaust the undici connection pool under repeated auth failures or upload redirects.

packages/core/src/google-discovery/schema.ts for the type mapping and packages/core/src/google-discovery/manager.ts for body leaks in callTool and fetchGoogleRequest.

Important Files Changed

Filename Overview
packages/core/src/google-discovery/manager.ts Core Google Discovery manager — two body-leak findings: response bodies are not consumed before throwing on 3xx/401/403 in both callTool and the shared fetchGoogleRequest helper, risking connection-pool exhaustion under repeated auth errors or upload redirects.
packages/core/src/google-discovery/schema.ts Discovery-to-JSON-Schema converter — type any is incorrectly mapped to object, producing over-restrictive schemas for any Google API that uses an unconstrained value type such as Firestore or BigQuery.
packages/core/src/google-discovery/operations.ts Operation extraction and input schema building — path parameters not explicitly marked required true in the Discovery document are missing from the inner schema required list, causing a runtime error instead of clean validation failure.
packages/core/src/google-discovery/request.ts URL building and request initialization — URL validation, path-traversal prevention, reserved-expansion encoding, and query serialization all look correct.
packages/core/src/http/response.ts New shared HTTP response reader — correctly streams to a size limit, inlines JSON and text and writes oversized or binary responses as media artifacts; body is always consumed via the reader.
packages/core/src/media/artifacts.ts New media artifact writer — thorough path-containment, symlink-rejection, and permission hardening with 0o600 files. URI encoding and decoding validation is well-guarded.
packages/core/src/config.ts Config validation extended with googleDiscoveryApis — duplicate-ID checks are now propagated through all backend chains, the mutual-exclusivity of discoveryPath and discoveryUrl is enforced, and header and URL constraints match the other backends.
packages/core/src/auth.ts OAuth scope handling extended to accept resolvedScopes from Discovery operations — scope injection into authorization URLs, token refresh, and bundle-mismatch detection all updated consistently.
packages/core/src/engine.ts Engine wired up with GoogleDiscoveryManager — dispatch chain, registry updates, and cache invalidation all correctly extended; selectHttpLikeOptions helper avoids duplication.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant MCP as MCP Client
    participant Engine as CapletsEngine
    participant GDM as GoogleDiscoveryManager
    participant Cache as In-Memory Cache
    participant Auth as genericOAuthHeaders
    participant Google as Google API

    MCP->>Engine: callTool(serverId, toolName, args)
    Engine->>GDM: callTool(api, toolName, args)
    GDM->>GDM: getOperation(api, toolName)
    GDM->>Cache: get(api.server)
    alt cache miss or expired
        GDM->>Google: fetch(discoveryUrl)
        Google-->>GDM: Discovery document JSON
        GDM->>GDM: discoveryOperations(document)
        GDM->>Cache: set(api.server, operations)
    end
    Cache-->>GDM: GoogleDiscoveryOperation
    alt supportsMediaUpload and args.media present
        GDM->>Auth: authHeaders(api, scopes)
        Auth-->>GDM: Authorization header
        GDM->>Google: single or multipart or resumable upload
        Google-->>GDM: Response
    else normal call
        GDM->>Auth: authHeaders(api, scopes)
        Auth-->>GDM: Authorization header
        GDM->>Google: fetch(operationUrl, init)
        Google-->>GDM: Response
    end
    GDM->>GDM: readHttpLikeResponse(response)
    alt small JSON or text
        GDM-->>Engine: inline body in result
    else large or binary
        GDM->>GDM: writeMediaArtifact(bytes)
        GDM-->>Engine: artifact URI in result
    end
    Engine-->>MCP: CompatibilityCallToolResult
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant MCP as MCP Client
    participant Engine as CapletsEngine
    participant GDM as GoogleDiscoveryManager
    participant Cache as In-Memory Cache
    participant Auth as genericOAuthHeaders
    participant Google as Google API

    MCP->>Engine: callTool(serverId, toolName, args)
    Engine->>GDM: callTool(api, toolName, args)
    GDM->>GDM: getOperation(api, toolName)
    GDM->>Cache: get(api.server)
    alt cache miss or expired
        GDM->>Google: fetch(discoveryUrl)
        Google-->>GDM: Discovery document JSON
        GDM->>GDM: discoveryOperations(document)
        GDM->>Cache: set(api.server, operations)
    end
    Cache-->>GDM: GoogleDiscoveryOperation
    alt supportsMediaUpload and args.media present
        GDM->>Auth: authHeaders(api, scopes)
        Auth-->>GDM: Authorization header
        GDM->>Google: single or multipart or resumable upload
        Google-->>GDM: Response
    else normal call
        GDM->>Auth: authHeaders(api, scopes)
        Auth-->>GDM: Authorization header
        GDM->>Google: fetch(operationUrl, init)
        Google-->>GDM: Response
    end
    GDM->>GDM: readHttpLikeResponse(response)
    alt small JSON or text
        GDM-->>Engine: inline body in result
    else large or binary
        GDM->>GDM: writeMediaArtifact(bytes)
        GDM-->>Engine: artifact URI in result
    end
    Engine-->>MCP: CompatibilityCallToolResult
Loading

Fix All in Codex

Reviews (4): Last reviewed commit: "fix: finish google discovery review foll..." | Re-trigger Greptile

Comment thread packages/core/src/google-discovery/manager.ts
Comment on lines +591 to +597
function shouldSendDiscoveryAuth(api: GoogleDiscoveryApiConfig): boolean {
return Boolean(
api.discoveryUrl &&
api.baseUrl &&
new URL(api.discoveryUrl).origin === new URL(api.baseUrl).origin,
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent no-auth for remote private discovery documents

shouldSendDiscoveryAuth requires api.baseUrl to be explicitly set before it will attach auth headers to the discovery document fetch. If a user configures discoveryUrl with oauth2/oidc auth but does not set baseUrl (relying on the auto-derived value from rootUrl + servicePath), no auth header is sent and the discovery fetch will receive a 401/403, failing with a confusing "Google Discovery document request failed" error rather than an auth-related message.

Consider either (a) documenting this requirement clearly, or (b) sending auth headers whenever the auth type is not none and the discoveryUrl is set, regardless of baseUrl.

Fix in Codex

Comment on lines 1281 to +1340
}
}

for (const [api, rawValue] of Object.entries(config.googleDiscoveryApis)) {
const raw = rawValue as ConfigSchemaGoogleDiscoveryApiValue;
if (config.mcpServers[api]) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api],
message: `Caplet ID ${api} is already used by mcpServers`,
});
}
if (config.openapiEndpoints[api]) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api],
message: `Caplet ID ${api} is already used by openapiEndpoints`,
});
}
if (!SERVER_ID_PATTERN.test(api)) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api],
message: "Google Discovery API ID must match ^[a-zA-Z0-9_-]{1,64}$",
});
}
if (Boolean(raw.discoveryPath) === Boolean(raw.discoveryUrl)) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api],
message:
"Google Discovery API must define exactly one discovery source: discoveryPath or discoveryUrl",
});
}
if (raw.discoveryUrl && !isAllowedRemoteUrl(raw.discoveryUrl)) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api, "discoveryUrl"],
message:
"Google Discovery API discoveryUrl must use https except loopback development urls",
});
}
if (raw.baseUrl && !isAllowedHttpBaseUrl(raw.baseUrl)) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api, "baseUrl"],
message:
"Google Discovery API baseUrl must use https except loopback development urls and must not include credentials, query, or fragment",
});
}
if (raw.auth?.type === "headers") {
for (const headerName of Object.keys(raw.auth.headers)) {
const normalized = headerName.toLowerCase();
if (!HEADER_NAME_PATTERN.test(headerName) || FORBIDDEN_HEADERS.has(normalized)) {
ctx.addIssue({
code: "custom",
path: ["googleDiscoveryApis", api, "auth", "headers", headerName],
message: `header ${headerName} is not allowed`,
});
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Incomplete cross-backend duplicate Caplet ID checks

The new googleDiscoveryApis validation guards against ID collisions with mcpServers and openapiEndpoints, but not against graphqlEndpoints, httpApis, cliTools, or capletSets. A duplicate ID shared between googleDiscoveryApis and, for example, httpApis would silently produce two entries with the same server ID, breaking any lookup by ID. The existing backends appear to have the same gap, but since this is new code it's a good opportunity to be complete.

Fix in Codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b137ec3386

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/core/src/media/input.ts
Comment thread packages/core/src/cli/auth.ts Outdated
Comment thread packages/core/src/google-discovery/manager.ts Outdated
Comment thread packages/core/src/http/response.ts
Comment thread packages/core/src/google-discovery/manager.ts
Comment thread packages/core/src/openapi.ts
Comment thread packages/core/src/auth.ts Outdated
Comment on lines +253 to +278
const started = await fetchGoogleRequest(api, operation, startUrl, {
method: operation.method.toUpperCase(),
headers,
body: JSON.stringify(args.body ?? {}),
redirect: "manual",
});
const location = started.headers.get("location");
if (!location) {
throw new CapletsError(
"DOWNSTREAM_PROTOCOL_ERROR",
"Google resumable upload missing Location",
);
}
const uploadHeaders = new Headers();
uploadHeaders.set("content-type", media.mimeType ?? "application/octet-stream");
uploadHeaders.set("content-length", String(media.bytes.byteLength));
uploadHeaders.set(
"content-range",
`bytes 0-${media.bytes.byteLength - 1}/${media.bytes.byteLength}`,
);
return fetchGoogleRequest(api, operation, new URL(location), {
method: "PUT",
headers: uploadHeaders,
body: media.bytes,
redirect: "manual",
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Response body leak in resumable upload session start

The started response returned by fetchGoogleRequest has its body never consumed or cancelled. In Node.js's undici-backed fetch, an unconsumed body holds the underlying TCP connection open in-flight until the Response object is garbage-collected, preventing it from being returned to the connection pool. Every resumable upload call leaks one connection; under any meaningful upload load this will exhaust the pool and stall subsequent requests.

After reading the location header — whether or not it is present — the body must be explicitly discarded: started.body?.cancel().catch(() => {}) (fire-and-forget) or await started.body?.cancel() (eager). This applies to both the missing-location error path and the happy path.

Fix in Codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8f39ac67f9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/core/src/google-discovery/operations.ts Outdated
Comment thread packages/core/src/google-discovery/manager.ts
Comment thread packages/core/src/media/input.ts
Comment thread packages/core/src/google-discovery/manager.ts Outdated
Comment thread packages/core/src/google-discovery/manager.ts Outdated
Comment on lines +121 to +135
const response = await fetch(url, { ...init, signal: controller.signal });
if (response.status >= 300 && response.status < 400) {
throw new CapletsError(
"DOWNSTREAM_PROTOCOL_ERROR",
"Google Discovery request returned a redirect",
{
server: requestApi.server,
status: response.status,
location: response.headers.get("location") ? "[REDACTED]" : undefined,
},
);
}
if (response.status === 401 || response.status === 403) {
throw googleAuthError(requestApi, response);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The 3xx and 401/403 error paths throw without consuming the response body. In Node.js's undici-backed fetch, an unconsumed body holds its TCP connection open until GC collects the Response object. Under any meaningful request volume (repeated auth failures, occasional redirects) this slowly exhausts the connection pool. Add a fire-and-forget cancel before throwing, consistent with the fix needed in callResumableUpload.

Suggested change
const response = await fetch(url, { ...init, signal: controller.signal });
if (response.status >= 300 && response.status < 400) {
throw new CapletsError(
"DOWNSTREAM_PROTOCOL_ERROR",
"Google Discovery request returned a redirect",
{
server: requestApi.server,
status: response.status,
location: response.headers.get("location") ? "[REDACTED]" : undefined,
},
);
}
if (response.status === 401 || response.status === 403) {
throw googleAuthError(requestApi, response);
}
const response = await fetch(url, { ...init, signal: controller.signal });
if (response.status >= 300 && response.status < 400) {
response.body?.cancel().catch(() => {});
throw new CapletsError(
"DOWNSTREAM_PROTOCOL_ERROR",
"Google Discovery request returned a redirect",
{
server: requestApi.server,
status: response.status,
location: response.headers.get("location") ? "[REDACTED]" : undefined,
},
);
}
if (response.status === 401 || response.status === 403) {
response.body?.cancel().catch(() => {});
throw googleAuthError(requestApi, response);
}

Fix in Codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant