Skip to content

Commit 6961004

Browse files
authored
fix(webapp): restore Postgres fallback for non-ClickHouse OTLP spans (#3803)
## Problem On environments where runs carry a Postgres-backed `taskEventStore` value (`taskEvent` or `taskEventPartitioned`), OTLP ingest endpoints (`POST /otel/v1/traces` and `/otel/v1/logs`) were returning HTTP 500. **Root cause:** The org-scoped ClickHouse factory introduced in a recent PR routes all OTLP spans through `getEventRepositoryForOrganizationSync` → `buildEventRepository`. That function only handles `"clickhouse"` and `"clickhouse_v2"` store values and throws `Unknown ClickHouse event repository store: <value>` for anything else. The throw occurred inside the grouping loop of `#exportEvents`, unwinding the entire method and returning 500 for the whole batch. The OpenTelemetry collector's `otlphttp` exporter treats HTTP 500 as non-retryable and drops the batch — causing real span loss. **Fix:** Guard the `getEventRepositoryForOrganizationSync` call in `#exportEvents` so it is only invoked for `clickhouse` / `clickhouse_v2` store values. All other values are routed directly to the Postgres `eventRepository`, matching the guard pattern already present in `resolveEventRepositoryForStore` and `getEventRepositoryForStore` in `eventRepository/index.server.ts`. The ClickHouse factory call is also wrapped in a try/catch that falls back to Postgres so any unexpected store value in a future OTLP batch degrades gracefully instead of failing the whole request. ## Changes - `apps/webapp/app/v3/otlpExporter.server.ts` — add Postgres routing guard and try/catch fallback in `#exportEvents` ## Testing The `eventRepository/index.server.ts` module already has the same guard pattern thoroughly covered. The fix brings `#exportEvents` into alignment with that existing, tested pattern. Manual verification: confirm OTLP batches containing Postgres-store spans return 200 and route to the correct repository.
1 parent 4b78d7e commit 6961004

2 files changed

Lines changed: 19 additions & 4 deletions

File tree

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
area: webapp
3+
type: fix
4+
---
5+
6+
Fixes OTLP ingest endpoints returning HTTP 500 for runs on environments that use a Postgres-backed task event store. This caused the OpenTelemetry collector to drop entire span batches as non-retryable, resulting in real span loss.

apps/webapp/app/v3/otlpExporter.server.ts

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ import type { ClickhouseFactory } from "~/services/clickhouse/clickhouseFactory.
2424
import { clickhouseFactory } from "~/services/clickhouse/clickhouseFactoryInstance.server";
2525

2626
import { generateSpanId } from "./eventRepository/common.server";
27+
import { eventRepository } from "./eventRepository/eventRepository.server";
2728
import type {
2829
CreatableEventKind,
2930
CreatableEventStatus,
@@ -120,10 +121,18 @@ class OTLPExporter {
120121
const routeKey = `${event.organizationId}\0${taskEventStore}`;
121122
let resolved = routeCache.get(routeKey);
122123
if (!resolved) {
123-
resolved = this._clickhouseFactory.getEventRepositoryForOrganizationSync(
124-
taskEventStore,
125-
event.organizationId
126-
);
124+
// Non-ClickHouse stores (taskEvent / taskEventPartitioned) are Postgres-backed.
125+
// The ClickHouse factory only handles clickhouse/clickhouse_v2 and throws otherwise.
126+
if (taskEventStore !== "clickhouse" && taskEventStore !== "clickhouse_v2") {
127+
// Non-ClickHouse stores (taskEvent / taskEventPartitioned) are Postgres-backed.
128+
// The ClickHouse factory only handles clickhouse/clickhouse_v2 and throws otherwise.
129+
resolved = { key: "postgres:default", repository: eventRepository };
130+
} else {
131+
resolved = this._clickhouseFactory.getEventRepositoryForOrganizationSync(
132+
taskEventStore,
133+
event.organizationId
134+
);
135+
}
127136
routeCache.set(routeKey, resolved);
128137
}
129138

0 commit comments

Comments
 (0)