Skip to content

fix: improve leader failover reconnection and error mapping#480

Merged
w1am merged 3 commits intomasterfrom
w1am/improve-error-mapping
Apr 13, 2026
Merged

fix: improve leader failover reconnection and error mapping#480
w1am merged 3 commits intomasterfrom
w1am/improve-error-mapping

Conversation

@w1am
Copy link
Copy Markdown
Collaborator

@w1am w1am commented Apr 13, 2026

Summary

  • Map gRPC INTERNAL and DATA_LOSS status codes to UnavailableError so the client retries on these transient failures during leader failover
  • Improve bridge error conversion to handle UnavailableError, DeadlineExceededError, and UnknownError, use string literals for error name matching, and default unrecognized errors to UnknownError
  • Bump @kurrent/bridge from 0.1.3 to 0.1.5

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Improve error mapping for leader failover reconnection

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Map gRPC INTERNAL and DATA_LOSS status codes to UnavailableError for transient failover failures
• Enhance bridge error conversion with UnavailableError, DeadlineExceededError, UnknownError
  handling
• Replace dynamic error name references with string literals for reliability
• Default unrecognized bridge errors to UnknownError instead of passthrough
• Upgrade @kurrent/bridge dependency from 0.1.3 to 0.1.5
Diagram
flowchart LR
  A["gRPC Status Codes"] -->|"INTERNAL, DATA_LOSS"| B["UnavailableError"]
  C["Bridge Errors"] -->|"String literal matching"| D["Error Conversion"]
  D -->|"UnavailableError, DeadlineExceededError"| E["Proper Error Types"]
  D -->|"Unrecognized errors"| F["UnknownError"]
  G["@kurrent/bridge"] -->|"0.1.3 → 0.1.5"| H["Enhanced Error Support"]
Loading

Grey Divider

File Changes

1. packages/db-client/src/utils/CommandError.ts 🐞 Bug fix +2/-0

Map gRPC status codes to UnavailableError

• Added StatusCode.INTERNAL and StatusCode.DATA_LOSS case mappings to UnavailableError
• Enables client retry behavior during leader failover scenarios
• Treats transient gRPC failures as unavailable service conditions

packages/db-client/src/utils/CommandError.ts


2. packages/db-client/src/utils/convertBridgeError.ts ✨ Enhancement +14/-25

Enhance bridge error conversion with string literals

• Removed stale commented-out code block with old error conversion logic
• Replaced dynamic error.name class references with string literals for reliability
• Added handling for UnavailableError, DeadlineExceededError, and UnknownError
• Changed default case to return UnknownError instead of passing through raw error

packages/db-client/src/utils/convertBridgeError.ts


3. packages/db-client/package.json Dependencies +1/-1

Upgrade @kurrent/bridge dependency

• Updated @kurrent/bridge dependency from 0.1.3 to 0.1.5
• Picks up improved error type support for leader failover scenarios

packages/db-client/package.json


View more (1)
4. packages/benchmark/package.json Dependencies +1/-1

Upgrade @kurrent/bridge dependency

• Updated @kurrent/bridge dependency from 0.1.3 to 0.1.5
• Aligns benchmark package with db-client dependency version

packages/benchmark/package.json


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Apr 13, 2026

Code Review by Qodo

🐞 Bugs (1)   📘 Rule violations (0)   📎 Requirement gaps (0)   🖥 UI issues (0)   🎨 UX Issues (0)
🐞\ ☼ Reliability (1)

Grey Divider


Remediation recommended

1. Unknown errors get masked🐞
Description
convertBridgeError now converts every unrecognized error.name into UnknownError, which can hide
non-bridge exceptions thrown during readStream/readAll execution and drops the original error’s
type/stack. This makes debugging and correct upstream error handling harder because unrelated
runtime errors become indistinguishable from bridge/transport failures.
Code

packages/db-client/src/utils/convertBridgeError.ts[R31-32]

    default:
-      return error;
+      return new UnknownError(serviceError);
Evidence
readStream/readAll catch broadly and rethrow whatever convertBridgeError returns; with the new
default branch, any unexpected error is wrapped as UnknownError. CommandErrorBase constructs a new
Error (super(error?.message)) and does not preserve the original error stack/type, so the original
exception details are lost.

packages/db-client/src/utils/convertBridgeError.ts[12-34]
packages/db-client/src/streams/readStream.ts[107-125]
packages/db-client/src/streams/readAll.ts[77-96]
packages/db-client/src/utils/CommandError.ts[59-71]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`convertBridgeError` now wraps all unrecognized errors into `UnknownError`, which can mask non-bridge failures and drop the original error stack/type.

## Issue Context
`readStream` and `readAll` catch broadly and route exceptions through `convertBridgeError`, so this affects any error thrown during Rust stream creation/iteration and local conversion.

## Fix Focus Areas
- packages/db-client/src/utils/convertBridgeError.ts[12-34]
- packages/db-client/src/streams/readStream.ts[107-125]
- packages/db-client/src/streams/readAll.ts[77-96]

## Recommended fix
- In the `default` case, return the original `error` unchanged when it is not a bridge/grpc error you explicitly recognize.
- If you still want to normalize unknown bridge/grpc errors, only wrap when the input looks like a `ServiceError` (e.g., has a numeric `code` or `metadata`), and preserve the original error via `cause` (and/or copy `stack`) so debugging retains the real origin.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Reconnect on INTERNAL/DATA_LOSS 🐞
Description
convertToCommandError now maps gRPC INTERNAL and DATA_LOSS to UnavailableError, which makes
Client.shouldReconnect treat these responses as reconnection-triggering failures. This changes the
system-wide error classification (INTERNAL/DATA_LOSS now surface as UnavailableError) and will tear
down the channel/clients on those codes.
Code

packages/db-client/src/utils/CommandError.ts[R642-645]

    case StatusCode.UNAVAILABLE:
+    case StatusCode.INTERNAL:
+    case StatusCode.DATA_LOSS:
      return new UnavailableError(error);
Evidence
convertToCommandError constructs UnavailableError for StatusCode.INTERNAL and StatusCode.DATA_LOSS;
shouldReconnect reconnects on UnavailableError, and handleError closes the current channel and
clears cached clients, forcing rediscovery on subsequent calls.

packages/db-client/src/utils/CommandError.ts[580-646]
packages/db-client/src/Client/index.ts[513-564]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`StatusCode.INTERNAL` and `StatusCode.DATA_LOSS` are now reclassified as `UnavailableError`, which (a) changes what error type callers observe and (b) triggers channel teardown/rediscovery via `Client.shouldReconnect`.

## Issue Context
Reconnect decisions are currently driven by `instanceof UnavailableError` / `CancelledError` / `NotLeaderError`.

## Fix Focus Areas
- packages/db-client/src/utils/CommandError.ts[622-646]
- packages/db-client/src/Client/index.ts[513-564]

## Recommended fix
- If the goal is reconnection on INTERNAL/DATA_LOSS without changing the surfaced error category, keep these codes mapped to `UnknownError` (or a dedicated error type), and update `shouldReconnect` to also reconnect when the raw `ServiceError.code` is INTERNAL or DATA_LOSS.
- Alternatively, introduce a distinct transient error type for these cases so you can drive reconnection without conflating them with UNAVAILABLE semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@w1am w1am force-pushed the w1am/improve-error-mapping branch from 11479da to 6520276 Compare April 13, 2026 10:28
@w1am w1am added cherry-pick:release/v1.1 Cherry picks PR into v1.1 release branch cherry-pick:release/v1.2 labels Apr 13, 2026
@w1am w1am force-pushed the w1am/improve-error-mapping branch from 66e6f70 to bd01e2a Compare April 13, 2026 10:55
@w1am w1am merged commit fa63d10 into master Apr 13, 2026
58 of 59 checks passed
@w1am w1am deleted the w1am/improve-error-mapping branch April 13, 2026 11:15
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@w1am 👉 Created pull request targeting release/v1.1: #481

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@w1am 👉 Created pull request targeting release/v1.2: #482

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick:release/v1.1 Cherry picks PR into v1.1 release branch cherry-pick:release/v1.2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant