Skip to content

[m3db-client] Circuit breaker: break only on timeout errors#4393

Merged
arnav-chakraborty merged 1 commit into
masterfrom
prateek-sachan_UBER/circuit-breaker-timeout-filter
May 20, 2026
Merged

[m3db-client] Circuit breaker: break only on timeout errors#4393
arnav-chakraborty merged 1 commit into
masterfrom
prateek-sachan_UBER/circuit-breaker-timeout-filter

Conversation

@prateek2211
Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

Summary:
Context:
- Circuit breaker in M3DB client currently trips on all errors
- Application errors (bad request, partial batch) should not trip the breaker

Changes:
- Add ErrorFilter func(error) bool to middleware Params for error classification
- Modify withBreaker to only report filtered errors as CB failures
- Only store lastError for CB-relevant errors (so rejection message shows timeout)
- Add filteredFailures metric to distinguish CB failures from total failures
- Wire IsTimeoutError as the error filter in host_queue.go
- Add unit and integration tests for error filtering behavior

Test Plan:
go test ./src/dbnode/client/circuitbreaker/middleware/... -v
go test ./src/dbnode/client/ -run TestHostQueueCircuitBreaker -v

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:


Does this PR require updating code package or user-facing documentation?:


Summary:
  Context:
    - Circuit breaker in M3DB client currently trips on all errors
    - Per the ERD, it should only trip on timeout/transport errors (slow nodes)
    - Application errors (bad request, partial batch) should not trip the breaker

  Changes:
    - Add ErrorFilter func(error) bool to middleware Params for error classification
    - Modify withBreaker to only report filtered errors as CB failures
    - Only store lastError for CB-relevant errors (so rejection message shows timeout)
    - Add filteredFailures metric to distinguish CB failures from total failures
    - Wire IsTimeoutError as the error filter in host_queue.go
    - Add unit and integration tests for error filtering behavior

Test Plan:
  go test ./src/dbnode/client/circuitbreaker/middleware/... -v
  go test ./src/dbnode/client/ -run TestHostQueueCircuitBreaker -v

Revert Plan:
  Revert the diff if it is not working as expected.
@arnav-chakraborty arnav-chakraborty merged commit 2e397f7 into master May 20, 2026
3 of 4 checks passed
@arnav-chakraborty arnav-chakraborty deleted the prateek-sachan_UBER/circuit-breaker-timeout-filter branch May 20, 2026 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants