Skip to content

feat(router): add router chain snapshot cache#3305

Open
Aetherance wants to merge 17 commits into
apache:developfrom
Aetherance:feat/router/chain-cache
Open

feat(router): add router chain snapshot cache#3305
Aetherance wants to merge 17 commits into
apache:developfrom
Aetherance:feat/router/chain-cache

Conversation

@Aetherance
Copy link
Copy Markdown
Contributor

@Aetherance Aetherance commented Apr 26, 2026

Description

Every RPC call executes the full router chain. TagRouter performs O(n) invoker traversal for every call, even though invoker lists and routing rules change orders of magnitude less frequently than RPC calls. This PR adds an invoker-snapshot cache on RouterChain so that Poolable routers can pre-compute address pools once per SetInvokers and reuse them across subsequent Route calls.

Fixes #3166

Approach

  • New routerCache on RouterChain, rebuilt in full on each SetInvokers. Implements the existing router.Cache interface.

  • CacheAccessor interface lets Poolable routers receive the cache reference from RouterChain.

  • TagRouter implements Poolable + CacheAccessor. Pool() builds a three-dimensional bitmap index (tag, addr/port) stored in a single AddrPool.

  • Static tag matching, dynamic address matching, and indexed param matching all resolve via O(1) bitmap lookup instead of O(n) traversal.

  • When a ParamMatch references a key not indexed, the bitmap path gracefully falls back to the original filterInvokers-based logic.

Checklist

  • I confirm the target branch is develop
  • I have added tests that prove my fix is effective or that my feature works
  • Code has passed local testing

@CAICAIIs
Copy link
Copy Markdown
Contributor

@CAICAIIs

@Aetherance Aetherance force-pushed the feat/router/chain-cache branch from beb8384 to cba446e Compare April 26, 2026 12:40
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 26, 2026

Codecov Report

❌ Patch coverage is 77.64706% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.66%. Comparing base (60d1c2a) to head (93be1e2).
⚠️ Report is 816 commits behind head on develop.

Files with missing lines Patch % Lines
cluster/router/tag/cache.go 76.31% 22 Missing and 5 partials ⚠️
cluster/router/chain/cache.go 67.74% 9 Missing and 1 partial ⚠️
cluster/router/tag/match.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3305      +/-   ##
===========================================
+ Coverage    46.76%   52.66%   +5.90%     
===========================================
  Files          295      494     +199     
  Lines        17172    38037   +20865     
===========================================
+ Hits          8031    20034   +12003     
- Misses        8287    16393    +8106     
- Partials       854     1610     +756     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread cluster/router/tag/router.go Outdated
@Alanxtl Alanxtl added ✏️ Feature 3.3.2 version 3.3.2 labels Apr 27, 2026
@Alanxtl Alanxtl linked an issue Apr 27, 2026 that may be closed by this pull request
2 tasks
@Alanxtl
Copy link
Copy Markdown
Contributor

Alanxtl commented Apr 27, 2026

add a benchmark test to show the enhancement

Comment thread cluster/router/chain/cache.go
@Aetherance
Copy link
Copy Markdown
Contributor Author

Benchmark Results

环境:Linux amd64 / Intel i7-14650HX / Go 1.25.0

N 表示 invoker 数量

Static Tag Routing

N no cache cached
10 691 ns 307 ns
100 6218 ns 820 ns
1000 71788 ns 3544 ns

Dynamic Tag Address Routing

N no cache cached
10 991 ns 768 ns
100 8189 ns 803 ns
1000 118391 ns 689 ns

Cache Hit

tagged no cache cached
10–90 ~8 µs ~0.8 µs

运行 benchmark:

go test ./cluster/router/tag/ -bench=.

Comment thread cluster/router/tag/router.go Outdated
Comment thread cluster/router/tag/router.go Outdated
Comment thread cluster/router/tag/router.go Outdated
Copy link
Copy Markdown
Contributor

@AlexStocks AlexStocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — dubbo-go #3305

核心优化思路(位图索引 + 快照缓存)方向正确,但存在严重的语义正确性问题:

P0 × 4

  1. 缓存路径使用全量 invokers 而非已过滤 invokers,路由链语义错误
  2. TagRouter.cache 字段并发读写无同步保护
  3. AddrMetadata 返回值被静默丢弃
  4. RouterCacheDisable 在特定场景下漏设

P1 × 3:正则热路径编译、空切片 panic 风险、位图展开双重分配
P2 × 3:benchmark 缺 ReportAllocs、位图 key 无常量定义、FindAddrPool 每次复制切片

建议在合并前重点修复 P0-1(缓存路径与路由链语义不一致)和 P0-2(并发安全),并对等价性做更严格的验证。

Comment thread cluster/router/chain/cache.go
Comment thread cluster/router/tag/router.go
Comment thread cluster/router/tag/router.go Outdated
Comment thread cluster/router/chain/chain.go
Comment thread cluster/router/tag/router.go Outdated
Comment thread cluster/router/tag/router.go
Comment thread cluster/router/tag/router.go Outdated
Comment thread cluster/router/tag/cache_benchmarks_test.go
Comment thread cluster/router/tag/router.go Outdated
Comment thread cluster/router/chain/cache.go
@Aetherance Aetherance force-pushed the feat/router/chain-cache branch from 6784e99 to 2e50749 Compare May 23, 2026 14:39
@sonarqubecloud
Copy link
Copy Markdown

Comment thread cluster/router/tag/router.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an invoker-snapshot cache at the RouterChain layer to avoid re-walking the full invoker list on every RPC, enabling Poolable routers (notably TagRouter) to precompute and reuse routing indices across Route calls. It adds a generation guard so cached bitmap indices are only used when they’re aligned with the chain’s per-call snapshot.

Changes:

  • Add RouterChain snapshot generation tracking + a routerCache rebuilt on each SetInvokers, and publish the generation onto the invocation for fast-path safety.
  • Implement TagRouter as Poolable + CacheAccessor, building roaring bitmap indices (tag/address/port) for O(1)-style lookups, with fallback to existing traversal logic.
  • Add unit tests, concurrency/race coverage, and micro-benchmarks; also fix a TagRouter failover filtering bug in requestTag.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
common/constant/key.go Adds invocation attribute keys for cache control/generation + bitmap key prefixes/constants.
cluster/router/router.go Extends the router.Cache interface and adds CacheAccessor for cache injection into routers.
cluster/router/chain/chain.go Introduces chain-level cache + generation publishing and cache-disable behavior during routing.
cluster/router/chain/cache.go New routerCache implementation storing pools + invoker snapshot + generation under a lock.
cluster/router/chain/chain_test.go Adds tests for generation publish/increment and a concurrency race test for cache generation skew.
cluster/router/tag/router.go Adds cache-aware fast path in TagRouter guarded by cache generation alignment.
cluster/router/tag/cache.go Implements TagRouter pooling (bitmap indices) and bitmap-based routing path with fallback.
cluster/router/tag/match.go Fixes failover address filtering to operate on the already-filtered result set.
cluster/router/tag/router_test.go Adds extensive tests covering bitmap path, fallbacks, and generation guard behavior.
cluster/router/tag/cache_benchmarks_test.go Adds benchmarks comparing cached vs non-cached routing for representative scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +120 to +128
if c.cache == nil {
c.cache = newRouterCache()
for _, r := range c.routers {
if accessor, ok := r.(router.CacheAccessor); ok {
accessor.SetCache(c.cache)
}
}
}
c.cache.rebuild(c.generation, invokers, c.routers)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修复

Comment on lines +125 to +127
if len(match) != 0 {
return nil // not bitmap-cached; fall back to requestTag
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缓存 ParamMatch 会引入过于高的复杂度 已在注释和 pr 描述中明确 ParamMatch 对应字段不为空直接fallback到无缓存路径

Comment thread cluster/router/router.go
Comment on lines +98 to +103
// FindAddrPool returns the address pool, the invoker snapshot, and the generation of that
// snapshot in a single locked read. The generation lets callers verify the pool/invokers
// belong to the same generation the chain snapshotted for the current route, so the bitmap
// indices stay aligned with the invoker slice and the route is not served from a snapshot
// produced by a concurrent SetInvokers.
FindAddrPool(Poolable) (AddrPool, []base.Invoker, uint64)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这确实是个 breaking change,但是 AddrPool 里的 bitmap index 依赖同一代 invoker snapshot,单独返回 AddrPool 不能表达安全的使用语义。把 pool、invokers 和 generation 放在同一个 FindAddrPool 返回值里,可以保证调用方一次读取到一致快照,也避免后续误用旧 API。

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 6, 2026

@Aetherance Aetherance requested a review from AlexStocks June 6, 2026 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Introduce Router Cache to Enhance Routing Performance

6 participants