pad: add custom comparator for GridGraphEdge set to prevent non-determinism#10632
pad: add custom comparator for GridGraphEdge set to prevent non-determinism#10632eder-matheus wants to merge 2 commits into
Conversation
…minism Signed-off-by: Eder Monteiro <emrmonteiro@precisioninno.com>
There was a problem hiding this comment.
Code Review
This pull request introduces deterministic edge ordering in RDLRouter by replacing std::set<GridGraphEdge> with a custom GridGraphEdgeSet that orders edges by their vertex indices rather than heap pointers. The review feedback suggests further improvements to ensure complete determinism and optimize performance: adding a custom comparator for std::set<RDLRoutePtr> to prevent non-deterministic pointer comparison, using std::minmax for a cleaner implementation of GridGraphEdgeLess, and eliminating the getVertexEdges helper to avoid the overhead of temporary container allocations by calling boost::out_edges directly.
|
|
||
| // Determine access points | ||
| std::unordered_map<odb::Point, std::set<GridGraphEdge>> remove_edges; | ||
| std::unordered_map<odb::Point, GridGraphEdgeSet> remove_edges; |
There was a problem hiding this comment.
While fixing the non-determinism of GridGraphEdge sets is a great step, there is another source of non-determinism in RDLRouter::route at line 652:
std::set<RDLRoutePtr> ripup;Since RDLRoutePtr is a std::shared_ptr<RDLRoute>, std::set will default to comparing the memory addresses of the shared pointers. This makes the iteration order of ripup (and consequently the order in which uncommitRoute and addGraphEdge are called) dependent on the heap layout, which varies across builds and runs.
To ensure complete determinism, ripup should use a custom comparator that orders the routes deterministically, for example by their terminal ID:
struct RDLRoutePtrLess
{
bool operator()(const RDLRoutePtr& lhs, const RDLRoutePtr& rhs) const
{
return lhs->getTerminal()->getId() < rhs->getTerminal()->getId();
}
};
std::set<RDLRoutePtr, RDLRoutePtrLess> ripup;References
- Iteration over containers with pointer keys is non-deterministic unless a custom comparator is defined for the pointer type to ensure a stable, deterministic order.
Signed-off-by: Eder Monteiro <emrmonteiro@precisioninno.com>
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
RDL routing was nondeterministic across builds, causing
pad.rdl_route_failed.tclto fail intermittently in CI (only the merge pipeline) while passing locally and
being stable within any single binary.
The router ordered graph data by heap addresses in two places:
std::set<GridGraphEdge>— a boostadjacency_listedge descriptor's defaultoperator<compares its internal edge-property pointer (a heap address).std::set<RDLRoutePtr>(the ripup set) —std::shared_ptrcompares by pointervalue.
Both orders drive the sequence in which edges are removed and re-added to the
graph (
commitRoute/uncommitRoute/access-point teardown). Boost A* walks avertex's incident edges in adjacency-list (insertion) order, so the heap layout
— which varies by build (compiler, allocator, LTO, CI node) — changed A*
equal-cost tie-breaks and therefore the final set of unroutable nets. Within one
binary ASLR shifts all addresses equally, preserving relative order, which is
why it was stable locally but flaky across CI builds.
The fix orders both by stable integer keys:
(source, target)vertex indices (GridGraphEdgeLess/GridGraphEdgeSet), andRDLRoutePtrLess).It also drops a redundant
in_edgestraversal ingetVertexEdges(for anundirected graph
out_edgesalready yields every incident edge).Type of Change
Impact
This is a result-changing deterministic policy, not a no-op cleanup. It replaces
an arbitrary, build-dependent (heap-address) ordering with an arbitrary but
stable one, so routing converges on a single canonical result instead of a
layout-dependent one. As a consequence two congested-case goldens change
(
rdl_route_failed: 17 -> 13 reported unroutable nets;rdl_route_max_iterations:one fewer). The chosen ordering is deterministic but is not selected for routing
quality; the QoR shift is incidental. All geometry (
defok) goldens areunaffected.
Verification
./etc/Build.sh).padsuite on gcc; bothchanged tests stable across repeated runs).
Related Issues
N/A