Skip to content
This repository was archived by the owner on Apr 30, 2026. It is now read-only.

Commit b7d9b84

Browse files
committed
fix(query): route GUID group-by to eval path, unblocks (select {by: guid})
(select {from: t by: OrderId}) on 10M rows / 1M distinct GUIDs hung for minutes (effectively forever) and could not be Ctrl-C'd. Two compounding bugs in the DAG no-agg no-nonagg branch at query.c:1700-1708: * The first-occurrence scan is O(N × n_groups) — 10M × 1M = 10 trillion comparisons, and there is no ray_interrupted() check inside the inner loop. * It reads the key column through ray_read_sym which truncates RAY_GUID to the first 8 bytes, so the comparison is not even semantically correct for wide keys. Route RAY_GUID group-by to the eval-level path unconditionally (previously gated on any_nonagg). ray_group_fn already has a proper open-addressed 16-byte hash table plus the interrupt / progress hooks added in the previous commits, so the query now finishes in ~1.3 s (down from "hang") on the 10M row benchmark. The O(N × n_groups) integer path is still latent for non-GUID wide-ish workloads and should get the same hash-table rewrite, but that is out of scope here — GUID was the reported hang.
1 parent b30e0ad commit b7d9b84

1 file changed

Lines changed: 10 additions & 1 deletion

File tree

src/ops/query.c

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1028,7 +1028,16 @@ ray_t* ray_select_fn(ray_t** args, int64_t n) {
10281028
if (RAY_IS_PARTED(kct)) kct = (int8_t)RAY_PARTED_BASETYPE(kct);
10291029
if (kct == RAY_LIST || kct == RAY_STR)
10301030
use_eval_group = 1;
1031-
else if (kct == RAY_GUID && any_nonagg)
1031+
else if (kct == RAY_GUID)
1032+
/* RAY_GUID keys always route to the eval-level
1033+
* ray_group_fn path: (a) the DAG first-occurrence
1034+
* scanner below truncates wide keys to 8 bytes
1035+
* via ray_read_sym and is buggy for GUIDs, and
1036+
* (b) its O(N × n_groups) nested fallback was
1037+
* hanging for minutes on 10M-row tables with
1038+
* ~1M distinct groups. ray_group_fn has a proper
1039+
* open-addressed 16-byte HT plus interrupt
1040+
* checkpoints. */
10321041
use_eval_group = 1;
10331042
}
10341043
}

0 commit comments

Comments
 (0)