Bug report
Bug description:
tomllib.loads re-traverses the whole table header path from the document root
once for every key/value pair under that header. A document with one table
header of depth D followed by M sibling keys is parsed in O(M * D) time.
key_value_rule (Lib/tomllib/_parser.py) computes
abs_key_parent = header + key_parent for each pair and then calls both
out.flags.is_(abs_key_parent, Flags.FROZEN) and
out.data.get_or_create_nest(abs_key_parent). Flags.is_ and
get_or_create_nest each walk the full abs_key_parent path from their
respective roots, so the shared header prefix is re-walked for every one of
the M keys. Nothing anchors the per-key work on the current table section.
Reproducer (single deep header, many single-part keys, all spec-valid):
import tomllib, time
D, M = 998, 80000
doc = "[" + ".".join(f"h{i}" for i in range(D)) + "]\n" + "".join(f"k{i}=1\n" for i in range(M))
t = time.perf_counter(); tomllib.loads(doc); print(len(doc), "bytes", time.perf_counter() - t, "s")
Measured (scaling D = M = L, so bytes are O(L) but time is O(L^2)):
L(=D=M) bytes time
1000 11.5 KB 0.14 s
2000 25.2 KB 0.57 s (4.2x for 2x)
4000 52.5 KB 2.19 s (3.9x for 2x)
A single 52 KB file blocks the thread for over 2 seconds.
How this differs from gh-149231
gh-149231 fixed an O(N^2) blowup from a single key with N dotted parts, by
adding MAX_KEY_PARTS = sys.getrecursionlimit() in parse_key. This is a
different mechanism: the cost here comes from re-walking a shared header for
every one of M keys, and each key in the reproducer has exactly one part, so
the parse_key cap never fires on the keys. MAX_KEY_PARTS bounds the header
depth D to about 1000 but does not remove the per-key re-walk, so a residual
O(1000 * M) amplification survives on builds that carry the cap:
D=998, M=80000 -> 714 KB -> ~11.7 s (legal even with the MAX_KEY_PARTS cap)
Suggested direction
Resolve the header's flags and nest node once per table section and have
key_value_rule walk only the relative key_parent (length 0 for the common
single-part key), turning each section from O(M * D) into O(M + D).
CPython versions tested on:
3.12, 3.13, 3.14, 3.15 (main). The full O(input^2) reproduces on 3.12 and on
3.14.5 as released (no MAX_KEY_PARTS yet); the O(1000 * M) residual reproduces
on 3.13, current 3.14, and main (which carry the gh-149231 cap).
Operating systems tested on:
macOS (also platform independent, pure Python parser)
Linked PRs
Bug report
Bug description:
tomllib.loadsre-traverses the whole table header path from the document rootonce for every key/value pair under that header. A document with one table
header of depth
Dfollowed byMsibling keys is parsed inO(M * D)time.key_value_rule(Lib/tomllib/_parser.py) computesabs_key_parent = header + key_parentfor each pair and then calls bothout.flags.is_(abs_key_parent, Flags.FROZEN)andout.data.get_or_create_nest(abs_key_parent).Flags.is_andget_or_create_nesteach walk the fullabs_key_parentpath from theirrespective roots, so the shared
headerprefix is re-walked for every one ofthe
Mkeys. Nothing anchors the per-key work on the current table section.Reproducer (single deep header, many single-part keys, all spec-valid):
Measured (scaling
D = M = L, so bytes areO(L)but time isO(L^2)):A single 52 KB file blocks the thread for over 2 seconds.
How this differs from gh-149231
gh-149231 fixed an
O(N^2)blowup from a single key withNdotted parts, byadding
MAX_KEY_PARTS = sys.getrecursionlimit()inparse_key. This is adifferent mechanism: the cost here comes from re-walking a shared header for
every one of
Mkeys, and each key in the reproducer has exactly one part, sothe
parse_keycap never fires on the keys.MAX_KEY_PARTSbounds the headerdepth
Dto about 1000 but does not remove the per-key re-walk, so a residualO(1000 * M)amplification survives on builds that carry the cap:Suggested direction
Resolve the header's flags and nest node once per table section and have
key_value_rulewalk only the relativekey_parent(length 0 for the commonsingle-part key), turning each section from
O(M * D)intoO(M + D).CPython versions tested on:
3.12, 3.13, 3.14, 3.15 (main). The full
O(input^2)reproduces on 3.12 and on3.14.5 as released (no MAX_KEY_PARTS yet); the
O(1000 * M)residual reproduceson 3.13, current 3.14, and main (which carry the gh-149231 cap).
Operating systems tested on:
macOS (also platform independent, pure Python parser)
Linked PRs