Skip to content

v0.3.0 candidate: memory safety hardening + C ABI v4#116

Open
samtalki wants to merge 14 commits into
mainfrom
v0.3.0-candidate
Open

v0.3.0 candidate: memory safety hardening + C ABI v4#116
samtalki wants to merge 14 commits into
mainfrom
v0.3.0-candidate

Conversation

@samtalki

Copy link
Copy Markdown
Member

Supersedes #112, which the head branch rename closed; its comments hold the lockstep Julia PR pointer (eigenergy/PowerIO.jl#25) and the benchmark verification.

Merges two malformed-input fixes, generalizes both bug classes, and revises the C ABI to v4. The header preamble (powerio-capi/include/powerio.h) is the normative statement of the v4 conventions.

Memory safety

  • MATPOWER gencost: a crafted NCOST (e.g. 1e20) overflowed the row-width arithmetic and panicked on every build profile. The arithmetic now saturates and the row is rejected as a parse error. (claude/keen-feynman-vv3049)
  • Message truncation: clipping a message at a raw byte count can split a multibyte UTF-8 character, leaving invalid text in the caller's buffer. Truncation now lands on a character boundary. (claude/amazing-edison-4bjitk)
  • .pwd reader: the byte-read helpers indexed the buffer directly and relied on per-call-site bounds checks. They now return Option, so an out-of-range offset from a corrupt file rejects the record instead of panicking; the record scan also retains decoded coordinates rather than re-reading them. The differential oracle tests (decoded coordinates checked against same-vintage .aux files across the save corpus) pass unchanged.
  • Fuzz harnesses (fuzz/, workspace-excluded; nightly + cargo-fuzz): matpower, psse, and powerio-json via parse_str; the .pwb/.pwd decoders on raw bytes. Invariant: any input yields Ok or a structured Err, never a panic. All five targets pass seeded smoke runs.
  • Audited, no change needed: .pwb cursor reads are bounds-checked; the psse/egret/pandapower numeric casts never feed indexing; every entry point already catches panics at the boundary.

C ABI v4 (PIO_ABI_VERSION 3 → 4)

v3 used three different verbs for serialization, named a handle-returning transform like the string serializers, and let extractors write past a miscounted buffer. All known consumers are this repo and PowerIO.jl, and pio_abi_version rejects mismatched libraries at load, so the break is cheap now. The conventions are designed so it is the last one.

v3 v4
pio_to_matpower, pio_to_json, pio_from_json removed; matpower and the new powerio-json snapshot are format strings into pio_to_format / pio_parse_str
pio_to_normalized pio_normalize — a value transform returning a handle; to_ re-encodes unchanged data
pio_export_arrow pio_to_arrow
pio_write_pypsa_csv_folder pio_write_dir(net, to, dir, ...)
pio_read_gridfm, pio_gridfm_scenario_ids pio_read_dir(dir, from, scenario, ...), pio_scenario_ids(dir, from, ...)
pio_parse_warnings pio_warnings
pio_reference_bus (isize), pio_reference_buses, pio_n_reference_buses pio_ref_bus_index (i64), pio_ref_bus_indices(net, out, cap) — a dense index, not a bus id, and named so
pio_n_components, pio_nodal_demand, pio_nodal_shunt pio_n_islands, pio_bus_demand, pio_bus_shunt
pio_convert_file(path, to, from) pio_convert_file(path, from, to); new pio_convert_str(text, from, to)

No format names remain in the symbol table; adding a format leaves the ABI unchanged.

Conventions:

  • Extractors (cap/count): each takes the caller's buffer capacity and returns the total available. Writes never exceed cap, so a miscounted buffer reads short instead of overflowing, and (NULL, 0) is a count query — the snprintf pattern. v3 wrote exactly pio_n_* elements on trust.
  • Warnings: pio_warnings returns the byte length of the joined text, so a buffer can be sized exactly; v3 returned a warning count, which cannot size a buffer. Warnings attach to the handle from any constructor; only functions returning no handle (pio_to_format, pio_convert_*, pio_write_dir) take a warnbuf.
  • Errors: caller-provided errbuf/errlen (the libpcap/curl idiom) — no library-allocated strings to free, no thread-local state.
  • Vocabulary: a bus is a named connection point; a node is one conductor's point at a bus (the OpenDSS sense), reserved for the multiconductor surface; a branch is any two-terminal series element, lines and transformers alike. Hence bus_demand and n_islands.
  • Freeze: existing signatures never change again. New data is new symbols; rich or multiconductor data rides the Arrow C Data Interface and the powerio-json snapshot, whose schemas evolve without touching a C signature.

Supporting changes: TargetFormat::PowerioJson makes the snapshot an ordinary format, reachable from the CLI and the converters; write_as/to_format become fallible because the snapshot rejects non-finite values rather than writing null (foreign JSON targets are unchanged); powerio::write_dir and powerio_matrix::read_dataset_dir/dataset_scenario_ids are the directory-format dispatch points; examples/smoke.c exercises the full v4 surface and is compiled and run in CI.

Verification

Workspace test suite, 26 capi unit tests, header parity, the compiled C smoke binary end to end, PowerIO.jl's 180 tests against this branch's library, the PowerModels/Exa oracle matrix, and the fuzz smoke runs. Benchmarks against a main baseline show no regressions; the two reworked readers measure flat (matpower) and 1.3% faster (.pwd). Full table: benchmark comment. Continuous tracking: #115.

Numbering and pairing

No version number changes in this diff; the branch keeps its working name. Recommended release number: 0.3.0 — pre-1.0 convention puts breaking changes in the minor, and the ABI handshake remains the actual compatibility gate. Merge order: this PR, then eigenergy/PowerIO.jl#25, with binaries cut from the same commit (tandem CI inactive, #64). Follow-ups: #113 (dist surface adopts these conventions), #114 (PowerDiff field renames), #115 (benchmark tracking).

🤖 Generated with Claude Code

claude and others added 13 commits June 12, 2026 03:44
`gencost_row` reads NCOST from the file as an f64 truncated to usize, so a
huge or non-finite value saturates near usize::MAX. The width requirement was
then computed as `start + want` (with `want = 2*ncost` for piecewise costs),
which overflows: an add-overflow panic under debug overflow-checks, and in
release a wraparound that makes the `require` length check pass and then panics
on the reversed `row[start..start + want]` slice range.

A crafted MATPOWER `mpc.gencost` row (e.g. NCOST = 1e20) therefore panics on
every build profile. Through the C ABI / Python / Julia the panic is caught at
the FFI boundary and degraded to a generic "panic while parsing", but the pure
Rust API and the CLI take an uncaught panic — a denial of service on untrusted
input. It is not a memory-safety issue: the release wraparound lands on a
bounds-checked slice, so it panics rather than reading out of bounds.

Size the requirement with saturating arithmetic so an implausible NCOST is
rejected by the existing length check as a loud `ShortRow` error, the parser's
normal malformed-input signal, on every profile and through every binding.

Found by malformed-input fuzzing of the parser surface.
https://claude.ai/code/session_013KSDeKD9C3YsGaR67RDKhr
copy_to_buf clipped error/warning messages at a raw byte count, which
could split a multi-byte UTF-8 codepoint and hand consumers an invalid
UTF-8 string. Back the truncation point up to a character boundary so a
clipped message is always valid UTF-8, and pin the behavior with a test.

https://claude.ai/code/session_01KxR1fuH4L8XHHZXtNYgrG8
…mat strings everywhere

PIO_ABI_VERSION 3 -> 4. One verb per job, one meaning per word, and no
format names in symbols, so the surface evolves additively from here:

- pio_to_normalized -> pio_normalize (a value transform returns a handle;
  the to_ family re-encodes unchanged data, per the strtol/htons lineage)
- pio_to_matpower / pio_to_json / pio_from_json cut: matpower and the new
  validated powerio-json snapshot flow through pio_to_format/pio_parse_str
  as format strings (TargetFormat::PowerioJson; write_as is now fallible
  because JSON has no Inf/NaN and the snapshot must round-trip exactly)
- pio_export_arrow -> pio_to_arrow; the Arrow schema is the evolution valve
- pio_write_pypsa_csv_folder -> pio_write_dir(net, to, dir); pio_read_gridfm
  -> pio_read_dir(dir, from, scenario); pio_gridfm_scenario_ids ->
  pio_scenario_ids(dir, from, ...): directory formats are strings too
- pio_convert_str joins pio_convert_file (both now (input, from, to, ...))
- every array extractor takes a cap and returns the total count; NULL out is
  the count query, so a caller buffer can never silently overflow
  (pio_n_reference_buses folds into pio_ref_bus_indices)
- pio_parse_warnings -> pio_warnings: warnings attach to the handle from any
  constructor (pio_read_dir drops its warnbuf), and the return is the byte
  length needed, so callers can size exactly
- pio_reference_bus -> pio_ref_bus_index (i64): it returns a dense index
  while pio_branches from/to carry bus ids; the unit is now in the name
- pio_n_components -> pio_n_islands; pio_nodal_demand/pio_nodal_shunt ->
  pio_bus_demand/pio_bus_shunt: bus/node/branch vocabulary fixed in the
  header preamble (bus = connection point, node = conductor point at a bus,
  reserved for the multiconductor surface; branch = any two-terminal series
  element)

The header preamble now states the grammar, the conventions (errbuf per
libpcap/curl, cap/count per snprintf, UTF-8 boundary truncation, handle
immutability), and the freeze-and-evolve policy.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
matpower, psse, and powerio-json through parse_str; the PowerWorld .pwb
and .pwd binary decoders on raw bytes. The invariant is the parser trust
model: Ok or a structured Err on any input, never a panic. Excluded from
the workspace (needs nightly + cargo-fuzz); see fuzz/README.md. The
gencost NCOST overflow was found by exactly this harness shape.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… model and panic notes

The capi README gains the ABI v4 history row, the cap/count contract, the
parser trust model (malformed input errors, never UB; memory scales with
input and is uncapped), and the panic strategy note (guards need the
default unwind; an abort build aborts cleanly). smoke.c now exercises the
v4 surface: count queries, powerio-json snapshot round trip, convert_str,
write_dir, pio_warnings sizing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ble write_as

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- dataset format dispatch moves from powerio-capi into powerio-matrix's io
  hub (read_dataset_dir / dataset_scenario_ids), next to where the gridfm
  reader lives; the single-variant DatasetFormat enum at the C boundary is
  gone and the C ABI is a thin wrapper, like every other format dispatch
- the three identical catch_unwind tails of pio_to_format / pio_convert_file
  / pio_convert_str fold into finish_conversion, mirroring finish_network
- write_as: the PowerioJson early return becomes a match arm, dropping the
  unreachable!() (the snapshot still skips the warning passes deliberately:
  warn_normalized_tap would be false for a format that preserves the labels)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…C block comment early

Found by compiling smoke.c against the regenerated header: the pio_read_dir
doc's directory-glob example ended the comment mid-sentence and broke every
build including powerio.h.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ecoder bench

The validate job's Julia shim ccalled pio_nodal_demand/pio_nodal_shunt and
the un-capped extractor signatures — the one consumer the docs sweep missed
(it greps .jl now). parse.rs gains parse_pwd_activsg200: the one reader
whose hot loop runs per byte, regression coverage for the total byte
accessors.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- to_json refuses non-finite values, naming the field: serde_json would
  degrade Inf/NaN to null, which the snapshot's own reader then rejects;
  the documented write-side error now actually fires
- sniff_json learns the snapshot's top level buses key, so a .json
  snapshot parses without a format hint; powerio-cli gains the
  powerio-json arm (aliases powerio/json)
- pio_network_free / pio_string_free run under the panic guard the
  boundary contract documents
- capi README calls out the silent pio_convert_file argument reorder,
  the one v4 break invisible at link time
- new powerworld_aux fuzz target: the .aux tokenizer was the one
  hand-written parse_str reader no harness fed
- README examples compile again (.network + ?); languages.md drops a
  stale "(PR open)" label

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The strict to_json guard broke the bindings' materialization path: readers
legitimately produce Inf limits (the pandapower fixture carries an infinite
pmax) and Python/Julia build every Network view through the snapshot, so
refusing the write refused the parse. Keep the write total, surface the
degradation as a write_as fidelity warning naming the field, and pin the
no-read-back consequence in the test (the validating reader still rejects
the null).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
# Conflicts:
#	powerio-capi/include/powerio.h
#	powerio-capi/src/lib.rs
@samtalki

Copy link
Copy Markdown
Member Author

@frederikgeth

Copy link
Copy Markdown
Collaborator

Looks good to me! Thanks for explaining what this is :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants