Skip to content

perf(prover): hold R1CS and w2 layers resident; remove unused compression#438

Merged
Bisht13 merged 1 commit into
v1from
perf/remove-prover-state-compression
May 12, 2026
Merged

perf(prover): hold R1CS and w2 layers resident; remove unused compression#438
Bisht13 merged 1 commit into
v1from
perf/remove-prover-state-compression

Conversation

@shreyas-londhe
Copy link
Copy Markdown
Collaborator

@shreyas-londhe shreyas-londhe commented May 12, 2026

Summary

Removes prover-state compression (postcard serialize/deserialize of R1CS and w2 layers during the commit phase). Introduced in 9291f2b to shrink commit-phase peak by ~270 MB; subsequent memory work made sumcheck phase dominate global peak, so the saving no longer reaches it. Today it is pure overhead.

Measured impact

Median-of-5, interleaved v1 vs opt. Peak memory identical on every circuit; allocations drop deterministically on circuits using challenge-based witness builders.

Circuit v1 allocs opt allocs Δ
basic-4 5190 5170 0%
poseidon2 9120 9100 0%
noir_sha256 50.5k 35.8k -29%
poseidon-rounds 9.04M 9.04M 0%
t_attest 414k 246k -41%
t_add_integrity_commit 421k 361k -14%
t_add_id_data_1850 1.0M 857k -14%
t_add_dsc_1850 3.81M 3.22M -15%
complete_age_check 4.25M 3.56M -16%

Wall time trends -3% to -6% on big circuits but noisy; allocation count is the deterministic gain.

Rollback

If a future circuit makes commit-phase peak exceed sumcheck-phase peak, restore compression from 9291f2b.

Test plan

  • cargo test --release --workspace --exclude provekit-bench — 62 passed
  • Cross-binary verification (v1 prove ↔ opt verify) passes
  • CI green

…sion

Removes the CompressedR1CS / CompressedLayers postcard-serialization
dance from prove_with_witness. The serialization was added in 9291f2b
to shrink peak memory during the commit phase (~270 MB saving on
complete_age_check at the time). Subsequent memory-reduction work made
the sumcheck phase the dominant memory consumer; on every measured
circuit today the commit-phase saving never reaches the global peak,
making the (de)serialization pure overhead.

Measured on 9 circuits spanning ~60 ms to 2.7 s, peak memory 9 MB to
880 MB (basic-4, poseidon2, noir_sha256, poseidon-rounds, t_attest,
t_add_integrity_commit, t_add_id_data_1850, t_add_dsc_1850,
complete_age_check):

  - peak memory unchanged on every circuit
  - allocations -14% to -41% on circuits using LogUp inverse builders
    (e.g. complete_age_check 4.25M -> 3.56M)
  - wall time on prove >=200ms: -3% to -9%
  - all proofs still verify, all 62 workspace tests pass

If a future circuit shape ever makes commit-phase peak exceed
sumcheck-phase peak, reintroduce a similar (de)serialization step then;
git history (9291f2b) preserves the prior implementation.
@Bisht13 Bisht13 merged commit db8ff0f into v1 May 12, 2026
2 of 3 checks passed
@Bisht13 Bisht13 deleted the perf/remove-prover-state-compression branch May 12, 2026 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants