Skip to content

perf(keccak): trim c_aux and skip zero-rotation split witness#1309

Open
hero78119 wants to merge 2 commits intomasterfrom
feat/keccak_improvement
Open

perf(keccak): trim c_aux and skip zero-rotation split witness#1309
hero78119 wants to merge 2 commits intomasterfrom
feat/keccak_improvement

Conversation

@hero78119
Copy link
Copy Markdown
Collaborator

Problem

lookup_keccakf in ceno_zkvm/src/precompiles/lookup_keccakf.rs had redundant witness materialization in a prover hot path:

  1. c_aux stored the j=0 prefix stage, which is a direct copy of input state bytes.
  2. The RhoPi lane with rotation 0 still used split-rotation witness/range checks.

This increased committed columns and per-round range lookup load without adding new information.

Design Rationale

This change trims only redundant witness encoding while preserving Keccak semantics and existing lookup model (u8 x u8 -> u8):

  • Keep the same protocol flow and lookup primitives.
  • Remove only derivable witness data:
    • store c_aux only for j=1..4,
    • special-case rot == 0 lane with direct byte equality instead of split witness decomposition.
  • Update lookup accounting constants to match the reduced witness/range-check footprint.

Trade-off: slightly more branching in builder/witness generation for reduced witness width and range lookups.

Change Highlights

  • ceno_zkvm
    • KeccakWitCols::c_aux reduced from 200 to 160 (drop stored j=0 prefix copy).
    • Theta prefix-XOR constraints updated to derive predecessor from state8[0][i] for j=1.
    • Witness generation updated to mirror new c_aux layout.
    • rotation_witness reduced from 196 to 192.
    • RhoPi rot == 0 lane constrained by direct byte equality (skip split witness/range checks for that lane).
    • RANGE_LOOKUPS_PER_ROUND reduced from 290 to 286.
    • Keccak lookup multiplicity recording kept consistent with the new layout.
    • Commit: 0280726b.

Benchmark / Performance Impact

This PR is performance-oriented (column/lookup reduction). End-to-end benchmark numbers are not included yet in this PR description.

Testing

cargo check -p ceno_zkvm --lib
RUST_MIN_STACK=67108864 cargo test -p ceno_zkvm lookup_keccakf::tests::test_keccakf -- --nocapture

Copilot Reviewer Directive (keep this section)

When Copilot reviews this PR, apply .github/copilot-instructions.md strictly.

@hero78119 hero78119 requested a review from Copilot April 15, 2026 09:59
Reduce Keccak witness width and range lookups without changing semantics.\n\n- Trim c_aux storage from 5 prefix stages to 4 (drop j=0 copy stage),\n  reducing committed columns by 40 and removing redundant init-equality constraints.\n- Special-case RHOPI lanes with rotation=0 to direct byte equality, skipping\n  split witnesses/range checks for that lane.\n- Reduce rotation_witness width from 196 to 192 and update\n  RANGE_LOOKUPS_PER_ROUND from 290 to 286 accordingly.\n- Keep lookup multiplicity accounting and Keccak tests consistent with the\n  new layout.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Performance-focused cleanup of the Keccak-f lookup-based precompile circuit, reducing redundant witness materialization and a small amount of per-round range-check/rotation-witness work while keeping the same lookup-based constraint model.

Changes:

  • Shrinks c_aux witness materialization by dropping the j=0 prefix stage and updating Theta prefix-XOR constraints/witness generation accordingly.
  • Skips split-rotation witnesses/range checks for the single rot == 0 RhoPi lane by constraining direct byte equality instead.
  • Updates per-round accounting constants and refactors Chi byte constraints / multiplicity recording via small helper functions.

Comment thread ceno_zkvm/src/precompiles/lookup_keccakf.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants