Background
Currently, the BN128Addition, BN128Multiplication, and BN128Pairing precompiled contracts suffer from performance bottlenecks, leading to transaction timeouts and limiting the application of zero-knowledge proofs on the TRON network, as shown in #4311 and #5492.
The zkBob team has proposed two optimization approaches:
-
The first one, described in #5507, wraps the arkworks cryptographic library via JNI. It offers significantly higher performance, achieving a 3x to 30x speedup for BN128 precompiled contracts.
-
The second one, described in #5611, uses Montgomery optimizations. This results in a 17% performance degradation for BN128Addition, but offers around 30% improvement for BN128Multiplication and BN128Pairing.
To optimize BN128 precompiled contract performance, Besu supports two implementations:
We conducted a comprehensive benchmark of these implementations:
Test results on macOS (2.6 GHz 6-Core Intel Core i7, 16GB RAM), in microseconds (μs):
| |
BN128Addition |
BN128Multiplication |
BN128Pairing (2 pairs) |
| java-tron |
32.718 |
2584.739 |
62751.503 |
| arkworks |
12.971 |
97.659 |
1792.001 |
| Montgomery |
41.512 |
1731.617 |
41900.001 |
| besu-java |
6.5 |
14163.3 |
1295543.0 |
| besu-gnark |
4.1 |
55.5 |
654.2 |
On Ubuntu 22.04.5 LTS, AMD x86_64, 16 cores, 32GB RAM, the benchmark results (μs):
| |
BN128Addition |
BN128Multiplication |
BN128Pairing (2 pairs) |
| java-tron |
23.654 |
1896.336 |
46889.459 |
| arkworks |
5.154 |
83.243 |
1646.457 |
| Montgomery |
29.903 |
1543.336 |
35772.447 |
| besu-java |
2.5 |
10413.6 |
796687.1 |
| besu-gnark |
2.4 |
53.5 |
655.9 |
Taking the Ubuntu results as an example, we can see that integrating arkworks significantly improves the performance of BN128 precompiled contracts on java-tron:
-
BN128Addition: ~2.5× speedup
-
BN128Multiplication: ~26× speedup
-
BN128Pairing: ~35× speedup
Moreover, java-tron’s performance is far behind the besu-gnark implementation:
-
BN128Addition: ~10× slower
-
BN128Multiplication: ~35× slower
-
BN128Pairing: ~71× slower
Taking this Groth16 contract example, which involves 6 BN128Addition, 5 BN128Multiplication, and 4 BN128Pairing operations:
-
java-tron execution time:
6×23.654+5×1896.336+2×46889.459≈103402.44 μs
-
arkworks execution time:
6×5.154+5×83.243+2×1646.457≈3740.05 μs( 27.6×faster)
-
besu-gnark execution time:
6×2.4+5×53.5+2×655.9≈1593.70 μs( 65×faster)
In addition to BN128 precompiled contracts, Besu has also optimized EcRecover and Secp256k1 signature verification. The besu-java implementation uses the BouncyCastle library, while besu-native wraps bitcoin-core/secp256k1 via JNA. We benchmarked and compared their performance as well:
Test results on macOS (2.6 GHz 6-Core Intel Core i7, 16GB RAM), in μs:
| |
EcRecover |
Secp256k1 |
ModExp |
| java-tron |
1956.975 |
1218.343 |
58.418 |
| besu-java |
959.4 |
891.7 |
21.222 |
| besu-native |
50.6 |
45.3 |
34.585 |
Test results on Ubuntu 22.04.5 LTS, AMD x86_64, 16 cores, 32GB RAM:
| |
EcRecover |
Secp256k1 |
ModExp |
| java-tron |
916.184 |
875.647 |
20.168 |
| besu-java |
679.2 |
670.2 |
16.428 |
| besu-native |
50.4 |
47.3 |
34.165 |
From the Ubuntu benchmark results, we can see that besu-native achieves ~18× speedup over java-tron in both EcRecover and Secp256k1 signature verification. For ModExp, the performance difference is smaller, possibly due to limited test coverage.
From the above benchmarks, it is clear that optimizing the performance of the BN128 precompiled contracts and other cryptographic algorithms is necessary. The besu-native implementation serves as a strong reference, as it also includes optimized support for Blake2bf and BLS12-381, which can help improve the scalability of the TRON network.
Rationale
Why should this feature exist?
- Optimize the performance of
BN128 precompiled contracts to avoid transaction timeouts and promote ZK-based applications on the TRON network;
- Accelerate
EcRecover and Secp256k1 signature verification to improve TRON’s scalability;
- Optimize additional cryptographic primitives such as
ModExp, Blake2bf, and BLS12-381.
What are the use-cases?
- ZK application developers;
- Reducing signature verification time for all TRON transactions;
Specification
- Replace the current java-tron
BN128 precompiled contract implementation with the optimized version;
- Replace the current java-tron implementation of
EcRecover and Secp256k1 with optimized versions;
- Apply similar performance improvements to other cryptographic primitives.
Test Specification
- Benchmark execution speed of the optimized
BN128 precompiled contract and compare it to the current implementation;
- Benchmark
EcRecover and Secp256k1 verification performance before and after optimization;
- Measure execution time of ZK-related transactions before and after optimization;
- Measure execution time of regular transactions before and after optimization;
Scope Of Impact
BN128 and EcRecover precompiled contracts;
- Reduced transaction processing time by accelerating signature verification;
- Potential gas cost adjustments for affected precompiled contracts;
- Performance improvements must preserve consensus correctness to avoid network forks, requiring proposal-level changes;
- As signature verification is a major performance bottleneck in block production, improvements should be evaluated in the context of block time scheduling.
Implementation
Do you have ideas regarding the implementation of this feature?
Currently, the optimization of BN128 precompiled contracts mainly considers two implementations: arkworks and besu-gnark. The arkworks-based version has already been developed by the zkBob team (see #5507), but recent benchmarks show that besu-gnark performs slightly better. Since besu-native relies on JDK 21, we also need to evaluate how to maintain compatibility with JDK 8.
A thorough discussion is still required to evaluate the advantages and potential risks of each solution before choosing the final optimization path. For EcRecover and Secp256k1, it should also be discussed within the community whether to adopt the optimization strategy used in besu-native.
Are you willing to implement this feature?
Y
Background
Currently, the
BN128Addition,BN128Multiplication, andBN128Pairingprecompiled contracts suffer from performance bottlenecks, leading to transaction timeouts and limiting the application of zero-knowledge proofs on the TRON network, as shown in #4311 and #5492.The zkBob team has proposed two optimization approaches:
The first one, described in #5507, wraps the arkworks cryptographic library via JNI. It offers significantly higher performance, achieving a 3x to 30x speedup for BN128 precompiled contracts.
The second one, described in #5611, uses Montgomery optimizations. This results in a 17% performance degradation for
BN128Addition, but offers around 30% improvement forBN128MultiplicationandBN128Pairing.To optimize
BN128precompiled contract performance, Besu supports two implementations:A pure Java implementation;
A gnark-crypto wrapper using JNA.
We conducted a comprehensive benchmark of these implementations:
Test results on macOS (2.6 GHz 6-Core Intel Core i7, 16GB RAM), in microseconds (μs):
On Ubuntu 22.04.5 LTS, AMD x86_64, 16 cores, 32GB RAM, the benchmark results (μs):
Taking the Ubuntu results as an example, we can see that integrating
arkworkssignificantly improves the performance of BN128 precompiled contracts on java-tron:BN128Addition: ~2.5× speedupBN128Multiplication: ~26× speedupBN128Pairing: ~35× speedupMoreover, java-tron’s performance is far behind the
besu-gnarkimplementation:BN128Addition: ~10× slowerBN128Multiplication: ~35× slowerBN128Pairing: ~71× slowerTaking this Groth16 contract example, which involves 6
BN128Addition, 5BN128Multiplication, and 4BN128Pairingoperations:java-tronexecution time:6×23.654+5×1896.336+2×46889.459≈103402.44 μs
arkworksexecution time:6×5.154+5×83.243+2×1646.457≈3740.05 μs( 27.6×faster)
besu-gnarkexecution time:6×2.4+5×53.5+2×655.9≈1593.70 μs( 65×faster)
In addition to BN128 precompiled contracts, Besu has also optimized
EcRecoverandSecp256k1signature verification. Thebesu-javaimplementation uses theBouncyCastlelibrary, while besu-native wraps bitcoin-core/secp256k1 via JNA. We benchmarked and compared their performance as well:Test results on macOS (2.6 GHz 6-Core Intel Core i7, 16GB RAM), in μs:
Test results on Ubuntu 22.04.5 LTS, AMD x86_64, 16 cores, 32GB RAM:
From the Ubuntu benchmark results, we can see that
besu-nativeachieves ~18× speedup overjava-tronin bothEcRecoverandSecp256k1signature verification. ForModExp, the performance difference is smaller, possibly due to limited test coverage.From the above benchmarks, it is clear that optimizing the performance of the
BN128precompiled contracts and other cryptographic algorithms is necessary. Thebesu-nativeimplementation serves as a strong reference, as it also includes optimized support forBlake2bfandBLS12-381, which can help improve the scalability of the TRON network.Rationale
Why should this feature exist?
BN128precompiled contracts to avoid transaction timeouts and promote ZK-based applications on the TRON network;EcRecoverandSecp256k1signature verification to improve TRON’s scalability;ModExp,Blake2bf, andBLS12-381.What are the use-cases?
Specification
BN128precompiled contract implementation with the optimized version;EcRecoverandSecp256k1with optimized versions;Test Specification
BN128precompiled contract and compare it to the current implementation;EcRecoverandSecp256k1verification performance before and after optimization;Scope Of Impact
BN128andEcRecoverprecompiled contracts;Implementation
Do you have ideas regarding the implementation of this feature?
Currently, the optimization of
BN128precompiled contracts mainly considers two implementations:arkworksandbesu-gnark. The arkworks-based version has already been developed by the zkBob team (see #5507), but recent benchmarks show thatbesu-gnarkperforms slightly better. Sincebesu-nativerelies on JDK 21, we also need to evaluate how to maintain compatibility with JDK 8.A thorough discussion is still required to evaluate the advantages and potential risks of each solution before choosing the final optimization path. For
EcRecoverandSecp256k1, it should also be discussed within the community whether to adopt the optimization strategy used inbesu-native.Are you willing to implement this feature?
Y