Native Nova SHA256 bench
Context: See https://hackmd.io/0gVClQ9IQiSXHYAK0Up9hg?view= for previous Nova benchmarks, with more of a focus on recursion.
Hardware: Macbook Pro M1 Max (2021), 64GB memory.
Native Nova SHA256 benchmark with varying preimage.
Code: https://github.com/srinathsetty/Nova/blob/main/benches/sha256.rs
Size |
Constraints |
Time |
64B |
55k |
53ms |
128B |
82k |
59ms |
256B |
135k |
77ms |
512B |
242k |
106ms |
1KB |
456k |
182ms |
2KB |
883k |
320ms |
4KB |
1.7m |
521ms |
8KB |
3.4m |
1s |
16KB |
6.8m |
1.9s |
32KB |
13.7m |
4.1s |
64KB |
27.4m |
7.7s |
- Using native Nova and Bellperson SHA256, not via Nova Scotia
- Size is preimage for sha256, not proof size
- Constraints is per step (primary circuit), secondarty circuit is constant ~10k constraints
- A single fold, no recursion
Comparing with Celer Network Benchmarks:
Comparing with Nova Scotia SHA256:
- See https://hackmd.io/0gVClQ9IQiSXHYAK0Up9hg?view=
- Multiple SHA256 hashes (p=100), also single fold (k=1)
- Written in Circom
- 2.9m constraints
- 635ms for prove step (base case)
- If done recursively (e.g. 10 or 100 folds) then each recursive proof step is ~2s, x5 base case
- appears to expand, from 1.3s to 3.1s - leak?
Comparing Halo2 SHA256 benchmarks:
- In varying preimage size case 8KB takes ~100s
- 8KB SHA256 takes ~3m constraints in R1CS
- For recursive hashing case ~3m constraints gives you 100 hashes in one fold
- With 100 hashes using lookup tables Halo2 takes ~1.6s
- I.e. a ~100x difference, which explains disparity
Tldr (tentative):
- Can reproduce Nova native benchmarks x100 faster than Plonky/Halo2 and on par with Starky
- For same number of constraints and no recursion prove step same prover speed for Nova Scotia and Nova
- Recursion overhead ~x5 more than base case in each step
- Lookup tables seems to have made Halo2 KZG x100 diff
- Using Circom toolchain takes long time to get many constraints (>3m) into circuit
- Doing 100 hashes in a circuit vs varying preimage size different problems
Things to follow up / answer:
- Why and when would we not stick as much as possible into a single fold?
- Reproduce preimage hash bench with Nova Scotia and e.g. rapidsnark toolchain
- What would a better recursive benchmark look like that can't easily be solved with e.g. lookups?