Native Nova SHA256 bench

Context: See https://hackmd.io/0gVClQ9IQiSXHYAK0Up9hg?view= for previous Nova benchmarks, with more of a focus on recursion.

Hardware: Macbook Pro M1 Max (2021), 64GB memory.

Native Nova SHA256 benchmark with varying preimage.

Comment

Using native Nova and Bellperson SHA256, not via Nova Scotia
Size is preimage for sha256, not proof size
Constraints is per step (primary circuit), secondarty circuit is constant ~10k constraints
A single fold, no recursion

Comparing with Celer Network Benchmarks:

Comparing with Nova Scotia SHA256:

See https://hackmd.io/0gVClQ9IQiSXHYAK0Up9hg?view=
Multiple SHA256 hashes (p=100), also single fold (k=1)
Written in Circom
2.9m constraints
635ms for prove step (base case)
If done recursively (e.g. 10 or 100 folds) then each recursive proof step is ~2s, x5 base case
- appears to expand, from 1.3s to 3.1s - leak?

Comparing Halo2 SHA256 benchmarks:

Tldr (tentative):

Can reproduce Nova native benchmarks x100 faster than Plonky/Halo2 and on par with Starky
For same number of constraints and no recursion prove step same prover speed for Nova Scotia and Nova
Recursion overhead ~x5 more than base case in each step
Lookup tables seems to have made Halo2 KZG x100 diff
Using Circom toolchain takes long time to get many constraints (>3m) into circuit
Doing 100 hashes in a circuit vs varying preimage size different problems

Things to follow up / answer:

Why and when would we not stick as much as possible into a single fold?
Reproduce preimage hash bench with Nova Scotia and e.g. rapidsnark toolchain
What would a better recursive benchmark look like that can't easily be solved with e.g. lookups?