SUREEL zkVM weekly syncup notes
2024-12-10
PSE
- debug tool
- (WIP) remove hardcode position 0 lk mutiplicity check PR 649
- prover
- (WIP) skip PCS commitment/opening for structual column issue 654
- based on issue 654 works, we can add new task to skip commitment of range table
- optimize opcode circuit with 32-bit range check proposal in #702
- refactor circuit to use 32 bit range check
- spartan SPARK is implemented draft PR 713
- current on hold for obstacle, need to figure out a way to do sparse fraction sum: main obstacle
- sumcheck optimisation
- ideas: optimize virtualpoly and extract identity polynomial
Miscs
- Scroll set timeline to publish Ceno research paper/blog on 2025/01/07 including timeline, benchmark.
- Have a meetup with Axiom UTC 3pm Thursday
2024-12-03
PSE
Ceno
- circuit optimisation
- (Reviewing) unify MUL opcode to field arithmetics PR 660
- debug tool
- (WIP) remove hardcode position 0 lk mutiplicity check PR 649
- (Reviewing) merge slt/sltu PR 659
- proving system
- (WIP) enhancement: skip structural witin commitment & PCS PR 654
- (WIP) mpcs opening optimisation: refactor to share same sumcheck impelementation & code cleanup PR 653
- will extract to smaller PR for review
- found another issue: performance on main proof regressed after recent PR 671
- benchmark super issue
- precompile DSL approach: frontend & backend design WIP document
- Ming will work on opcode circuit approach (next priority after circuit optimisation)
- parallel proving design via shardings
- GPU proving
Miscs
- Connecting with cores team on tg as a follow up of
Starks - aggregating hash based signatures
- asking what the hash based signature candidate
- Axiom connecting us and look for potiential colaboration.
- Precon Retro, happend on UTC+8 0am, Dec 4
2024-11-26
takeaways & idea & suggestion after pre-con
Ming
- updated roadmap accordingly, focus on EL & CL benchmark
- few new ideas. e.g. zkVM based pre-compiles, lasso succintly structural table evaluation in more place.
Roadmap summary (from Ming):
we probably focus on
- zkVM (prover) performance
idea justification: all zkVM vendors haven't catch up the performance for Ethereum EL/CL usage. For ceno (SOTA), it's around 250 kHz, our target > 10MHz (zkVM proved opcode per second), around 100x boost
- L1 targeted benchmarks
focus on EL/CL task, and optimize surround this two task
- precompile/co-processor for zkVM
as cryptography primative accelaration
With proven ideas from other zkVM, e.g.
L1 benchmarks
- EL: revm e2e
- CL: hash based signature scheme on zkVM
build connection with Kev team from CL, and know the hash-base signature candidates
precompiles
- build precompile framework
=> preliminary idea: to build precompile from zkVM opcode. Discussing idea with Scroll.
- take EL precompiles and CL hash based signature as first priority target
zkVM researches & exploration
- sumcheck algorithm optimisation
- hardware optimisation: AVX/Cuda
- binary field domain knowhow
- audit & fv
- Scroll roadmap on Ceno
- prepare for announcement, cooked on more benchmarks
- private I/O integration test
- MPCS: benchmark WHIR as a replacement/enhancement of basefold
- recursive verifier SNARK
Miscs
Ceno task WIP
- optimise circuit with less witness (Kimi)
- mock-prover support padding check (Soham)
- discussed pre-compile and proving system optimisation (Ming)
With Scroll Community Calls
- Every Tuesday: Strategy meeting
- Every Thursday: Weekly progress meeting including all developer (CET 10am)
- Scroll Slack: PSE are invited as guest
.. 2024-11-5/12/19 skips due to PRECON/DEVCON
2024-10-29
PSE
- (completed) opcode development: slt/slti/srai
- (wip) sltiu
- (under reviewing) modular memory/public i/o design PR 457
- experiment optimise sumcheck protocol by PolygonZero publish
- (completed) load/store word load/store
- (Doing) ELF program load into memory & e2e test
2024-10-22
2024-10-15
Milestone 1
- We might expand milestone benchmark scope to cover more other than Fabonacci
- because for pure Fabonacci people might think it's kind of cheat and maybe only outperform in this specific workload.
- RIV32im are fully around 38 opcode, with that we can cover more benchmark like
- rsa
- regex
- is-prime
- ssz-withdrawal
- tendermint light client
PSE
opcodes
- (reviewing) logical i-type are under reviewing
- (Done) mock-prover error dedup
- (Done) soundness: public input fix
- (Ongoing) few more opcode tasks: SLLI, SRAI, SLTIU, SLTI, MULH
Protocol
2024-10-08
Milestone 1
- opcode 15/24, 6 revidewing
- benchmark:
- on cpu, 2^20 instance e2e 2.10s, vs SP1 11s (until 2024/07)
we can further improve after resolving this issue via single limb Issue 285
many optimisation in planed follow up on milestone 1
PSE
- opcodes development:
- unittest enhancement
- (Done) assertion on register rd value PR 301
- (Done) customized debug expression probing PR 306
- Doing Divu/SLI soundness fixing
- (Done) ecall-halt PR 258
- also support public input
…
Misc
share benchmark result with sumcheck-gpu team as a GPU optimisation anchor
2024-10-01 (skip meeting, combining with next week)
PSE
- (Done) improve CI testing
2024-09-24
PSE
Misc
- (WIP, low priority) refactor uint for better expression conversion PR 264
- performance: 2^20 add instance: 3.6s -> 2.6s
- sumcheck protocol improvement, e.g polygon zero PIOP paper stuff
- cpu optimization: avx2/avx512 on goldilock, e.g > 4x improvement
- binary field arithmetics/PCS
- implementaion xxx
2024-09-17
PSE
Ceno
- (Done) ci integration pipelipe PR 209
- (Done) mock prover PR 206
- (Done) mul opcode generalization generalized MUL OP
- (Doing) SRLI PR
- (Doing) memory/cpu consistent check Issue 126
enhance sumcheck to align and run on different num_variables
- (Doing) mul opcode PR 98
- MPCS: 2.1s (goal 1.5s)
- (Done) program table & opcode lookup
- Integrate MPCS to proving flow
Misc
2024-09-10
PSE
Ceno
- (Done) Mock Prover error print PR 182
- (Done) add CI target as metrics PR 195
- (Done) witness assignment interface PR 187
- (Reviewing) Lt Util PR 183
- (Reviewing) Lock-free thread-safe logup multiplicity witness counting PR 198
- (Doing) generalized MUL OP
- (TODO) opcode implementation (mul, addi, srli)
- (TODO) MockProver cache table data and load once
- it might be more urgent as now per run, we load > 5 tables, and each with size 2^16. It slows down CI
Miscs
- lack behind of M1 progress https://hackmd.io/@ceno-zkvm/ryDWX5_5R
- due to still consolidate the overall proving system.
- improve reviewing speed & quality
- fasten opcode developments
- Ethereum granted project for FV on zk(E)VM https://verified-zkevm.org/
- Ali will reach them and seek for collarboration
- GPU sumcheck colaboration: Sowoon + Dohoon + Scroll => tg group
- Benchmark result: 5x fast than SP1 on Fobanacci task.
- MPCS: 60 poly commitments => 2 s
- create proof of Add opcode 2^20 => 1s
- Add opcode 16 poly commits
- Cost: MPCS proof 10X, 8Mb
- opt1: optimise codebase/ mechanism
- opt2: recursive/aggregation we can compress the proof into smaller size
2024-09-03
Ceno
PSE
- (Done) PR fix sumcheck degree & monomial form dedup issue => bug captured PR 169
- [PR] overflow handling
- discussion thread
- due to usage wrapping_add/sub/mul/div in revm
- in summary:
-
- disable compiler by default seasoning overflow check on a;; instructions
-
- support overflow as external assignment.
-
- rely on rust
wrapping_XXX
respective assemply to deal with overflow check
- (Done) UInt refactor
- (Done) Mock Prover
- (Doing) LT/GT Gadget https://github.com/scroll-tech/ceno/issues/167
TODOs & Doing
2024-08-27
Ceno
PSE
- PR Fix verifier failed when
lk_expression.len()
> r/w_expression.len()
- will also eliminate potiential soundness by trussless from prover proof
- Change default to RIV32 https://github.com/scroll-tech/ceno/pull/166
RIV32 toolchain got to be more mature
align benchmark with SP1 and other zkVM
experiment riv64 later
- Mock Prover PR
- Ming have done the review, WIP for adding >1 degree
assert_zero_expression
multiplication.
- follow up tasks: modify
addsub
opcode unittest to use MockProver
- replace random generated witness with real data to pass the unittest
- keep prove/verify flow in benchmark/example
- UInt refactor PR
- review done from Ming
- suggestion: commits cherry-pick and exclude commits in master branch
- range table circuit https://github.com/scroll-tech/ceno/pull/154
- Interpreter implementation
- PCS integration: Basefold + Plonky2-FRI optimisation
2 explorations
- FRI-Binius => benchmark result not good, pending for new research / implementaion polishment
- Basefold + Plonky2
- From Snarkify: control flow opcode implementation
Misc
- a one-on-one scheduled around this Thur/Fri for sharing peer review/self evaluation result :)
2024-08-20
Ceno
- Performance: remove unnessesary to_vec() clone improve latency from 600ms -> 380ms => 2.7Mhz zkVM prover.
- Project Milestone [dashboard]
(https://github.com/orgs/scroll-tech/projects/15)
- Util lib development
- MockProver: lookup expression assertion check are done, while others WIP
- UInt utility: under reviewing
- Super Issue and TODOs review
2024-08-13
Ceno
- Up-to-date result: 2^20 instance run from 1.04s -> 600ms on 16 phy-cores 64GB, achieve 1.5Mhz prover
- Ongoing tasks
2024-08-06
Ceno
- zkVM v2 implemetation
- framework almost done
- benchmark
- Up-to-date result: 2^20 instance run in 1.04s on 16 phy-cores 64GB, achieve 1Mhz prover (should be 10x than sp1, > 12x than jolt).
- raise super issues to trace sub tasks https://github.com/scroll-tech/ceno/issues/95
- high priorities sub-tasks
- implement multi-opcode support => blocking other opcode implementation
- benchmark: devirgo sumcheck optimization
- (Kimi) Refactor UInt gadget and use expression system https://github.com/scroll-tech/ceno/issues/103
- add MockProver: improve opcode debugging ability
- Research
- [Scroll] plan to benchmark binius PCS + GKR in the following 2 weeks.
2024-07-30
Ceno
Interpreter:
- ongoing, with pending tasks on running with mainblock and getting statistics result.
2024-07-23
Ceno
- (Ming Ongoing) Design and implementing GKR + Hyperplonk variant to specificly address zkVM use case.
- PoC shows 262k Hz for riscv add great value for potiential fast zkvm prover.
- review riscv opcodes and see which one can NOT (or high cost) be expressed by new design.
- high level implementation idea on computation graph: with the dag graph with various operation node, and each node might involve sumcheck or just simply evaluation split/merge.
- Goal is to keep existing ceno frontend design while just change the underlying implementation to achieve highly code reuse.
- layer -> vector, and no cellid.
- target to finish first version in the following weeks.
- (Kimi Ongoing) PR riscv add opcode reviewing
- unit test error and pending for debug
Interpreter
2024-07-16
Ceno
- Engineering
- (Done) PR more refactor to applied devirgo sumcheck. Boost around 20x on evm add benchmark
- (Ongoing) PR optimize prover run time/memory
- (Ongoing) PR riscv add opcode
- zkVM new design from Scroll
- PoC benchmark shows around 262k Hz (2^20 add in 3.x sec) to generate proof (without PCS)
> Jolt 90k hz, which means around 3x fast than Jolt.
> SP1 1.7x fast than Jolt
Interpreter
- (Ongoing) based on SP1 emulator
- repo link (TBD)
- framework still work in progress
- currently bug fixing
Misc
- Ceno open source roadmap
- Aligned with Scroll: prioritize on zkVM and build the framework based on PoC to shift the project asap. Would be focus on
zkVM Keccak
instead of general keccak
.