# Week 42 / Notes / Report
Working on zkWasm's halo2 gpu specific impl. with --features = gpu
Working on understanding:
```Rust!
ProveExpression::Op(l, r, op)
ProveExpression::Y(ys)
ProveExpression::Unit(u)
ProveExpression::Scale(l, ys)
```
and the GPU evaluation mechics.
Looked breifly at `ecc-gpu` branch `halo2-opt-v2` to understand kernel/cl
Test environment:
```
Ubuntu 22.04
5.15.0-86-generic
NVIDIA-SMI 535.104.12
Driver Version: 535.104.12
CUDA Version: 12.2
RTX 4080
```
for repo `halo2-gpu-specific` working in branch `xgao-gpu-expriments`
### `gen pkey:`
Start: Assign
End: Assign ....................................................................444.266ms
Start: generate pkey
··Start: prepare ev
gpus number is 1
total vs group is 36
····Start: group exprs
total vs group is 36
elements:
cells are <>
```
elements:
cells are <a0f8>
...
elements:
cells are <a1f8>
...
```
etc
····End: group exprs ...........................................................153.795ms
depth is 58
depth is 45
`...<print depths down to 0>...`
depth is 2
depth is 0
--------- expr part 0 ---------
Dump the complexity is ComplexityProfiler
··End: prepare ev ..............................................................461.505ms
End: generate pkey .............................................................1.104s
### `create proof:`
Start: create proof
··Start: instance
k is 18
··End: instance ................................................................12.749ms
··Start: advice
-->
··End: advice ..................................................................282.877ms
··Start: lookups 30
··End: lookups 30 ..............................................................541.343ms
··Start: lookups commit product
··End: lookups commit product ..................................................423.172ms
··Start: lookups add blinding value
··End: lookups add blinding value ..............................................833.204µs
··Start: lookups msm and fft
··End: lookups msm and fft .....................................................836.289ms
··Start: permutation commit
··End: permutation commit ......................................................93.359ms
··Start: vanishing commit
··End: vanishing commit ........................................................31.464ms
### `h_poly:`
··Start: h_poly
····Start: lagrange_to_coeff_st
····End: lagrange_to_coeff_st ..................................................239.386ms
····Start: expressions gpu eval
-->
····End: expressions gpu eval ..................................................1.590s
····Start: permutations
--> `evaluation.rs` L877
····End: permutations ..........................................................182.230ms
····Start: eval_h_lookups
-->
····End: eval_h_lookups ........................................................567.423ms
··End: h_poly ..................................................................2.579s
### `vanishing arg / challenger:`
··Start: vanishing construct
··End: vanishing construct .....................................................95.446ms
### `poly eval (compute instance, adice, fixed):`
··Start: eval poly
··End: eval poly ...............................................................306.040ms
··Start: eval poly vanishing
··End: eval poly vanishing .....................................................6.605ms
··Start: eval poly permutation
··End: eval poly permutation ...................................................38.088ms
··Start: eval poly lookups
··End: eval poly lookups .......................................................163.562ms
### `open:`
··Start: multi open
··End: multi open ..............................................................534.920ms
End: create proof ..............................................................6.115s
write transcript to "/home/x/Workspace/zkWasm/output2/zkwasm.0.transcript.data"
--------------------------
## ev setup & elements/cells dump & complexity dump
rom `evaluation.rs` a new `Evaluator` will dump the elements and its cells for the `ProveExpression`
```Rust!
let mut e = ProveExpression::new();
...
println!("elements:");
for s in &e {
println!("cells are {}", ProveExpression::<C::Scalar>::string_of_bundle(&s.0))
}
```
Which is in `prepare ev` --> `group exprs` phase.
Then dump complexity:
```Rust!
for (i, e) in es.iter().enumerate() {
let complexity = e.get_complexity();
ev.unit_ref_count = complexity.ref_cnt.into_iter().collect();
ev.unit_ref_count.sort_by(|(_, l), (_, r)| u32::cmp(l, r));
ev.unit_ref_count.reverse();
println!("--------- expr part {} ---------", i);
println!("complexity is {:?}", e.get_complexity());
println!("sorted ref cnt is {:?}", ev.unit_ref_count);
println!("r deep is {}", e.get_r_deep());
}
```
return an `Evaluator`
### evaluate_h
prover.rs (the prover) calls `evaluate_h` (in evaluation.rs -> the gpu featured) with self pk.ev.evaluate_h(...) .
i.e. `ProvingKey.Evaluator..evaluate_h(...)` .
create new BTreemap
"units" which is `BTreeMap<usize, ProveExpression>` and
"unit_stat" which is `BTreeMap<usize, usize>`
for `gpu_gates_expr` (pk.ev.gpu_gates_expr) which is of type `ProveExpression` prefetch `units` and `unit_stat`
get values (type Polynomial)
run permutations.
run lookups.
------------
Evaluator data struct
```Rust!
#[derive(Default, Debug)]
pub struct Evaluator<C: CurveAffine> {
/// Constants
pub constants: Vec<C::ScalarExt>,
/// Rotations
pub rotations: Vec<i32>,
/// Calculations
pub calculations: Vec<CalculationInfo>,
/// Value parts
pub value_parts: Vec<ValueSource>,
/// Lookup results
pub lookup_results: Vec<Calculation>,
/// GPU
pub gpu_gates_expr: Vec<ProveExpression<C::ScalarExt>>,
pub gpu_lookup_expr: Vec<LookupProveExpression<C::ScalarExt>>,
pub unit_ref_count: Vec<(usize, u32)>,
}
```
------------
### expressions gpu eval:
perform gpu_eval on each `gpu_gates_expr` filed of `pk.ev.` in a parallel fashon (Rayon). Call
```Rust!
x.eval_gpu(group_idx, pk, &unit_stat, &advice_poly[0], &instance_poly[0], y))
```
i.e., `eval_gpu` (in evaluation_gpu.rs) with the goup index, provingkey, memory cache, advice poly, instance poly & challenge y.
in evaluation_gpu.rs
```Rust!
fn eval_gpu<C: CurveAffine<ScalarExt = F>>(
```
generates extended fft from provingkey with gpu program.
`self._eval_gpu()` is called
**more to come WIP.**
------------
### ProveExpression::Y
`y` is the challenge with the exended scalar field. `y: ChallengeScalar<C, Y>`
`ys` is a Vec with two elements; the scalar field value of 1 & the challenge value (from above)
**working on this WIP.**
------------
### ProveExpression::Op
op = Operation (Sum or Product) the l & r.
```Rust
ProveExpression::Op(l, r, op) => {...}
```
e.g.,
```
l = Op(Scale(Unit(Advice { column_index: 36, rotation: Rotation(0) }), {407: 0x0000000000000000000000000000000000000000000000000000000000000001}), Scale(Unit(Advice { column_index: 36, rotation: Rotation(1) }), {1: 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000000}), Sum)
r = Unit(Fixed { column_index: 15, rotation: Rotation(0) })
op = Product
-----------------------------------------------
--> ProveExpression::Op
l = Scale(Unit(Advice { column_index: 36, rotation: Rotation(0) }), {407: 0x0000000000000000000000000000000000000000000000000000000000000001})
r = Scale(Unit(Advice { column_index: 36, rotation: Rotation(1) }), {1: 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000000})
op = Sum
```
**working on this WIP.**