## Week 2
Working on understand he patch for `halo2_proofs`
doing some bench & understnadign whats going on with the cache mechanics
For a `ProveExpressionUnit` (in evaluation.rs) patch goes,
From:
```rust
let es = es.into_iter().map(|e| {
ProveExpression::reconstruct(e.as_slice())
}).collect::<Vec<_>>();
```
To:
```rust
let es = es.into_iter().map(|e| {
ProveExpression::reconstruct(e.as_slice())
let mut es = es.into_iter().map(|e| {
println!("elements:");
for s in &e {
println!("cells are {}", ProveExpression::<C::Scalar>::string_of_bundle(&s.0))
}
let a = ProveExpression::reconstruct(e.as_slice());
a
}).collect::<Vec<_>>();
```
looking into `evaluation_gpu.rs` `impl ProveExpression<F>` checking out the modifications of the cache policy generator
From:
```rust
pub(crate) fn gen_cache_policy(&self, unit_cache: &mut Cache<Buffer<F>>) {
match self {
ProveExpression::Unit(u) => unit_cache.access(u.get_group()),
ProveExpression::Op(l, r, _) => {
l.gen_cache_policy(unit_cache);
r.gen_cache_policy(unit_cache);
}
ProveExpression::Y(_) => {}
ProveExpression::Scale(l, _) => {
l.gen_cache_policy(unit_cache);
}
}
}
```
which i nthe case of `ProveExpression::Unit` perform a chche access
if the expression is a `::Op` or `::Scale` then call recursively call `gen_cache_policy` on left and right operands.
To:
```rust
pub(crate) fn gen_cache_policy(&self, unit_cache: &mut Cache<Buffer<F>>) {
let handle_flat = if let Some ((uid, exprs)) = self.flat_unique_unit_scale() {
if exprs.len() > 1 {
uid.gen_cache_policy(unit_cache);
Some (())
} else {
None
}
} else {
None
};
if handle_flat.is_none() {
match self {
ProveExpression::Unit(u) => unit_cache.access(u.get_group()),
ProveExpression::Op(l, r, _) => {
l.gen_cache_policy(unit_cache);
r.gen_cache_policy(unit_cache);
}
ProveExpression::Y(_) => {}
ProveExpression::Scale(l, _) => {
l.gen_cache_policy(unit_cache);
}
}
}
}
```
first checks is self returns `Some` from `flat_unique_unit_scale()`, if so if the expression length is > 1 then generate the Some cache policy, ow None --> `handle_flat`.
If `handle_flat` is None, then mathc the expression similar to above.
Breaking down:
```rust
pub fn flat_unique_unit_scale(&self) -> Option<(Self, Vec<Self>)> {
...
}
```
basically matches an `::Op` checking if its a `Bop::Sum`, if it can be flattened into unique unit scales (that is, `flat_unique_unit_scale` rest Some) and the unit grop are the same, it concats their expressions and returns Some with the flattened expression and ascoiated Vec. If matches on `::Scale` then if the scaled expression if a Unit then return Some: the original expr and a Vec<ProveExpression>. If its not a Unit expr then ret None.
Litle bit of `pub(crate) fn do_fft_core<F: FieldExt>()`
If 25, change degree to 27 (3*7 = 21), and for the next round use degree 4 (21 + 4 = 25). -- this give 70% improvement of fft with k = 27.
There is a performance bug when degree = 1.
stil la WIP working out whats going on with performance bug.
```
-------------------------------------------------------
do_fft_core()
log_n = 22
n = 4194304
max_log2_radix = 8
max_log2_local_work_size = 6
-------------------------------------------------------
```
todo pull apart:
```
uint lid = GET_LOCAL_ID();
uint lsize = GET_LOCAL_SIZE();
uint index = GET_GROUP_ID();
uint t = n >> deg;
uint p = 1 << lgp;
uint k = index & (p - 1);
x += index;
y += ((index - k) << deg) + k;
uint count = 1 << deg; // 2^deg
uint counth = count >> 1; // Half of count
uint counts = count / lsize * lid;
uint counte = counts + count / lsize;
```
```opencl
KERNEL void FIELD_eval_batch_scale(
GLOBAL FIELD* res,
GLOBAL FIELD* l,
GLOBAL int* l_rot,
uint nb_scale,
uint size,
GLOBAL FIELD* c
) {
uint gid = GET_GLOBAL_ID();
uint idx = gid;
uint lidx = (idx + size + l_rot[0]) & (size - 1);
res[idx] = FIELD_mul(l[lidx], c[0]);
for (uint i = 1; i < nb_scale; i++) {
uint lidx = (idx + size + l_rot[i]) & (size - 1);
res[idx] = FIELD_add(res[idx], FIELD_mul(l[lidx], c[i]));
}
}
```