## optimize LogUp challenge in IVC
### background
In pull-request [read-write lookup offline-memory-checking logUp-based implementation in (Super)Nova R1CS](https://github.com/lurk-lab/arecibo/pull/48) discussion, one of remains question is to optimise the cost of random oracle circuit in IVC. This post try to address the issue and propose a design.
### Design rationale
One of the nice property to derive challenge via random oracle in folding IVC is we can decouple random oracle into function composition, and relax one or more of function to loose property, e.g. one of them not nessesary to satisify indifferentiable from a random oracle, while still keep overall composition result indifferentiable from random oracle. With relaxation, we implement relaxed function in IVC step circuit, while in the final SNARK pass accumuation result to another random oracle function to get the challenge which indifferentiable from random oracle.
### Random Oracle
Terminology cited from [Proving the correct execution of concurrent services in zero-knowledge](https://eprint.iacr.org/2018/907.pdf)
Giving
$\mathfrak{H}=\phi(R(x))$,
with below
- $x\in \mathbb{F}^{n}$
- $R: \mathbb{F}^{n} \rightarrow \mathbb{F}$ is collision resistent encoding
> R even can be non-uniform, and not indifferentiable from a random oracle
- $\phi: \mathbb{F} \rightarrow \mathbb{F}$ is random oracle
Then $\mathfrak{H}$ is random oracle
> reference proof https://hackmd.io/@2DQ_BR_sTUOyIfPms3lGVw/r1TNUj8q6
### collision resistent encoding function R design
In each step circuit, we know how many read/write it will have. Per read/write we need to encoding $(a, v_{prev}, v, t_{prev}, t)$ 5 field elements and accumulated with prev acc value to obtain new acc and pass to next round.
To design a collision resistent hash, and with idea from pedersen hash, we can design as below: in setup phase we can pre-generate random ordered-set field values $[f_0, f_1, ... f_n]$, for $f_0, f_2, ... f_n \in \mathbb{F}$.
> In zkVM setting, one step circuit in SuperNova might represent one opcode. Each opcode read/write are bounded. Giving max read/write num is $k$, $n=5 \times k$.
> Question: $f_i$ need to be prime?
Giving $(a, v_{prev}, v, t_{prev}, t)$, we decompose each field value into bit decomposition and concat all as
$bits$ = bit($a$) \|\| bit($v_{prev}$) \|\| bit($v$) \|\| bit($t_{prev}$) \|\| bit($t$)
Then we can encode batched field values $(a, v_{prev}, v, t_{prev}, t)$ into an order matter single field value, assume bit concated value lead to `j` bits
$$acc = acc_{prev} + R(bits, [f_{i}])
= acc_{prev} + \Sigma_{i=0}^j b_{i} \times f_{i}$$ where $b_{i}$ is the bit value of $bits$ in index $i$
In R1CS, This formula can be represented in just 1 R1CS constraint.
#### collision resistent encoding on batched read/write in a single step
To encode batched of read/write in step circuit, it's naturally viewing it as encoding vector $(a_{0}, v_{0,prev}, v_{0}, t_{0, prev}, t_{0}, a_{1}, v_{1,prev}, v_{1}, t_{1, prev}, t_{1}, ...)$ from its bits representation
$bits$ = bit($a_{0}$) \|\| bit($v_{0,prev}$) \|\| bit($v_{0}$) \|\| bit($t_{0,prev}$) \|\| bit($t_{0}$) \|\| bit($a_{1}$) \|\| bit($v_{1,prev}$) \|\| bit($v_{1}$) \|\| bit($t_{1,prev}$) \|\| bit($t_{1}$) \|\| ...
> Notice with these design, it retain the order matters feature, such that if we exchange any 2 field value from vector it lead to different encoding field value.
So the length of pre-generated constant random field set $[f_{i}]$ need to cover max bits length in step circuit.
#### collision resistent encoding order matters across steps
For encoded acc value we can passed it to next step to aggregate to next acc value. However we need to account for one order matters scenario.
Giving vector folding step m
$(v_{m, 0}, v_{m, 1},...)$
and step n = m+1
$(v_{n, 0}, v_{n, 1},...)$
Then
$$acc = acc_{prev} + R(bits_{m}, [f_{i}]) + R(bits_{n}, [f_{i}]) $$
When a unhonest prover exchange 2 field value in different step yet same position, e.g. $v_{m,1}$ with $v_{n,1}$, then the new acc value will remain unchanged, imply function R is not collision resistent with order matter.
To address this issue, we introduce step variable into accumulation formula as
$$acc = acc_{prev} \times c + R(bits, [f_{i}]) $$
where $c \in F$ represent current $c_{th}$ folding step in Nova, and $c$ starting from 1.
> Another alternative design is $acc_{prev}^{c}$, to raise $acc_{prev}$ to power $c$
#### R collision resistent analysis
Below we will infer the probability for such collision exist
define
- prime field $\mathbb{F}$ with $p$ bits prime
- $n!= 1 \times 2 \times 3 ... \times n$
- m < n, $(m,n)!= m \times (m+1) \times (m+2) ... \times n$
- $l$ is max bit length of read/write absortion
- $[f_i]$ is the pseudorandom generated fields set, $f_i \in \mathbb{F}, i \in l$
- $N$ is total number of folding steps
#### base case $l$=2
To prove R is collision resistent, firstly we analyze $l$ = 2 case, and $[f_1, f_2]$ field set
Then it's equivalent to find $k_i \in \mathbb{F}, k_i > 0$, such that $k_1 \times f_1 + k_2 \times f_2 = 0 \rightarrow k_1 = neg(k_2 \times f_2) \times f_1^{-1}$
Giving Encoding function R
$$acc = acc_{prev} \times c + R(bits, [f_{i}]), c \in [1..N]$$
And define set $S = \{(m, N)! \space|\space m \in [1... N]\}$, subset $S^{\prime} \subset S$
Let
$$ k^{\prime} = \sum_{a_i \in S', S' \subset S} a_i$$
Subset $S^{\prime}$ got $2^N$ choices, each can derived respective $k^{\prime}$. So there are $2^N$ possible $k^{\prime}$.
It implies for $k_2$ we have $2^N$ choice, after $k_2$ settle down, $k_1$ so got $2^N$ choice.
The probability for $k_1, k_2$ overlapping (collision) is
$$ \frac{2^N + 2^N}{2^p} $$
For example, giving folding steps $N = 2^{10}$. $p$ is 255 bit. Then the probability for collision exist $2^{11-255}$
#### General case $l$ > 2
For $l$ > 2, it is equivalent to find $\exists k > 0$, such that
$$ k_1 \times f_1 + k_2 \times f_2 + k_3 \times f_3 + ... + k_l \times f_l = 0$$
We can rewrite as
$$ k_1 = neg(k_2 \times f_2 + k_3 \times f_3 + ... + k_l \times f_l) \times f_1^{-1} $$
Combination of $k_2 , k_3, k_4,...,k_l$ lead to
$$2^{(l-1) \times N}$$
choice.
Similarly, $k_1$ also got $2^{N}$ choice
And the probability for collision exist is
$$ \frac{2^N + 2^{(l-1) \times N}}{2^p} $$
#### other notes
Speical notice above analysis just prove the probability of collision exist. We still need to search solition space and verify to find it. To verify a candidate, we need to do efficient **factorial decomposition**, means given $k$, find $b_1, b_2, b_3,... b_N \in [0,1]$ such that
$$ k = \sum_{i \in N} b_i * i! $$
> naive bruce-force search take $O(2^N)$ time complexity, not sure is there any efficient other algorithm .
### Cost analysis for R function in step circuit
Giving example
- address $a$ max 32 bits, accounting for 4G address
- value $v$ max 64 bits.
- max read-write ts across folding is 32 bits.
- max folding step is 64 bits
Then in R1CS
| name | Decomposition into Bits | collision resistent R |
|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|
| #R1CS constraints | 32 + 1 from address a <br> 64 + 1 from value v <br> 32 + 1 from ts <br> 33 + 65 x 2 + 33 x 2 = 229. <br> giving k reads/writes <br> overall cost 229 x k | 1 |
| #R1CS vars | 32 from address a bit <br> 64 from value v bits <br> 32 from ts bits <br> 32 + 64 x 2 + 32 x 2 = 224 <br> Giving k reads/writes, overall cost 224 x k | 1 |
So the overall challenge cost are growing linear with number of read/write in step circuit, while still remain agnostic to table size |T|
### random oracle in final SNARK
We can choose any random oracle candidate $\phi$ e.g. poseidon hash in final SNARK.
The final challenge (r, $\gamma$)
$$r = \phi(final\_acc)= $$
and
$$\gamma = \phi(r)$$