# SHPlonK explainer There are a lot of low-level details in PlonK which typically get low attention (or are completely swept under the rug) in many popular expositions. Today, I want to talk about a different (from canonical PlonK paper [GWC19](https://eprint.iacr.org/2019/953.pdf)) opening scheme, suggested in the subsequent work [BDFG21](https://eprint.iacr.org/2020/081.pdf). I will not discuss the PlonK itself (though, a bit of familiarity will help a lot in understanding this text). Notably, this approach is used in practice in many currently existing PlonK systems; but explanations tend to follow GWC19. So maybe it makes sense to do a some sort of writeup on this. This explainer is first and foremost for me (best way of understanding something is trying to explain it!); though I hope it will be useful for the general audience. ## Recall: PlonK ### Plonkish protocols Plonkish protocols, generally, look like this: 1. Prover sends commitments to some polynomials 2. Verifier responds with some challenges 3. (they repeat this for some amount of rounds) 4. Prover opens these polynomials in some points, and verifier checks some equations on these openings. In fact, typically these equations are supposed to hold on the polynomials *themselves*, and their *rotations*, where $\text{rot}_k(P) (x) = P(\mu^k x)$, with $\mu$ being a primitive root of unity of order $n$. Therefore, in general, to check these equations, in the last step, verifier sends a challenge point $x$ and prover opens the polynomials $P_i$ in points $\mu^k x$ for $k \in S_i$, where $S_i$ is set of all rotations of $P_i$ present in the constraint. I did not talk about quotient polynomial $H$ here, but it is naturally included in this framework, so this is a non-issue. ### KZG commitment scheme There are actually few closely related things called KZG commitment scheme, we will talk about one used in PlonK: SRS: $(G, \tau G, \tau^2 G, ..., \tau^{n-1} G) \in \text{Vec}\langle \mathcal{G}_1 \rangle; (H, \tau H) \in \text{Vec}\langle \mathcal{G}_2 \rangle$. Here, $\tau$ is a toxic waste element. Also, optionally we could have more powers of tau in the second group, but this is not necessary in what we are going to do (but useful in some other protocols, like proving that polynomial has some particular degree). We will also denote $G_i = \tau^i G$, $H_i = \tau^i H$. For a polynomial $P(t)$ its *KZG commitment* is defined as: $[P(t)]_1 = P(\tau)G = \underset{i=0}{\overset{\text{deg}P}\sum} p_i G_i$ Now, we would like to describe how to "open" a commitment - i.e. prove that a committed polynomial satisfies $P(x) = y$ for some field elements $x, y$. This is done as follows: prover computes $Q(t) = \frac{P(t) - y}{t - x}$, and sends commitment to it (called "opening proof"). Verifier, then, checks: $\langle [P]_1 - [y]_1, [1]_2 \rangle$ = $\langle [Q]_1, [t-x]_2 \rangle$. Let's rewrite it in a bit more convenient way: $\langle [P]_1 - yG_0, H_0 \rangle$ = $\langle [Q]_1, H_1 - xH_0 \rangle$ And further, using bilinearity, open up the rhs and reorganize the check a bit: $\langle [P]_1 - yG_0, H_0 \rangle$ = $\langle [Q]_1, H_1\rangle - \langle[Q]_1, xH_0 \rangle$ $\langle [P]_1 - yG_0, H_0 \rangle$ = $\langle [Q]_1, H_1\rangle - \langle x[Q]_1, H_0 \rangle$ $\langle [P]_1 - yG_0 + x[Q]_1, H_0 \rangle$ = $\langle [Q]_1, H_1\rangle$ This is a very convenient way of representing the check, because you can notice that in these equations the right term in the pairing is always constant ($H_0$ and $H_1$ respectively), which means that we can, potentially, batch these equations together. I.e., if we had multiple such equations, for multiple polynomials, we could add them up with random coefficients and only then check. Moreover, this addition would be only cheap $\mathcal{G}_1$ operations. *Notably, any homomorphic polynomial commitment scheme admits batching for the opening in the same point - but here were can do it even for different points*. Therefore, final PlonK verifier costs 2 pairings, and an MSM (random linear combination) of size *total amount of points opened*. Here, I will talk about SHPlonK, which is a way of reducing the size of this MSM using multi-opening arguments. ## SHPlonK Before we proceed, let's discuss how to open a polynomial in multiple points simultaneously. Let's denote as $Y(t)$ a polynomial which satisfies $Y(x_i) = y_i$. We would like to prove that P(t) - Y(t) is divisible by $Z_S(t)$, where $Z_S = \underset{i}\prod(t-x_i)$ is a vanishing polynomial of a set $S = \{x_1, ..., x_k\}$. This could be achieved, for example, if we had a more extended SRS (having more powers of tau in the second group) - we would just write something like: $\langle[P]_1 - [Y]_1, [1]_2\rangle = \langle [Q(t)]_1, [Z_S(t)]_2 \rangle$. There are tricks in SHPlonK paper explaining how to batch these equations (but still inconvenient because they involve a lot of $\mathcal{G}_2$ operations). But there is also an alternative idea (which I will try to first explain as how I understand it, and then check against the paper). So, we have commitments to polynomials: $[P]_1, [Q]_1$. In order to check the required equation, let's just ask the verifier for a challenge point $z$, open $[P]_1, [Q]_1$ there, and check that $P(z) - Y(z) = Q(z) Z_S(z)$! This, of course, still requires computing $Y(z), Z_S(z)$, but it is much faster than doing MSM (because it is just a computation over a field). The fact that we need to *both* normally open $P$ and $Q$ is a bit inconvenient, i.e. this only gives an advantage if our set $S$ was $3$ points or larger. Now, armed with our "first impression" argument, let's take a look what the authors of SHPlonK actually do in their paper. Let's denote the union of $S_i$'s as $T$. 1. Prover: send commitments $[P_1]_1, ..., [P_n]_1$ and openings $Y_i(t)$ in sets $S_i$. 2. Verifier: send random challenge $\gamma$ for random linear combination. 3. Prover: compute $P(t) = \underset{i}\sum \gamma^i Z_{T\setminus S_i}(t)(P_i(t) - Y_i(t))$. // notice that we have multiplied differences by $Z_{T \setminus S}$, which makes the whole thing divisible by $Z_T(t)$.<br> Denote $H(t) = \frac{P(t)}{Z_T(t)}$, and send commitment $W = [H]_1$. 4. Verifier: send random challenge $z$. 5. Prover: compute $P_z(t) = \underset{i}\sum \gamma^i Z_{T\setminus S_i}(z) (P_i(t) - Y_i(z))$ (we are aiming to do the same thing I've described before, but for now evaluated it in $z$ only partially). <br> Define $L(t) = P_z(t) - Z_T(z)H(t)$. It is clearly vanishing in $t=z$, so let's send a commitment $W' = [\frac{L(t)}{t-z}]$. 6. Verifier: computes $F = \underset{i}\sum \gamma^i Z_{T\setminus S_i}(z) ([P_i]_1 - [Y_i(z)]_1) - Z_T(z) W$, and checks: <br> $\langle F, [1]_2\rangle = \langle W', [t-z]_2 \rangle$. Let's take a deep breath; first of all, to see how this is related to what I've said before, let's imagine we only had a single polynomial, and a single set $S$. Then, $H$ has a role of $Q$ in our previous naive scheme. Nonetheless, already in this case, the SHPlonK is a bit different. Let's spell it out: 1. Prover: send commitments to $P$ and $Q = \frac{P(t)-Y(t)}{Z_S(t)}$. 2. Verifier: send $z$. 3. Prover: prepare proof $W' = [\frac{P(t)-Y(z)-Z_S(z)Q(t)}{t-z}]_1$. 4. Verifier: naturally, check $\langle W' , [t-z]_2 \rangle = \langle [P]_1 - [Y(z)]_1 - Z_S(z)[Q]_1,[1]_2\rangle$. That's similar to the naive protocol I've described, but we do not open $P$ and $Q$ in $z$ independently. Let's abstract a bit: imagine we had some (committed) polynomials $P_1, ..., P_n$, and low-degree public polynomials $f_1, ..., f_n$, and we wanted to prove that $\underset{i}\sum f_i(t) P_i(t) = 0$. Then, we want to do something which (in a different context) is known as Mary Maller's optimization: after verifier sends us $z$, we just need to prove that $\underset{i}\sum f_i (z) P_i(t)$ vanishes in a point $z$. Using the fact that our $f_i$-s our public, and commitments are homomorphic, we will now only need to open $\underset{i}\sum f_i(z) [P_i]_1$, which roughly ~ doubles our efficiency. // Mary Maller's argument generally also deals with the situation where both $f$ and $P$ are private, where it is less overpowered but still reduces the amount of openings we need to do. Good explanation of her argument can be found [here](https://www.cryptologie.net/article/526/maller-optimization-to-reduce-proof-size/). This is exactly what happens in the general argument, too. We need to check that $\underset{i}\sum \gamma^i Z_{T\setminus S}(t) (P_i(t) - Y_i(t)) - H(t)Z_T(t)$ = 0. In order to do it, we get a variable $z$ from a verifier, and substitute it to all public polynomials ($Z_T, Z_{T \setminus S_i}, Y_i$), which, conveniently, gives us a linear combination of committed polynomials (that we can just add up using the fact that the commitment is homomorphic) - which can then be opened in $z$ in one go. The equations provided above should also be optimized in a same fashion that we have did for a single opening proof - i.e. rearranged in a form $\langle A, H_0 \rangle = \langle B, H_1 \rangle$ - which is done trivially using the same technique as before. This allows us to batch the multipoint opening arguments in a same fashion as normal opening arguments. ## Consequences This approach is typically advantageous. Notably, verification is much cheaper provided we have a lot of different rotations of a same polynomial ("narrow" config). We also do not need to have a separate column for public inputs, like in normal PlonK - which is a notable difference. Instead, we can just use multipoint opening argument to ensure that some values of some polynomials in some cells are constrained to be public inputs. As far as I understand, the [FFlonK](https://eprint.iacr.org/2021/1167) is also an extreme version of a similar idea, though I do not understand it well enough to judge yet.