# BLS Aggregation Multiplicity Problem
Notes on BLS Aggregation Multiplicity Problem discussed at SBC Ethereum/ZK Workshop with Mary Maller, Prof. Dan Boneh, Andrew He, Scott Wu, George (asn), Carl, Dankrad, Pop and many others.
## BLS - Usual Scheme
First, a quick note on BLS signature scheme. BLS is a elegant pairing based signature scheme:
### Keys
Generate a secret key $sk$ and a corresponding public key $pk = [sk]G_1$ where $G_1$ is the generator. Note $sk$ is secured by $DLOG$ hardness.
### Signature
To sign a message $M$, compute $sig = [sk]H(M)$, where $H$ is a hash function.
### Verification
To verify a $sig$, simply check $e(H(m), pk) = e(sig, G_1)$.
### Aggregation
Given two signatures and keys, $pk_1, sk_1, sig_1$ and $pk_2, sk_2, sig_2$ and the same message $M$, notice that
$e(H(M), pk_1 + pk_2) = e(sig_1 + sig_2, G_1)$. So, let $sig_{agg} = sig_1 + sig_2$ and $pk_{agg} = pk_1 + pk_2$. Then, verifying an aggregation is the same as verifying a signature for $pk_{agg}, sig_{agg}$: $e(H(m), pk_{agg}) = e(sig_{agg}, G_1)$. Correctness follows from the standard pairing properties.
Notice this can be extended to aggregating more than 2 signatures directly.
## Setup
Note: N = 4096, M = 16, t = 4 seconds right now?
Currently, there are $N$ validators that BLS sign a block and submit their signature to $M$ selected aggregators. Then, these $M$ parties aggregate all the signatures they’ve received into one aggregate BLS signature, and then, repeat aggregation of all these aggregate signatures with each others’ aggregations and post the summed aggregation on-chain attesting a block’s validity. Ideally, each of the $N$ validator wants their signature to go on the block (for security and potentially even financial rewards in future). Therefore the $N$ validators will usually submit their signature to *many* of the $M$ aggregators, in the hope that at least one of them will honestly include it in the full aggregation.
Additionally, we would like to do this in a distributed network setting within two rounds of t seconds each, the first round for validators to send signatures to the aggregators, and the second to generate a full aggregation. Introducing more rounds or networking may be possible, but is considered out of scope for this document.
<img style="center" height="300" src="https://i.imgur.com/IIQm1hD.jpg" />
For the first layer of aggregation, where a single aggregator simply listens for signatures from many validators, the aggregation setup described in the [section above](#Aggregation) works nicely. The aggregator can allow others to verify the aggregation by submitting an $N$ bit string where the $i$th bit is ON if the signature from the validator at index $i$ is included in the aggregation, and this bit string can be used to assemble a matching $pk_{agg}$ by the verifier. Communicating this aggregation is therefore very cheap, only requiring communication $N$ bits and one $sig_{agg}$ (which consists of $\lceil\log_2(|$field size$|)\rceil$ bits).
However, this communication cost increases when we try to aggregate signatures in the next layer. Since all the aggregators would have assembled aggregations independently, there would be overlapping signatures amongst their aggregations, so simply communicating a bitstring is not enough.
As a particular example, for $N=3$ and $M=2$, and one of the aggregators sees a signature from the first and third validator while the other sees signatures from the first and second, so the bit strings corresponding to their individual aggregations are $(1, 0, 1)$ and $(1, 1, 0)$. Simply following the [aggregation scheme](#Aggregation) described previously, we would end up with $pk_{agg} = 2pk_1 + pk_2 + pk_3$ and $sig_{agg} = 2sig_1 + sig_2 + sig_3$. Notice that the contribution of each individual $sig$/$pk$ is no longer bounded by $1$, so we would need to communicate the entirety of the coefficients for verification (which naively requires $\lceil\log_2(|$field size$|)\rceil \cdot N$ bits for communication).
In short, we are looking to create a signature scheme that supports aggregations such that a same keypair may contribute signatures many times but can be verified in minimum performance/networking overhead.
So, we suggest a secondary scheme that’s simple & fast and can go alongside BLS signatures to support such structure. We additionally show that said scheme can be made succinct by wrapping in a bulletproof.
## Suggested Simple Scheme
Notice that signature from each keypair can be repeated at most $M$ times in the final aggregation (since each of the aggregators may contribute it at most once). For $pk_{agg} = v_1pk_1 + v_2pk_2 + \dots = <v, pk>$, $v_i \in [0, M] \forall i$. Therefore, the vector $v$ can be communicated in $N\cdot\lceil\log_2(M)\rceil$ bits, which is already much better than previous approaches.
## Bulletproofing
First, we construct a bulletproof for a very specific problem: Given a vector of BLS public keys $\vec{pk}$, a bitstring $\vec{b}$ and an aggregate public key $pk_{agg}$, show that there exists some witness vector $\vec{v}$ such that $<\vec{v}, \vec{pk}> = pk_{agg}$ and if $b_i = 1$, then $v_i \neq 0$. Later, we will show that this bulletproof is sufficient for our use case.
### Construction
Inputs: Aggregate public key $pk_{agg}$, bitstring $\vec{b}$, vector of BLS public keys $\vec{pk}$
Witness: The vector of coefficients $\vec{v}$ of the public keys $\vec{pk}$.
Show:
- $<\vec{v}, \vec{pk}> = pk_{agg}$: This is an inner product argument and directly shown doable in $\log_2{N}$ rounds = $2\log_2{N}$ field element communication in the [original bulletproof paper](https://eprint.iacr.org/2017/1066.pdf), where $N$ = size of $\vec{v}$ = number of validators.
- $\vec{v} \cdot \vec{v}^{-1} = \vec{b} \implies \vec{v} \cdot \vec{v}^{p-2} = \vec{b}$ where $p$ is the order of the field (prime). This constrains that if $b_i = 1$, then $v_i \neq 0$.
By including this proof on-chain alongside its inputs, we are able to show the validity of $pk_{agg}$ and $\vec{b}$ without including $\vec{v}$ itself, and this can then be used to verify a corresponding BLS signature from $pk_{agg}$. This bulletproof is quite succinct with only $O(log N)$ field elements necessary for communication and $O(N)$ verification time (which in this case should be fast enough for the size of N). Therefore, with just $O(log N)$ communication/block size overhead, we have solved our original problem. Some napkin math suggested this becomes more efficient to communicate over the [simple scheme above](#Suggested-Simple-Scheme) at $N = 1000$ or so. (TODO: do exact math on bits transmitted and when this is better than the simpler solution mentioned above)
Notice, however, we have not constrained the other direction on $\vec{b}$, if $v_i \neq 0$, then $b_i = 1$ (i.e. the aggregator could have set $b_i = 0$ even if they are actually including a signature from the corresponding validator in $pk_{agg}$). This is _probably_ not necessary since it seems it does not create _new_ attack surface. In particular, the aggregator creating this proof could have simply created an alternate aggregation $sig_{agg}/pk_{agg}$ excluding the validator anyway, so this is probably fine to ignore? (TODO confirm?)
## Dankrad's Problem Extension
On proposing this solution, Dankrad noted an extended version of this problem where it may be useful to have a recursively verifiable version of this scheme. The details of this problem are not fully clear to me (TODO, ask him to detail?) but it seemed to suggest something like: at a later time you want to be able to prove that a certain validator signed (or did not sign) a particular historical block, and having a single commitment in the current block that they can just open to demonstrate this multiproof. Essentially, we were previously solving the problem for a two layer tree and solving the problem at the second layer, but Dankrad wanted a solution for a three layer tree (for both level 2 and 3). If this extended problem is a priority, one fix for this is to use the [simple scheme](#Suggested-Simple-Scheme) at the layer of aggregation into blocks (i.e. level 2), and then use the bulletproof construction at the layer of proving slashing (level 3).
An obvious solution is to verify bulletproofs inside a SNARK, or use another efficient SNARK recursion proving system like [Plonky 2](https://github.com/mir-protocol/plonky2/blob/main/plonky2/plonky2.pdf) or [Nova](https://eprint.iacr.org/2021/370), but this seems generally inefficient and the security proof would need more work.
On further thought, it’s not clear to me why we can’t maintain some sort of sliding window merkle tree of these (signatures, bitstrings) in the blocks and then open these with merkle proofs + BLS verifications. (I’m probably missing something about the extension?)
asn noted another related problem/extension by Vitalik: https://ethresear.ch/t/request-for-cryptographic-primitive-vector-commitment-for-elliptic-curve-points-with-algebraic-properties/9080