owned this note
owned this note
Published
Linked with GitHub
usepackage{stmaryrd}
# HoneyBadgerMPC and AsynchroMix: Practical Asynchronous MPC and its Application to Anonymous Communication - 2019
###### tags: `Tag(HashCloak - DC Net Readings)`
Author(s): Donghang Lu, Thomas Yurek, Samarth Kulshreshtha, Rahul Govind, Rahul Mahadev, Aniket Kate, Andrew Miller
Paper: [https://eprint.iacr.org/2019/883.pdf](https://eprint.iacr.org/2019/883.pdf)
### Table of Contents
[toc]
:::info
>Abstract: Multiparty computation as a service (MPSaaS) is a promising approach for building privacy-preserving communication systems. However, in this paper, we argue that existing MPC implementations are inadequate for this application as they do not address fairness, let alone robustness. Even a single malicious server can cause the protocol to abort while seeing the output for itself, which in the context of an anonymous communication service would create a vulnerability to censorship and de-anonymization attacks.
> To remedy this we propose a new MPC implementation, HoneyBadgerMPC, that combines a robust online phase with an optimistic offline phase that is efficient enough to run continuously alongside the online phase. We use HoneyBadgerMPC to develop an application case study, called AsynchroMix, that provides an anonymous broadcast functionality
> AsynchroMix features a novel MPC program that trades off between computation and communication, allowing for low-latency message mixing in varying settings. In a cloud-based distributed benchmark with 100 nodes, we demonstrate mixing a batch of 512 messages in around 20 seconds and up to 4096 messages in around two minutes.
:::
## 2 PRELIMINARIES: MPC BASED ON SHAMIR SECRET SHARING
> Our standard MPC setting involves $n$ parties ${\mathcal{P}_1, . . . , \mathcal{P}_n }$, where up to $t < n/3$ of those can be compromised by a Byzantine adversary.
HoneyBadgerMPC relies on many standard components for MPC \[11, 29, 32, 42\] based on Shamir secret-sharing. Here, we detail the most relevant techniques and notation.
### 2.1 Shamir Secret Sharing and Reconstruction
**Notation:** For prime $p$ and a secret ![](https://i.imgur.com/JYkTd5e.png)
denotes Shamir secret sharing [71] (SSS) with threshold t (i.e., a t-sharing).
* Specifically, a degree-$t$ polynomial $\phi : \mathbb{F}_p \rightarrow \mathbb{F}_p$ is sampled such that $\phi(0) = s$. The share ![](https://i.imgur.com/VYLjXFw.png) is the evaluation $\phi(i)$. The superscript and/or subscript of a share may be omitted when clear from context.
**Robust interpolation of polynomials:** Reconstructing a secret $s$ from ![](https://i.imgur.com/Tu0UDuD.png) requires interpolating the polynomial $\phi$ from shares received from other parties.
* Since we want to achieve security against an active Byzantine attacker, up to $t$ of the shares may be erroneous.
* Furthermore, in an asynchronous network, we cannot distinguish a crash fault from an intentional withholding of data and can consequently only expect to receive shares from $n$-t parties in the worst case.
Figure 1 outlines the standard approach [11, 29, 31, 32] for robust decoding in this setting:
![](https://i.imgur.com/tZZ8PB9.png)
* We optimistically attempt to interpolate a degree-$t$ polynomial $\phi$ after receiving any $t$ + 1 shares received, then we know is is correct.
* If the resulting $\phi$ coincides with the first $2t$ + 1 shares received, then we know it is correct.
* If the optimistic case fails, we wait to receive more shares and as they arrive to attempt to correct errors.
* In the worst case, we receive $t$ incorrect shares and need to wait for $3t$ + 1 total shares before we can correct $t$ errors and find a degree-$t$ polynomial that coincides with all $2t$ + 1 honest shares.
**Batch reconstruction:** We recall an algorithm for the amortized batch public reconstruction (*BatchRecPub*) of $t$-sharings for the $t < n/3$ setting by Damgård and Nielsen [42] in Figure 2.
* The idea is to apply a Vandermonde matrix $M$ to expand the shared secrets ![](https://i.imgur.com/f7mTvd7.png) into a set of sharings ![](https://i.imgur.com/b3NYDy5.png).
* In the first round, each server $\mathcal{P}_j$ locally computes their shares of each ![](https://i.imgur.com/4GZWGGA.png) and sends it to $\mathcal{P}_i$.
* Each $\mathcal{P}_j$ then uses *Robust-Interpolate* to reconstruct a different share $y_j$, and again use *Robust-Interpolate* to recover $x_1, ..., x_{t+1}$.
> When defining an MPC program, we use the notation $x_i \leftarrow$ ![](https://i.imgur.com/8LRYEPW.png) for reconstructing an individual share, implicitly making amortized use of the *BatchRecPub* protocol.
![](https://i.imgur.com/CZ25yUg.png)
### 2.2 SSS-Based MPC
Linear combinations of SSS-shared secrets can be computed locally, preserving the degree of secret sharing without any necessary interaction between parties. However, in order to be able to realize an arbitrary arithmetic circuit using MPC, we need a way to multiply secrets together.
* In this work, we use Beaver’s trick to multiply two t-sharings ![](https://i.imgur.com/XzFthz4.png) by consuming a preprocessed Beaver triple.
* Beaver triples are correlated $t$-sharings of the form ![](https://i.imgur.com/T2NzIgp.png) for random $a,b \in \mathbb{F}_p$ which can be used to find ![](https://i.imgur.com/5fW2p57.png) by using the following identity:
* ![](https://i.imgur.com/uS6V4VW.png)
* If $a$ and $b$ and random and independant of $x$ and $y$, then ![](https://i.imgur.com/kcvx8HV.png) and ![](https://i.imgur.com/sHsg0Ht.png) do not reveal any information about $x$ or $y$.
* Each multiplication n then requires the public opening of $(a −x)$ and $(b −y)$ and the spending of a Beaver triple.
We follow the standard online/offline MPC paradigm, where the online phase assumes it can make use of a buffer of preprocessed values that were created during the offline phase. By utilizing precomputed triples and using *BatchRecPub* to open $(a−x)$ and $(b−y)$ for many multiplication gates at once, we can process many gates at the same circuit depth simultaneously.
**Offline phase:** In order to fulfill the computational needs of our online phase, we need to generate a steady supply of Beaver Triples offline (prior to when inputs for an MPC circuit are given).
* As the offline phase can be run for an indefinite amount of time, we relax the robustness requirements and focus on more efficient protocols.
* In this way, the offline phase can proceed with less work while still gradually building up a buffer and allowing for guaranteed output in the online phase.
The first step of the offline phase is randomness extraction [11], where secret-shared random values are produced from the contributions of different servers.
* To produce $t$-sharings of random elements of $\mathbb{F}_p$, we apply an $(n,n)$ hyperinvertible matrix $M$ (concretely, a Vandermonde matrix) and compute:
* ![](https://i.imgur.com/zXbO6QB.png)
* Where each ![](https://i.imgur.com/KpL0XHN.png) is contributed by a distinct server $\mathcal{P}_i$, and we output ![](https://i.imgur.com/JCv8rQ2.png)
* The choice of $M$ ensures the ![](https://i.imgur.com/fzoKXTl.png) are random and unknown, despite of the influence of $t$ corrupt parties.
* To check that the secret sharings are of the correct degree, $2t + 1$ of the servers attempt to reconstruct one column each of ![](https://i.imgur.com/xoEJ0E4.png)
* The hyperinvertibility property of $M$ ensures that if all of the inputs are of the correct degree, then so are all of ![](https://i.imgur.com/uQWO9pE.png)
> Since all $n$ parties must be online to provide input for this process, this cannot guarantee output if any parties crash.
To generate Beaver triples, we make use of random double sharings, which are $t$- and $2t$-sharings of the same random value ![](https://i.imgur.com/nDLuAtT.png) and ![](https://i.imgur.com/ZhkkicF.png)
* For this we use *RanDouSha* [11, 42], wherein each server contributes a pair of shares, ![](https://i.imgur.com/8Z9gI4C.png)
* The first $t + 1$ pairs ![](https://i.imgur.com/kWKOhK7.png) after applying $M$ are taken as output.
* And the remaining $2t + 1$ pairs are reconstructed as a checksum (by one server each).
* All together, this protocol is given in Figure 3.
![](https://i.imgur.com/OEEUJe0.png)
Given the double sharing, we generate a Beaver triple by generating random shares ![](https://i.imgur.com/V8eh88n.png) calculating ![](https://i.imgur.com/5jZNDlc.png) and performing degree reduction:
![](https://i.imgur.com/8M6gzfF.png)
Besides random field elements and multiplication triples, the offline phase is also used to prepare random bits, and $k$ powers of random elements using standard techniques [37].
* In general, we can implement any necessary preprocessing task by combining the above two ingredients.
* The overall cost of the offline phase is summarized by the number of batch reconstructions and the number of random shares needed.
> We summarize the offline costs for our two mixing approaches in Section 5.
### Asynchronous Reliable Broadcast and Common Subset
We employ an asynchronous reliable broadcast primitive in order to receive client inputs.
A reliable broadcast (RBC) protocol satisfies the following properties:
* (*Validity*) If the sender (i.e., the client in our case) is correct and inputs *v*, then all correct nodes deliver *v*.
* (*Agreement*) If any two correct servers deliver $v$ and $v′$ , then $v = v′$.
* (*Totality*) If any correct node delivers $v$, then all correct nodes deliver $v$.
In order to reach an agreement on which instances of RBC have terminated, and to initiate each mixing epoch, we rely on an asynchronous common subset protocol [13, 24, 67]. In *CommonSubset*, each server begins with an input $b_i$ (in our application each $b_i$ is a $κ$-bit vector). The protocol outputs an agreed-upon vector of $n$ values that includes the inputs of at least $n −2t$ correct parties, as well as up to $t$ default values.
*CommonSubset* satisfies following properties:
* • (*Validity*) If a correct server outputs a vector $b′$, then $b^′_i = b_i$ for at least $n − 2t$ correct servers.
* (*Agreement*) If a correct server outputs $b′$ , then every server outputs $b′$.
* (*Totality*) All correct servers eventually produce output.
To stick to purely asynchronous primitives, we concretely instantiate *CommonSubset* with the protocol from HoneyBadgerBFT [13, 67];
* BEAT0 \[45\] is similar but offers more efficient cryptographic primitives.
For small messages, the overhead for either protocol grows with $n^2$, although for very large messages it achieves linear overhead.
* If asynchronous liveness is not needed, then any partially synchronous consensus protocol, such as PBFT [26], would suffice here as well.
## 3 ROBUSTNESS IN MPC PROTOCOLS AND IMPLEMENTATIONS
In practice, distributed computing protocols should successfully protect against not just benign failures like system churn, but also network partitions and denial of service attacks.
![](https://i.imgur.com/S6HLg3I.png)
In this section we evaluate the robustness of existing MPC implementations and protocols (summarized in Table 1), and use this evaluation to inform the design of HoneyBadgerMPC and AsynchroMix.
* For the sake of comparison, in this section we assume $n = k$ (where $n$ is the number of servers and $k$ is the number of clients).
**Fairness and Guaranteed Output:** *Fairness* is widely studied in MPC. Roughly speaking, it means that either all parties receive their output, or else none of them do [50]. Unfair protocols allow the adversary to peek at the output of the computation, while the honest parties observe the protocol fail.
*Guaranteed output delivery* y is usually considered synonymous with robustness in MPC. It is a stronger notion than fairness that further requires that corrupt parties cannot prevent honest parties from receiving output.
> Unlike fairness, guaranteed output is primarily a concern for liveness rather than safety. A fair protocol that aborts can in principle be restarted with a new set of parties. In any case, the protocols we evaluate satisfy both or neither.
**Asynchronous Safety and Liveness:** MPC protocols that guarantee output typically fall into one of two camps.
1. The first camp is based on (bounded) synchronous broadcast primitives and involves restarting the computation after detecting and eliminating one or more faulty parties.
* Such protocols can be unconditionally secure when $t < n/3$ [7, 11, 12, 42] and using cryptography can reach $t < n/2$ [42, 53].
* Dispute resolution is also used by virtualized protocols that boost a low-resilience outer protocol (i.e., $t < n/8$) to $t < n/2 − ϵ$ [39, 40].
* However, we observe that these protocols rely on the ability to time out nodes that appear to be unresponsive, restarting the computation with the remaining parties.
* If $t$ honest nodes are temporarily partitioned from the network, then any failures among the remaining parties could compromise the safety properties, including confidentiality.
* Using this approach to guarantee output, therefore, leads to an inherent trade-off between the liveness and safety properties—the more faults tolerated for liveness, the fewer tolerated for safety.
* Furthermore, the preference for performance would be to set the timeout parameter low enough to tolerate benign crashes, though this means even shorter duration network partitions weaken the security threshold among the remaining nodes.
> Personal Note: The ADC-net paper seems to solve for this by having other honest nodes fill the spots of the nodes that would be kicked out.
We say a protocol has asynchronous safety if its safety properties hold even in an asynchronous network and up to $t$ parties are corrupt.
**Communication Overhead:** Communication overhead is a critical factor in how well the network size $n$ can scale. We mainly focus on amortized overhead over suitably large batches of operations.
* An MPC protocol has linear communication overhead if, for a given task, as a function of a network size $n$, the total communication cost grows with O($n$).
* In particular, this means that as additional nodes are added, the bandwidth required by each node remains constant.
> Besides communication overhead, we also discuss computation overhead in Section 6.1.
## 4 OVERVIEW OF ASYNCHROMIX AND HONEYBADGERMPC
AsynchroMix is an application of the MPC-System-as-a-Service (MPSaaS) [9] approach to the problem of anonymous broadcast communication.
* We consider a typical client-server setting for anonymous communication networks [44, 59, 77], where clients send their confidential messages to server nodes and server nodes mix clients messages before making them public.
* As our primary focus is robustness, we model an asynchronous communication network such that we must not make use of timeouts and do not rely on timebound parameters to be correctly configured.
* The communication network is assumed to be under the adversary’s control such that the adversary may arbitrarily:
* Delay messages.
* Duplicate them,.
* Or deliver them out of order.
* For system liveness, we assume that the adversary cannot drop messages between two honest parties.
System Model: Assume a set of clients $\mathcal{C} = \{C_j\}_{{j=1}...k_{pop}}$ with input messages $m_j$ , who communicate to a set of $n$ servers, $\{\mathcal{P}_i \}_{{i=1}...n}$.
* We assume that at most $t < n/3$ of the servers are Byzantine corrupted by a global adversary, and similarly, a number of clients are corrupted as well.
* All of the clients and servers are connected over asynchronous channels.
* The messages themselves are fixed sizes of $|m|$ bits.
* Or field elements, depending on context.
AsynchroMix proceeds in sequential mixing epochs, where in each epoch we mix input messages provided by $k ≤ k_{pop}$ clients.
Fig. 4 offers a high-level overview of the process:
![](https://i.imgur.com/OP1tsyf.png)
The protocol satisfies the following security properties:
* **Anonymity (Safety):** During every mixing epoch, even when all but $k − 2$ selected clients are compromised, the adversary cannot link an included message $m_j$ to its honest client $C_j$ except with probability negligibly better than 1/2.
* Specifically, for input vector $m_1, ...,m_k$ from $k$ clients, the output is a permutation $π(m_1, ...,m_k)$ such that the output permutation is at least almost independent of the input permutation.
* **Availability (Liveness):** Every honest client’s input is eventually included in a mixing epoch, and every mixing epoch eventually terminates.
AsynchroMix is built upon a new MPC prototype, called HoneyBadgerMPC, which realizes secure computation through the use of asynchronous and maliciously-secure primitives.
* In particular, HoneyBadgerMPC makes use of asynchronous reliable broadcast to receive secret shared inputs from untrusted clients, and asynchronous common subset to reach agreement on the subset of clients whose inputs are ready and should be mixed in the next epoch.
* Each mixing epoch involves a standard robust MPC online phase based on Beaver triples and batched public reconstruction [11].
* The offline phase [9, 11] runs continuously to replenish a buffer of preprocessing elements used by the online phase.
* The offline phase is optimistic in the sense that all server nodes must be online and functioning to replenish the buffer.
### 4.1 Receiving Client Inputs using Preprocessing and Asynchronous Broadcast
Since clients are untrusted, we need a way to receive secret shared inputs while guaranteeing that the inputs are valid, consistent, and available at every server node. We make use of a preprocessing approach due to Choudhury et al. [30].
* The idea is that for each input $m$ from client $C$, we consume a preprocessed random share ![](https://i.imgur.com/FtkCeCD.png) which was generated in the offline phase and privately reconstructed to $C$ (i.e., each server node sends their share of ![](https://i.imgur.com/CjkE6LJ.png) to C, who robustly interpolates $r$).
* The client then blinds its message $\overline{m} := m + r$ and broadcasts the blinded message $\overline{m}$
> ((1) in Figure 4).
* The servers then each locally compute their share ![](https://i.imgur.com/v4QXTJ5.png) without leaking any information about $m$.
* To broadcast $m$, we make use of the asynchronous broadcast protocol R*eliableBroadcast*, which guarantees, roughly, that if any server receives $m$, then every correct server also receives $m$.
### 4.2 Asynchronous Mixing Epochs
Each mixing epoch begins when servers have received inputs from enough clients. Servers must reach an agreement on a subset of k client inputs [2, 45, 67] which are deemed to be available for processing.
* Every epoch, this agreement is made using the asynchronous broadcast primitive *CommonSubset* [13].
* At the beginning of *CommonSubset*, each server inputs its view of which client inputs are available for mixing.
* For honest servers, this will be the set of inputs for which a value has been received by ReliableBroadcast.
* The output of *CommonSubset* will be a set of $k$ available inputs that will be used in the next mixing epoch.
### 4.3 Robust Online Phase
Once the inputs to a mixing epoch are determined, the mixing proceeds as an online phase of MPC, running one of two programs, *power-mix* or *iterated-butterfly*.
* The online phase itself is standard, based on Beaver triples [10], and only requires batch reconstruction oft-sharings, which in the $t < n/3$ setting we can achieve through Reed Solomon decoding [11, 42].
### 4.4 Continuously Running Offline Phase
Since AsynchroMix is a continuously running service, the offline phase could be run concurrently to replenish a buffer of preprocessing values.
* Here latency is not critical, although it should ideally be efficient enough to keep up with the demand from the online phase.
* Our offline phase is an implementation of BH08 [11], the same as used in HyperMPC.
* It is based on decoding $2t$-sharings and therefore makes progress only when all $n$ nodes are responsive.
* As mentioned in Section 3, we consider it reasonable to use a non-robust protocol for the offline phase which runs ahead of time in order to provide a reserve buffer of preprocessed values.
* If one or more nodes fail, eventually the reserve will be depleted and clients will have to move to a new instance of the service.
![](https://i.imgur.com/524jutb.png)
### 4.5 Security Analysis of AsynchroMix
THEOREM 4.1: Assuming that sufficient preprocessing elements are available from a previously-completed offline phase, then the AsynchroMix protocol defined in Figure 5 satisfies the anonymity and availability properties defined earlier.
PROOF: For anonymity, it is clear that each mixing epoch only proceeds with $k$ inputs from different clients.
* The use of preprocessed random sharings ensures that the secret shared inputs depend only on broadcast values from clients, and hence are valid sharings.
* The PowerMix program, thanks to perfect symmetry in its equation format, outputs the $k$ values in a canonical ordering that depends only on their values, *not* their input permutation order.
* The SwitchingNetwork induces a random permutation, which is sampled from a nearly uniform distribution.
For availability, we need to show that:
1. Each honest client’s input is eventually included in a mixing epoch.
* Notice that once a broadcast $m_j$ from client $C_j$ is received by every honest server, then the corresponding bits $b_{i,j}$ in the next epoch will be set for every honest server.
* Therefore $m_j$ is guaranteed to be included in the next mixing epoch.
2. Each mixing epoch completes robustly.
* Notice that if at least $t + 1$ of the bits $b_{·,j}$ are set for $C_j$, then we know at least one honest server has received the client’s broadcast, and hence by the agreement property of *ReliableBroadcast* we can rely on this input to be available to every honest server.
### 4.6 Comparing AsynchroMix with Other Strong Anonymity Solutions
We observe that most anonymous communication systems do not focus on robustness and thus cannot achieve strong availability guarantees in the presence of faults.
Our approach to MPC mixing is closely related to MCMix [4], which implements an anonymous messaging system based on MPC. Instead of a switching network, they associate each message with a random tag and obliviously sort the tags using MPC comparison operations.
## 5 MPC PROGRAMS FOR MESSAGE MIXING
Once the inputs are selected, ![](https://i.imgur.com/QVKQUT8.png) each asynchronous mixing epoch consists of an online MPC phase, computing either the Iterated Switching Network or PowerMix MPC programs.
1. The first approach is based on an iterated butterfly switching network [35] which yields an almost-ideal random permutation of inputs.
* Each switch uses a secret-shared random bit from the offline phase and a single MPC multiplication.
* Overall this method requires $O(log^2k)$ asynchronous rounds.
* The communication and computation cost per server are both $O(n \space log2 k)$ per input.
2. As an alternative to the switching network, we present a constantround protocol called PowerMix, based on Newton’s sums.
* To mix a batch of $k$ messages ![](https://i.imgur.com/U2DB05b.png) the servers first compute the powers ![](https://i.imgur.com/IjX2rn6.png) where $i$, $j$ range from 1 to $k$.
* We then locallly compute the sums of each power, ![](https://i.imgur.com/cgCZVSY.png) and publicly reconstruct each $S_i$.
* Finally, we use a solver for the set of $m_i$ using Newton sum methods.
Ordinarily, computing ![](https://i.imgur.com/SE7zpfW.png) using Beaver multiplication would require at least $O(log \space k)$ rounds of communication.
* However, in PowerMix we use a novel way to trade-off communication for computation, generating all the powers in a single round of communication by using some precomputed powers of the form ![](https://i.imgur.com/FR1e4vn.png)
* As a result, PowerMix only requires two rounds of communication to finish mixing.
### 5.1 Option I: Switching Network
Our first approach is to use an MPC program to randomly permute a set of $k$ secret shared values using a switching network.
* Switching networks are implemented in layers, where each layer applies a permutation to the inputs by conditionally swapping each pair.
* However, the resulting permutations are biased [1, 68].
* For example, while a random Benes network can express every possible permutation, some permutations are more likely than others.
* Czumaj and Vöcking showed that $O(log k)$ iterations of random butterfly networks (each of which consists of $O(log k)$ layers) provide adequate shuffling [35] in the sense that the combined permutation is nearly uniform.
* The round complexity of the switching network is $O(log^2 \space k)$, and the overall communication cost is $O(k \space log^2 \space kn)$ considering there are $O(log^2 \space k)$ layers in total and $O(k)$ multiplications are needed in each layer.
* Computation cost is $O(k \space log^2 \space kn)$ since $O(k \space log^2 \space kn)$ multiplications are needed in total.
* (See Figure 6 for a secure switching network instantiation with standard MPC operations.)
![](https://i.imgur.com/8VvYH02.png)
### 5.2 Option II: PowerMix
To contrast with the switching network, we propose a novel protocol PowerMix, which results in reduced communication at the cost of computation. Out approach follows two steps:
1. We compute the $k$ powers of each shared secret,![](https://i.imgur.com/Swl2BVD.png) from just ![](https://i.imgur.com/A7eCF90.png)
* Surprisingly, we show how to achieve this using only O(1) communication per shared secret, our protocol for computing powers may be of independent interest.
2. Use Newton’s Identities [63] to solve a system of equations of the form $S_i =m^i_1 + ... + m^i_k$.
* The servers can obtain $S_i$ by computing locally ![](https://i.imgur.com/txb4QBP.png) and publicly reconstructing.
* Then we solve the system of equations to obtain $\{m^′_i \}$ in canonical ordering.
**Computing powers with constant communication:**
For each secret share ![](https://i.imgur.com/hwHaydS.png) sent by clients, we need to compute ![](https://i.imgur.com/IQBFbkR.png)
* The naïve way is to directly use Beaver triples k − 1 times.
* If we cared only about round complexity, we could also use the constant-round unbounded fan-in multiplication [36].
* Though it adds a 3x factor of additional work.
* In either case, we’d need to reconstruct $O(k)$ elements in total.
* We instead make use of a preprocessing step to compute all of ![](https://i.imgur.com/GmgCCxA.png) by publicly reconstructing only a single element.
* Our approach makes use of precomputed powers of a random element, ![](https://i.imgur.com/dTzYan9.png) obtained from the preprocessing phase.
* We start with the standard factoring rule:
* ![](https://i.imgur.com/GZYUURV.png)
* Taking $C = (m − r)$, and annotating with secret share brackets, we can obtain an expression for any term ![](https://i.imgur.com/YuV6You.png)
as a sum of monomials of smaller degree:
* ![](https://i.imgur.com/U2MoSc1.png)
![](https://i.imgur.com/PyokAYb.png)
Based on (1), in Figure 7 we give pseudocode for an efficient algorithm to output all the powers ![](https://i.imgur.com/YO1CUFZ.png) by memoizing the terms ![](https://i.imgur.com/bBg09TH.png)
* The algorithm requires a total of $k^2 /2$ multiplications and $k^2$ additions in the field.
* The memory requirement for the table can be reduced to $O(k)$ by noticing that when we compute ![](https://i.imgur.com/47vr2HB.png)
* We only need monomials of degree i+j−1, so we can forget the terms of lower degree.
* Table 2 summarizes the asymptotic communication and computation costs of each approach.
![](https://i.imgur.com/I6eMhzd.png)
**Solving Newton’s Identity:** We now discuss how to reconstruct the shuffled values from the power sums.
* We have $S_j = \sum^k_{i=1} m^j_i$ where $m_i$ is the message provided by client $C_i$.
* So we require an algorithm to extract the message $m_i$ from $S_i$.
* Assuming that our goal is to mix $k$ messages $m_1,m_2,m_3, . . . ,m_k$, the servers first run Algorithm 7 to compute the appropriate powers.
* Then all servers calculate ![](https://i.imgur.com/ADLAjS5.png) locally and then publicly reconstruct each $S_j$.
* Let $f (x) = a_kx^k +a_{k−1}x^{k−1}+. . .+a_1x +a_0$ be a polynomial such that $f (x) = 0$ has roots $m_1,m_2,m_3, . . . ,m_k$.
* And we have $a_k = 1$ given that it is the coefficient of $x^k$ resulting from the product of $(x −m_1)(x −m_2) . . . (x −m_k)$.
* According to Newton’s identities [70], we can calculate all coefficients of $f (x)$ by:
* ![](https://i.imgur.com/QDvbmuw.png)
* Knowing $S_i$ we can recover all $a_i$ by solving these equations one by one.
* Once we know the coefficients of $f (x)$ we can then find $k$ roots of $f (x) = 0$ with $O(k^2)$ computation complexity in our implementation [20].
* Then we recover all $m_i$.
* Our final mixing set consists of these $k$ messages.
To conclude, Figure 8 shows the overall protocol of Power-mixing.
![](https://i.imgur.com/pHCgWFY.png)
### 5.3 AsynchroMix Offline Phase Requirements
The offline phase supporting AsynchroMix needs to be able to generate the requisite preprocessing elements for both converting client inputs into secret sharings and for realizing either mixing program.
* Of these, handling client inputs is the most straightforward as it only requires generating a $t$-shared random value for each input.
* For simplicity, we note that the randomness extraction protocol is just *RanDouSha*, but with only one matrix operation performed and with half the number of inputs and outputs.
* We, therefore, write randomness extraction as simply half of a call to *RanDouSha*.
Running our mixing programs requires additional preprocessing inputs.
* The Switching-Network program requires the generation of random selector bits as well as the Beaver triples needed to use them.
* Meanwhile, our PowerMix program needs $k$ secret-shared powers of the same random value.
* These preprocessing costs are given in terms of invocations of *RanDouSha* and *BatchRecPub* in Table 3.
![](https://i.imgur.com/HFGF56u.png)
### 5.4 Supporting Larger Messages
We have so far assumed that each client message consists of a single 32-byte field element, but AsynchroMix can easily be adapted to support larger (fixed-size) messages of multiple field elements each.
* Since the switching network choices depend only on the preprocessed selection bits, we can simply apply the same selection bits to each portion of input (i.e., the 1st element of clients’ messages are permuted in the same way as the 2nd element, and so on).
* For PowerMix, we could reserve a portion of each message element (e.g., $κ$ = 40 bits) to use as a tag which would be used to link parts of a message together.
* Since no information about mixing inputs is leaked until the mix is opened, tags will not collide except for with 2$^{−κ}$ probability.
## 6 IMPLEMENTATION AND EVALUATION
Our prototype is written primarily in Python 3, although with computation modules written in C++ (to use NTL [72] [https://github.com/initc3/HoneyBadgerMPC](https://github.com/initc3/HoneyBadgerMPC)).
* For batch computations on secret sharings, both the FFT-based and matrix-based algorithms are implemented in C++ using the NTL library.
* We carried out a distributed benchmarking experiment with several aims:
* To validate our analysis.
* To demonstrate the practicality of our approach.
* And to identify bottlenecks to guide future improvement.
* We are mainly concerned with two performance characteristics: cost and latency.
* Latency is the user-facing cost, the time from when the user initiates a message to when the message is published.
* Computation and bandwidth costs are a complementary metric since we can improve latency by adding more resources, up to the extent that sequential computations and communication round trips are unavoidable.
We are mainly interested in configurations with varying the mix size $k$, as well as the number of servers $n$ (and assuming $n ≈ 3t + 1$). We evaluated not only the *online phase* of the MPC protocols, but also the *offline phase* which generates precomputed Beaver triples, powers, and bits.
### 6.1 Microbenchmarks for Robust Reconstruction
**Evaluating FFT-based and matrix- based decoding:** For the switching network, the main cost in the online phase is batch reconstruction. We implemented two variations of the batch reconstruction operation:
1. One based on matrix multiplication (superlinear)
* as in HyperMPC [9] and others,
2. And an alternative based on $\overline{FFT}$(quasilinear time).
* A function $f (n)$ is quasilinear if $f = O(n \space log^c \space n)$ for some constant $c$.
The use of FFT-based methods has been suggested by Damgärd et al. [42], but to our knowledge it has not been included in any MPC implementation.
* Clearly for some large enough value of $n$, FFT-based methods would lead to a performance improvement, but we want to determine if it could provide benefits for the network sizes in our experiments.
![](https://i.imgur.com/az64RE2.png)
Figure 9 shows the results of microbenchmarks for a single-core C++ implementation of the reconstruction algorithms, using a single *t2.medium* node for a series of 144 batch reconstructions of 4096 shares each, corresponding to a run of the switching network program for mixing $k$ = 4096 client messages.The primary crossover point is at around $n$ = 2048.
***For network sizes of n = 2048 and larger, FFT-based methods offer a significant (greater than 2x) impprovement***
### 6.2 Distributed Experiment for AsynchroMix
To evaluate the performance of AsynchroMix and identify the tradeoffs and bottlenecks involved in our two mixing approaches, we deployed our prototype on clusters of AWS *t2.medium* instances (2 cores and 4GB RAM) in 10 regions across 5 continents.
* For each experiment, we ran three trials for each configuration of $n$ and $k$, and recorded the bandwidth, and running times
**Online Phase for PowerMix:** Figure 10 (solid lines) shows the running time for PowerMix to mix and open from k = 64 to k = 1024 messages on up to n = 100 server nodes.
* It takes around 5 seconds to mix $k$ = 256 messages on $n$ = 100 servers
* And around 130 seconds to mix $k$ = 1024 messages.
* We can see that PowerMix is mostly insensitive to the size of $n$, since the bottleneck is the computational costs, which depend mostly on $k$.
![](https://i.imgur.com/VWi9LLS.png)
Figure 11 shows the communication cost of PowerMix, measured as outgoing bytes sent by each server, amortized per each client input.
* Since PowerMix requires two batch reconstructions of $k$ shares each, and BatchRecPub has a linear asymptotic communication overhead to open a linear number of shares, we expect the per-server per-share cost to reach a constant for large enough $n$ and $k$.
![](https://i.imgur.com/8RLDnRZ.png)
**Online Phase for Switching Network:** Figure 10 (dashed lines) shows the running time for Switching Network to mix from $k$ = 64 to 4096 messages.
* We can shuffle $k$ = 4096 messages on $n$ = 100 servers in around 2 minutes.
* Since the number of batch reconstruction rounds grows with $log^2 k$, the sensitivity to $n$ also increases as $k$ increases.
Based on the microbenchmarks (Figure 9), at $k$ = 4096 and $n$ = 100, the inherent computation time should account for only about 3 seconds out of the total 120 seconds observed.
* The rest is due to a combination of serialization and Python overhead as well as communication.
Fig 12 shows the overall communication cost of the Switching network.
* For $k$ = 4096 client inputs with $n$ = 100 servers, each input requires each server to transmit nearly 30 kilobytes.
* The dashed line here is $y$ = 32 · 6 · log$^2k$ where 6 is reconstruction overhead and log$^2k$ corresponds to the number of total rounds.
![](https://i.imgur.com/CHAUiUY.png)
At this setting, computation, and communication contribute about equally (neither is the sole bottleneck), although there appears to a considerable room to eliminate overhead due to serialization and Python function calls in our implementation.
**Tradeoffs between PowerMix and Switching Network:**
In the online phase, PowerMix requires considerably more computation but less communication than Switching Network.
* Given the resources available to our *t2.medium* instances, PowerMix results in more than 2× reduction in overall latency at $n$ = 100 for up to $k$ = 512 clients, but for larger values of $k$, Switching Network is preferable.
* PowerMix would naturally be useful for larger values of $k$ in more bandwidth-constrained or computationally-powerful networks.
**Overall cost for AsynchroMix:** Figures 13 and 14 show the estimated overall cost, per server and per client input, combining both computation ($0.05 per core hour for an EC2 node) and bandwidth ($0.02 per gigabyte transferred out) costs based on AWS prices.
![](https://i.imgur.com/DvKgKKO.png)
![](https://i.imgur.com/F0luuIJ.png)
The stacked bar charts show the costs broken down by phases (offline, online, and client input).
* The offline phase contributions are based on a distributed experiment for the *RanDouSha* algorithm, multiplied out by the necessary number of preprocessing ingredients of each type (see Table 3).
* The offline cost of PowerMix is always more expensive than Switching Network at the same setting, and the difference increases with more clients ($k$ versus than log$^2k$).
**Using Switching Network, at $n$ = 100 and $k$ = 4096, the overall cost (including all 100 servers) is 0.08 cents per message using geographically distributed *t2.medium* instances**
## 7 CONCLUDING REMARKS
Emerging Internet-scale applications such as blockchains and cryptocurrencies demand a robust anonymous communication service offering strong security guarantees.
* Along the way towards building a robust anonymous communication service on top of MPC, we have highlighted robustness as a first-class concern for practical MPC implementations.
* We have shown through our AsynchroMix application case study that robust MPC can be practical.
AsynchroMix features a novel MPC program for anonymous broadcast that trades off local computation for reduced communication latency, allowing for low-latency message mixing in varying settings.
* Through an extensive experimental evaluation, we demonstrate that our approach not only leverages the computation and communication infrastructure available for MPC but also offers directions towards further reducing the latency overhead.
* In the future, our effort should motivate other MPC implementations to consider robustness as well as a computation vs communication trade-off.
> Personal Notes: I suppose the questions would be:
> * If the staking rewards for a PoS network would be worth those costs?
> * If the system would be low-latency enough for PoS networks in production now (Maybe the idea of computation vs communication is enough)?
> * Message size seems to always be an issue, what will the final message size be that ETH2.0 Valicators will have to validate? Seems like this plays a large part into which of these systems is viable?