--- tags: spec --- This document is a technical description of Web3Auth's MPC infrastructure and its interactions with applications. Its a high-level document, Web3Auth reserves the right to make changes to this document as needed. It is part of a set of documents that describes Web3Auth, others that maybe helpful include: - [Web3Auth User Key Architecture](https://hackmd.io/@torus/Hyv8HjO8i) - [MPC Infra Architecture Questionaire](https://hackmd.io/@torus/S1ZaIETic) ## Introduction With the growth of cryptographic applications, key management has become increasingly important. Mismanagement of private keys often results in considerable financial loss, and as usage of private keys become increasingly mainstream key loss and misuse will become even more prevalent and costly. Many existing non-custodial key management solutions present users with trade-offs between security, convenience, and redundancy. For example, backing up your key increases redundancy during key loss, but reduces security if that backup is stolen. Multisig wallets provide a balance of all three, but are more costly to deploy, are less private, and are not cross-chain. The Web3Auth is non-custodial key management infrastructure powered by MPC that acts as an multi-factor account that solves these problems. In order to achieve its goals of serving the global Web3 market, the network needs to meet the following design goals: - Seamless non-custodial key management user experiences - Compatible with existing authentication methods and blockchain ecosystems - Performant globally, and scalable as necessary ## Definitions of utilized schemes and concepts We provide definitions of concepts and schemes which we use later on. ### Shamir Secret Sharing (SSS) and Verifiable Secret Sharing (VSS) Shamir's secret sharing scheme is a polynomial threshold $(t,n)$ secret sharing scheme where a dealer divides a secret into $n$ multiple shares and each participant is given a share by evaluating a polynomial of order $t$. To reconstruct the secret, $t+1$ shares are required. Verifiable secret sharing refers to Shamir secret sharing schemes where even in the presence of malicious dealers, there is a well-defined secret that the participants can later reconstruct. SSS is a base fundemental in a lot of MPC cryptography and in Web3Auth's infrastucture. More can be read [here](https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing). ### Distributed Key Generation (DKG) Distributed key generation was introduced by Pedersen, and the key idea involves using $n$ parallel runs of SSS or VSS to ensure that there is no dealer who knows the secret. We generally use a varient of [Async Verifiable Secret Sharing](https://eprint.iacr.org/2002/134.pdf) in our DKG. ### Proactive Secret Sharing (PSS) Proactive secret sharing allows participants to "refresh" shares, so that all participants receive new shares, but the secret remains unchanged. This allows the secret sharing to be secure against mobile adversaries who may be able to compromise all participants over the lifetime of the secret (eg. adversary hacks a random participant's server every month). We refer the user to a [proactive secret sharing scheme](https://eprint.iacr.org/2002/134.pdf) that supports dynamic sets of participants, which we use for share refresh. ### Threshold Signature Schemes (TSS) Threshold signature schemes refer to signing schemes that allow a qualified set of parties involved in a secret sharing scheme to generate a signature on a message without reconstructing the private key. The most well-known signature scheme is [GG19](https://eprint.iacr.org/2019/114.pdf) and an implementation can be found [here](https://github.com/ZenGo-X/multi-party-ecdsa). In particular Web3Auth supports and utilizes GG19, GG20, its EDDSA variants, and DKLS19. | :memo: More on cryptographic support and implementations can be found further [below](#Cryptographic-and-blockchain-support-\(compatibility-and-implementations\)) | |-----------------------------------------| ## User Personal Key Architecture Web3Auth manages its user keys in a distributed manner utilizing different factors(or shares) that users manage including user's devices, private inputs, backup locations and cloud service providers. As long as a user has access to 2 out of n (2/n) of these shares, they will be able to access their key. This distributedly secure key is called the $TSSKey$. ![](https://hackmd.io/_uploads/rkrV-n_8i.png) One of the lost functionalities of a distributedly secure key is the loss of efficient encryption/decryption capabilities. For that reason, we use an optimised dual key structure for managing keys: the $TSSkey$ that is used for signing transactions that is never reconstructed, and a $metadataKey$ for managing metadata (which is reconstructed). Using this setup allows us to get the security of threshold signatures when it's necessary for signing transactions, but still leverage the faster performance of key reconstruction for updating and reading private user metadata. | :memo: Readers whom like to read more can do so [here](https://hackmd.io/@torus/Hyv8HjO8i) | |-----------------------------------------| ### Metadata User metadata is strictly supplementary and only helps to facilitate and govern user flows. In particular, metadata does not leak information about the shares of the private key being used to sign transactions. Metadata uses an encrypted storage layer that serves as a persistent data store for storing encrypted information about the user's keys (eg. public key, preferences, device information, thresholds etc). This information is stored in a replicated fashion across the set of nodes that are involved in facilitating the user login. During operation, when the user has threshold shares, they can read and write to metadata. Writing to metadata requires encrypting the data and signing it with the shares / private key. ## Web3Auth's Infrastructure Highlevel Overview Web3Auth's infrastucture is designed to manage a factor of a key for hundreds of millions of users. It utilizes a distributed security model that is inspired from Apple's iCloud and other highly performant distributed systems. The design of it revolves around achieving high guarantees for: - A distributed multi-setup security model - Close to 100% uptime - Upscales and downscales to cater to spikes in load In order to meet these objectives, Web3Auth's Infra is designed to be a set of nodes. ![](https://hackmd.io/_uploads/rJg45Dmws.png) *Web3Auth Infrasturcture secures a factor of a users key* ![](https://hackmd.io/_uploads/Sk429vmvj.png) *Architecture of a Node* #### t of n Distributed Model for Security and Uptime These nodes operate in a t of n security model, a user's factor (share) is further split into sub-shares and secured individually by each of these nodes. Threshold(t) number of nodes are able to sign signatures for users. In order to compromise this setup, one must compromise a total t of n setups. Similarly in order for the services to go down, n-t setups must go down (i.e. 2 nodes in a 3/5 setup). This is on a technical level very unlikley. #### Regionally available services Signatures and login services are often expected to be low-latency sub 1.5s interactions from the user. To achieve that level of latency, each node operates clusters of instances in regions across the world. This includes operations in US-east, west, Singapore, South America, Africa, Europe east and west and ultimately is flexible. #### Horizonally scalable for billions In each node, in each regional cluster, there runs an orchestration layer that operates multiple services. Services are spun up and down in different regions to cater to the load necessary for global applications. These clusters is orchistracted via a master coordinator that communicates with different nodes to understand the load that they should be receiving and coordinate on a distributed level. #### Load expectation via Verifiers Verifier contracts define the relationship between infrastructure and applications. It contain information about authentication parameters, as well as supported methods for users. It acts as a long standing agreement between the application and nodes, and serves as a central point of reference for wallet front-ends that implement the user transaction functionality. It is also in charge of the submission and validation of user transactions (which is dependent on the authentication protocol parameters). ## Detailed User Interactions Below we describe the main interactions with the user: - **SessionRequest** | A user's intial interaction/login - **KeygenRequest** | Key generation/setup for a user - **ThresholdSignRequest** | Usage of the user's assigned key/s - **Reshare/Revoke** | Changing the parameters of a user's account setup, in a distributed manner ### SessionRequest Session request is the intial interaction that any user does with a node before interacting with further interactions. Its not unsimilar to standard authentication to backend services, but does include a commit vs reveal phase to prevent against front-running attacks from other nodes. The result of the session request is a token, that in combination with the session key is used to interact with the infra from there on out. ![](https://hackmd.io/_uploads/S1xNoDg0_9.png) SessionRequests are specific to different applciations and are agnostic to most implementations of authentication, including but not limited to OAuth2.0, OIDC, ECJWTs and most forms of FIDO. ### Key generation & importing keys Keys can be distributedly generated OR imported into the ecosystem. Below describes the former interactoin with users where after session authentication, key generation is conducted with a series of round interactions: ![](https://hackmd.io/_uploads/B1vIlMCO5.png) This is done differently for the import of a key. When importing a key, the nodes interact with one another to do a DKG seperate from the user, the user then retrives the public key for this distributed secret, and creates a second share with respect to it on the client side. ### Threshold Signature Scheme (TSS) The TSS key is made up of two sections: - shared information (eg. public key, share commitments, theshold, unique identifiers) - local information (eg. TSS key share). The shared information is stored on metadata and replicated, whereas the local information is kept locally on the user's device. This ensures that metadata for shared operations can be easily replicated and accessed without computationally expensive calls, while for local operations the TSS key shares never leave the local context. Constructing a threshold signature requires a session token which we can get via the session request. This then allows us to set up a threshold signature session. The threshold signature session consists of and offline signing phase and an online signing phase (GG20, GG19, Doerner19). The offline signing phase consists of 6 rounds of interaction between the device and nodes and can be pre-computed before the transaction signing request is received. The online signing phase requires the transaction to be present and is non-interactive. This means that although the threshold signature generation takes a substantial amount of time, most of it can be precomputed via a background process, before the user even needs to sign a transaction. When the user decides to sign a message in the online phase, only one round of noninteractive communication is required, which is very fast (<0.2 seconds). ![](https://hackmd.io/_uploads/B1NXVM0Oq.png) ### Resharing and revokability Users can reshare current secrets kept with the node set. This is done via Proactive Secret Sharing, in particular the variant used here is PSS via AVSS an async method of doing so. Below describes the interaction, between user and nodes in this particular transaction. You'll find its quite similar to key generation in its rounds, they are similar schemes: ![](https://hackmd.io/_uploads/SJhqCb0u9.png) Offline revokability is also something that is inherited from the inital version of tKey. How its implemented is similar as well, only difference being in the above interaction the key is never reconstructed on a front-end context. For reference, the snippet from the [tkey doc is reiterated here](https://docs.tor.us/key-infrastructure/technical-architecture): #### Using metadata for share resharing and revokability Utilizing the storage layer, we are able to generate new shares for all devices, regardless if they are online or offline. This allows us to remove devices from the sharing, allow the user to change their security questions and/or adjust their security threshold. The key concept here is utilizing published Share commitments as encryption keys to retrieve shares on the most updated SSS polynomial. This is illustrated from a 2/4 SSS sharing $f(z)$ with shares $s_a, s_b, s_c, s_d$ kept on 3 user devices and the service provider. Let $g$ be a generator of a multiplicative subgroup where the discrete log problem is assumed hard to solve. During initialization of the key we create share commitments $g^{s_a}, g^{s_b}, g^{s_c}, g^{s_d}$ to be stored on the storage layer. These share commitments are analgous to public keys derived from the share scalars. Given the user loses device D holding $s_d$, and wants to make that share redundant. He/she first reconstructs their key on device A. We utilize a public key based encryption scheme (eg. ECIES). The user generates a new 2/3 SSS polynomial $h(z)$ on the same $\sigma$ with shares $s_1, s_2, s_3$ and encrypts the newly generated shares for each respective share commitment, except for the lost $g^{s_d}$ commitment. $$\text{for }i = \{1,2,3\} \text{ and } v = \{a,b,c\} \\ encrypt(s_iu,g^{s_v}) \Rightarrow c_{nv} \\ \text{where } nv = \{1a,2b,2c\}$$ $c_{nv}$, the resulting ciphertext from the encryption, is then stored on the storage layer for retrieval on other devices. On login to device B, the user retrieves $c_{2b}$ is able to use $s_b$ to decrypt the new $s_2$ and reconstruct their key with $s_1$ derived from the service provider in a similar fashion. Using the $h(z)$ allows $s^d$ to also be deprecated as a share. Resharing allows us to share a key on a new polynomial with a different degree/threshold, allowing us to also increase a user's security/factor devices or inputs to reconstruct thier key as they require it. This can be incrementally done as needed. ## Cryptographic and blockchain support (compatibility and implementations) Web3Auth supports most popular blockchains & elliptic curves out there. In particular, out of the box the infrastucture supports all chains on: * `secp256k1` | Ethereum (EVM) chains, Bitcoin, Polygon & other L2s, etc... * `ed25519` | Solana, Polkadot, NEAR For other elliptic curve/chain support, feel free to [ask/request](https://web3auth.io/contact-us.html) as we may already support them. ### Distributed Key Generation & Pro-active Secret Sharing Scheme References There are many schemes and variants for DKGs and PSSs out there, we in particular use an asynchronous variant, [Kate12](https://eprint.iacr.org/2012/377.pdf), derived from Asynchronous Verifiable Secret Sharing (AVSS), [Cachin02](https://eprint.iacr.org/2002/134.pdf). ### TSS & Signature References TSS schemes often vary in their approach to creating shared cryptographic material in a distributed manner. We support the popular [GG19](https://eprint.iacr.org/2019/114.pdf), GG20, EDDSA, and [DKLS19](https://eprint.iacr.org/2019/523.pdf). In result supporting the `ecdsa` signature standard on both elliptic curves. Its worth noting that the TSS signing is largely decoupled from Web3Auth's infrastucture to allow us to be agnostic to TSS implementation. These include other signature schemes, which arguably are much more convienent. Notably, but non-exhaustively, Web3Auth supports: * [ECDSA](https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm) or its ElGamal variants * EDDSA or its [Schnorr variants](https://en.wikipedia.org/wiki/Schnorr_signature) * BLS, Stark (coming soon) | :memo: For other signature or elliptic curve or chain support, feel free to [ask or request](https://web3auth.io/contact-us.html) if we support them | |-----------------------------------------| ## Privacy, User Data and Compliance Web3Auth's takes a conservative approach to data collection. The only required stored data is a relationship of an anonymised identifier from the OAuth or JWT that is pegged up into the infrastucture. This is often the required `sub` field on the JWT RFC, which applications have the option of storing outright or storing a Hashed value of `sub`. This relationship is first created on initial key generation/assignment and later utilized to authenicate the specific public key and session token to the user's end device. ## Security Extensions By limiting the base transaction types and functionality we can limit the complexity of the system and thus provide better guarantees on safety and liveness. The aforementioned functionality is sufficient for generic threshold cryptography operations. However, we fully intend to extend the capabilities of our system to support other interesting use cases. We make provisions for these types of extended functionality using "extensions" that reuse the existing interfaces that have already been defined. ### Additional verification / checks Additional dynamic transaction verification / message validation can be done during the SessionRequest or SignRequest phase by each one of the nodes to reduce the impact of phishing attacks or scams. Smart contract wallet rules like spending limits can also be implemented in this way. ### User defined access structures Although the default nodeset is defined by the application, the user may want to control this access. One way we can provide this functionality is by implementing a proxy contract in front of the parameter lookups that happen at the start of SessionRequest, with this proxy contract being user-controlled. <!-- ## Performance optimisations This section describes some of the deeper over-arching performace optimizations that we include in order to achieve < 3 second user wait times: ### Geodistributed threshold signing servers As benchmarks on TSS show that the communication rounds are highly dependent on inter-server latency, node operators can opt to run read-replicas in major data centers across the world to decrease latency between the servers and the user. This additional feature can have an associated fee that is specified via the global node registry. #### TSS Benchmarks We conduct benchmarks on signing operations with increasing number of parties randomly selected across a restricted geographical availability region (e.g. East Asia). We take the average over 30 runs. | Threshold | Average Threshold Signing Time (ms) | Variance (ms) | | -------- | -------- | -------- | | 2 | 1544 | 86 | | 3 | 2531 | 102 | | 4 | 2953 | 133 | | 5 | 3833 | 166 | | 6 | 5132 | 252 | ### Key generation buffer Applications often onboard users in large spikes during launch events, which means that having buffer capacity is helpful in ensuring smooth operation during high traffic situations. Node operators can offer to pre-generate key generation material for a fee from the application, and subsequently during user signup the pre-generated key can then be assigned to the user. #### Benchmarks The current Web3Auth permissioned network already uses key generation buffers, and so we use benchmarks of our current permissioned network here. The resharing protocol has the same computational and communication complexity and can be expected to take the same amount of time. We take the average time with servers in the same geographical region. | Nodes | Average Keygen/Reshare Time (ms) | Variance (ms) | | -------- | -------- | -------- | | 5 | 644 | 113 | | 9 | 1231 | 145 | | 13 | 3828 | 182 | --> <!-- ### Hierarchical Threshold Secret Sharing Implementation This type of secret sharing has been studied extensively since the seminal work of [Tassa](https://link.springer.com/chapter/10.1007/978-3-540-24638-1_26), and since then many variants of hierarchical threhsold secret sharing have been introduced. In particular, we provide an adaptation of [Pedersen-VSS](https://link.springer.com/chapter/10.1007/3-540-46416-6_47) below for a $(t,n)$ secret sharing where one of the shares is further sub-shared under a $(t',n')$ secret sharing, such that the secret sharing is linear, and adaptable for use as a subprotocol in an existing [DKG](https://link.springer.com/article/10.1007/s00145-006-0347-3) and [TSS](https://eprint.iacr.org/2019/114.pdf) scheme. Wlog, we assume that the $n^{th}$ share in the $(t,n)$ sharing is sub-shared. #### $HTSS-(t,n,t',n')$ - Dealer $D$ performs a secret sharing of a secret $z$ - D chooses 2 random polynomials $A(z),A'(z)$ over $\mathbb{Z}_q$ of degree $t$: $A(z)=a_0+a_1z+...+a_tz^t,A'_i(z)=a'_0+a'_1z+...+a'_tz^t.$ such that $z=a_0=A(0)$ $D$ broadcasts $C_{k}=g^{a_k}h^{a'_k}\bmod{p}$ for $k=0,...,t$. $D$ computes the shares $s_i=A(i),s'_i=A'(i)\bmod{q}$ for $i=1,2,3,n-1$ and sends $s_i,s'_i$ to party $P_i$ - D also chooses 2 random polynomials $B(z),B'(z)$ over $\mathbb{Z}_q$ of degree $t'$: $B(z)=b_0+b_1z+...+b_tz^{t'},B'(z)=b'_0+b'_1z+...+b'_tz^{t'}$ such that $b_0=A(n)$ and $b'_0=A'(n)$. $D$ broadcasts $C'_{k'}=g^{b_{k'}}h^{b'_{k'}}\bmod{p}$ for $k'=0,...,t'$. $D$ computes the shares $u=B(i),u'_i=B'(i)\bmod{q}$ for $j=1,2,3,n'-1$ and sends $u_j,u'_j$ to party $P_j$ - Each party $P_i$ verifies the shares he received from D by checking if: $g^{s_i}h^{s'_i}=\displaystyle\prod^t_{k=0}(C_k)^{i^k}\bmod{p}$ - Each party $P_j$ verifies the shares he received from D by checking if: $g^{u_i}h^{u'_i}=\displaystyle\prod^t'_{k'=0}(C'_{k'})^{i^{k'}}\bmod{p}$ - All parties $P_i,P_j$ also verify that $b_0=A(n)$ and $b'_0=A'(n)$ by checking if: $C'_0=\displaystyle\prod^t_{k=0}(C_k)^{n^k}\bmod{p}$ - If a check fails for $P_i$ or $P_j$, they broadcast a *complaint* against D. - D then reveals the share for $P_i$ or $P_j$, and if it does not pass the check D is disqualified. - **Correctness** - For any set $\mathcal{R}$ of $t+1$ parties $P_i$, $z=\sum_{i\in\mathcal{R}}\gamma_i.s_i\bmod{q}$, where $\gamma_i$ is the appropriate Lagrangian coefficients for $\mathcal{R}$ - For any set $\mathcal{R'}$ of $t$ parties $P_i$ and $t'+1$ parties $P_j$, $z=\sum_{i\in\mathcal{R'}}\gamma_i.s_i+\gamma_n.\sum_{j\in\mathcal{R'}}\gamma'_j.u_j\bmod{q}$ where $\gamma_i$ and $\gamma'_j$ are appropriate Lagrangian coefficients for $\mathcal{R'}$ - Note that since the reconstruction of the secret from the shares is a linear mapping, the HTSS scheme is linear - **Secrecy** - No information on $z$ can be learned by an adversary who does not satisfy the specified $(t,n,t',n')$ access structure. We consider an adversary who corrupts the maximal non-qualified set $M$, which is $t$ parties in $P_i$ and $t$ parties in $P_j$. For any $z'\in\mathbb{Z}_q$, we can use $\{z',s_i\}, i\in M$ to interpolate $\bar{A}$, a polynomial of degree $t$, and evaluate $\bar{b}_0=\bar{A}(n)$. We can then use $\{\bar{b}_0,s_j\}, j\in M$ to interpolate $\bar{B}$, a polynomial of degree $t'$. So there always exists a set of two polynomials, $\bar{A},\bar{B}$ for any $z'\in \mathbb{Z}_q$ that include the shares in the maximal non-qualified set, so it is not possible for the adversary to find $z$. #### HTSS-DKG We construct a DKG from HTSS-(t,n,t',n'): - We denote $P_i$ as a participant who belongs to the $(t,n)$ level and the $P_j$ as a participant who belongs to the $(t',n') level. - Let $H=(n-1)+n'$, wlog we index participants such that $i\in \{1,...,n-1\}$ and $j\in \{n,...,H\}$. - Each party $P_h$ performs a HTSS-(t,n,t',n') of a random value $z_i$ as a dealer, dealing out shares based on the (t,n,t',n') access structure. - Each party $P_h$ then verifies the shares received from other parties $P_d$, if the checks fail, $P_h$ broadcasts a complaint against $P_d$. - Each party $P_d$, upon receiving a *complaint*, broadcast the shares for $P_h$ that satisfy checks. - Each party $P_h$ marks as *disqualified* any party $P_d$ that either - received more than $t$ complaints - answered a complaint with an shares that fail checks - Each party then builds a set of non-disqualified parties $QUAL$ - Each party sums up the shares received from all other parties that are in $QUAL$. $xh=\sum_{d\in QUAL}s_{dh}\bmod{q}$ and $x'h=\sum_{d\in QUAL}s'_{dh}\bmod{q}$. The distributed secret is $x=\sum{d\in QUAL}z_{d}\bmod{q}$. - To extract $y=g^x$, we have all parties $QUAL$ generate $NIZKPK(g^{x_{i_d}},g^{x_{i_\delta}}h^{x_i'},x_{i_d}=x_{i_\delta})$ and broadcast it. The parties can then interpolate in the exponent with appropriate Lagrange coefficients to retrieve $g^x$. A full description of this technique can be found [here](https://eprint.iacr.org/2012/377.pdf). For correctness, by the linear properties of HTSS, the resulting sum of shares $x_i$ for each party $P_i$ is also a HTSS. Full simulation proofs will be covered in a separate document, but informally for every adversary $\mathcal{A}$, there exists a simulator $\mathcal{S}$, such that on input an element $y$ in the subgroup of ${Z_p}^{\ast}$ generated by $g$, produces an output distribution which is polynomially indistinguishable from $\mathcal{A}$'s view of a run of the HTSS-DKG protocol that ends with $y$ as the output. This is done by having the simulator $\mathcal{S}$ who controls the honest particiapnts follow the HTSS-DKG protocol for all but one of the honest participant, and then choose values for the last honest participant so that the resulting public key is $y$. #### TSS HTSS admits a simple conversion protocol that satisfies the requirements for using [TSS](https://eprint.iacr.org/2019/114.pdf). Note that for any set of of $t+t'+1$ participants with shares generated by HTSS-(t,n,t',n'), they can locally map their shares into a $(\bar{t},\bar{t})$ secret sharing scheme by using appropriate Lagrangian coefficients, where $t\leq\bar{t}\leq t+t'$. In particular, for $\lambda_{i,S}$ in [GG19](https://eprint.iacr.org/2019/114.pdf), we replace it with $\lambda_{n,S}.\lambda_{j,S}$ for each $P_j$. The rest of the TSS protocol remains unchanged as this is now a $(\bar{t},\bar{t})$ sharing. --> <!-- - Stakeholders - Non-custodiality - Incentive compatibility - Scalability - Optimistic, payment channels + fallback - Compatibility with existing authentication methods - Adjusting for externalities - Extensible -->