# BTC Script and Replay Attack Vector Analysis for Sygma MPC Bitcoin Expansion Replay attacks pose a security threat by enabling the repeated execution of the same transaction, which could result in unintended loss of funds or other security compromises. Such vulnerabilities could be exploited maliciously or arise inadvertently due to the complexities inherent in distributed systems. For instance, a scenario where an RPC endpoint inadvertently broadcasts information regarding the same event multiple times. This report focuses on the route of bridging from EVM networks to the Bitcoin network. The rationale for this focus stems from the existing mechanisms within EVM network executions that guard against replay attacks using mapping of used deposit nonces (specific check for used nonce happens on execution) First, we present an overview of Bitcoin scripting, particularly the usage of script opcodes in the context of Bitcoin expansion for Sygma. Afterwards, we discuss replay attack vectors and propose a unique transaction identifier strategy and possible implementation routes. ## Bitcoin Script Bitcoin Script is the scripting language used by Bitcoin for executing transactions and a few other operations in Bitcoin. Inspired by [Forth](https://en.wikipedia.org/wiki/Forth_(programming_language)), it includes a variety of opcodes that perform specific operations on the data within a transaction script. Bitcoin Script is a stack-based, non-Turing-complete programming language used to define conditions under which a transaction is valid in the Bitcoin blockchain. Operating on a First-In-Last-Out (FILO) principle, it facilitates the execution of transactions through two main components: input scripts (scriptSig) and output scripts (scriptPubKey). These are often referred to as unlocking and locking scripts, respectively. ### How Does Bitcoin Script Work? A transaction in Bitcoin involves two scripts: 1. **Input Script (scriptSig):** Provided by the transaction recipient, this script usually contains a signature that proves ownership of the bitcoins being spent. 2. **Output Script (scriptPubKey):** Set up by the sender to "lock" the transaction output, defining the conditions under which the output can be spent. The transaction is validated by concatenating the input script and the output script and executing them. If this combined script runs without errors, the transaction is considered valid. ### Script Components - **Data Handling:** Directly pushed onto the stack, data is typically enclosed in angled brackets for clarity in documentation, though it appears as plain text in execution environments. - **Operation Codes (OP_CODEs):** More complex than data, these commands manipulate stack data to produce results. They generally begin with the prefix "OP_". Bitcoin Script is restricted to 256 possible operations, with only 79 currently active. Several operations are disabled due to past security vulnerabilities, like the OP_LSHIFT, and some reserved for potential future use. A comprehensive list of opcodes with script examples can be explored [here](https://en.bitcoin.it/wiki/Script). ### Common Script Types in Bitcoin Various script types have been developed to facilitate different kinds of transactions, offering different levels of security, efficiency, and functionality. With the introduction of Taproot, Bitcoin's capability for smart contracts and complex spending conditions has significantly expanded. Here’s an overview of the most common script types in Bitcoin, including those enabled by Taproot: #### 1. Pay-to-Public-Key-Hash (P2PKH) - **Description**: The most commonly used script type in Bitcoin. It requires the spender to provide a public key and a signature generated from the corresponding private key. - **Script Format**: `<sig> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG` #### 2. Pay-to-Script-Hash (P2SH) - **Description**: Allows the sender to lock funds to a script hash. The spender must provide a script (the "redeem script") that matches the hash and data satisfying the script. - **Script Format**: `<sig(s)> <redeemScript> OP_HASH160 <scriptHash> OP_EQUAL` #### 3. Pay-to-Witness-Public-Key-Hash (P2WPKH) - **Description**: A SegWit (Segregated Witness) version of P2PKH which offers benefits like smaller transaction sizes and improved scalability. It separates the witness (signature and public key) from the transaction data. - **Script Format**: The scriptPubKey is `0 <pubKeyHash>` and the witness field is `<signature> <pubKey>`. #### 4. Pay-to-Witness-Script-Hash (P2WSH) - **Description**: A SegWit version of P2SH, suitable for larger and more complex scripts, improving scalability by reducing transaction size. - **Script Format**: The scriptPubKey is `0 <scriptHash>` and the witness field contains the stack elements needed to satisfy the redeem script. #### 5. Multisignature (multisig) - **Description**: Allows bitcoins to be spent only when a specified number of signatures from defined public keys are provided. Common formats are 2-of-3, 3-of-5, etc. - **Script Format**: `<m> <pubKey1> <pubKey2> ... <pubKeyN> <n> OP_CHECKMULTISIG` #### 6. Time-locked transactions - **Description**: Restricts spending until a specified time or block height. - **Script Formats**: - Absolute time lock: `OP_CHECKLOCKTIMEVERIFY (CLTV)` - Relative time lock: `OP_CHECKSEQUENCEVERIFY (CSV)` #### 7. Taproot (Pay-to-Taproot, P2TR) - **Introduced**: Activated in November 2021 with the Taproot upgrade. - **Description**: Combines the advantages of P2PKH and P2SH with added privacy and efficiency. Taproot transactions can either be a simple key-path spend using Schnorr signatures or a more complex script-path spend using MAST (Merkelized Abstract Syntax Trees) to hide complex conditions. - **Script Format**: The scriptPubKey format is `1 <x-only-pubkey>`, and spending can be direct via a key path with a single signature, or via the script path if more complex scripts are involved. #### 8. OP_RETURN - **Description**: Used to embed data in transactions. It makes the output provably unspendable, used primarily for inserting metadata into the blockchain. - **Script Format**: `OP_RETURN <data>` ## Replay Attack Vectors In this section we describe potential vulnerabilities and suggest strategies to prevent the repeated execution of the same transaction in the Sygma bridge when executing on the Bitcoin network. ### Understanding Potential Attack Vectors Since Bitcoin does not natively support complex state-based nonce management as seen in EVM networks, additional measures need to be implemented. Potential vulnerabilities we can think of include: - **Duplicate Events**: An RPC endpoint or other parts of the system could inadvertently broadcast the same event multiple times, leading to replay attacks. - **Nonce Reuse**: Transactions with the same nonce value could potentially be replayed if not properly tracked. - **Weak Transaction Validation**: If transactions are not verified against previously seen transactions, they could be executed multiple times. ## Mitigation Strategies To prevent replay attacks in the Sygma bridge when executing on the Bitcoin network, the main goal is to ensure that each transaction is unique and cannot be executed more than once. This can be achieved by leveraging unique transaction identifiers maintain transaction integrity combined with monitoring from the relayers. Below is our proposed strategy to prevent replay attacks along with the associated technical specifications: ### Unique Transaction Identifiers The most impactful strategy is to implement unique identifiers or metadata for each transaction that passes through the bridge. For Bitcoin, where such mechanisms aren't native, the Sygma bridge will probably have to handle these metadata checks, although it may be possible to directly pass metadata as inputs to the btc script itself. - **EVM Side**: Continue using the current system where each deposit to the bridge has a unique nonce, which is tracked and marked as spent once used. - **Bitcoin Side**: When a transaction is relayed to Bitcoin, append a unique identifier derived from the original transaction (such as the hash of the EVM transaction concatenated with the nonce) into a OP_RETURN output of the Bitcoin transaction. This output does not affect the transaction's ability to send BTC but serves as a data carrier of up to 80 Bytes. Even though `OP_RETURN` seems like the most reasonable way to implement a unique transaction identifier, we next analyse the benefits and limitations: #### Benefits of using `OP_RETURN` for Unique Identifiers - `OP_RETURN` allows for embedding up to 80 bytes of arbitrary data directly into a Bitcoin transaction. This data can serve as a unique identifier for BTC side Sygma bridge transactions. - Embedding data using `OP_RETURN` does not affect the spendability of other outputs in the transaction. Other outputs can still transfer bitcoins as intended, making `OP_RETURN` a non-disruptive option for adding extra information to transactions. #### Limitations of `OP_RETURN` - The 80-byte size limit may be too restrictive for embedding a concatenated transaction hash and nounce. - Although the data is non-spendable, including `OP_RETURN` in a transaction still requires a transaction fee. The transaction fee can range from 5-100 sat/vB. - The data in `OP_RETURN` does not interact with the Bitcoin protocol in any functional way other than being recorded. It does not influence transaction validity beyond being accepted into the blockchain. `OP_RETURN` is a straightforward, blockchain-native method to embed a unique identifier or metadata into a Bitcoin transaction, and it's particularly suited for applications that require immutable recording of small amounts of data without the need to spend those outputs. An alternative could be to assign and track unique identifiers through sygma relayers outside of bitcoin, shifting trust from btc network to the relayers. Reliance on relayer monitoring will be needed either way as the nonce/timestamp/hash will have to be communicated to the relayers outside of the BTC blockchain. Using OP_RETURN over could be viewed as depth-of-defence at an extra cost. ### Implementing a Unique Identifier in Bitcoin Script Here we describe how to implement a system where Bitcoin outputs are spent based on the presence of a unique identifier using `OP_RETURN` within the Sygma Bridge. ### Step 1: Embedding Data with `OP_RETURN` Recall that in Bitcoin transactions, `OP_RETURN` is used to embed arbitrary data that does not affect the spendability of other outputs. This is works well for storing a unique identifier. **Transaction Creation**: When a user initiates a bitcoin transaction to lock funds into Sygma Bridge, a Bitcoin transaction that includes an `OP_RETURN` output with the required unique identifier can be created, where the unique identifier can be a hash of several elements (like user ID, timestamp, nonce) that uniquely identifies the transaction at the cost of an extra transaction fee. ```plaintext Output 1: Regular BTC transfer to a locking address Output 2: OP_RETURN <unique identifier> ``` ### Step 2: Sygma MPC Relayers Monitoring Sygma MPC relayers monitor the Bitcoin blockchain for transactions containing specific `OP_RETURN` outputs. Secure communication (that already exists) between and storage of nonce/user id's will be required. When they detect a transaction with a valid `OP_RETURN` payload, they perform the necessary checks to verify the unique identifier. They may also wait for confirmations on the transaction to pass a certain threshold, and then send the transaction forward to the bridge smart contract. **Relayer Actions:** 1. **Verify Unique Identifier**: The relayers check if the unique identifier from the `OP_RETURN` output matches the expected format and conditions set by the bridge protocol. 2. **Trigger Bridge Actions**: If the identifier is valid, the relayers facilitate the next steps in the bridging process, i.e. confirming asset locks and/or initiating corresponding transactions on another blockchain. ### Step 3: Enforcing Checks to Prevent Replay Attacks To prevent replay attacks, where the same transaction or data is used to illicitly repeat operations (like double spending or double minting), relayers record every transaction with its unique identifier. Future transactions with the same identifier should be automatically rejected. ### Further Considerations **Timestamp**: Include combining a timestamp with the nonce within the unique identifier to ensure that each transaction is tied to a specific time and sequence (combined with a nonce), making it inherently unique and non-replicable at a different time or in a different context. ## Related Work Here we compare and contrast a few other BTC bridge implementations and their security for BTC movement. ### [RSK](https://rootstock.io/) (RBTC) - a sidechain that works in conjunction with the Bitcoin. RSK uses a "[Powpeg](https://dev.rootstock.io/rsk/architecture/powpeg/#peg-inpeg-out-and-other-properties-of-rootstock-powpeg)" as its bridging protocol that connects the RSK sidechain to Bitcoin. The Powpeg consists of three main components: a pre-compiled Bridge contract in RSK, a set of federated entities called Pegnatories, and a set of security devices called the PowHSMs. The Bridge contract provides most of the functionality to transfer bitcoins from and to the Bitcoin network, the PowHSMs protect the locked BTC (secured through side mining on BTC), and the Pegnatories oversee the correct operation of the Powpeg by acting as data relays between the other components of the bridge, and the RSK and Bitcoin blockchains. The part of their protocol related this report is "pegging in" transferring BTC to RSK to obtain RBTC and consists of the following steps: 1. A user sends N BTC to the Powpeg address, which is controlled collectively by the Pegnatories’ PowHSMs through a multisig scheme. 2. Each Pegnatory periodically monitors the Powpeg address and keeps track of peg-in transactions. 3. After a predefined number of confirmations, the Pegnatories or the user send the peg-in transaction to the Bridge contract in RSK. 4. The Bridge validates the peg-in transaction using a representation of the Bitcoin blockchain that is updated by the Pegnatories periodically (the user can also update this representation of the Bitcoin blockchain if the Pegnatories are offline or if they refuse to do it). 5. The Bridge contract transfers N RBTC to the user in RSK. The BTC->EVM token conversion is not instant and releis on their relayers (pegnatories) to monitor a multisig address and wait for confirmations. (I assume they also check for other security related factors). Only once a confirmation threshold is met, they then complete the transfer by sending the transaction information forward to the EVM side smart contract for release. ### [Interlay](https://spec.interlay.io/) (Polkadot) Polkadot is a decentralized platform that enables the interoperability between different blockchains. Any blockchain can connect to Polkadot as a parachain and interact with other parachains through an established communication protocol. Interlay is the parachain that provides a bridge to the Bitcoin blockchain. The process of transferring BTC to the Interlay parachain is called issuing and requires the creation of a collateralized Vault. Any user with enough funds can create a Vault by locking collateral on Interlay (typically in the DOT token). Vaults can then mint interBTC after receiving BTC on Bitcoin. The issuing of interBTC consists of the following steps: 1. Precondition: a Vault has locked collateral. 2. The user sends an issue request in Interlay that includes the amount of interBTC the user wants to issue, the selected Vault, and a small collateral reserve to prevent Griefing. 3. The user sends the requested amount of BTC to the Vault on the Bitcoin blockchain. - The deposit address is derived from the Vault address and a unique identifier 4. The user, or Vault acting on behalf of the user, extracts a transaction inclusion proof of the locking transaction on the Bitcoin blockchain. 5. The user, or a Vault acting on behalf of the user, triggers the issuance of interBTC by providing the bridge with a transaction inclusion proof of the Bitcoin locking transaction. - This is possible because the Interlay parachain implements a Bitcoin SPV client that is updated with Bitcoin block headers by relayers or vaults - The bridge verifies the destination address and quantity of the deposit transaction Interlay does not fully guarantee consistency as the protocol is vulnerable to [replay attacks](https://spec.interlay.io/security_performance/btcrelay-security.html#replay-attacks). The Bitcoin SPV client in the bridge validates transaction inclusion but does not verify if the transaction has been already processed. This allows a dishonest Vault to present a BTC->EVM transaction twice and mint counterfeit interBTC. Interlay mitigates this risk by penalizing Vaults if they move the locked BTC. However, the protocol relies on relayers or other vaults to expose misbehaviors. ### tBTC - a tokenized version of BTC that serves as a trust-minimized bridge. tBTC is a collateralized bridging protocol between Bitcoin and Ethereum similar to Interlay that similarly uses the [Keep network](https://keep.network/). Their protocol requires a set of signers that act as a BTC vault and place a bond on the destination chain. The BTC->EVM process is as follows: 1. The user sends a deposit request to a smart contract on ethereum - The request can only be of a set of fixed amounts 2. The system selects a random set of signers to receive the BTC deposit - The selection of signers is performed off-chain using the Keep network - Signers must stake 150% of the transferred value in ETH (as collateral) 3. The selected signers produce a threshold signature address and send it to the smart contract in Ethereum - This process is also carried out in the Keep network 4. The smart contract gives the user an NFT representing the deposit - This NFT gives the user the right to claim the tBTC 5. The user sends BTC to the deposit address 6. After 6 confirmations, the user sends an SPV (light client) proof of the deposit to another smart contract to redeem the NFT for newly minted tBTC 7. The SPV client in the smart contract keeps track of a number of Bitcoin block headers and checks that the difficulty adjustment in these blocks is consistent with Bitcoin’s algorithm. The BTC->EVM process creates a one-to-one relationship between staked collateral and a deposit. They are also susceptable to replay attacks and use penalties as deterence. ## Conclusion The proposed system leverages the immutability and transparency of blockchain with the `OP_RETURN` opcode to securely embed unique identifiers in Bitcoin transactions. We can combine this with monitoring and processing capabilities of Sygma MPC relayers that adhere to a preset security protocol, preventing replay (and other) attacks. The exact specifications for conditions the relayers will have to monitor, among other security considerations like the use of HSM, could be an extension of this work.