Notes on Testing BOLD Disputes

# Notes on Testing BOLD Disputes with Stylus This post explores the BOLD repository's integration with Nitro as of Nitro commit [0cc28bb13269620d62632cba143b799f29342ea9](https://github.com/OffchainLabs/nitro/commit/0cc28bb13269620d62632cba143b799f29342ea9). In order to create a dispute between two parties using BOLD with Nitro, we set up a system test called [bold_challenge_protocol_test.go](https://github.com/OffchainLabs/nitro/blob/sepolia-tooling-merge/system_tests/bold_challenge_protocol_test.go). The test sets up a full L1 node and a Nitro L2 node, along with deployed BOLD smart contracts on the L1 to start: ```go _, l2nodeA, _, _, l1info, _, l1client, l1stack, assertionChain, stakeTokenAddr := createTestNodeOnL1ForBoldProtocol(t, ctx, true, nil, l2chainConfig, nil, l2info) defer requireClose(t, l1stack) defer l2nodeA.StopAndWait() ``` We then set up a second Nitro instance that uses the same L1 node as the first, and uses the same rollup contracts as the first. However, it uses a different sequencer inbox contract, which it will use to post its evil batches. We have two entities that will be proposing assertions, called `Asserter` and `EvilAsserter`, and fund their initial balance. ```go TransferBalance(t, "Faucet", "Asserter", balance, l1info, l1client, ctx) TransferBalance(t, "Faucet", "EvilAsserter", balance, l1info, l1client, ctx) ``` We then set up two separate validator instances, one for the honest and one for the evil validator, and a `stateprovider` which is another dependency the bold challenge manager needs. This latter item is needed to get the data bold will use to post assertions onchain from the Nitro validator DB. ```go statelessA, err := staker.NewStatelessBlockValidator( ... ) Require(t, err) err = statelessA.Start(ctx) Require(t, err) statelessB, err := staker.NewStatelessBlockValidator( ... ) Require(t, err) err = statelessB.Start(ctx) Require(t, err) ``` Next, we start posting some batches. We make the honest and evil parties post the same batches to their respective sequencer inboxes, meaning they will agree up to a point ```go numMessagesPerBatch := int64(5) divergeAt := int64(-1) // -1 means they should not diverge makeBoldBatch( t, l2nodeA, l2info, l1client, &sequencerTxOpts, honestSeqInboxBinding, honestSeqInbox, numMessagesPerBatch, divergeAt, ) l2info.Accounts["Owner"].Nonce = 0 // Evil validator posts the same batch. makeBoldBatch( t, l2nodeB, l2info, l1client, &sequencerTxOpts, evilSeqInboxBinding, evilSeqInbox, numMessagesPerBatch, divergeAt, ) totalMessagesPosted += numMessagesPerBatch ``` Next, they post another batch where they diverge at the message at index 5 within this new batch ```go l2info.Accounts["Owner"].Nonce = 5 numMessagesPerBatch = int64(10) makeBoldBatch( t, l2nodeA, l2info, l1client, &sequencerTxOpts, honestSeqInboxBinding, honestSeqInbox, numMessagesPerBatch, divergeAt) l2info.Accounts["Owner"].Nonce = 5 divergeAt = int64(5) makeBoldBatch( t, l2nodeB, l2info, l1client, &sequencerTxOpts, evilSeqInboxBinding, evilSeqInbox, numMessagesPerBatch, divergeAt) totalMessagesPosted += numMessagesPerBatch ``` Next, we check if both Nitro nodes have the same genesis state, and then wait for their validator components to finish validating the batches that were just posted. ```go for { _, err1 := l2nodeA.TxStreamer.ResultAtCount(totalMessageCount) nodeAHasValidated := err1 == nil _, err2 := l2nodeB.TxStreamer.ResultAtCount(totalMessageCount) nodeBHasValidated := err2 == nil if nodeAHasValidated && nodeBHasValidated { break } } ``` Then, we initiate a BOLD challenge manager service for both honest and evil: ```go manager, err := challengemanager.New( ... ) Require(t, err) managerB, err := challengemanager.New( ... ) Require(t, err) // Start the challenge manager goroutines. manager.Start(ctx) managerB.Start(ctx) ``` The test then ends once the honest validator confirms an edge from the challenge contracts by one step proof. At that point, it is impossible for the evil party to win. ### What happens inside the BOLD challenge manager service Whenever it is possible to post a new assertion to the rollup smart contract, the challenge manager will ask Nitro to give it data to post to the contract. Each assertion is a claim about **executing the messages in the Arbitrum inbox contract**. That is, each assertion contains information that looks like this, in plain english: > "After executing 100 messages from the Arbitrum inbox on L1, the L2 block hash of Arbitrum is 0xabc123. At the time this assertion is posted, 20 new messages have arrived in the inbox. The next assertion should consume at least 120 messages". The next assertion posted should then claim at least 120 messages from the inbox. As the rollup contract manages these assertions, we can represent them as a chain structure where there can be forks. That is, two or more assertions that agree on a parent, but have different L2 block hashes at a specific message number. ![Screenshot 2024-05-24 at 09.10.06](https://hackmd.io/_uploads/SJ9TkX0mR.png) Both the honest and evil validators will post their assertions, and they will disagree about some message that we set up earlier for divergence in their batches. Upon detecting this "fork" in the assertion chain, each respective BOLD challenge manager will start a dispute by posting data called `history commitments` to the challenge manager contract, referring to the specific fork in the assertion chain they observed. The flow of a dispute looks something like this, for an arbitrary example, inside the challenge manager contract. Each claim that is posted requires a Merkle proof that commits to a history of L2 block hashes. Let's say there was a dispute from assertion that claimed message 0 to message 4. The challenge manager will ask Nitro to provide it the list of L2 block hashes from message [0, 4]. These L2 block hashes come from the `validator` component of Nitro, specifically in the function `StatesInBatchRange` in [here](https://github.com/OffchainLabs/nitro/blob/0cc28bb13269620d62632cba143b799f29342ea9/staker/state_provider.go#L216). ```mermaid graph LR A(message 0)-->B(Honest = message 4, L2 hash 0x123) A-->C(Evil = message 4, L2 hash 0xabc) ``` The honest challenge manager will ask the Nitro validator "hey, give me all the validated L2 block hashes for message 0 to 4". Then, it will post a Merkle commitment to them onchain to the challenge manager contract on L1. The evil validator will do the same. The challenge managers will then make "bisection" moves where they perform binary search to figure out the exact block they disagree with. Let's say that after making these moves, our challenge tree looks like this: ```mermaid graph LR A(message 0)-->D(message 2) D-->E(message 3, L2 hash 0x762) E-->B(Honest = message 4, L2 hash 0x123) E-->C(Evil = message 4, L2 hash 0xabc) ``` It means the disagreement is in the execution of message `3`. The challenge manager service will then ask Nitro to load up the Arbitrator machine for message `3`. > From Pepper: "To narrow down to exactly one opcode about which two validators disagree, we perform a two-phase search. > In phase one, we step through 2^23 opcodes per iteration searching for a single iteration about which the challenger and defender disagree. (This is logically bounded at a maximum of 2^19 iterations.) > In phase two, we step through 1 opcode per iteration searching for a single opcode about which the challenger and defender disagree." We will step through 2^23 opcodes per iteration, and produce 2^19 hashes. The same routine will be followed as the initial dispute, in which a Merkle commitment to these 2^19 hashes will be posted, and a binary search will be played to narrow down. Once phase two is reached, the same game is played over 2^23 hashes until a single step is identified for a one step proof. The test then ends once the honest party wins the one step proof. ### How the evil BOLD validator actually diverges from the honest validator In this system test, the validators disagree about the execution of a message in a batch they posted. The actual message definition is within the `makeBoldBatch` function [here](https://github.com/OffchainLabs/nitro/blob/0cc28bb13269620d62632cba143b799f29342ea9/system_tests/bold_challenge_protocol_test.go#L800): ```go func makeBoldBatch( t *testing.T, l2Node *arbnode.Node, l2Info *BlockchainTestInfo, backend *ethclient.Client, sequencer *bind.TransactOpts, seqInbox *bridgegen.SequencerInbox, seqInboxAddr common.Address, messagesPerBatch, divergeAtIndex int64, ) { ctx := context.Background() batchBuffer := bytes.NewBuffer([]byte{}) for i := int64(0); i < messagesPerBatch; i++ { value := i if i == divergeAtIndex { value++ } err := writeTxToBatchBold(batchBuffer, l2Info.PrepareTx("Owner", "Destination", 1000000, big.NewInt(value), []byte{})) Require(t, err) } ``` Each message in the batch is a simple transfer transaction from an address called "Owner" to another called "Destination" with a specified `value`. If we want to create a divergent batch, we can pass in the tx index that we want to diverge at, and it will increase the value of that tx if hit. ### Creating more sophisticated tests In order to test Stylus, for instance, we should change `writeTxToBatchBold` to instead write a complex, Stylus transaction to the batch, and customize how the `divergeAtIndex` condition can modify this tx. For instance, perhaps it can change the inputs of the transaction by some value. If we can resolve a challenge over this complex dispute, we will have a lot more confidence in Stylus + BOLD integration.