# Understanding Optimism via Confusion-Resolution Devnet Explorations The goal of this article is to help people speed-run through the states of confusion that I've been going through learning about Optimism and L2s more generally, which I believe anyone learning about L2s would eventually go through. ## Confusion-Resolution Driven Learning I personally find one of the best ways to learn about systems is to start by watching some overview video, then skim docs, and finally ask myself some dumb questions. Very often I won't be able to answer these seemingly dumb questions and be left in a state of confusion. Fortunately confusion, although painful, [is necessary for learning](https://www.sciencedirect.com/science/article/pii/S0959475212000357), but only if it is eventually resolved. So one must find a way to resolve that confusion. Sometimes its asking other people, sometimes its reading the code, and other times the best way is to play with runnable example code. The goal of this article is to get readers confused about Optimism by asking questions, and to help readers resolve their confusion, using two separate approaches. We will first [get confused about the protocol itself](#Getting-Confused-about-the-Protocol), and resolve this confusion using visualizations. I have found that the optimism spec generally lacks good visuals, all of which [exist](https://github.com/ethereum-optimism/specs/tree/main/specs/static/assets) in their repo but unfortunately are not linked in the book. After understanding the protocol better, we will next [get confused about op-batcher](#Getting-Confused-about-OP-Batcher) and resolve that confusion by running a fork of the optimism devnet, playing with the batcher parameters, and analyzing the behavior using a grafana dashboard. We will leave confusing the reader about op-proposer, op-challenger, and fault proofs to a future article. ## Getting Confused about the Protocol The most important confusion that anyone learning about L2s will have to face is "are rollups just secured bridges?" This question leads to never ending twitter debates. The very insightful counter-view is that rollups are not about the bridge at all, but should be thought of as just another chain that [couples its fork-choice rule to that of the L1](https://x.com/jon_charb/status/1750871060675694741) (but please don't ask me about validiums): ![image](https://hackmd.io/_uploads/ryekD6yjR.png) The way Optimism couples its FCR to Ethereum is via the [derivation pipeline](https://specs.optimism.io/protocol/derivation.html). Kelvin Fichtner gave a great sketch explanation of this concept in his [How Rollups *Actually* Work](https://www.youtube.com/watch?v=NKQz9jU0ftg) presentation. Jon Charbonneau wrote a detailed [written follow-up](https://dba.mirror.xyz/LYUb_Y2huJhNUw_z8ltqui2d6KY8Fc3t_cnSE9rDL_o). Joshua Gutow goes into more technical details in his [Optimism Bedrock](https://www.youtube.com/watch?v=vXuRJgyISI0) presentation. We won't repeat any of their arguments here, but watching them as precursors or parallel material will definitely help. ### L2 block confusion Despite having watched and read all of above material, when I started reading and playing around with the optimism codebase, I still found myself incorrectly thinking that the op sequencer (combination of op-node + op-geth that receives L2 user txs) only creates L2 blocks when it receives transactions, and furthermore that it needs to be running in order to create blocks. When reading through the derivation pipeline [specs](https://specs.optimism.io/protocol/derivation.html), one stumbles upon ![image](https://hackmd.io/_uploads/rJ1BXpyiC.png) The natural follow-up question given the above model that the op-sequencer creates L2 blocks is: "what if the sequencer (and/or batcher) goes down? How can that equation hold if no blocks are created"? After a back-and-forth with Adrian, he mentioned something which made me click, somehow. ![image](https://hackmd.io/_uploads/ByNe7aJoA.png) L2 "blocks" are in some sense more like ethereum PoS slots: there is an ethereum slot every 12s without exception, and similarly there is an optimism block every 2 seconds. L1 blocks can be skipped if the proposer misses their slot. But slots always progress, with a new one being added every 12s. Similarly, op validators will optimistically listen for new blocks every 2 seconds from the sequencer. But if no block lands onchain after 12 hours (this is the job of the op-batcher, which pulls new blocks from the sequencer and sends them on ethereum), then they will create empty blocks to advance the chain retroactively. Without getting into too much [frame](https://specs.optimism.io/protocol/derivation.html#frame-format), [channel](https://specs.optimism.io/protocol/derivation.html#channel-format), [batch](https://specs.optimism.io/protocol/derivation.html#batch-format) jargon which is needed to fully understand the op spec, one can simply think of op-batcher batches as being the sequencer's "block proposal". This proposal can be missed, but the block ("slot") will still be created! Unlike Ethereum however, the op-batcher has 12 hours to land its proposal on Ethereum, asynchronously. | | Slot Time | Proposal Window | | -------- | ------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | | Ethereum | 12s | [4s](https://www.paradigm.xyz/2023/04/mev-boost-ethereum-consensus#slots--sub-slot-periods) (synchronous) | | Optimism | [2s](https://github.com/ethereum-optimism/optimism/blob/develop/packages/contracts-bedrock/deploy-config/mainnet.json#L7) | [12h](https://docs.optimism.io/builders/chain-operators/configuration/rollup#sequencerwindowsize) (asynchronous) | The main confusion at play here is that L2 blocks are only created by the sequencer, and depend on it being up. The resolution to this confusion is that one should think of L2 slots as running along at a 2 second pace, regardless of the sequencer being up or down. They can be filled with transactions, either from L1 [deposits](https://specs.optimism.io/protocol/deposits.html) (so called force-included transactions) and/or from one batch of sequencer received transactions. Thinking of an L2 as a centralized sequencer only is a wrong and deceptive mental model. An L2 should be thought of like any other blockchain, with a bunch of validator nodes independently deriving the L2 chain, which as opposed to the L1 chain is done by reading from L1 blocks. Blocks will be created every 2 seconds (though perhaps only retroactively 12 hours later!), regardless of whether single deposit txs or batches of transactions are present. To push this view to its extreme, and leave the reader with another confusion-inducing question, the L2 analogy to the famous [philosophical thought experiment](https://en.wikipedia.org/wiki/If_a_tree_falls_in_a_forest_and_no_one_is_around_to_hear_it,_does_it_make_a_sound%3F) is "if no op-node is deriving the L2 block at a given timestamp, does that block still exist?" This all just reinforces the above quote that an L2 is *derived* from an L1, by coupling their FCR. In some sense, the L1, not the sequencer, *is* the L2. If the Jon Charbonneau image above if it didn't make sense at first, go reread it, and hopefully it'll make more sense now. ### Protocol Parameters Confusion The previous confusion was of a very high-level, abstract nature. If still confusing, diving into more low-level details should provide more clarity. So let's dive into a lower-level confusion. I personally find that I don't understand a system until I'm able to look at its config file and explain what every option does. Optimism has a lot of different [configs](https://docs.optimism.io/builders/chain-operators/configuration/overview) for its multiple components and actors, but the [Rollup Config](https://docs.optimism.io/builders/chain-operators/configuration/rollup) contains the main, most important parameters, that are protocol specific (as opposed to actor specific). These are the equivalent of Ethereum's [consensus spec configs](https://github.com/ethereum/consensus-specs/tree/dev/configs), and for op-mainnet can be found in this [config file](https://github.com/ethereum-optimism/optimism/blob/develop/packages/contracts-bedrock/deploy-config/mainnet.json#L9). The [L2 blocks related parameters](https://docs.optimism.io/builders/chain-operators/configuration/rollup#blocks) are particularly important. But they are also very confusing to newcomers. Out of them, the following two are the most interesting, and also most confusing: - [maxSequencerDrift](https://docs.optimism.io/builders/chain-operators/configuration/rollup#maxsequencerdrift) (default = 30 mins after Fjord) - If a sequencer's L1 connection breaks, this drift value determines how long it can still produce blocks without violating the timestamp drift derivation rules. - Important for censorship resistance through forced inclusion of deposit transactions: op-batcher is forced to advance its batches' L1 origin block after some amount of time, and hence include the next deposit transactions - [code](https://github.com/ethereum-optimism/optimism/blob/f243ad0d892dfa9dc022caeb0155c0c9ff5fef52/op-node/rollup/derive/batches.go#L128) where it's used in the derivation pipeline - [sequencerWindowSize](https://docs.optimism.io/builders/chain-operators/configuration/rollup#sequencerwindowsize) (default = [3600](https://github.com/ethereum-optimism/optimism/blob/5a5dd8f44161e8e05093d92b32e102eb38fe78b6/packages/contracts-bedrock/deploy-config/mainnet.json#L10) L1 blocks) - After a block has been produced, how far into the future (past the block’s L1.origin) a batcher is allowed to wait before submitting the block/batch to Ethereum. - Important for ensuring liveness (and finality) of L2 chain: even if the sequencer/batcher is down and no batches are submitted to Ethereum for 12 hours, the L2 will still advance the chain by create blocks that only contain deposit transactions - [code](https://github.com/ethereum-optimism/optimism/blob/f243ad0d892dfa9dc022caeb0155c0c9ff5fef52/op-node/rollup/derive/batches.go#L87) where it's used in the derivation pipeline The above jargon was most likely not very enlightening upon first reading. Hopefully this visualization will help: ![image](https://hackmd.io/_uploads/HygGt2CqR.png) With Ethereum's 12s blocks and optimism's 2s blocks, in most situations each L2 epoch (sequence of L2 blocks anchored at a specific L1 block) will contain 6 L2 blocks. However, if the sequencer loses connection to the L1 for whatever reason, it is still allowed to produced L2 blocks anchored to the same L1 block for 30 minutes, after which it is forced to regain connection to the L1 and include L1 deposit transactions. ### Concrete Example: max_sequencer_drift on a PoW L1 Let's take `max_sequencer_drift` as an example. From some [code comment](https://github.com/ethereum-optimism/optimism/blob/1e2b19cb6b01c259508b244c6b82a640ac725136/op-node/rollup/types.go#L72): > Sequencer batches may not be more than MaxSequencerDrift seconds after the L1 timestamp of their L1 origin time. > > Note: When L1 has many 1 second consecutive blocks, and L2 grows at fixed 2 seconds, the L2 time may still grow beyond this difference. We will focus on the `note` from the third quote as a source of confusion. Does it make sense to you? It didn't for a long while to me. ![IMG_0D4ED3C8EAD7-1](https://hackmd.io/_uploads/SkWV_Af3R.jpg) The main idea is that L2 blocks cannot skip L1 blocks, because otherwise deposits might not get included. If a optimism was still anchored on the PoW ethereum, and (however unlikely) it produced a bunch of blocks in rapid 1s successions, then the drift between the 2s produced L2 blocks, and the 1s produced L1 blocks, might eventually grow beyond the `max_sequencer_drift`. Thus, there is a special [rule](https://specs.optimism.io/protocol/derivation.html#overview) added to allow this behavior, in case some op-stack rollup were to anchor themselves on bitcoin or some other PoW chain. ![image](https://hackmd.io/_uploads/ByuadCGhA.png) The paragraph below this mentions: > The second constraint ensures that an L2 block timestamp never precedes its L1 origin timestamp, and is never more than max_sequencer_drift ahead of it, except only in the unusual case where it might prohibit an L2 block from being produced every l2_block_time seconds. (Such cases might arise for example under a proof-of-work L1 that sees a period of rapid L1 block production.) In either case, the sequencer enforces len(batch.transactions) == 0 while max_sequencer_drift is exceeded. See Batch Queue for more details. ## Getting Confused about OP-Batcher Very succinctly, the [op-batcher](https://docs.optimism.io/builders/chain-operators/architecture#op-batcher) periodically polls the sequencer’s op-node for new L2 blocks, and adds those into its current channel, transforms them into batches, which are then compressed, and eventually split into frames (1 frame == 1 blob in our case), and sent to the L1. The channel abstraction is needed to fill L1 blobs and save on gas costs: ![image](https://hackmd.io/_uploads/B172nglo0.png) The batcher's parameters that we will play with are described [here](https://docs.optimism.io/builders/chain-operators/configuration/batcher), and can be modified [here](https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/ops-bedrock/docker-compose.yml#L187). First, we need to setup the devnet in order to even run the op-batcher. ### Setting up the devnet > NOTE: I unfortunately was made aware after having written this blog post that the docker compose devnet we are using here is going to get [deprecated](https://github.com/ethereum-optimism/optimism/issues/11562) in favor of the new [kurtosis based one](https://github.com/ethpandaops/optimism-package). The principles presented here still apply fortunately, even if the default tooling might be different by the time you are reading this. We use a fork of optimism where we added monitoring to the devnet, such that we have better observability, and can play around with grafana panels to ask meaningful questions. First clone the repo and checkout the correct commit ``` git clone git@github.com:Layr-Labs/optimism.git cd optimism-exp git checkout e51e887 ``` Then start the devnet with ``` make devnet-up ``` You can always stop the devnet with one of ``` # this will just stop the containers, but leave the docker volumes intact # so that we can restart the devnet whenever make devnet-down # this will remove all volumes, and delete all temporary config directories # if you ever run into weird errors, try devnet-clean/devnet-up make devnet-clean ``` ### Devnet Configs You will find the config files for the devnet in `.devnet/`. <details> <summary>Show devnetL1.json rollup config file</summary> <br> <pre> { "l1ChainID": 900, "l2ChainID": 901, "l2BlockTime": 2, "maxSequencerDrift": 300, "sequencerWindowSize": 200, "channelTimeout": 120, "p2pSequencerAddress": "0x9965507D1a55bcC2695C58ba16FB37d819B0A4dc", "batchInboxAddress": "0xff00000000000000000000000000000000000901", "batchSenderAddress": "0x3C44CdDdB6a900fa2b585dd299e03d12FA4293BC", "l1StartingBlockTag": "earliest", "l2OutputOracleSubmissionInterval": 10, "l2OutputOracleStartingTimestamp": 0, "l2OutputOracleStartingBlockNumber": 0, "l2OutputOracleProposer": "0x70997970C51812dc3A010C7d01b50e0d17dc79C8", "l2OutputOracleChallenger": "0x15d34AAf54267DB7D7c367839AAf71A00a2C6A65", "l2GenesisBlockGasLimit": "0x1c9c380", "l1BlockTime": 6, "baseFeeVaultRecipient": "0x14dC79964da2C08b23698B3D3cc7Ca32193d9955", "l1FeeVaultRecipient": "0x23618e81E3f5cdF7f54C3d65f7FBc0aBf5B21E8f", "sequencerFeeVaultRecipient": "0xa0Ee7A142d267C1f36714E4a8F75612F20a79720", "baseFeeVaultMinimumWithdrawalAmount": "0x8ac7230489e80000", "l1FeeVaultMinimumWithdrawalAmount": "0x8ac7230489e80000", "sequencerFeeVaultMinimumWithdrawalAmount": "0x8ac7230489e80000", "baseFeeVaultWithdrawalNetwork": 0, "l1FeeVaultWithdrawalNetwork": 0, "sequencerFeeVaultWithdrawalNetwork": 0, "proxyAdminOwner": "0xa0Ee7A142d267C1f36714E4a8F75612F20a79720", "finalSystemOwner": "0xa0Ee7A142d267C1f36714E4a8F75612F20a79720", "superchainConfigGuardian": "0xa0Ee7A142d267C1f36714E4a8F75612F20a79720", "finalizationPeriodSeconds": 2, "fundDevAccounts": true, "l2GenesisBlockBaseFeePerGas": "0x1", "gasPriceOracleOverhead": 2100, "gasPriceOracleScalar": 1000000, "gasPriceOracleBaseFeeScalar": 1368, "gasPriceOracleBlobBaseFeeScalar": 810949, "enableGovernance": true, "governanceTokenSymbol": "OP", "governanceTokenName": "Optimism", "governanceTokenOwner": "0xa0Ee7A142d267C1f36714E4a8F75612F20a79720", "eip1559Denominator": 50, "eip1559DenominatorCanyon": 250, "eip1559Elasticity": 6, "l1GenesisBlockTimestamp": "0x123", "l2GenesisRegolithTimeOffset": "0x0", "l2GenesisCanyonTimeOffset": "0x0", "l2GenesisDeltaTimeOffset": "0x0", "l2GenesisEcotoneTimeOffset": "0x40", "l1CancunTimeOffset": "0x30", "systemConfigStartBlock": 0, "requiredProtocolVersion": "0x0000000000000000000000000000000000000000000000000000000000000000", "recommendedProtocolVersion": "0x0000000000000000000000000000000000000000000000000000000000000000", "faultGameAbsolutePrestate": "0x03c7ae758795765c6664a5d39bf63841c71ff191e9189522bad8ebff5d4eca98", "faultGameMaxDepth": 50, "faultGameClockExtension": 0, "faultGameMaxClockDuration": 1200, "faultGameGenesisBlock": 0, "faultGameGenesisOutputRoot": "0xDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEF", "faultGameSplitDepth": 14, "faultGameWithdrawalDelay": 604800, "preimageOracleMinProposalSize": 10000, "preimageOracleChallengePeriod": 120, "proofMaturityDelaySeconds": 12, "disputeGameFinalityDelaySeconds": 6, "respectedGameType": 254, "useFaultProofs": true, "usePlasma": true, "daCommitmentType": "KeccakCommitment", "daChallengeWindow": 16, "daResolveWindow": 16, "daBondSize": 1000000, "daResolverRefundPercentage": 0 }Ï </pre> </details> For some reason the L1 config file is not copied into `.devnet/`, but can be found in [ops-bedrock/beacon-data/config.yaml](https://github.com/Layr-Labs/optimism-exp/blob/samlaf--alt-da/ops-bedrock/beacon-data/config.yaml). ### Playing with batch_decoder There's a cool batch decoding [cli tool](https://github.com/ethereum-optimism/optimism/tree/develop/op-node/cmd/batch_decoder) available, which we can use to analyze channels and frames and understand how they are being submitted to the devnet's L1. > **_NOTE:_** you will need to comment [this line](https://github.com/ethereum-optimism/optimism/blob/f243ad0d892dfa9dc022caeb0155c0c9ff5fef52/op-node/cmd/batch_decoder/main.go#L173) to make the reassemble command work on devnet. For some reason the code overrides the BatchInboxAddress that we pass as argument flag. I created an [issue](https://github.com/ethereum-optimism/optimism/issues/11506) for this. ```bash! cd op-node/cmd/batch_decoder # this command will extract batcher txs on the L1 between blocks 0 and 50, and write their frame details to the /tmp/batch_decoder/transactions_cache dir go run . fetch --start 0 --end 50 --inbox 0xff00000000000000000000000000000000000901 --sender 0x3c44cdddb6a900fa2b585dd299e03d12fa4293bc --l1 http://localhost:8545 # this command will read the batcher txs output from the previous command, reassemble the channels that it can, and write out decoded batches to the /tmp/batch_decoder/channel_cache dir go run . reassemble --inbox 0xff00000000000000000000000000000000000901 ``` <details> <summary>Example batcher tx output file</summary> <br> <pre> { "tx_index": 0, "inbox_address": "0xff00000000000000000000000000000000000901", "block_number": 31, "block_hash": "0x37dc6f25a5e8c49785f5ea977893ffdbd1d8af605c724b6b9dd32834509a905b", "block_time": 1723873130, "chain_id": 900, "sender": "0x3c44cdddb6a900fa2b585dd299e03d12fa4293bc", "valid_sender": true, "frames": [ { "id": "35ba946e6cec461a3bd2fb4caf428490", "frame_number": 0, "data": "eNra4cPww3OB+Udp9Z4z8aUrp59/vrvR4rlcl8dEOXaxyQs2nxEJjF3QJrng6cEakZL/O2advPvFtWOv6yKFD3yZLc9jNxwXaPj181Nvf0vaAXO/AxDThDzKrdUWdEaqP1U+4p8n23u7puRYcrMP885tmWm3m7+GSS2wv9+eky6xSPTp/Vk2h56YH/zvNPNj2fT2tz7O3x9LWu0+ATItAGpa3aWmWVbfezbo3lJe87jfn3NygCFb6rS5c3Nl1Dj8r+seJ860IKhpvHsE5ohbKBVo/r52TuNOjKlovFGi7MzbsrxpWWlT3lZvI860EKhpx+5cUUp8szAk716Tdvfcthcms1V+aceZLdbJWcGawSyWKL2A7c+uZOnAN8snL7bPPhl2lj3zxOOwnzsZzoim/NFVDP77FmRaGNS0A6ndyjPnhC779m+TmNuOdhvF3dyKfGfTL4svOtZwxDaSgzjTIg4AAgAA//8nN+ps", "is_last": true } ], "frame_parse_error": [ "" ], "valid_data": [ true ], "tx": { "type": "0x2", "chainId": "0x384", "nonce": "0xe", "to": "0xff00000000000000000000000000000000000901", "gas": "0x69cc", "gasPrice": null, "maxPriorityFeePerGas": "0x3b9aca00", "maxFeePerGas": "0xb2d05e00", "value": "0x0", "input": "0x0035ba946e6cec461a3bd2fb4caf42849000000000016b78dadae1c3f0c37381f94769f59e33f1a52ba79f7fbebbd1e2b95c97c7443976b1c90b369f11098c5dd026b9e0e9c11a9192ff3b669dbcfbc5b563afeb22850f7c992dcf63371c1768f8f5f3536f7f4bda0173bf0310d3843ccaadd5167446aa3f553ee29f27db7bbba6e45872b30ff3ce6d9969b79bbf86492db0bfdf9e932eb148f4e9fd5936879e981ffcef34f363d9f4f6b73ecedf1f4b5aed3e01322d006a5adda5a65956df7b36e8de525ef3b8df9f737280215beab4b9737365d438fcafeb1e27ceb420a869bc7b04e6885b281568febe764ee34e8ca968bc51a2ecccdbb2bc69596953de566f23ceb410a869c7ee5c514a7cb33024ef5e9376f7dcb61726b3557e69c7992dd6c959c19ac12c9628bd80edcfae64e9c037cb272fb6cf3e1976963df3c4e3b09f3b19ce88a6fcd1550cfefb16645a18d4b403a9ddca33e7842efbf66f9398db8e761bc5dddc8a7c67d32f8b2f3ad670c436928338d3220e00020000ffff2737ea6c01", "accessList": [], "v": "0x0", "r": "0x1e38f6d5232f064d5fb9994d40333c86b3959f1eef9784864bc487b5e482bb14", "s": "0x2f002e94e965f399bea543401cd972ee57fa00d5b65ea915ecb093fefa794664", "yParity": "0x0", "hash": "0x2defbe890a37a190f4b51d45a7b664b2f382b2718189481622e2853182ec7e83" } } </pre> </details> <details> <summary>Example channel output file</summary> <br> <pre> { "id": "70c8f9c9fc17bf378a5e6bd23f7a385c", "is_ready": true, "invalid_frames": false, "invalid_batches": false, "frames": [ { "transaction_hash": "0x72654b0fe0fe9fabd2dff6dd1e4766e469acdad61f6d58193ed754d19055d0ff", "inclusion_block": 27, "timestamp": 1723873106, "block_hash": "0x06fcba631b51eca793a33f6bc956cd0769c8e356f9b900cc1564fc2d2153fded", "frame": { "id": "70c8f9c9fc17bf378a5e6bd23f7a385c", "frame_number": 0, "data": "eNra4cPww3MB3/JexxWODmF9L3Pfn3rdOMF2q9qk+uN2ba8tF/1+w8UYJ7pgqZgb25UXj6vZjLepG2V+2H7M5rHlztKYDLUJe5bPinxzpiXtgLnZAYhpVpkpGbYqDqH/khK5rnDedXu+PTClTVrl93vNct8Ijb2tYgv+WiSx2DDFqJVeP1+b92Xze8ejhi+zxRbrz9/I8m5hddNjkGkWUNPa/7zf41d6NuRA/5ZLvg2+bht2vmLa1hfX/WDeljuqHM45xJlmBTUtdIXIl+y+7pfu6u8m9+WVW0V95rN5+Fdfq2x+/qNd6mcEiDPNBmraozUbt//NzNkl7xPznj2lPGEWR9mCubuZbM7W5DBZFlRGiC+Ycn6H7aIF6/pjW9VFJtioZO01YZkn7/3xadPLI4ZrjktNBplmBzXN4JIfd8dmyZizPtzWapVTLbU+XtF22Zn+WnBikqjgrI5C4kxzOAAIAAD//3aa4tA=", "is_last": true } } ], "batches": [ { "ParentHash": "0x0ea78d41a84140568ee96defcaeb81903db526927fc73e86eb39a2fbec0a015e", "EpochNum": 21, "EpochHash": "0xa5164606d4e8e37b0633b6273269f0b7c63ce339b9755c682690bca79a59eccc", "Timestamp": 1723873078, "Transactions": [] }, { "ParentHash": "0x3a6964683d244055fe62610ad409dd46e7b75164861b24fbef29774d5828bd85", "EpochNum": 22, "EpochHash": "0xfd3862043c025c2675d7cf7d6ef4b3ef41c531e96b16a32f9fb104eea17b82e3", "Timestamp": 1723873080, "Transactions": [] }, { "ParentHash": "0x87fcefbc4e75cd54c08fb4d24d804d46b0b9ea02b68e5e8be09eb4dc2508436c", "EpochNum": 22, "EpochHash": "0xfd3862043c025c2675d7cf7d6ef4b3ef41c531e96b16a32f9fb104eea17b82e3", "Timestamp": 1723873082, "Transactions": [] }, { "ParentHash": "0x55a814f46b8e8be94727ee938e6e773a5af30e3ce1fd2f2a769f6fe2ba27cc10", "EpochNum": 22, "EpochHash": "0xfd3862043c025c2675d7cf7d6ef4b3ef41c531e96b16a32f9fb104eea17b82e3", "Timestamp": 1723873084, "Transactions": [] }, { "ParentHash": "0xe2acb1b7fd696cba1f4c5cef076477609a0876a09dbb023ccd7c6c0239707958", "EpochNum": 23, "EpochHash": "0x94cfb83da2a0ae8f5d852714903c246abd34049e1f4bf1e582e9c431acc71a93", "Timestamp": 1723873086, "Transactions": [] }, { "ParentHash": "0x30d24e0b88b3195ccd4c0b3b267995392af1d42b44b967eb11916215119a8871", "EpochNum": 23, "EpochHash": "0x94cfb83da2a0ae8f5d852714903c246abd34049e1f4bf1e582e9c431acc71a93", "Timestamp": 1723873088, "Transactions": [] } ], "batch_types": [ 0, 0, 0, 0, 0, 0 ], "compr_algos": [ "zlib", "zlib", "zlib", "zlib", "zlib", "zlib" ] } </pre> </details> Note that the batches in the channel output file all have empty tx lists. This is normal, because no traffic is being sent to the sequencer! Before sending txs, let's first look at the grafana dashboard and make sure we can understand what we are seeing. ### Playing with Grafana The devnet starts a local [grafana instance](https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/ops-bedrock/docker-compose.yml#L258), which you can access at <a href="http://localhost:3000" target="_blank">http://localhost:3000</a>, with user `admin` and password `admin`. At the very bottom, you will see ![image](https://hackmd.io/_uploads/H1AZbEgiR.png) <details> <summary>Explanation</summary> <br> Here are the definitions for unsafe, safe, and finalized: ![image](https://hackmd.io/_uploads/ryNkfNesA.png) Can you explain the different numbers? For eg, reading from the [devnet configs](#Devnet-Configs) section, one can find that the L1 runs at 6 seconds_per_slot, and the L2 at 2 second blocktime. Hence, there are 3 L2 blocks per L1 block, and thus 778 (L1 head) * 3 = 2334, which is ~= 2337 (the L2 had produced 3 more blocks that still hadn't been written to the L1) when this prometheus snapshot was taken. </details> We will now walk through the main batcher panels at the top of the grafana dashboard. Can you explain their behavior just by looking at the op-batcher [arguments](https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/ops-bedrock/docker-compose.yml#L187)? ![image](https://hackmd.io/_uploads/ryV3H60qR.png) <details> <summary>pending blocks explanation</summary> <br> op-batcher polls op-node for new blocks <a href="https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/ops-bedrock/docker-compose.yml#L193">every 2sec</a>, and the devnet L2 has 2 second blocks. </details> <details> <summary>blocks added to current channel explanation</summary> <br> op-batcher has max_channel_duration of <a href="https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/ops-bedrock/docker-compose.yml#L191">10</a> L1 blocks. With 3 L2 blocks per L1 block, this means 30 L1 blocks fit per channel. The panel actually shows 27, and I am not sure where the discrepancy comes from. </details> <details> <summary>size of transactions explanation</summary> <br> this one seems odd. It shows that there are never any bytes in the pending block (which seems to align with the fact that no traffic is being sent). But the total number of bytes keeps increasing, which contradicts the previous observation. This is simply explained by the current gauge being <a href="https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/op-batcher/metrics/metrics.go#L280">increased</a> and <a href="https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/op-batcher/metrics/metrics.go#L280">decreased</a> faster than the 2second prometheus scrape interval can pick up. The bytes are added to the channel state and very quickly processed. This probably just means that the current metric is not the right approach, and a histogram might be more useful here. <br> <br> On the other hand, what explains that blocks are non-empty? This is because L2 blocks contain 3 things: l1 deposits (none), L2 txs (none), and <a href="https://specs.optimism.io/glossary.html#l1-attributes-deposited-transaction"> L1 attributes deposited txs </a>, which are system transactions that are always present. This explains the bytes observed. </details> <details> <summary>input/output bytes per channel</summary> the input added curve has the same pattern as the blocks added to current channel panel above it, which makes sense. <br><br> The input closed curve showing more bytes than the output curve also makes sense, given that channels are compressed. We see that when closed, channels contain ~2.2KB, and are compressed down to ~1.5KB. </details> ![image](https://hackmd.io/_uploads/SJ4wJSxiR.png) <details> <summary>number of frames per channel explanation</summary> <br> 1 frame per channel is the <a href="https://github.com/Layr-Labs/optimism-exp/blob/36b0029075af964969664905c51f915fc81dd635/op-batcher/flags/flags.go#L82"> default </a>, and that's what we see here. Nothing surprising. </details> <details> <summary>tx confirmation latency explanation</summary> <br> Both bottom panels show the same information, that each transaction takes ~12 seconds to get confirmed. That is because it gets included in the next L1 block (~6 seconds), and then waits <a href="https://github.com/Layr-Labs/optimism-exp/blob/36b0029075af964969664905c51f915fc81dd635/ops-bedrock/docker-compose.yml#L194"> 1 </a> more block to get confirmed. Don't bother changing the config to use 0 block confirmation, that is not allowed, for some reason. </details> Batcher txs just grows monotonically, so is not interesting. ![image](https://hackmd.io/_uploads/HJ__kreoR.png) <details> <summary>pending txs explanation</summary> <br> Remember that the max_channel_duration is set to <a href="https://github.com/Layr-Labs/optimism-exp/blob/36b0029075af964969664905c51f915fc81dd635/ops-bedrock/docker-compose.yml#L191"> 10 </a> L1 blocks = 60 seconds. We see one pending tx roughly every 60 second, which takes ~12 seconds to get confirmed. That is, it gets included in the next L1 block (6 seconds), and then waits <a href="https://github.com/Layr-Labs/optimism-exp/blob/36b0029075af964969664905c51f915fc81dd635/ops-bedrock/docker-compose.yml#L194"> 1 </a> more block to get confirmed. Don't bother changing the config to use 0 block confirmation, that is not allowed, for some reason. </details> 4844 blob txs are not used by default, so the histogram is empty. They can be turned on by uncommenting [these 2 lines](https://github.com/Layr-Labs/optimism-exp/blob/36b0029075af964969664905c51f915fc81dd635/ops-bedrock/docker-compose.yml#L207). ### Sending Traffic We can send traffic with this simple evm txs generator: ``` git clone https://github.com/Layr-Labs/evm-tx-load-gen.git cd evm-tx-load-gen ./run.sh ``` This will send txs to the optimism sequencer (http://localhost:9545). You can change the shape of the traffic by playing with the parameters in the `.env` file. The two main ones are: ``` # to send 1 MB/s, set interval to 125ms and pad_size to 130958 # with the tx overhead (signature, nonce, etc) the actual txs send will have size exactly 128KB = 131072 bytes # this will send 8 txs/s, each of size 128KB, such that 8 * 128 KB = 1 MB TRAFFIC_GENERATOR_REQUEST_INTERVAL=1s TRAFFIC_GENERATOR_PAD_SIZE=600 ``` Try sending 1 MB/s traffic as suggested above. Make sure to set [OP_BATCHER_DATA_AVAILABILITY_TYPE](https://github.com/Layr-Labs/optimism-exp/blob/samlaf--alt-da/ops-bedrock/docker-compose.yml#L207) to blobs and set `OP_BATCHER_TARGET_NUM_FRAMES: 6` to send 6 blobs per transaction (max that ethereum allows). Can you explain the change in behavior of the graphs? This level of traffic is too high for the devnet to sustain, so make sure to stop it after a while when you start seeing `unable to publish transaction` errors. ![image](https://hackmd.io/_uploads/H1qnIBxiC.png) The panels that I am seeing are: ![image](https://hackmd.io/_uploads/rJjRLSxiA.png) We see from the "Total size of transactions" panel that the high-throughput traffic gen was stopped around 21:12:00. The total size of pending blocks then platoed, and the input/output bytes per channel dropped back to 0. The first row is also easy to explain: the blocks were huge, so there were very few blocks needed to fill a channel. ![image](https://hackmd.io/_uploads/HyK1wrgoC.png) We see similar behavior here for the number of frames output per channel, which jumped from 1 to 15, despite [target_num_frames](https://github.com/Layr-Labs/optimism-exp/blob/e3f3dfc09015ca8472c136c2b8ae2e0190ce5169/op-batcher/batcher/channel_config.go#L36) being set to 6. A channel can only go past its target number of frames if the last added channel was huge, which is what is happening. The other panels are less interesting, because we actually sent too much traffic and caused the txmgr to behave poorly, with failed txs and high latency. The reader should try to find the sweet spot traffic (or parameters) to prevent this behavior from happening. ![image](https://hackmd.io/_uploads/H1GewSlo0.png) The batcher pending txs resetting to 0 is interesting behavior. My initial guess was that [channelTimeout](https://docs.optimism.io/builders/chain-operators/configuration/rollup#channeltimeout) was getting hit without channels getting confirmed onchain, which forces a republishing of all channel txs. However, the devnet channelTimeout is set to 120 L1 blocks, which means 12 minutes, whereas this behavior happened much faster. This might be some intricate txmgr behavior, I am not sure. The rightmost panel is very interesting. It shows a histogram of the size of blobs sent to L1. The confusing part is why it is bimodal, instead of all blobs having max size (inf category just means max blob size = 131072). Remember that each channel is being broken up into 6 frames/blobs, and each 4844 tx sent contains 6 blobs. The bimodal behavior is explained by the last channel not being full, because it can't fit one more block, or would grow beyond the max channel size. Notice this [comment](https://github.com/ethereum-optimism/optimism/blob/32eea1b691d66925b028bb72c1564b22a4520d21/op-batcher/batcher/channel_manager.go#L23) in the channelManager implementation: ``` // For simplicity, it only creates a single pending channel at a time & waits for // the channel to either successfully be submitted or timeout before creating a new // channel. ``` This image from the [derivation pipeline spec](https://specs.optimism.io/protocol/derivation.html#batch-submission-wire-format) is ensightful: ![image](https://hackmd.io/_uploads/rytIOUljC.png) The blob containing A1|B0 is precisely what the current implementation of channelManager cannot produce. Instead, we see that the B0 space is wasted. The last frame of a channel is sent with empty space, and only then is another channel started. ## Conclusion We've learned some fundamental properties of how rollups work, and gained some first-hand experience playing with optimism's devnet. I hope that these explorations were more useful than confusing.