Arbitrum Batch Poster Overview

The Arbitrum batch poster is responsible for aggregating and confirming l2 messages constructed by the sequencer. There are many optimizations, encoding shemas, and security countermeasures implanted to ensure that the component can always reliably run; assuming a securely expressed node configuration.

Technical Overview

** Last updated to reflect

Batches are built using the following general sequence:

Fetch latest batch position from SequencerInbox or pre-provided meta and establish current l1 bounds
Starting building batch from provided on-chain position if unconfirmed messages exist in the sequencer feed
Sequence all messages from latest_confirmed –> latest_unconfirmed into a buildingBatch until notified to do otherwise
Encode batch and submit to DA Provider if necessary
Submit sequencer message (i.e, DA Cert, 4844 blob references, compressed calldata) to SequencerInbox to confirm the batch on-chain
End routine and run-again in x time

System Diagram

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Inbox Sequence State - The global state attesting to the progression of both inboxes
Parent Chain - Chain that Arbitrum is deployed on
Sequencer Message Feed - The sequence of all unconfirmed messages
Cooldown - Minimum amount of time that the batch poster must wait before it can submit another batch.
Submission Boundary - The L1 block bounds that an incoming L2 message must fullfill for batch inclusion.

Inbox Sequence State

The batch poster fetches the latest inbox sequence state from the SequencerInbox before proceeding to build and confirm the next batch:

type batchPosterPosition struct {
	MessageCount        arbutil.MessageIndex
	DelayedMessageCount uint64
	NextSeqNum          uint64
}

Batch

A batch is an encoded and compressed sequence of l2 messages - including:

L2 Blocks
Batch spending reports
L1 deposits

The l2 messages read from the unsafe sequencer are reduced to subset

Raw Segments

Each of these messages is encoded to a 2D raw segments matrix - using only a subset of the entire message when interpreted (l2_message –> []segment)
Each raw segment is prepended with a prefix type byte; i.e:

L2Message = 0 (L2 block)
L2MessageBrotli = 1 (unnused)
DelayedMessages = 2 (delayed inbox message type)
AdvanceTimestamp = 3 (forward L1 context)
AdvanceL1BlockNumber = 4 (forward L1 context)

with the raw segments matrix looking like:

[
  [Advance Timestamp Segment],
  [Advance L1 Block Number Segment],
  [Message Segment (L2 Message or Delayed L1 Message)],

  [Advance Timestamp Segment],
  [Advance L1 Block Number Segment],
  [Message Segment (L2 Message or Delayed L1 Message)],

  ...

  [Add]
  [Message Segment (Delayed L1 Message)]
]

The diff segments (i.e, AdvanceTimestamp and AdvanceL1BlockNumber) are only prefixed before a new message when it contains a timestamp or l1BlockReference that hasn't been seen in the building batch. These raw segments are brotli compressed to optimize batch sizes and reduce submission costs.

Compression

Dynamic level setting
Compression levels are set dynamically provided the existing backlog size (i.e, unsafe_head - safe_head).

Let:

B be the backlog or number of L2 messages waiting to be posted.
UC be the compression level set by the user via batch poster config.
CL be the compression level used for reducing the batch.
RL be the recompression level used for recompressing a batch again.
C(B) be the (compression level, recompression level) used as a function of backlog B.

The piecewise function tuple C(B, UC) is defined as:

compression level =
\begin{cases} min(\text{6, UC}) & \text{if } B \leq 20 \\ UC\ & \text{if } B\gt 20 \ \text{and} \ B<60 \\ min(\text{4, UC}) & \text{if } B > 60 \end{cases}

recompression level =
\begin{cases} \text{UC} & \text{if } B \lt 40 \\ min(\text{6, UC}) & \text{if } B > 40 \end{cases}

–
recompression level is only used for recompressing existing batch segments in the event that:

The batch has overflown
The batch has hasn't been properly closed (i.e, close() invoked before before calling CloseAndGetBytes())

Completing the batch

The poster must understand when to stop building the message batch. There are few key scenarios that cause the poster to disregard further messages in the backlog and submit what they currently have:

An overflow can be detected when batch resourcing constraints are exceeded (e.g, # of messages > allowed messages per batch)
A max L1 block boundary can be triggered that causes the batch poster to disregard further messages
max-delay is triggered

Overflow detection

The batch poster uses local limits for understanding when a batch has overflown and must be immedietly completed.

Let’s define the following variables:

S: The set of all segments in the current batch
Lmax: The maximum allowable uncompressed size of the batch
Cmax: The maximum allowable compressed size of the batch
Nmax: The maximum number of message segments supported per batch
L(S): The total uncompressed size of the segments in (S)
C(S): The total compressed size of the segments in (S)
|S|: Cardinality of (S)
ΔL: The uncompressed size of the new segment
ΔC: The compressed size of the new segment

Overflow Conditions:

The overflow occurs if any of the following conditions is violated:

Uncompressed Size Overflow:

\[ L(S) + \Delta L > L_{\text{max}} \]
Compressed Size Overflow:

\[ C(S) + \Delta C > C_{\text{max}} \]
Segment Count Overflow:

\[ |S| + 1 > N_{\text{max}} \]

L1 Block Boundaries

The batch poster also tracks the L1 block associated with the prior batch submission and, based on the L1BlockBound setting, waits for a certain number of L1 blocks before making the next submission.

Currently, the batch poster supports the following settings:

l1BlockBoundDefault: Safe if the L1 reader has finality data enabled, otherwise Latest
l1BlockBoundSafe: one consensus epoch (i.e, 32 blocks)
l1BlockBoundFinalized: two consensus epochs (i.e, 64 blocks)
l1BlockBoundLatest: most recent block to be validated
l1BlockBoundIgnore: no checks and don't reference l1 state before submitting batch

Submission Boundary

The batch poster can only sequence a L2 message into the batch if the message's BlockNumber respects submission boundaries. The following must hold true for message inclusion:

(minBlockTime, minBlockNumber) --> (msg.blockTime, msg.BlockNumber)
--> (maxBlockTime, maxBlockNumber)

where
minBlockTime <= msg.timestamp <= maxBlockTime

and
minBlockNumber <= msg.blockNumber <= maxBlockTime

–

Encoding & Submitting Batches

3 different submission flows exist for publishing batches across different DA destinations:

Calldata - Batches are included within L1 SequencerInbox tx calldata
4844 - Batches submitted to ETH beacon chain DA
DAP - Data availability provider; arbitrum anytrust or alt da provider forking nitro

All flows interact with the SequencerInbox contract to update the global state sequence using unique entrypoint logic and message structures.

Destination	Inbox Entrypoint	Message structure
4844	`addSequencerL2BatchFromBlobs`	`[0x10, blob_hash_0,...,blob_hash_n]`
calldata	`addSequencerL2BatchFromOrigin0`	`[0x1, compressed_batch]`
dapWriter (anytrust)	`addSequencerL2BatchFromOrigin0`	`[0x80,keyset_hash, signable_fields, signers_mask, BLS signatures]`

Destinations

Calldata

Compressed batch is encoded into the tx calldata that's submitted as part of the SequencerInbox entrypoint transaction.

4844

Takes the l2 batch and encodes it across a span of blobs proportional to the batch size. Each blob is encoded into 32 byte field elements with the first byte used to store modulo overflow bits that remain after chunking the input data into 31 byte sections:

Field element encoding:

0            1                         32
|------------|-------------------------|
| spare bits |     blob data           | 


where spare bits = (blob_length % 31) * 8

Post encoding, the 4844 blob identifier hashes are computed via hashing the BLS12-381 KZG commitment. These identifier hashes construct the sequencer message posted to the inbox and are used for secure blob lookups against beacon chain.

ALT DA (i.e, AnyTrust)

If a dapWriter is configured, then the batch is submitted to the alt-da location with the sequencerMsg or batch value being overriden with the Data Availability Certificate that is then submitted on-chain.

Tx Submission

The batch poster constructs the inbox transaction and simulates the execution costs to understand how to set the gas_limit. From here the tx is submitted via data poster with the following key fields:

nonce - monotonic L1 account nonce
meta - expected metadata result after transaction
calldata - inbox state update instructions
gas_limit - initial gas limit for first tx submission attempt

Security Measures

Delay Proof

An optional delay proof feature is supported per nitro-contracts v3.0.0 and nitro v3.3.2. The Sequencer Inbox can force the sequencer to submit an additional DelayProof when sequencing batch metadatas. There are new inbox entrypoints to support this (i.e, addSequencerL2BatchFromBlobsDelayProof, addSequencerL2BatchFromOriginDelayProof) which

Inbox Tx Safety

A data poster object is maintained by the batch poster and is expected to ensure that the SequencerInbox gss update tx always lands safetly (even if reorged or rejected). The poster also ensures that tx submissions are respected in the order in which it has obtained them. Critical batch poster halts due to reorgs would only happen when data poster uses a NoopStorag which doesn't provide fail safe gurantees.

Distributed Posting

The Arbitrum batch poster can be ran across multiple instances with unique private keys. A redis cache is used for distributed coordination across posters where a poster will attempt to build and submit a batch if they can acquire a write lock. Otherwise one will wait and retry until a lock can be acquired.

Batch Simulation

An optional check-batch-correctness field exists which forces the batch poster to enure that a sequncer inbox message can be successfully serialized using a simulated inbox reader before submission. This is key for ensuring that bad batch commitments can be disregarded during posting vs. derivation.

Appendix (Unstructed Notes)

Concurrency

Concurrency mgmt for spawning go routines, event loops, and routine limits is handled by a stopWaiter construction used across the nitro codebase. The batch poster is primarily operated via three concurrent routines:

main event loop that handles the batch construction, submission, and accrediting all performed within the same function (i.e, maybePostSequencerBatch).
pollForL1PriceData that subscribes to new L1 Block headers and increments metric gauges for price observability.
pollForReverts that subscribes to new L1 block header events. When streaming, the routine proceeds to read the latest range of unprocessed blocks where it manually parses every data poster to see if reverted and log an error msg. If the data poster is configured with a NoopDataBase then the batch poster would halt since the data poster couldn't reliably handle reorgs.

Code Diagram

Types

Building Batch

The batch poster maintains a buildingBatch which contains necessary construction and positional metadata:

type buildingBatch struct {
	segments          *batchSegments
	startMsgCount     arbutil.MessageIndex
	msgCount          arbutil.MessageIndex
	haveUsefulMessage bool
	use4844           bool
	muxBackend        *simulatedMuxBackend
}

segments: internal batch metadata
startMsgCount: message index at which the current batch construction begins
msgCount: The total number of L2 messages that have been created by the sequencer
haveUsefulMessages: Set to true when the batch is full or when a message is processed that isn't a delayed inbox spending report.
use4844: Submit batches to Ethereum beacon chain DA
muxBackend: Used to simulate the message inbox derivation to ensure that the batch being built can be correctly read and processed by the system

Batch Segments

As L2 messages are sequenced into a batch, they're interpreted into raw byte segments.

type batchSegments struct {
	compressedBuffer      *bytes.Buffer
	compressedWriter      *brotli.Writer
	rawSegments           [][]byte
	timestamp             uint64
	blockNum              uint64
	delayedMsg            uint64
	sizeLimit             int
	recompressionLevel    int
	newUncompressedSize   int
	totalUncompressedSize int
	lastCompressedSize    int
	trailingHeaders       int // how many trailing segments are headers
	isDone                bool
}

bowen-ll

2024/10/11 06:57:19

Diagram

delayed inbox msg comes from force inclusion?

Ethen Pociask

2024/11/01 16:45:12

yes and no - so in the optimistic case the sequencer runs an event listener against the delayed inbox and processes these messages into the batch directly. In the pessimistic case (i.e, the sequencer is censoring and ignoring the delayed inbox), a user can force include the message from the delayed inbox into the sequencer inbox. This could actually cause the sequencer to reorg since its internal understanding of the inbox sequence state would shift.