Optimism batch compression analysis

How to calculate a more accurate L1 fee.

Special thanks to Roberto Bayardo for his excellent compression analysis repo and ideas on linear regression.

Current state

The current post-Regolith, pre-Ecotone cost function is defined as follows:

(zeroes*4 + ones*16 + overhead) * scalar * l1BaseFee / 1e6

where:

zeroes: count of 0 bytes in the tx
ones: count of non-0 bytes in the tx
overhead: overhead per tx (suggested: 188)
scalar: scalar per tx (suggested: 684_000)
l1BaseFee: current L1 base fee

This is a basic assumption of EIP-2028 costs multiplied by a constant scalar that represents the compression ratio over time.

Ecotone will update it to:

(zeroes*4 + ones*16) * (16*l1BaseFeeScalar*l1BaseFee + l1BlobBaseFeeScalar*l1BlobBaseFee) / 16e6

where:

zeroes: count of 0 bytes in the tx
ones: count of non-0 bytes in the tx
l1BaseFeeScalar: scalar to use if submitting calldata`
l1BaseFee: current L1 base fee
l1BlobBaseFeeScalar: scalar to use if submitting 4844 blobs
l1BlobBaseFee: current L1 blob fee

While the scalars in these formulas give some macro control over the economics of the chain, it doesn't take into account the compressibility of transactions. For example, a tx with calldata that contains 1000 repeated 1 values will be charged more than a transaction that contains 500 random bytes, even though the former is more compressible than the latter, and costs the chain operator less to rollup.

This starts to incentivize pre-compression of transactions submitted to L2, as we are starting to see with 4337 transactions. On L2s execution gas is cheap and calldata is expensive, so folks are starting to use libraries like solady's LibZip to save on L1 fees. This behavior will eventually cause the chain operator to increase the scalar value to cover the cost of these less compressible transactions, essentially forcing other users to cover the DA cost increase. It will also increase L2 gas usage per block, reducing the overall chain transaction throughput.

Better estimator

What if we could provide a better estimate for the actual cost a transaction costs to rollup? It is tricky to get a perfectly accurate estimate, because the L1 cost function is run with only the transaction as input, and in reality the batcher is compressing many transactions together in a batch which gives better compression ratios.

However, we can do better than the naive EIP-2028 approach above. We can introduce an efficient compression estimator into the fee calculation that can give a more accurate estimate of the actual costs to roll up a transcation.

This compression estimator needs to be efficient as we don't want to slow down the execution engine with compression tasks. It also needs to have an implementation available in Solidity so we can keep the GasPriceOracle calculation in sync with the execution engine implementation.

Enter FastLZ

FastLZ is a "Small & portable byte-aligned LZ77 compression" algorithm. The advantage of this algorithm is that it is simple and efficient, and has an audited Solidity implementation in the solady library.

We compared FastLZ with running individual transactions through zlib at max compression, and while zlib results in better compression ratios, we found it did not perform dramatically better for compressibility estimation. Both FastLZ and zlib use a variation of LZ77. Given the batcher uses zlib, it's unlikely that some other compression algorithm will improve accuracy, so FastLZ seems to be a good choice for this use case.

There's a reference implementation of FastLZ in Golang here, and a more efficient length-only version here (we only need an estimate of the compression length, not the actual compressed data).

Back testing

We wanted to test the improved accuracy of the new L1 cost formula, so we took the following approach:

Obtain a machine with the Geth data directory from an Optimism Mainnet archive node (either extracted from a snapshot or from a running node).
For every single Bedrock transaction, dump the RLP-encoded tx size, the FastLZ compressed size, and a best estimate of the batcher compressed size (using this script).
Perform an time series comparison of the FastLZ method compared to the EIP-2028 method. Specifically we look at the root-mean-squared-error (RMSE) of both FastLZ and EIP-2028, using the best estimate as a source of truth.

Note: we calculate the "best estimate" using a dual setup of zlib writers at the highest compression level. We aim to keep one of the writers always between 64kb and 128kb in size, simulating the compression achieved in a 128kb batch. Running all of the transactions on OP mainnet since Bedrock through this algorithm spat out an estimate of 19,929MB. The actual amount of calldata written is 20,222MB meaning the estimate is off by 1.5%, validating the method's accuracy.

Why RMSE? RMSE penalizes outliers more heavily, so it captures things like adversaries potentially trying to exploit weaknesses. We also looked at mean-absolute-error (MAE) and found that the two measures were correlated: estimators that performed well using RMSE also performed well against MAE.

Results

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

You can see the RMSE of the FastLZ algorithm performs better than the EIP-2028 version. However we still see a lot of spikes in the chart. These spikes are spaced 14 days apart, and are likely due to Worldcoin activity on OP, which are inherently incompressible transactions.

Linear regression

We decided to apply a linear regression to the data to attempt to find a more accurate estimator. This has two advantages:

adds a constant term (the intercept) to the formula
can accept multiple variable inputs (multivariate linear regression)

Using the "best estimate" above as the dependent variable, we chose to run three separate regressions over the data, using the following sets of independent variables:

[zeroes, ones] (checking if using a linear regression using the EIP-2028 variables would improve performance without FastLZ)
[fastlz_length]
[fastlz_length, uncompressed_tx_size]

We performed the regression over a subset of the data: transactions from the month of October, 2023, which includes some of the worldcoin distribution shift. We then generated RMSE charts across the entire dataset, which allows us to validate that the model still holds during the larger changes in the tx data.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

The [fastlz_length, uncompressed_tx_size] regression (in orange above) performed the best, almost wiping out those spikes in errors seen in the other charts.

The models calculated in these cases are as follows:

v = 8.570750093473123 - 0.01778946*zeroes + 0.87830521*ones
v = -35.60342890895569 + 0.90558897*fastlz_length
v = -27.321890037208703 + 1.03146206*fastlz_length - 0.08866427*uncompressed_tx_size

Other techniques

We wanted to validate some other techniques that didn't involve FastLZ, in case there was a simpler implementation that would yield good results. A few variations were tested:

Variations of the EIP-2028 estimator above, where the zero byte was assigned different costs.
An estimator that only counted non-repeated and non-zero bytes.
An estimator that charged different fees for repeated bytes vs non-repeating bytes.

You can see a full list of the techniques we tried here. While some of these improved on the original algorithm, we didn't find any that performed better than the FastLZ algorithm.

Conclusion

We recommend switching to the following formula for the new L1 cost function:

l1BaseFeeScaled = l1BaseFeeScalar * l1BaseFee * 16
l1BlobFeeScaled = l1BlobFeeScalar * l1BlobBaseFee
l1FeeScaled = l1BaseFeeScaled + l1BlobFeeScaled
((intercept + fastlzCoef*fastlzLength + uncompressedTxCoef*uncompressedTxSize) * l1FeeScaled) / 1e12

where (most variables scaled by 1e6 for integer arithmetic purposes):

l1BaseFeeScalar = 11_111 (~= 7600/0.684)
l1BlobFeeScalar = 1_250_000 (~= 862000/0.684)
l1BaseFee: current L1 base fee
l1BlobBaseFee: current L1 blob fee
intercept = -27_321_890: intercept (a.k.a. constant) calculated by the linear regression
fastlzCoef = 1_031_462: coefficient of the fastlzLength term
fastlzLength: length of FastLZ-compressed RLP-encoded signed tx
uncompressedTxCoef = -88_664: coefficient of the uncompressedTxSize term
uncompressedTxSize: length of uncompressed RLP-encoded signed tx

The l1BaseFeeScalar/l1BlobFeeScalar will give the chain operator some ability to tweak the rollup costs should the tx traffic shape change. They are calculated from the L1 fee scalars doc here, divided by the assumed 0.684 compression ratio.

Links

The linear regression calculation script can be found here (please excuse the lack of correct Python style).