# Optimism batch compression analysis
*How to calculate a more accurate L1 fee*.
*Special thanks to Roberto Bayardo for his excellent [compression analysis](https://github.com/roberto-bayardo/compression-analysis/) repo and ideas on linear regression.*
## Current state
The current post-Regolith, pre-Ecotone cost function is [defined](https://github.com/ethereum-optimism/specs/blob/f16a6252e632564817114bbb84caebb46129566b/specs/protocol/exec-engine.md#pre-ecotone) as follows:
```
(zeroes*4 + ones*16 + overhead) * scalar * l1BaseFee / 1e6
```
where:
- `zeroes`: count of 0 bytes in the tx
- `ones`: count of non-0 bytes in the tx
- `overhead`: overhead per tx (suggested: `188`)
- `scalar`: scalar per tx (suggested: `684_000`)
- `l1BaseFee`: current L1 base fee
This is a basic assumption of [EIP-2028](https://eips.ethereum.org/EIPS/eip-2028) costs multiplied by a constant scalar that represents the compression ratio over time.
Ecotone will [update](https://github.com/ethereum-optimism/specs/blob/f16a6252e632564817114bbb84caebb46129566b/specs/protocol/exec-engine.md#ecotone-l1-cost-fee-changes-eip-4844-da) it to:
```
(zeroes*4 + ones*16) * (16*l1BaseFeeScalar*l1BaseFee + l1BlobBaseFeeScalar*l1BlobBaseFee) / 16e6
```
where:
- `zeroes`: count of 0 bytes in the tx
- `ones`: count of non-0 bytes in the tx
- `l1BaseFeeScalar`: scalar to use if submitting calldata`
- `l1BaseFee`: current L1 base fee
- `l1BlobBaseFeeScalar`: scalar to use if submitting 4844 blobs
- `l1BlobBaseFee`: current L1 blob fee
While the scalars in these formulas give some macro control over the economics of the chain, it doesn't take into account the compressibility of transactions. For example, a tx with calldata that contains 1000 repeated `1` values will be charged more than a transaction that contains 500 random bytes, even though the former is more compressible than the latter, and costs the chain operator less to rollup.
This starts to incentivize pre-compression of transactions submitted to L2, as we are starting to see with 4337 transactions. On L2s execution gas is cheap and calldata is expensive, so folks are starting to use libraries like solady's LibZip to save on L1 fees. This behavior will eventually cause the chain operator to increase the scalar value to cover the cost of these less compressible transactions, essentially forcing other users to cover the DA cost increase. It will also increase L2 gas usage per block, reducing the overall chain transaction throughput.
## Better estimator
What if we could provide a better estimate for the actual cost a transaction costs to rollup? It is tricky to get a perfectly accurate estimate, because the L1 cost function is run with only the transaction as input, and in reality the batcher is compressing many transactions together in a batch which gives better compression ratios.
However, we can do better than the naive EIP-2028 approach above. We can introduce an efficient compression estimator into the fee calculation that can give a more accurate estimate of the actual costs to roll up a transcation.
This compression estimator needs to be efficient as we don't want to slow down the execution engine with compression tasks. It also needs to have an implementation available in Solidity so we can keep the [GasPriceOracle](https://github.com/ethereum-optimism/specs/blob/f16a6252e632564817114bbb84caebb46129566b/specs/protocol/predeploys.md#gaspriceoracle) calculation in sync with the execution engine implementation.
## Enter FastLZ
[FastLZ](https://github.com/ariya/FastLZ) is a "Small & portable byte-aligned LZ77 compression" algorithm. The advantage of this algorithm is that it is simple and efficient, and has an audited [Solidity implementation](https://github.com/Vectorized/solady/blob/8919f61d14a5e7b32f3d809c9f5fe3ea2ebcbc50/src/utils/LibZip.sol) in the `solady` library.
We compared FastLZ with running individual transactions through zlib at max compression, and while zlib results in better compression ratios, we found it did not perform dramatically better for compressibility estimation. Both FastLZ and zlib use a variation of LZ77. Given the batcher uses zlib, it's unlikely that some other compression algorithm will improve accuracy, so FastLZ seems to be a good choice for this use case.
There's a reference implementation of FastLZ in Golang [here](https://gist.github.com/mdehoog/1980b883bef5d67b9ce5d4095adbf872), and a more efficient length-only version [here](https://gist.github.com/mdehoog/d263949d65bc518fa0e18f157d5ca590) (we only need an estimate of the compression length, not the actual compressed data).
## Back testing
We wanted to test the improved accuracy of the new L1 cost formula, so we took the following approach:
1. Obtain a machine with the Geth data directory from an Optimism Mainnet archive node (either extracted from a snapshot or from a running node).
2. For every single Bedrock transaction, dump the RLP-encoded tx size, the FastLZ compressed size, and a best estimate of the batcher compressed size (using [this script](https://gist.github.com/mdehoog/0b1448223dbc67f0c6b0a0eebeb733fb)).
3. Perform an time series comparison of the FastLZ method compared to the EIP-2028 method. Specifically we look at the root-mean-squared-error (RMSE) of both FastLZ and EIP-2028, using the best estimate as a source of truth.
Note: we calculate the "best estimate" using a dual setup of zlib writers at the highest compression level. We aim to keep one of the writers always between 64kb and 128kb in size, simulating the compression achieved in a 128kb batch. Running all of the transactions on OP mainnet since Bedrock through this algorithm spat out an estimate of 19,929MB. The actual amount of calldata written is 20,222MB meaning the estimate is off by 1.5%, validating the method's accuracy.
Why RMSE? RMSE penalizes outliers more heavily, so it captures things like adversaries potentially trying to exploit weaknesses. We also looked at mean-absolute-error (MAE) and found that the two measures were correlated: estimators that performed well using RMSE also performed well against MAE.
## Results

You can see the RMSE of the FastLZ algorithm performs better than the EIP-2028 version. However we still see a lot of spikes in the chart. These spikes are spaced 14 days apart, and are likely due to Worldcoin activity on OP, which are inherently incompressible transactions.
## Linear regression
We decided to apply a linear regression to the data to attempt to find a more accurate estimator. This has two advantages:
- adds a constant term (the intercept) to the formula
- can accept multiple variable inputs (multivariate linear regression)
Using the "best estimate" above as the dependent variable, we chose to run three separate regressions over the data, using the following sets of independent variables:
1. `[zeroes, ones]` (checking if using a linear regression using the EIP-2028 variables would improve performance without FastLZ)
2. `[fastlz_length]`
3. `[fastlz_length, uncompressed_tx_size]`
We performed the regression over a subset of the data: transactions from the month of October, 2023, which includes some of the worldcoin distribution shift. We then generated RMSE charts across the entire dataset, which allows us to validate that the model still holds during the larger changes in the tx data.

The `[fastlz_length, uncompressed_tx_size]` regression (in orange above) performed the best, almost wiping out those spikes in errors seen in the other charts.
The models calculated in these cases are as follows:
1. `v = 8.570750093473123 - 0.01778946*zeroes + 0.87830521*ones`
2. `v = -35.60342890895569 + 0.90558897*fastlz_length`
3. `v = -27.321890037208703 + 1.03146206*fastlz_length - 0.08866427*uncompressed_tx_size`
## Other techniques
We wanted to validate some other techniques that didn't involve FastLZ, in case there was a simpler implementation that would yield good results. A few variations were tested:
- Variations of the EIP-2028 estimator above, where the zero byte was assigned different costs.
- An estimator that only counted non-repeated and non-zero bytes.
- An estimator that charged different fees for repeated bytes vs non-repeating bytes.
You can see a full list of the techniques we tried [here](https://github.com/roberto-bayardo/compression-analysis/blob/e353ce6362c81fdf5ef6eed2129ef3847d040b70/main.go#L73-L86). While some of these improved on the original algorithm, we didn't find any that performed better than the FastLZ algorithm.
## Conclusion
We recommend switching to the following formula for the new L1 cost function:
```
l1BaseFeeScaled = l1BaseFeeScalar * l1BaseFee * 16
l1BlobFeeScaled = l1BlobFeeScalar * l1BlobBaseFee
l1FeeScaled = l1BaseFeeScaled + l1BlobFeeScaled
((intercept + fastlzCoef*fastlzLength + uncompressedTxCoef*uncompressedTxSize) * l1FeeScaled) / 1e12
```
where (most variables scaled by `1e6` for integer arithmetic purposes):
- `l1BaseFeeScalar = 11_111` (`~= 7600/0.684`)
- `l1BlobFeeScalar = 1_250_000` (`~= 862000/0.684`)
- `l1BaseFee`: current L1 base fee
- `l1BlobBaseFee`: current L1 blob fee
- `intercept = -27_321_890`: intercept (a.k.a. constant) calculated by the linear regression
- `fastlzCoef = 1_031_462`: coefficient of the `fastlzLength` term
- `fastlzLength`: length of FastLZ-compressed RLP-encoded signed tx
- `uncompressedTxCoef = -88_664`: coefficient of the `uncompressedTxSize` term
- `uncompressedTxSize`: length of uncompressed RLP-encoded signed tx
The `l1BaseFeeScalar`/`l1BlobFeeScalar` will give the chain operator some ability to tweak the rollup costs should the tx traffic shape change. They are calculated from the L1 fee scalars doc [here](https://www.notion.so/oplabs/4844-L1-fee-scalars-5461ee49075d4658bc165e89c44faed2), divided by the assumed `0.684` compression ratio.
## Links
The linear regression calculation script can be found [here](https://gist.github.com/mdehoog/0b1448223dbc67f0c6b0a0eebeb733fb#file-regression-py) (please excuse the lack of correct Python style).