4337 Compression in WAX

WAX 4337 Compression

Note: This doc is pre-Dencun. Blobs have changed the economics dramatically, but compression will still matter for future scalability. Here's another doc all about that.

Based on the implemented and upcoming methods in the WAX project by PSE, supported by the Ethereum Foundation.

We're here to support the ecosystem - everything is freely available and modular. You can use exactly these methods, and you can also remix the parts you like together with your own ideas. Source code is in our repository and we're available to help via discord.

(WAX is also about the wonderful features enabled by 4337, but this doc focuses on fee optimization.)

Summary

Simple user ops can be compressed to about 18 bytes.

Using these methods, bundlers can potentially charge the right-most column in the table below while turning a profit:

ETH Transfer	EOA	4337	4337 Compressed
Mainnet	$ 1.4196	$ 6.5018	$12.1660
Arbitrum One	$ 0.1359	$ 0.4868	$ 0.0868
Optimism	$ 0.0935	$ 0.3131	$ 0.0302

ERC20 Transfer	EOA	4337	4337 Compressed
Mainnet	$ 2.7040	$ 7.1406	$13.1597
Arbitrum One	$ 0.2054	$ 0.5316	$ 0.0918
Optimism	$ 0.1155	$ 0.3412	$ 0.0312

See Current Fee Environment and Bringing it Together
We're not confident to 4 decimal places, but the usual 2 decimal places introduces excessive rounding error and 3 decimal places for currency is prone to being misread
The benefits of compression are greatly affected by:
- The price of blob data (4844) (above fees are pre-4844)
- The number of user ops bundled together (above assumes 10 ops/bundle)

Google sheet.
Interactive calculator.

Transaction Data

When using 4337, the full calldata to run a transaction to process a bundle still needs to start with a regular ECDSA-signed ethereum transaction. Using 1559, the format is:

0x02 || rlp([chain_id, nonce, max_priority_fee_per_gas, max_fee_per_gas, gas_limit, destination, amount, data, access_list, signature_y_parity, signature_r, signature_s])

https://eips.ethereum.org/EIPS/eip-1559

Without changes to the protocol (which we'll consider out-of-scope here), we can only compress the data field above. This leaves us with about 110 other bytes which ultimately need to be supported by the contained user operations. More user operations = less shared cost per user operation.

`EntryPoint` vs Account Compression

Compression can take place in two independent phases:

EntryPoint: Applied to the EntryPoint's calldata (ie bundle and beneficiary), compressed by the bundler and decompressed by a contract which wraps the EntryPoint
Account: Applied to userOp.callData, compressed by the wallet (the user's off-chain software) and decompressed by the user's account

`EntryPoint` Compression

This compression introduces a wrapper contract (eg) which calls the EntryPoint. The wrapper takes the compressed calldata, decompresses it, and calls the EntryPoint.

Relatively simple universal encodings can be used today to benefit bundlers without the need to get wallets involved or delve into the complexities of the calldata that gets passed to each wallet.

For example, sending a single user op requires sending bytes like this to EntryPoint:

1fad948c
    - handleOps(UserOperation[],address)

0000000000000000000000000000000000000000000000000000000000000040
    - ops array is located at 0x40 (= 2x 32 bytes)

000000000000000000000000f39fd6e51aad88f6f4ce6ab8827279cfffb92266
    - beneficiary address

0000000000000000000000000000000000000000000000000000000000000001
    - array length (ie there's 1 user op)

Array data:
0000000000000000000000000000000000000000000000000000000000000020
    - the user op is located 0x20 bytes after the array data

Start of the user op:
000000000000000000000000b734eb54c90c363d017b27641cc534caf7004fc4
    - sender address (aka the user's account address)

0000000000000000000000000000000000000000000000000000000000000001
    - nonce = 1

0000000000000000000000000000000000000000000000000000000000000160
    - initCode is located 0x160 bytes after the start of the
      user op

0000000000000000000000000000000000000000000000000000000000000180
    - callData is located 0x180 bytes after the start of the
      user op

000000000000000000000000000000000000000000000000000000000001228f
    - callDataGasLimit is 74,383 (=0x01228f)

00000000000000000000000000000000000000000000000000000000000186a0
    - verificationGasLimit is 100,000 (=0x186a0)

000000000000000000000000000000000000000000000000000000000000d494
    - preVerificationGas is 54,420 (=0xd494)

000000000000000000000000000000000000000000000000000000003e08feb0
    - maxFeePerGas is 1,040,776,880 (=0x3e08feb0, about 1.04
      gwei)

000000000000000000000000000000000000000000000000000000003b9aca00
    - maxPriorityFeePerGas is 1 gwei (=0x3b9aca00)

0000000000000000000000000000000000000000000000000000000000000220
    - paymasterAndData is located 0x220 bytes after the start of
      the user op

0000000000000000000000000000000000000000000000000000000000000240
    - signature is located 0x240 bytes after the start of the
      user op

0000000000000000000000000000000000000000000000000000000000000000
    - length of initCode is zero

0000000000000000000000000000000000000000000000000000000000000064
    - length of callData is 0x64 (= 3x 32 + 4 bytes)

2d1634c5
0000000000000000000000000000000000000000000000000000000000000020
0000000000000000000000000000000000000000000000000000000000000008
0103000001810000000000000000000000000000000000000000000000000000
    - callData

00000000000000000000000000000000000000000000000000000000
    - padding (since callData was not a multiple of 32)

0000000000000000000000000000000000000000000000000000000000000000
    - length of paymasterAndData is zero

0000000000000000000000000000000000000000000000000000000000000041
    - length of signature is 65 (=0x41)

6d0d86052da1995cb95f7d51fe68375c182e82822263a38a4976251e6d4a6918
162b605c71bea200f55e314cdea53112ab61a440fa1a0d260e119ebe417780be
1c
    - signature

00000000000000000000000000000000000000000000000000000000000000
    - padding (since signature was not a multiple of 32)

These bytes are required because it is the layout determined by the Solidity ABI. Instead of sending them in the top-level transaction, we can have another contract send those bytes instead. It can decode the bytes below and pass along the equivalent Solidity ABI encoding to the contract. By generating the extra data inside the transaction, it doesn't need to posted to L1, so we avoid paying for it.

01
    - 1 user op (VLQ)

04
    - bit stack encoding (1)00 (0x04 == 0b100)
        - Reading least signficant bit first:
        - 0: initCode is empty
        - 0: paymasterAndData is empty
        - 1: end of stack

c90c36
    - Sender (using a lookup table containing:
        c90c36 => b734eb54c90c363d017b27641cc534caf7004fc4)

01
    - nonce = 1

(no bytes)
    - initCode is empty, but we don't need any bytes for it
        because the bit stack already indicated it's empty

64
    - length of callData is 100 (VLQ)
2d1634c5
0000000000000000000000000000000000000000000000000000000000000020
0000000000000000000000000000000000000000000000000000000000000008
0103000001810000000000000000000000000000000000000000000000000000
    - callData

185d
    - callDataGasLimit is 74,400 (PseudoFloat)

3100
    - verificationGasLimit is 100,000 (PseudoFloat)

1944
    - preVerificationGas is 54,500 (PseudoFloat)

410d
    - maxFeePerGas is 1.05 gwei (PseudoFloat)

3900
    - maxPriorityFeePerGas is 1 gwei (PseudoFloat)

(no bytes)
    - paymasterAndData is empty, but we don't need any bytes for
        it because the bit stack already indicated it's empty

41
    - length of signature is 65 bytes (VLQ)
6d0d86052da1995cb95f7d51fe68375c182e82822263a38a4976251e6d4a6918
162b605c71bea200f55e314cdea53112ab61a440fa1a0d260e119ebe417780be
1c
    - signature

This reduces the effective bytes for the bundle from 319 down to 113.

Of those 113:

2 are intrinsic to the bundle
100 are used for userOp.callData and userOp.signature, which will be reduced elsewhere (see Account Compression and BLS Signature Aggregation)
11 are used for the remaining userOp fields

Details about the encoding above:

The beneficiary address does not appear at all because it can be stored in the wrapper contract rather than providing the same value for each bundle
A bit stack uses VLQ for a tight packing of a uint256 value, which is then interpreted using this solidity library
3 bytes (c90c36) isn't enough for all addresses, but using RegIndex we can give 3-byte IDs to the first 8 million addresses, 4 bytes for the next billion addresses, and so on (for context, 250m addresses have been used on L1)
PseudoFloat uses 2 bytes to represent most quantities with up to 3 decimals of precision, and expands as needed to support all uint256 values (lossless compression)
The 5 fields that use PseudoFloat (eg callGasLimit) are tolerant to small errors and need to be rounded up to 3 significant figures to make the most of the format. This requires co-operation from the user's wallet software because the rounding changes the userOpHash and therefore changes the required signature. Without rounding, 5-10 additional bytes will be used.

EntryPoint compression can also go further by compressing the calldata being passed to wallets, potentially eliminating the need for account compression. This could be challenging if trying to support a variety of evolving calldata formats for different competing smart accounts, but could be convenient if the bundler and smart account are provided by the same organization.

Account Compression

This compression applies to the userOp.callData field, which determines the bytes the EntryPoint sends to the account. Decompression occurs inside the account to determine the action(s) to perform.

This allows accounts to benefit from compression without relying on a dedicated bundler, and to freely switch bundlers without losing their compression features.

Using compression at this level reduces the implementation complexity of bundlers, but account compression cannot help with the many other fields of each user op, so some amount of EntryPoint compression is always recommended.

For example, an account without compression probably uses a method like this:

function execute(
    address dest,
    uint256 value,
    bytes calldata func
) external {
  // ...
}

(This method is from eth-infinitism's SimpleAccount example.)

To send an ERC20 token this way, our userOp.calldata field will be encoded like so:

b61d27f6
    - execute(address,uint256,bytes)

000000000000000000000000c845d6b81d6d1f3b45f2353fec8c960085a9a42e
    - dest / ERC20 address

0000000000000000000000000000000000000000000000000000000000000000
    - value (zero because we're not sending ETH)

0000000000000000000000000000000000000000000000000000000000000060
    - location of func (0x60 = 3x 32 bytes)

0000000000000000000000000000000000000000000000000000000000000024
    - length of func (0x24 = 2x 32 + 4 bytes)

a9059cbb
    - 4 bytes for transfer(address,uint256)

000000000000000000000000e30a735c9b90549f8171f17dd698ab6048dde5ab
    - recipient address

0000000000000000000000000000000000000000000000000de0b6b3a7640000
    - one token (10 ^ 18)

00000000000000000000000000000000000000000000000000000000
    - padding so that if there was a next field of
      execute(address,uint256,bytes), it could be inserted next
      and be on a 32-byte alignment boundary

This is the encoding we get when we use the solidity ABI. It was designed to be efficient to compute for the 256-bit EVM, and it is, but this makes it very inefficient for bytes.

In total, this uses 228 bytes. Zeros are cheaper though (about 75% cheaper depending on the L2), so it's more useful to think of it as 98 effective bytes (ie the equivalent number of non-zero bytes with the same cost).

An account with compression can achieve the same thing by receiving this calldata:

02
    - Compression scheme to use (02 = ERC20 transfer)

6a
    - Token (using a lookup table containing:
        6a => c845d6b81d6d1f3b45f2353fec8c960085a9a42e)

473dee
    - Recipient (using a lookup table containing:
        473dee => e30a735c9b90549f8171f17dd698ab6048dde5ab)

9900
    - Amount = 10^18, encoded as a PseudoFloat

This reduces the effective bytes for userOp.calldata from 98 to 6.

Details about the encoding above:

While a single byte (6a) is not enough for all ERC20 tokens, using VLQ we can use single-byte IDs for the most popular tokens, two-byte IDs for the 16,384 next most popular tokens, etc
Similarly, 3 bytes (473dee) isn't enough for all addresses, but using RegIndex we can give 3-byte IDs to the first 8 million addresses, 4 bytes for the next billion addresses, and so on (for context, 250m addresses have been used on L1)
PseudoFloat uses 2 bytes to represent most quantities with up to 3 decimals of precision, and expands as needed to support all uint256 values (lossless compression)

BLS Signature Aggregation

An ECDSA signature uses 65 bytes. After switching to compact encodings for other fields, this usually becomes the vast majority of the bytes needed for the user op.

BLS signatures are marginally smaller at 64 bytes, and can be used to verify an unlimited number of transactions from different parties. Effectively, the cost of a single BLS signature can be shared between all user ops.

Screen Shot 2024-01-11 at 13.28.23

BLS signatures also come with higher L2 gas costs:

	ECDSA	BLS
Intrinsic to bundle	0	90,000
Added by user op	3,000	36,000¹

Note: This is all based on BLS on the BN254 curve. BLS can also be done on the BLS12-381 curve (used in the beacon chain), but this needs to wait for new precompiles to make it viable in the EVM.

Current Fee Environment

Param	Value	About
ETH	$2,600	Latest (at time of writing)
ETH gas	26 gwei	Median of last month
Arbitrum gas	0.1 gwei	Median of last month
Optimism gas	0.0054 gwei	Median of last month

The exact details of how L2 chains charge fees vary between chains. For example, Arbitrum increases its gas values to account for the L1 gas it needs to pay, but Optimism charges for L1 gas separately (in addition to gasPrice * gasUsed).

For our purposes, L2 fees can be predicted with good accuracy by finding the right parameters for the following unified model:

l1GasUsed = fixedL1Gas + l1GasPerEffectiveDataByte * dataBytes
l2GasUsed = (ordinary gas defined by protocol / same on L1 and local dev)

fee = l1GasUsed * l1GasPrice + l2GasUsed * l2GasPrice

fixedL1Gas is a minimum amount of L1 gas charged for all transactions.
l1GasPerEffectiveDataByte is an amount of L1 gas charged per effective byte in the data field (ie the bytes sent to the destination address). Here 'effective' is slightly under-defined, but incompressible data, which is a good approximation of our use case (because we've already compressed it), should be 100% 'effective'. In other words, every byte we're putting in the data field counts for one 'effective' byte.

While it's possible to find these parameters theoretically by diving into the details of each L2, it's easier and less error prone to measure them directly with real transactions. This can be done with the help of the WAX Fee Measurer.

At the time of writing, this methodology yields the following:

Parameter	Arbitrum One	Optimism
`fixedL1Gas`	1816	1302
`l1GasPerEffectiveDataByte`	16.2	11.0

Note:

L2 chains internally define similar parameters for their actual calculations, but they differ from the parameters being used here (eg counting total tx bytes instead of bytes of the data field)
l2Gas is particularly prone to interpretation error, because Arbitrum reports "gasUsed" as whatever number satisfies fee = gasUsed * gasPrice. Not that that is a bad idea, but it is not what we mean in this model. Other L2s might have similar complications in their reported gas numbers. It is essential to measure gas independently in a dev environment when applying this model.
When L2 chains charge for L1 gas, they use their internal view of the L1 gas price. This is closely correlated with the actual L1 gas price, but it's not quite the same and it's controlled by the L2.

Bringing it Together

In EntryPoint Compression, we saw 113 bytes to encode a bundle containing one ETH transfer. By replacing the 33 effective bytes in the userOp.callData field with the 6 effective bytes from the Account Compression example, this brings our bundle size down to 86 (and performs an ERC20 transfer instead).

Based on experiments with WAX prototypes, the ordinary gas (L2 gas) estimate for this bundle is 174,000.

It's important now to distinguish between costs that are intrinsic to the bundle and costs that are added by each user op. To be cost effective, a bundle should contain several user ops, and the more the merrier.

	Effective bytes	L2 Gas
Intrinsic to bundle	2	34,000
Added by user op	84	140,000

We can now adjust the above for BLS signatures. 66 of those 84 user op bytes are attributable to the ECDSA signature, and we can replace all of them with a single 128-byte BLS signature. For gas, we can save 3,000 per user op by not doing ECDSA, but we need to pay 36,000 per user op¹ and 90,000 fixed gas to verify the BLS signature.

Adjusting with these values, we get:

	Effective bytes	L2 Gas
Intrinsic to bundle	66	124,000
Added by user op	18	173,000

We can now combine these numbers with the fixedL1Gas and l1GasPerEffectiveDataByte to add the L1 gas values for bundles and user ops for both Arbitrum and Optimism. fixedL1Gas is intrinsic to the bundle because we only pay it once per bundle.

Arbitrum One	L1 Gas²	L2 Gas²
Intrinsic to bundle	2,885	124,000
Added by user op	292	173,000

Optimism	L1 Gas	L2 Gas
Intrinsic to bundle	2,028	124,000
Added by user op	198	173,000

At this point, let's pick 10 for the number of user ops in our hypothetical bundle. This is a balance between demonstrating low-cost potential and a size that isn't too far out of reach. This will result in L1 costs that are about 2x the theoretical minimum (infinite bundle size). If you could get to 30 ops/bundle, that would go down to about 1.3x.

L1 Gas vs Bundle Size

Bundle size is a tradeoff between fees and latency. The more you want a lower fee, the more you're willing to wait to share it with more users. There is a larger game-theoretic story to tell here, since users willing to pay higher fees are not negatively affected by users who want to pay the smallest possible share of the bundle overhead. This could lead to users either paying the whole bundle overhead or none of it, with users willing to pay the full overhead having no incentive then to use a third party bundler. It'll be interesting to see how this plays out.

Anyway, to get the total fees per user op, the bundler should also include a profit margin as its incentive to operate. 5% has been included below.

Fees for ERC20 transfer	L1 Gas²	L2 Gas²
Arbitrum One	609	194,670
Optimism	421	194,670

We can combine these with actual gas prices and the value of ETH from Current Fee Environment to get total fees in USD:

Fees for ERC20 transfer	L1-based	L2-based	Total
Arbitrum One	$0.0412	$0.0506	$0.0918
Optimism	$0.0284	$0.0027	$0.0312

Finally, blobs are coming. 4844 should significantly reduce the L1-based fees above. Exactly how much is very difficult to predict, but 20x is a low discount compared to numbers I've heard and greater discounts wouldn't affect the totals above too much. So, assuming 20x the new fees are:

Fees for ERC20 transfer (4844)	L1-based	L2-based	Total
Arbitrum One	$0.0021	$0.0506	$0.0527
Optimism	$0.0014	$0.0027	$0.0042

In this scenario where fees become dominated by L2 itself and not the costs of posting to L1, it will be important to re-evaluate our compression choices. Today, compression is easily profitable because all the compression work happens on L2 and the L2 costs are super low. When that flips, compression will probably need to change.

Footnote 1

¹ It might seem strange that BLS has a per-user-op gas cost. This is because BLS verification involves processing not just the signature data but also the associated message data, and each user op adds message data.

Footnote 2

² Care needs to be taken when interpreting gas on Arbitrum. See Demystifying Arbitrum’s Fees.

Summary

Transaction Data

EntryPoint vs Account Compression

EntryPoint Compression

Account Compression

BLS Signature Aggregation

Current Fee Environment

Bringing it Together

Footnote 1

Footnote 2

Read more

Solving the First Learner Problem

Chinese Remainder Theorem

The Impact of Blobs on Compression

Demystifying Arbitrum’s Fees

`EntryPoint` vs Account Compression

`EntryPoint` Compression