Ambire's validateUserOp or how to enable the EntryPoint on your deploy transaction without changing the address

ERC-4337 represents an array of challenges, some of which are present or not depending on the initial design decisions you decide to go with. For example:

Should we deploy the SA (smart account, the official term for a smart contract wallet) the moment the user's address is calculated or later
Should we sponsor the deployment or ask the user to pay it
How do we give permissions to the EntryPoint to execute transactions

And the list goes on. As a developer, I feel nothing but excitement for the freedom a wallet has in making its own design decisions and tackling them along the way.

So, to understand what the challenge is and how we tackled it, the starting point is understanding the decisions we've made for our wallet. And we should start with a couple of main points:

The SA is deployed along the first authorized user transaction
The user pays the deployment fee
The address of the SA is calculated by create2. In the bytecode we pass an array of all the keys (as addresses) that have privileges to execute transaction for this account
No upgradability, or the popular (#infamous) Diamond Pattern. Once deployed, the end code for the SA doesn't change.

Probably the most important point here is 3). Anyone who's implemented ERC-4337 knows that the EntryPoint should have privileges to execute transactions on behalf of the SA. So the question is: can we just put the entry point address as a initial signer in the bytecode and be done with it?

We certainly can and that's a fair decision. But let's look at the disadvantages of this route:

1) A bug in the entry point

Let's discuss a situation with low-probability but nevertheless possible - someday in the future a bug is found in the entry point and the users have to migrate to a different entry point address to avoid the exploit.

Such things happen in crypto and the answer should be simple - remove the old entry point as a signer and forget about the issue. The problem is the entry point is one of the initial signers. If you decide to deploy your account on another network and don't pass the entry point as an initial signer, you will get a different address, causing confusion. This doesn't change even if you're using an Upgradability Pattern - once calculated by create2(), the address stays the same only if the exact same code is deployed. Having the same address on each network is a very important UX decision that I believe all wallets agree upon.

2) A new version of the entry point is deployed

The timing of this post is great as EntryPoint 0.0.7 has been announced a couple of weeks ago and our implementation is for the EntryPoint 0.0.6. Now you have a different problem than the one from above - instead of an exploited entry point, you will deploy the account on a different network with the old one. At first glance, this may not seem as a big problem but:

you keep the old entry point as a signer, keeping the risk of it being exploited one day
you need to prepare another transaction for your users to remove the old EntryPoint on each new network they decide to deploy.

While the above issues are unpleasant, they are certainly solutions. Maybe a wallet can arrange in its UI on the first user transaction to check for old entry point signers and remove them.

At Ambire, we decided to go another route by not including the entry point in the initial privileges.

The user operation flow

Not including the entry point in the initial privileges is not as simple as it sounds, though. It means that the entry point is not authorized to execute the first wallet transaction. From that point on, you have two options:
a) force the first wallet transaction to be a none-erc-4337 one
b) think of another solution

Let's take a look at the assumption of why the first transaction cannot be a 4337 one if the entry point is not authorized.

User operation handling consists of two main parts in the entry point: validation and execution. Validation consists of two functions we as a SA have control over: validateUserOp and validatePaymasterUserOp. Each wallet needs to implement validateUserOp on their end and validate whether the incoming user operation is valid or not.

So the most basic rule one should include in it is something along the lines:

// addresses that can sign
mapping(address => bytes32) public privileges;

// this indicates that the signer is the entry point
bytes32 private constant ENTRY_POINT_PERM = 0x00...7171;

function validateUserOp(UserOperation calldata op, bytes32 userOpHash, uint256 missingAccountFunds)
    external payable returns (uint256)
{
    require(privileges[msg.sender] === ENTRY_POINT_PERM, 'validateUserOp: not from entryPoint');

    // validate the user op signature
    address signer = SignatureValidator.recoverAddr(userOpHash, op.signature, true);
    if (privileges[signer] == bytes32(0)) return SIG_VALIDATION_FAILED;
}

Two things to pickup from here:

Request are executed only by the entry point
The user op hash commitment is validated

So our problem is that the entry point does not have permissions. How do we allow the request to pass without losing the user operation hash commitment?

The answer lies in our paymaster.

Paymaster

We didn't mention the paymaster until now so a few things. In ERC-4337, there are two ways to pay for the user operation on-chain fees: the SA pays in native; and using a Paymaster. If the request includes a Paymaster, the Paymaster needs to verify that the request is valid. Or in other words, this is what we could control and bend in our favour.

Here's the code:

function validatePaymasterUserOp(UserOperation calldata userOp, bytes32, uint256)
        external
        view
        returns (bytes memory context, uint256 validationData)
{
    (uint48 validUntil, uint48 validAfter, bytes memory signature) = abi.decode(userOp.paymasterAndData[20:], (uint48, uint48, bytes));

    bytes memory callData = userOp.callData;
    bytes32 hash = keccak256(abi.encode(
        block.chainid,
        address(this),
        msg.sender,
        validUntil,
        validAfter,
        userOp.sender,
        callData.length >= 4 && bytes4(userOp.callData[0:4]) == IAmbireAccount.executeMultiple.selector ? 0 : userOp.nonce,
        userOp.initCode,
        callData,
        userOp.callGasLimit,
        userOp.verificationGasLimit,
        userOp.preVerificationGas,
        userOp.maxFeePerGas,
        userOp.maxPriorityFeePerGas
    ));
    bool isValidSig = SignatureValidator.recoverAddr(hash, signature, true) == PAYMASTER_SIGNER;
}

This is what we do:

We force the first ERC-4337 transaction to be with a paymaster payment.
The paymaster signs off-chain the user operation with some additional properties:

it includes msg.sender which is the entry point address, reassuring the request comes from the entry point
it forces a one-time hash nonce with a key equal to the all the user operation properties except the signature (this is in validateUserOp, we'll show the code in a bit) and a value of 0. The EntryPoint helps in protecting against malleability as it will demand a nonce with a value of 1 on the second request. But just to be safe that enforcement is in the paymaster as well.
it forces a callData with a sigHash of executeMultiple. executeMultiple is an Ambire method that expects transaction calls and a signature commitment for them. It's how Ambire handled requests before converting to 4337:

function executeMultiple(ExecuteArgs[] calldata toExec) external payable {
    for (uint256 i = 0; i != toExec.length; i++) execute(toExec[i].calls, toExec[i].signature);
}

function execute(Transaction[] calldata calls, bytes calldata signature) public payable {
   // validate the signature for the calls
   // execute the calls if signature valid
   // revert if not
}

Here's how the validation in validateUserOp changes:

function validateUserOp(UserOperation calldata op, bytes32 userOpHash, uint256 missingAccountFunds)
external payable returns (uint256)
{
    // the first transaction that gets authorization from the paymaster
    if (op.callData.length >= 4 && bytes4(op.callData[0:4]) == this.executeMultiple.selector) {
        require(op.signature.length == 0, 'validateUserOp: empty signature required in execute() mode');
        require(
            op.paymasterAndData.length >= 20 && bytes20(op.paymasterAndData[:20]) != bytes20(0),
            'validateUserOp: paymaster required in execute() mode'
        );
        uint256 targetNonce = uint256(keccak256(
            abi.encode(op.initCode, op.callData, op.callGasLimit, op.verificationGasLimit, op.preVerificationGas, op.maxFeePerGas, op.maxPriorityFeePerGas, op.paymasterAndData)
        )) << 64;
        require(op.nonce == targetNonce, 'validateUserOp: execute(): one-time nonce is wrong');
        return SIG_VALIDATION_SUCCESS;
    }

    // normal validation from here
    require(privileges[msg.sender] == ENTRY_POINT_MARKER, 'validateUserOp: not from entryPoint');
    address signer = SignatureValidator.recoverAddr(userOpHash, op.signature, true);
    if (privileges[signer] == bytes32(0)) return SIG_VALIDATION_FAILED;

    ...
}

And with this, you have it. The first user operation will be with this special paymaster authorization which will give permissions to the entry point by having that call authorized by the user in callData. Once that is done, the user begins to use ERC-4337 natively. He can still use the paymaster after for paying fees in tokens but he no longer needs to go through the executeMultiple selector as the entry point now has permissions to execute requests.

Trade offs

No design decision comes flawlessly so let's talk a bit about trade offs.

Pros

the address of the SA does not depend on the EntryPoint address
when a new EntryPoint version comes, we can change on the UI the callData call and give authorization to the new entry point for deployments on new networks
when a new EntryPoint version comes, we can prepare a "de-activator" call on the UI that will remove the old entry point as a privilege and enable the new one for networks the user is active on.

Cons

Normally, ERC-4337 requests have clear separation of validation and execution. We mix parts of the validation with the execution for the first user transaction. This is important as the transaction reverting in the validation phase does not incur fee loses to the user/paymaster. If it reverts in the execution phase though, fees are demanded and extracted. So the risk is delegated to our Paymaster. The paymaster needs to make sure off-chain that the execution will pass before authorizing the request.
Overall complexity. One should have extensive knowledge of the EntryPoint and how it works in order to be sure no issues arise. Audits are even more necessary as we diverge a bit from the beaten path to use this flow.

Additional benefits

This hack offers many other benefits that we're just going to mention them quickly as it's getting a bit long:

Standard Key Recovery

During recovery, the user has lost his key and he cannot do standard signature verification.

For v2 of the Ambire contracts, we have a DKIM Recovery mechanism. The thing is validation and execution are coupled in it for security reasons. So it uses storage spaces outside of the SA. Which is allowed only with limitations in validateUserOp as per specification and it doesn't work for us:

An address A is associated with:

    Slots of contract A address itself.
    Slot A on any other address.
    Slots of type keccak256(A || X) + n on any other address. (to cover mapping(address => value), which is usually used for balance in ERC-20 tokens). n is an offset value up to 128, to allow accessing fields in the format mapping(address => struct)

So we use our paymaster "edge case" to enable recovery without contradicting the ERC-4337 rules.

Timelock Key Recovery

Who pays the fee when the user cannot authenticate the request (he's lost his key) and cannot set a new key immediately (as it's timelocked)? The paymaster does, through the edge case.

Other requests with signature validation in the execution

You keep your design flexible as you can delegate the signature validation to an external contract without worrying about the storage slot rules in validateUserOp