ERC-4337 represents an array of challenges, some of which are present or not depending on the initial design decisions you decide to go with. For example:
And the list goes on. As a developer, I feel nothing but excitement for the freedom a wallet has in making its own design decisions and tackling them along the way.
So, to understand what the challenge is and how we tackled it, the starting point is understanding the decisions we've made for our wallet. And we should start with a couple of main points:
Probably the most important point here is 3). Anyone who's implemented ERC-4337 knows that the EntryPoint should have privileges to execute transactions on behalf of the SA. So the question is: can we just put the entry point address as a initial signer in the bytecode and be done with it?
We certainly can and that's a fair decision. But let's look at the disadvantages of this route:
Let's discuss a situation with low-probability but nevertheless possible - someday in the future a bug is found in the entry point and the users have to migrate to a different entry point address to avoid the exploit.
Such things happen in crypto and the answer should be simple - remove the old entry point as a signer and forget about the issue. The problem is the entry point is one of the initial signers. If you decide to deploy your account on another network and don't pass the entry point as an initial signer, you will get a different address, causing confusion. This doesn't change even if you're using an Upgradability Pattern - once calculated by create2()
, the address stays the same only if the exact same code is deployed. Having the same address on each network is a very important UX decision that I believe all wallets agree upon.
The timing of this post is great as EntryPoint 0.0.7 has been announced a couple of weeks ago and our implementation is for the EntryPoint 0.0.6. Now you have a different problem than the one from above - instead of an exploited entry point, you will deploy the account on a different network with the old one. At first glance, this may not seem as a big problem but:
While the above issues are unpleasant, they are certainly solutions. Maybe a wallet can arrange in its UI on the first user transaction to check for old entry point signers and remove them.
At Ambire, we decided to go another route by not including the entry point in the initial privileges.
Not including the entry point in the initial privileges is not as simple as it sounds, though. It means that the entry point is not authorized to execute the first wallet transaction. From that point on, you have two options:
a) force the first wallet transaction to be a none-erc-4337 one
b) think of another solution
Let's take a look at the assumption of why the first transaction cannot be a 4337 one if the entry point is not authorized.
User operation handling consists of two main parts in the entry point: validation and execution. Validation consists of two functions we as a SA have control over: validateUserOp
and validatePaymasterUserOp
. Each wallet needs to implement validateUserOp
on their end and validate whether the incoming user operation is valid or not.
So the most basic rule one should include in it is something along the lines:
Two things to pickup from here:
So our problem is that the entry point does not have permissions. How do we allow the request to pass without losing the user operation hash commitment?
The answer lies in our paymaster.
We didn't mention the paymaster until now so a few things. In ERC-4337, there are two ways to pay for the user operation on-chain fees: the SA pays in native; and using a Paymaster. If the request includes a Paymaster, the Paymaster needs to verify that the request is valid. Or in other words, this is what we could control and bend in our favour.
Here's the code:
This is what we do:
executeMultiple
. executeMultiple is an Ambire method that expects transaction calls and a signature commitment for them. It's how Ambire handled requests before converting to 4337:Here's how the validation in validateUserOp changes:
And with this, you have it. The first user operation will be with this special paymaster authorization which will give permissions to the entry point by having that call authorized by the user in callData
. Once that is done, the user begins to use ERC-4337 natively. He can still use the paymaster after for paying fees in tokens but he no longer needs to go through the executeMultiple
selector as the entry point now has permissions to execute requests.
No design decision comes flawlessly so let's talk a bit about trade offs.
callData
call and give authorization to the new entry point for deployments on new networksThis hack offers many other benefits that we're just going to mention them quickly as it's getting a bit long:
During recovery, the user has lost his key and he cannot do standard signature verification.
For v2 of the Ambire contracts, we have a DKIM Recovery mechanism. The thing is validation and execution are coupled in it for security reasons. So it uses storage spaces outside of the SA. Which is allowed only with limitations in validateUserOp
as per specification and it doesn't work for us:
So we use our paymaster "edge case" to enable recovery without contradicting the ERC-4337 rules.
Who pays the fee when the user cannot authenticate the request (he's lost his key) and cannot set a new key immediately (as it's timelocked)? The paymaster does, through the edge case.
You keep your design flexible as you can delegate the signature validation to an external contract without worrying about the storage slot rules in validateUserOp