Run a local testnet using dapptools:
On a second terminal, source your testnet environment:
Wait, you don't have a testnet env file? Here it is:
Always source that file after running the testnet.
Allright, let's do this!
That's it.
That opcode means stop
. Not revert or anything. Just stop
. So if you send a transaction, any transaction to that smart contract, it will accept it. Isn't that cool?
Run the following:
Did you run it? Please run it now.
Congratulations! You did it! Or did you?
Actually, you didn't deploy shit. Take a look:
Did you really think it was gonna be that easy? C'mon. What did you think this was?
In order to actually deploy something, you need to include code that returns the code you actually want to deploy:
f3
means return
. You're finishing the execution and returning something back. In this case, as you're sending a transaction to the zero address, that means you're deploying a contract with whatever code you return.
So run this:
Nah, I'm just messing with you. That won't work either.
I promise that very soon you'll be able to deploy stuff. Trust me.
f3
actually returns stuff that's stored in memory. For instance, let's assume that in memory position number 0 you have value 0 and you want to deploy that. So you have to let f3
know about the position in memory and the length of what you're interested in. You do that by putting it into the stack:
Now we're talking. 60
means push
. Push to the stack, that is. And whatever comes after 60
is the value you're pushing.
So in the code above, we're pushing a 01
and then a 00
into the stack, and then we're calling f3
. What f3
does internally is take these two values from the stack and interpret them as the input it needs in order to return something from memory.
So it's like function parameters, but they're in the stack.
In this case, we're telling f3
to return 01
bytes of information from memory position 00
. Why 01
though? Because we only wanna return a single byte of code, namely 00
.
Now that we know how to return something from memory, let's learn how to put something in memory that we want to return. In this case, some code.
And then, we will be able to deploy!
So the command to put things into memory is 52
. You tell it what it is you wanna put in memory and where in memory you wanna put it, and then you call 52
:
In our case, we just wanna put a 00
in memory position 00
. That's all.
Makes sense, or what? Let's try it:
And there you have it. Your first smart contract written directly in bytecode. Take a look:
You see? You just deployed a zero to the blockchain. Power to you.
Let's try something else. Let's deploy some code that actually does something. How about returning a "hola mundo" string?
There you have it. We need to return that.
But also, let's retain compatibility with Solidity, because what can you do. So in solidity strings are dynamically-sized arrays, so you first specify its location, then its length, and then the actual string:
Concat that removing the 0x
s in the middle and there you have your string.
Let's work from backwards. You wanna write some code that returns something. So let's start with that. What whas the opcode that returns stuff? Do you remember? Hint: it starts with an f
. Everything that's kinda related to finishing te execution starts with an f
.
It's f3
. So let's start with that:
It takes two "arguments" as stack values. First, the memory position; then, the length of the stuff in memory you wanna return.
When I say first I mean the outermost element in the stack, and when I say second, I mean the innermost element, or the one that was pushed first. So first actually means last. But you know what I mean, right? Something like this:
That's the stack and it goes from left to right, so when you push something, you push it to the left of it. So f3
takes that, and then returns.
In this case, the length is gonna be 3 slots of 32 bytes, that is 96, or 0x60. And for the offset, let's put it in memory position 0:
So you first push the length, then the offset in memory, then return. Now let's put that string in memory.
Please tell me what was the opcode for putting stuff in memory. Did you put it in your memory?
It starts with a 5
, as IO operations tend to do. It's 52
. It takes two "arguments" from the stack, like so:
In this case, we want to put it in memory position 0. This goes against some convention, by the way. You're supposed to allocate some free memory space, but tbh I don't understand what's the use of that, so I'm not gonna do it.
Then we need to push the whole string to memory, but 52
only does it 32 bytes at a time. So we need 3 52
operations. Let's start with the first 32 bytes and work from there.
The first 32 bytes are just zeroes and then a 20
at the end. Check. The second 32 bytes are very similar:
Take into account that you always need to specify two characters, even if the first is a zero because the EVM has no other way to tell bytes apart. The spaces between bytes are just a human convention, but it will be gone.
Now, let's put the actual string in memory. This is gonna be a bit of a challenge.
60
only pushes one byte to a stack position. But that's a problem, because we need to push 10 bytes this time. If we use 60
ten times, we will end up with 10 bytes in 10 stack positions. What we want is 10 bytes in a single stack position.
Introducing the 6*
family. Each member of the 6*
family pushes an increasingly large number of bytes to the stack. Which is confusing, because 61
doesn't push 1
byte, but two. So in this case, we want the 69
, which pushes 10 bytes.
So we run seth --from-ascii 'hola mundo'
and put the result here:
Then we put it in the third memory slot (which, being a 32-bytes slot, starts at 64, aka 0x40):
The above code returns "hola mundo". But in order to deploy that code, we need to return it in the transaction we make to the blockchain. So we need to write code that returns code that returns "hola mundo". Like, literally.
We kinda did that earlier, but at the time we only needed to deploy one byte. Now we need to deploy like 20, so putting it in the stack and then in memory would be pretty cumbersome. There must be something better. A way to copy code into memory.
That's called 39
. Things that start with 3
tend to be related to the user input. In this case, we are the user, and we are inputting the code. So 39
copies code into memory, and we must tell it where in memory we want to copy it, and the lenght and offset of the code we wanna copy.
So we need to push the following things into the stack in order to use 39
:
The first one (I mean, the last one we should push) is easy: we're just gonna put that code in memory position 0. The second one is tricky, so let's leave it for later, and the third one is the length of the code. Let's count how many bytes our hola mundo contract has.
To me, it looks like it has 29 bytes. That's 1d
. So let's go with that:
Now. When we run the transaction, this is the first code that we're gonna put there, and then we're gonna put all the hola mundo code. So the code position is gonna be right after this block of code. So let's count how many bytes it has (counting also the ??, that is, the count itself) and replace ?? with that number.
It looks like it has seven bytes. So, given it starts with 0, the position of the hola mundo code would be 07
:
Oh shit. We forgot that we also need to return the code. Here we're merely copying it to memory. So the code position is gonna be larger that 07
. Sorry about that. And welcome to the world of constantly counting bytes.
The code for returning would be
So putting it all together,
And now that we have it, we can count again the number of bytes we just wrote and replace ??
with it. It's 12, a.k.a. 0c
:
Putting together the deployment (a.k.a. constructor) code with the actual contract code, we have the following:
Now we need to remove all the comments and put the bytecode together. Save this to a local file and do
Now deploy the output with
Great!
Now let's transform that into ascii to finally see our long-awaited message:
There you have it.
Did you like bytecode programming? Check out my repo where I did a full ERC-20 implementation with bytecode. I also made some tools for easier processing of bytecode. Among other niceties, you don't have to count bytes anymore with these tools.
Also, check out https://www.ethervm.io/ for a full reference of all opcodes. https://github.com/crytic/evm-opcodes includes the gas cost of each, as well as https://github.com/wolflo/evm-opcodes. I'm not sure whether these pages are really up-to-date.
https://github.com/quilt/etk/ has a more robust framework for bytecode modification. It uses the opcode names instead of their actual numbers and allows for using multiple files.