EPF: Cohort 4 - Week 8

As mentioned in my last update, the first two days of this week focused on discussing with Sam Wilson and Lightclient, the modifications needed for the ETK assembler.

The current problem occurs mainly during the manipulation of labels, which can be used even before they are defined. Since the ETK backend proceeds in a linear fashion, when it encounters a label that it is not yet able to define (because it is declared later in the code), it enters in "pending" state until the portion of code that defines the label is reached.

Let's look at an example to clarify the idea

push1 end // <- The assembler enters in 'pending' state 
          //    caused by the use of the not yet defined 'end' label.
jump

push 0x01
push 0x01
add

end:     // <- The label is finally declared
    jumpdest
    pc
    stop

The idea of the "pending" state is nothing more than the accumulation of the bytecodes produced by the original code in a separate vector, waiting to find the final definition of the label. This is so, because in this way, we can quickly "count" how many bytes were added to the code until the label was defined and thus define the relative value that the label has to take in the rest of the code to be able to jump to it.

The problem arises when, in addition to not having found the definition of the corresponding label, the same happens with a macro, given that the latter require the vector of pending opcodes to be zero. This causes an assert to fail and the compiler to panic.

%macro revert_if_neq() 
        push1 revert    
%end

%revert_if_neq() // <- The assembler enters in 'pending' state 
                 //    caused by the use of the not yet defined 'revert' label.
%revert()        // <- Macro undefined with len(pending) > 0. Panic!

From this explanation I think it is easy to see that a possible solution would be to modify the code so that all macros are "defined" before starting to process the code, in which case our example should return an "undefined macro" error, because in case the revert macro is not found, it would be possible to infer that it does not exist. However, from our discussion with Sam Wilson, we came to the conclusion that it would be a good time to simplify the assembler code and get rid of this pending state.

After analysing the code and understanding the whole situation, I decided that a possible solution would be to process all the code, leaving placeholders in each of the labels. Then once the code is completely unrolled (i.e. the final code with all the macros replaced, etc), make a pass calculating and replacing each of the labels by their corresponding position.

While it is true that this may not be the most efficient solution, I think it is a good first step to simplify the code, which can then be optimised.

With this in mind, over the next week I intend to:

Finish testing ideas to simplify the assembler.
After the simplified version is tested, spend some time to implement possible optimizations.

Read more

EPF: Cohort 4 - Final Dev Update

EPF: Cohort 4 - Week 15

EPF: Cohort 4 - Week 14

EPF: Cohort 4 - Week 13