Solidity Gas Optimization Techniques: Loops

For loops in plain Solidity suck. Here's how to unsuck them.

This guide contains 6 loop optimization techniques. We'll illustrate them by going through 3 example for loops. As the guide progresses we will be turning up the craziness and turning down the readability, all in the noble goal of saving gas.

Vanilla Loop

First off, this is your normal for loop in Solidity. This one is purposefully simple: it just adds the numbers from 0 to 99 together and returns their sum.






// 37628 gas 
function loop() public returns (uint256 sum) {
    for(uint256 n = 0; n < 100; n++) {
        sum += n;
    }
}

Can we make this thing cheaper? Let's tinker with the function and observe how its gas consumption changes.

1. Don't use `>=` and `<=` in the condition

First, let's see what happens when we change the loop condition to n <= 99.






// 37975 gas (+347)
function loop_lte() public returns (uint256 sum) {
    for(uint256 n = 0; n <= 99; n++) {
        sum += n;
    }
}

To our surprise, using the less-than-or-equal operator makes our function consume more gas than previously. To understand this, recall that high-level Solidity code must be compiled to EVM bytecode in order to be executed on-chain. The EVM has the opcodes LT, GT and EQ for comparison, but there are no convenient LTE or GTE opcodes for the operation we are doing. Therefore each time the condition n <= 99 is checked, 3 instructions must be executed: LT n 99,EQ n 99 and OR to check if either one of those returned true. (Note: this is not how EVM bytecode actually looks or handles variables, it is just useful for making the point). Since every instruction duringb execution costs gas, our function gets more expensive.

2. Increment the variable in an unchecked block

Since version 0.8 Solidity implements safety checks for all integer arithmetic, including overflow and underflow guards. This is a blessing for the security of smart contracts, but a curse for the gas optimizor.

For the fairly innocuous n++ Solidity will insert extra code to handle the case if n would overflow after incrementing it. This is useless in our case since we know as a fact that the loop condition guaranteed that n will never be above 100, let alone 2^256. Thankfully for us, we can wrap it in an unchecked {} block to skip the overflow check.









// 32343 gas (-5285)
function loop_unchecked_plusplus() public returns (uint256 sum) {
    for(uint256 n = 0; n < 100;) {
        sum += n;
        unchecked {
            n++;
        }
    }
}

This saves us a whopping 5285 gas, more than 14% reduction compared to our vanilla implementation.

In fact you can just go ahead and put the whole for loop in the unchecked block. But only if you are sure the variables will never overflow. Think very critically about your logic before you make all your code unchecked, as your (our) gas-saving addiction can end in disaster!









// 26587 gas (-11041)
function loop_unchecked() public returns (uint256 sum) {
    unchecked {
        for(uint256 n = 0; n < 100;) {
            sum += n;
            n++;
        }
    }
}

3. Just write it in assembly

There's no escaping it: you can always rewrite anything cheaper in Yul assembly. If you're not familiar with Yul, read the docs here.











// 26450 gas (-11178)
function loop_assembly() public returns (uint256) {
    assembly {
        let sum := 0
        for {let n := 0} lt(n, 100) {n := add(n, 1)} {
            sum := add(sum, n)
        }
        mstore(0, sum)
        return(0, 32)
    }
}

We removed the declaration uint256 sum from the function header in order to escape Solidity's type system as much as possible. Yul has no types, which is actually pretty comfy as you can dispense with type casting and incompatibility bs. You can also define your own types (you will see later) which will drive you insane as you try to keep track of all the gimmicky perversions you have conjured across your codebase.

Array Loop

Sometimes we want to loop through an array and do something with its values. Here is the canonical way to do this (incorporating our lessons so far):









// 25402 gas
function loopArray(uint256[] calldata ns) public returns (uint256 sum) {
    for(uint256 i = 0; i < ns.length;) {
        sum += ns[i];
        unchecked {
            i++;
        }
    }
}

_{†We are measuring the gas usage by providing this array with 10 elements:
[1, 6, 13, 22, 24, 25, 30, 33, 94, 215].}

4. "Cache" the array's length for the loop condition










// 25182 gas (-230)
function loopArray_cached(uint256[] calldata ns) public returns (uint256 sum) {
    uint256 length = ns.length;
    for(uint256 i = 0; i < length;) {
        sum += ns[i];
        unchecked {
            i++;
        }
    }
}

Generally you should always aim to simplify the loop condition. Previously, every time before the condition was checked, the length of ns had to be fetched anew. We know the length won't change during execution and we can reduce the number of ns.length calls to just 1 for a modest reduction in gas.

Of course, assembly is even cheaper:

// 23986 gas (-1434)
function loopArray_assembly(uint256[] calldata ns) public returns (uint256 sum) {
    assembly {
        let guard := add(1, calldatasize())
        for {let offset := ns.offset} 
            lt(offset, guard) 
            {offset := add(offset, 32)} 

            {sum := add(sum, calldataload(offset))}
    }
}

5. Pack small iterables

We can employ some cleverness to reduce the code size. Notice that the input array [1, 6, 13, 22, 24, 25, 30, 33, 94, 215] contains small 8-byte integers. When we pass them to the function, however, each one of them uses up a whole 32 bytes. Even if we define our input variable as uint8[] calldata ns, they will still take up 32 bytes each as calldata in solidity is not packed. In fact, that would actually make it more expensive as the uint8 variables would need to be cast to uint256 before using them. You should really only the smaller types for storage packing.

What we can do is pack the variables beforehand in a single uint256. In our example this comes out to the number 1017045174132346211599873 or in hexadecimal, 0x00000000000000000000000000000000000000000000d75e211e1918160d0601.
Each 2 hex characters make up an 8-bit number. We can read our numbers from right to left: 01,06,0d,16,…,d7. We can isolate the rightmost number by AND-ing the packed number with a bitmask for the rightmost 8 bits. This is how the operation looks in hex:

AND
0x00000000000000000000000000000000000000000000d75e211e1918160d0601
0x00000000000000000000000000000000000000000000000000000000000000ff
=
0x0000000000000000000000000000000000000000000000000000000000000001

And this is how it looks in binary:

AND
..11010111010111100010000100011110000110010001100000010110000011010000011000000001
..00000000000000000000000000000000000000000000000000000000000000000000000011111111
=
..00000000000000000000000000000000000000000000000000000000000000000000000000000001

To get to the next number we can bitwise shift to the right by 8 bits. Here is how it looks in hex:

SHR
8
0x00000000000000000000000000000000000000000000d75e211e1918160d0601
=
0x0000000000000000000000000000000000000000000000d75e211e1918160d06

To know when to stop, we add a length argument in the input. Of course we can also pack the length together with the numbers themselves (as the first variable for example.)










// 23248 gas (-1934)
function loop_arr_magic(uint256 encoded_ns, uint256 length) public returns (uint256 sum) {
    for(uint256 i = 0; i < length;) {
        sum += encoded_ns & 0xff;
        encoded_ns >>= 8;
        unchecked {
            i++;
        }
    }
}

Note that this approach will become cheaper as the length of the number array increases. As customary, assembly will always be cheaper:













// 22931 gas (-2471)
function loopArr_magic_assembly(bytes calldata encoded_ns) public returns (uint256 sum) {
    assembly {
        let sum := 0
        let ns := calldataload(encoded_ns.offset)
        let length := add(1, and(ns, 0xff))
        ns := shr(8, ns)
        for {let i := 0} lt(i, length) {i := add(i, 1)} {
            sum := add(sum, and(ns, 0xff))
            ns := shr(8, ns)
        }
    }
}

Here, we assume the length is given in front of the packed variables (input is 0x000000000000000000000000000000000000000000d75e211e1918160d06010a). Furthermore, using bytes calldata allows for an arbitrary-length input.

This technique is more flexible than in this particular contrived example. You can pack booleans, addresses, arbitrary-bit-sized integers and others. For example, you can efficiently represent coordinates on a 2D board as a packed 32-byte value and define a bitmask over that to represent each player's legal moves.

Credit: I first saw this approach from @fiveoutofnine's tweet.

Multicall

One last scenario before we call it a day. Say you want to call an external contract multiple times with the arguments given as an array. One common application for this is sending an ERC20 / ERC721 from a contract to multiple addresses. A sane gas-aware person would implement a bulk transfer function as such:†












function loopMulticall(address token, uint256 amount, address[] calldata recipients) public {
    // 128505 gas
    if (msg.sender != owner) {
        revert();
    }
    for(uint256 i = 0; i < recipients.length;) {
        IERC20(token).safeTransfer(recipients[i], amount);
        unchecked {
            i++;
        }
    }
}

_{†Here the contract is using OpenZeppelin's SafeERC20 for IERC20. The gas in this and all following examples is measured by sending 100 DAI to 10 addresses with non-zero initial balance. This is an important detail as
SSTORE is more expensive when the storage slot contains 0.}

6. Reuse memory between similar calls

There is an inefficiency here, however. When calling the token contract, the same data for the function selector and amount argument is loaded into memory multiple times. We can optimize this further with some assembly:





























function loop_multitransfer_assembly(address token, uint256 amount, address[] calldata recipients) public {
    // 126813 gas (-1692)
    assembly {
        if or(
            iszero(eq(caller(), owner)), 
            iszero(extcodesize(token))
        ) {
            revert(0, 0)
        }
        let fmp := mload(0x40)
        // store the function selector for transfer
        mstore(fmp, 0xa9059cbb00000000000000000000000000000000000000000000000000000000)
        // add the "amount" argument (will not change)
        mstore(add(fmp, 36), amount)

        let guard := add(1, calldatasize())
        for {let offset := recipients.offset} 
            lt(offset, guard) 
            {offset := add(offset, 32)} 
            {
                // add the "to" argument
                calldatacopy(add(fmp, 4), offset, 32)
                if iszero(call(gas(), token, 0, fmp, 68, 0, 32)) {
                    revert(0, 0)
                }
            }

    }
}

Notice how we only store the function selector and the amount argument once, only replacing the to argument between calls. I discovered this trick when trying to optimize a MEV execution contract that did many UniswapV2Pair.swap() calls. The gas savings increase with the number of calls and the size of the reused data.

Conclusion

For loops are often big gas guzzlers in Solidity. I hope you found some of these techniques useful or at least cool.