UTXO-based smart contracts

# UTXO-based smart contracts *Note: These are **my own** ideas. What IF is up to might be completely different* Smart contracts are getting more and more important in the crypto space. They upgrade the ledger to not just send around the network token, but any kind of user-defined asset with arbitrary rules. This idea was at the core when Ethereum was born: A global computer, open to use by anyone willing to pay for gas. This led to the creation of ERC-20 tokens, ERC-721 NFTs, and the entire DeFi sector. Since then, many projects are jumping aboard the SC train, to open up for these use cases. And IOTA should not miss out on this game. As of now, we have Ethereum-based smart contracts running on Layer 2. This works by creating environments for the EVM with validators to handle the requests. However, as with all Layer 2 networks, it makes things more difficult and also less secure. So what if we also allow smart contract execution on Layer 1? ## Layer 1 parallelism One of the core goals of IOTA has always been to allow parallel transactions to allow maximum scalability. Looking into the near future, this becomes a critical design feature, as the single-core performance of CPUs is stalling, yet each processor has more available cores. Most DLT projects like Bitcoin and Ethereum process transactions in a totally ordered manner in a single thread. While some side tasks (e.g. peering, database management) can be delegated to other cores, after a few cores you will have little to no speed gain from the next one. Therefore execution of L1 SCs must be multi-threaded to ensure maximum scalability. Just like Bitcoin, unlike Ethereum, IOTA uses the UTXO model. For each transaction, we take existing outputs as input and add some user-defined data (usually signatures to prove that we own the tokens). If the transaction is valid, the inputs are deleted and new outputs are created as a result, usable for future transactions. In classic Bitcoin UTXO, an output would only hold some tokens and how to unlock them (usually requiring a signature for a certain address). However, we can put arbitrary data inside outputs, allowing us to keep track of additional information like native tokens, DIDs, sensor data, and *entire smart contracts*. The result of this architecture is that all outputs are fully independent of each other. As long as no output is consumed twice (double spend) and we don't spend outputs before they are created, we can run as many transactions in parallel in any order we want. ## Deterministic execution IOTA strives towards deterministic execution. In terms of user experience, this brings a big advantage: Whatever you sign in your wallet is what is executed, with no unexpected running out of gas, no external entity shrinking your profit (MEV), or other bad surprises like SC bugs. To archive this, the wallet executes the transaction before sending it to the network, together with a commitment of its results. Should the result of the network be different, the transaction fails. ## Contested States <img style="float: right;" src="https://i.imgur.com/gcr1Nah.png"> In general, we can view an output as a single state of the network, the list of unspent outputs at a given point being the global state of the network. Still, transactions that reference the same state cannot be executed in parallel, since they are dependent on each other. Let's say we have a simple smart contract functioning as a bank, recording user balances and allowing them to withdraw their money later. Currently, the total balance is 100Mi and both Alice and Bob want to withdraw 10Mi. If we would execute both transactions in parallel, this would result in a remainder of 90Mi, compared to the expected remainder of 80Mi. Therefore, we need ordering for transactions that want to modify the same state. The classic solution is to have a leader of some sort decide who comes first. In this example, it does not matter, in others the leader might abuse his powers to make some extra gains with MEV. IOTA does not have any sort of leader. Furthermore, Alice's transaction has committed to a state where Bob still has his funds deposited and vice versa. The logical result is that one of the transactions fails to execute. We call this problem a *contested state*. But can we allow Bob and Alice to withdraw at the same time? ## Split States Indeed, Alice and Bob are modifying different sub-states of the smart contract. Aside from the balance of the smart contract, no writes are shared. With a simple change to the inner workings of the smart contract, we can allow them to execute in parallel! ![](https://i.imgur.com/MOVfPcN.png) Now, our entry point is a bank controller, which manages our withdrawals. The controller does not have any state, so it can't cause issues. Alice's request would modify Bank #1 and Bob would modify Bank #2. As long as Charlie or Dave don't want to join the show as well, there is no conflict. We could go even further and give each user his separate state. In this case, all users could deposit and withdraw freely from each other, without running into any issues. ![](https://i.imgur.com/zlwUBMs.png) However, there is a big downside to this. Each state has a certain overhead for storing it in the database, currently around 400 virtual bytes. This means that in our extreme example, where we only store a single 32-byte key/value pair, we would pay more than 5 times the storage cost than if we just added it to an existing state! If we want to access multiple states, we also have to reference them all individually in our transaction inputs, causing further overhead in transaction size. My take on this is, that there will be splitting for contracts with huge states, where the overhead doesn't change a lot. Also in terms of network performance, it should be recommended to act in such a way, so less data needs to be written and read from the ledger. There are likely going to be some sort of limits on state sizes. Ideally, you'd have some sort of smart contract storage framework that manages that for you and splits your data over multiple states, should one become too big. Single-user states might be cool for large data, for example, if you want users to have some sort of identity inside a smart contract (network), storing more data than just a few values. However, I don't expect this to be the norm as it is indeed costly. Also returning to the topic, we have to admit that this still doesn't solve all issues. If Alice would internally send their tokens to Bob, his balance would still be contested. Furthermore, if our bank had some sort of central value like *total cashflow*, updated with each transaction, all our gains with this trick are instantly void. I will come back to this issue later, now let's look at how transactions work: # Execution model ## Output format The on-chain output needed for this works similarly to current TIP-0018 NFT outputs, holding IOTA and other native tokens. However, instead of having feature blocks and unlock conditions, the conditions to spend the output are formulated in an immutable code field. The mutable metadata field becomes the storage of the smart contract. An ID (working identically to Alias ID) will be used to reference the smart contract after its creation. Furthermore, there is an immutable field to store signers (more about this in a second) Non-SC outputs will be interpreted as a smart contract with a spend() function checking for all unlock conditions to be fulfilled. Alias outputs would also have a transition() and governanceTransition() ## Environments During execution, the outputs with their balance, code, and state are represented as *Environments*. Each environment has a slot starting from 0 in the order they are initialized/created. There is one important rule: **An environment can only be modified by its own code.** Each environment also has a separate code scope. Interaction between environments is only possible via calls. ### Environment lifecycle There are two ways to create an environment. The first one is to specify an input, which will cause the VM to be initialized with it before any code execution. The second one is to create it during execution. At any point during execution, a new environment can be created by specifying its initial balances, code, and state. Any created environment is instantly fully independent. There is no connection to the creating environment and it follows the "can only be modified by its own code" rule. ### Signers To manage authentication, we use a system similar to Fuel's Signer system. If we have a signer for a given address (or Alias), we can be sure that the owner of said address has approved this transaction in some way. Signers are immutable and cannot be created at runtime. Each signature of the transaction generates a signer for the specific address. Also, each environment has a signer for their own id available, allowing smart contracts to authenticate as themselves. Upon creation of an environment, additional signers can be passed that will also be available inside the environment. ### Read-only environments When referencing an existing output, it can be specified as *read-only*, which will create a *read-only environment*, blocking any writes to it. As no changes are made, the output remains unchanged in the ledger and is not spent (only provided to the transaction). However, read-only environments can initiate state changes outside of themselves by either calling other environments or creating a new environment. The cool thing about read-only environments is that they can't contribute to conflicts. Oracles or other shared data like code libraries or proxy smart contracts can therefore be used without problems. To give owners of such contracts a chance to control who can create conflicts with their smart contract, the read-only flag **must** be set correctly. This means, even if write methods are available, if you don't make any changes to the state you must put the input as read-only. ### Consumption of environments Any non-read-only environment that is not destroyed when the VM finishes execution, will become an output that is passed back to the node. The node will assert that the ledger constraints are still met (i.e. no IOTA or native tokens have been created or burned). It will also check if the provided output commitment matches. If a new smart contract output was created, it will assign the smart contract ID. If all checks pass, the transaction has finished successfully, causing the inputs to be marked as spent and the new outputs to be written onto the disk. An environment can be destroyed by a specific VM call within it (similar to Ethereum's self-destruct call). This causes the environment to be marked as destroyed, it can still receive calls but it will not be persisted anymore. ### Environment #0 Environment #0 is a special *read-only* environment created for executing the transaction script. Its scID, balance, nativeTokens, and state are zero/nil. The code is read from the transaction field. The signers from transaction signatures are also available in this scope. ## Examples ### Example 1: Simple Transaction Let's suppose Alice wants to send 10 *MOON* tokens and 100i (remember, we still need a storage deposit!) from her address *a123* to Bob's address *b456* (in the real world the ID and the addresses would be hex values). In the transaction, Alice specifies a UTXO that has the exact balance tokens and a code that would look something like this: ``` env1.spend(<Signer for a123>) createEnvironment(100,["MOON":10],<unlock condition for b456>) ``` The code inside of the output she referenced defines a spend() function, that Alice can call to access her funds. ``` func spend(Signer s){ assert(addressOf(s) == "a123") selfdestruct() } ``` First, the node loads the UTXO from the disk and creates an environment from it. After this, the VM is launched, executing the passed code in environment #0. Since Alice signed her transaction, she has access to a signer for her address, which is passed into the spend function. This function in Environment 1 is called first and destroys the environment. As a second step, a new environment with the desired amount and specifying a similar code for Bob's address is created. Finally, the VM finishes and passes back to the node. Environment #1 was destroyed, however, it still makes its way into the output commitment as a null value to explicitly record that it has been destroyed. However, outputID `txHash||0` will not exist in the UTXO set. Since the newly created environment does not have any SC-ID yet, the node sets it (in this case `txHash||1`). The node now runs two checks. First, the output commitment of Alice is validated to ensure the desired result has been reached. Second, the node validates if any tokens were created or burnt. Since both the input from Alice and the output to Bob has 100 IOTA and 10 MOON, this check passes. Subsequently, the new output is written to disk, UTXO #1 is marked as spent and the transaction is completed. ![](https://i.imgur.com/243ZHZB.png) In this case, there was no remainder. If Alice had a greater balance on her input, she would create a third environment with the remainder, specifying an unlock condition that she could fulfill. ### Example #2: A bit more complex Same scenario as above, but the code of Alice's output is a little different: ``` func spend(Signer s){ assert(addressOf(s) == "a123") createEnvironment(100,[],<unlock condition for c789>) selfdestruct() } ``` Turns out Charlie wants his deposit back! Alice now needs to modify her script: ``` env1.spend(<Signer for a123>) createEnvironment(100,["MOON":10],<unlock condition for b456>) env2.spend(<Signer for a123>) ``` The first call unlocks the tokens like in the first example. But we also create an environment that will ultimately transfer 100 IOTA to Charlie. Since we do not have Charlie's signer (unless he decided to co-sign the transaction), we have no way to modify this environment. Now Alice sends the 10 MOON and 100i to Bob as in the first example. However, she could not end the transaction at this point, as this VM state would result in 200i on the output side yet just 100i on the input size. **The VM will allow this state**, which is crucial as it gives us a chance to fix it. If this state was final, the node would recognize the result as invalid and drop the transaction. To get the transaction through, Alice has no other choice than to grab another of her UTXOs and spend it as well. Once again it just happens to contain the required 100i, but she could create a remainder output once again. ![](https://i.imgur.com/uFyGHie.png) ### Example #3: Smart contract shenanigans Now Alice wants to buy some MOON tokens and calls a distributing smart contract. However, during a full moon, the price is cheaper, so we need to consult an Oracle for this. The code of the function is like this ``` func buy(uint256 amount){ price := 20 //in iota per token moonCycle:=oracleID.getMoonCycle(); if(moonCycle=="Full Moon") { price = 18 } //adjust smart contract balance this.balance+=price*amount this.tokenBalance["MOON"]-=amount } ``` The entire Oracle code consists of a simple getter and a setter that checks authentication. d987 could be a multi-sig address ensuring a consensus in the Oracle committee. ``` func getMoonCycle(){ //load the moon cycle from the state and return it return this.moonCycle } func setMoonCycle(Signer s,string newCycle){ //setting the moon cycle requires a signer assert(addressOf(s) == "d987") this.moonCycle = newCycle } ``` Now Alice needs to put 3 inputs into her transaction. First of all one of her outputs, then the smart contract, and finally also the oracle output, despite she never calls it directly from the transaction script. In this case, the oracle needs to be included as "read-only". Since the only way to call setMoonCycle() is by providing the signer for the multi-sig, the Oracle committee does not need to worry about their smart contract getting contested and can update it even when it is frequently used. The transaction script is pretty straightforward again, let's assume that Alice has 180.100 tokens on her output, and it is a full moon. ``` env1.spend(<Signer for a123>) env2.buy(10000) createEnvironment(100,["MOON":10000],<unlock condition for a123>) ``` In the final graph, we can see that UTXO 3 is not persisted again, as we just loaded it but didn't do any changes. ![](https://i.imgur.com/I1jTPLl.png) ## Consequences for conflict detection The current protocol only knows the first two cases. Two more cases arise with this proposal: * **A input is spent once** There is no conflict - the transaction can confirm! * **A input is spent more than once** Only one of the spending transactions may confirm! * **The input is used read-only multiple times** This one is straightforward. There haven't been any changes to the input, so the order of transactions does not change anything. All transactions can confirm! * **The input is spent but also read from** This one is a little tricky. We could assume that all reading transactions happen before the spending one. This would solve the conflict since the writing transaction would load the same output no matter how often it has been read before. However, this allows old, already spent outputs to be used in side tangles momentarily. For example, let's say we are referencing a smart contract that returns us fee rates for a DEX. If you increase it, you don't want users to use the old fee rates anymore. Therefore, there has to be a limit on how long you can read from already spent outputs. # Sequenced Execution Returning to the issue of contested states, we still don't have a good solution for it. The only way to solve this issue is to give up determinism and introduce a *sequencer*. The sequencer is a Layer 2 application that collects requests to a smart contract and orders them, so there won't be any conflicts. Only transactions co-signed by the sequencer can call the smart contract. To archive this, we take the transaction script and put it into a sequence() function inside a smart contract instead. We add a condition that only the sequencer can fulfill and make sure to destroy the contract afterward to prevent replay attacks. Now the sequencer can simply call the created SC and our code gets executed. Since there is no output commitment, execution is non-deterministic. To counteract potential unwanted outcomes, additional assert statements can be added. To provide the needed signature, we add our signer to the smart contract. Since we can't allow anybody to add any signers, this is only possible from the scope where the signer was initially provided, which ensures that storing the signer is directly authorized by this signature. After an order has been chosen, the sequencer will try to execute the transactions in it. If a transaction fails or the result after the transaction would not pass node verification (e.g. because a transaction tries to send funds it does not provide inside it), it is dropped. For these cases, there is a recover() function that allows us to reclaim our funds in this case or when the sequencer does not do its job correctly. It should be noted that the sequencer transaction IS deterministic. The sequencer can therefore assure that his proposal will be executed as specified. This could even allow some sort of secondary validation. For example, the sequencer could propose a transaction, with a committee only signing if the sequencer played by the rules. # Further thoughts/questions * There should be a way to generate non-SC outputs from SC transactions. Basic outputs are ideal for a simple signature-locked store of tokens. It is favorable to encode commonly needed locking types into unlocking conditions, as it saves space in the UTXO set. * Do we need some sort of state index for SC outputs? Current Alias Outputs have it. * Reentry scenarios should be considered. This is a problem on Ethereum and I am a big fan of security by default. * Foundries are going to make validation more complex than shown here since we'd have to check if additional tokens were minted or melted. Ideally, you'd also want smart contracts that can mint or melt tokens. * While for this proposal token transfers are fully implicit, we could also require explicit transfers between environments via calls (and possibly also in return statements). * Obviously, there would be a per-instruction mana cost to prevent spam. * This goes for all three stages of the transaction (loading, executing, writing) * The size of a single Output needs to be limited to prevent attacks, as it needs to be read and written down fully (assuming it's not read-only) to make a transaction. * SC developers should therefore make use of state splitting to reduce resource usage per transaction (Bonus: This lowers mana costs for users). There should probably be a native library for storage management. * An approach to this might be to make the storage deposit exponential after a certain size. For example, the first 10000 vBytes could cost 100 glow/vByte, the next 10 vBytes would cost 101 each, then 102, etc. * Without any extra precaution, the sequencer could also face contesting states. A solution to this might be to only allow calls from the sequencer or rate-limit other users. For example, one could store the last call time and reject any transitions within 120 seconds - unless they come from the sequencer. * The sequencer would also need some sort of incentive. After all, they have to pay the mana to run. In addition, they need to validate transactions to include and might face spam attacks in this phase.