Stardust formal verification

# Stardust formal verification The end goal of this work is to prove that our validation rules for transactions, inputs, outputs and unlock blocks are enough to keep the UTxO ledger in a "valid" state. __Goal__: the ledger stays "valid" as long as the validations rules are enforced __Steps__: - Define a formal model of IOTA ledger solely based on the RFCs - allows us to detect any issue or unclear content that may appear in the RFCs - Define what is a "valid" ledger - Prove that if the validation rules are enforced when adding new transactions then the ledger stays "valid" ## The formal model The formal model is being written with the Agda proof assistant. Agda was chosen because there is existing work in Agda formalizing the EUTxO of Cardano. Our model is build from this work and is accessible on Github: https://github.com/iotaledger/utxo-formal-specs ## Overview of the formal model A simple overview of the formal model, some parts may have been simplified or were omitted for simplicity. ### Output We present solely the SimpleOutput (or ExtendedOutput without feature blocks) for clarity purpose. Possesses an address that can spend the output and the amount of iota it holds. $$ Output: \left\{ \begin{array}{l} ~address: Address \\ ~iotas: \mathbb{N} \\ \end{array} \right. $$ ### Output Reference An output is referenced with the hash of its origin transaction and its index in the transaction. $$ OutputRef: \left\{ \begin{array}{l} ~transactionId: HashId \\ ~index: \mathbb{N} \\ \end{array} \right. $$ ### Input Input is just a reference to an unspent output $$ Input: OutputRef $$ ### Transaction We only concern ourselves with A transaction contains a list of inputs, outputs and unlock blocks $$ Transaction: \left\{ \begin{array}{l} id: HashId \\ inputs: List(Input) \\ outputs: List(Output) \\ unlockBlocks: List(UnlockBlock) \end{array} \right. $$ ### Ledger State The state of the ledger is a set of output references. These output must be unspent. $$LedgerState: Set(OutputRef)$$ ### Ledger A ledger contains an initial state, current state and a list of transactions. These transactions are the list of chronogically ordered transactions that have been accepted by the ledger. $$Ledger: \left\{ \begin{array}{l} initState: LedgerState \\ currentState: LedgerState \\ transactions: List(Transaction) \end{array} \right. $$ We have a function $step$ which computes the next ledger state from an initial state and a transaction. $$ step: LedgerState \rightarrow Transaction \rightarrow LedgerState$$ The ledger current state has to be derived from applying $step$ to the initial state using the list of transactions. $$ \forall (l: LedgerState),~~~~ l.currentState = fold(l.transactions, l.initState, step)$$ ### Note Current the model is sequential, it behaves as if we have a single node in the network. Therefore all behaviours related to data race, consensus... are abstracted away and cannot be reasoned with. Howerer our goal is to focus on validity of the ledger ensured by the validation rules. Nodes may have different views about the ledger but as long as their views are "valid" then our model is correct. Issues concerning the synchronization of these views is up to the consensus mechanism and is out of scope. ## What is a "valid" Ledger The goal is prove that our model respects the properties listed under. This is still work in progress, some properties may be false or the formulation may be wrong. Feel free to add/comment any properties that you think we would like to have for the ledger, or would be interested to check if that's true. ### Approach __Goal__: prove that our validation rules are enough to keep the ledger valid __Steps__: 1. Define what is a valid ledger 2. Define what are valid transactions, inputs, outputs 3. Prove that if we a valid ledger $l_1$ to which we apply a valid transaction $tx$ then the updated ledger $l_2$ is also valid: $$ \forall (l_1~l_2: Ledger), \left( \begin{array}{l} &Valid(l_1) \\ \implies &Valid(tx) \\ \implies &l_2 = \left\{ \begin{array}{l} initState = l_1.initState \\ currentState = step(l_1.currentState, tx) \\ transactions = append(l_1.transactions, tx) \end{array} \right\} \\ \implies &Valid(l_2) \end{array} \right) $$ 4. Prove that our validation rules are sufficient to only permit application of valid transactions ### Validity of the ledger Here we present a set of properties which defines what is a valid ledger. These properties come from my understanding of UTxO. Hence I may have ommited certain properties or you may have a different opinions (please share your thoughts) #### Constant amount of iotas The total amount of IOTA is constant in the ledger $$ \forall (l: Ledger),~~ total\_iotas(l.initState) = total\_iotas(l.currentState)$$ #### No Double Spend A ledger should not have 2 inputs referencing the same outputs. Any inputs $i_1$ and $i_2$ in the ledger which are not in the same transaction and at the same index cannot be equal (or cannot refer the same output). $$\forall (l: Ledger)(i_1~i_2: Input), \left( \begin{array}{l} &\left( \begin{array}{c} i_1 \in l.transactions.all\_inputs() \\ \land \\ i_2 \in l.transactions.all\_inputs() \end{array} \right) \\ \implies& input\_location(i_1, l) \neq input\_location(i_2, l) \\ \implies& i_1 \neq i_2 \end{array} \right) $$ With $input\_location$ a function giving us the $(TransactionId,~index)$ of the input. _Note_: My intuition tells me that "Constant amount of iotas" implies no double spend since double spend add to the total amount of iotas but not 100% sure, I will need to prove it #### Progress This property expresses that the ledger cannot get stuck. If a ledger initial state is not "locked" then there always exists a transaction which can "advance" the leger current state. $$\forall (l: Ledger), ~~ \neg Locked(l.initState) \implies \neg Locked(l.currentState)$$ _Definition of a locked state_: - a locked state is a state from which there exists no transactions that can be applied to it $$ \forall(s_1: LedgerState), \left( \begin{array}{c} Locked(s) \\ \iff \\ \nexists(s_2: LedgerState)(tx: Transaction),~~ step(s_1, tx) = s_2 \end{array} \right) $$ - for example a state where there are no unspent outputs - are there other examples? like a circular dependency between outputs #### Locality This property expresses the fact that the ledger can handle transactions "locally". For any transactions $tx_1$ and $tx_2$, if $tx_1$ and $tx_2$ are "independent" from each other in ledger $l$ then stepping with $tx_1$ then $tx_2$ results into the same ledger state as stepping with $tx_2$ then $tx_1$ $$\forall (l: Ledger) (tx_1~tx_2: Transaction),~~ Independent(tx_1, tx_2, l) \\ \implies \forall (s: LedgerState), ~~ step((step(s, tx_1), tx_2)) = step(step(s, tx_2), tx_1)$$ _Definition of independent transactions_ - none of one transaction inputs refer to any of the other transaction outputs $$ \forall(l: Ledger)(tx_1~tx_2: Transaction), \\ Independent(tx_1, tx_2, l) \\ \iff \\ \forall (i: Input) (o: Output), \left( \begin{array}{c} \left( \begin{array}{c} (i \in tx_1.inputs) \\ \land \\ (o \in tx_2.outputs) \end{array} \right) \implies (i \neq output\_location(o, l)) \\ \land \\ \left( \begin{array}{c} (i \in tx_2.inputs) \\ \land \\ (o \in tx_1.outputs) \end{array} \right) \implies (i \neq output\_location(o, l)) \end{array} \right) $$ _Note_: this property is inherent to UTxO and nice to have but may be not "usable". Indeed an UTxO in the form of a chain is "valid" but does not have any independent transactions, therefore we don't enjoy any benefits from the locality property. #### Security We want to prove that our specifications are not vulnerable to known attacks: - Sybil attack: on consensus, not in this model - Eclipse attack: on consensus, not in this model - Replay attack: a ledger is secure against replay attacks if for any transaction that has already been processed before then it cannot be processed again $$ \forall (l: Ledger) (tx_1~tx_2: Transaction),~~ \left( \begin{array}{c} tx_1 \in l.transactions \\ \land \\ tx_2 \in l.transactions \end{array} \right) \implies tx_1 \neq tx_2 $$ #### Correctness I believe that we can have a property which expresses that an address $a_1$ cannot steal funds from another address $a_2$, but I have not formalized any crypto yet: what is an address, a hash, ... This property will rely on assumptions on cryptographic primitives (for example: it's impossible to find a collision for a hash). ### Validity of transactions, inputs and output Properties of the transactions, inputs, outputs should be sufficient to prove the above properties of the ledger. To prove that the properties of transactions, inputs and outputs (below) are true we will implement the validation rules of the RFC and prove that they enforce the following properties. #### Transactions - __Constant amount of iotas__: iotas in inputs - iotas in outputs = 0 - __Locality__: validation rules can be executed only with local information - __Deterministic__: validation rules are deterministic - __Termination__: validation computation cannot be stuck - __Unlocked inputs__: all inputs are unlocked for a transaction to be accepted #### Inputs - __No double spend__: an input always refer to a unique unspent output ### Outputs - __Progress__: there exist at least one environment (set of conditions that do not contradict with one another) under which an output can be consumed ## Future work - Add simple version on crypto with assumptions (non-probabilistic model) - Add notion of time to handle outsputs with expiration time - Add notion of tokens/nft - Extract an executable of the validation rules (which will have been proven to be enough to guarantee the properties of the ledger) and try to run them with tests of Bee/Hornet