owned this note
owned this note
Published
Linked with GitHub
- We want to present two plans, a short term and a long term
- Short term plan: Build on MR!2439
- It works! Including CI and a little reporting app: https://benchmarks.marigold.dev
- Tests are easy to add but a bit tedious to write (written in an OCaml DSL). Can be transitioned to use the TZS format relatively easily.
- Code is unlikely to get merged in, for multiple good reasons:
- Introduces new dependencies on core and core_bench
- The code is not up to the high standards of the protocol. Duplicates code, uses exceptions with abandon, etc.
- We suggest that Marigold maintain a separate fork of the codebase that applies the benchmark patch on top of every MR to tezos/tezo.
- Avoids downsides above while providing very fast feedback (runtime of benchmarks + <2min overhead on most commits)
- There is an existing MR to pull Michelson into its own lib. Once that is merged we can consider making the code more robust and clean.
- Long term plan: build a new framework based on a proposed TZS format
- Yann has noticed a duplication of effort around benchmarking, unit testing of Michelson, formal verification of the protocol, and replaying chain history.
- What's needed is s a single, protocol-independent, text-based encoding of transaction sequences and expected outputs
- This could be used for many purposes:
- Constructing contexts and transactions for the benchmarking suite
- Crowdsourcing contracts/transactions that are important to the community via Agora, etc.
- Replaying the history the chain with a modified protocol
- Replaying the chain is extremely difficult
- It takes 3 days to build an archive node, and it's 90GB of data
- You have to apply migration functions. So if you want to run Florence on Carthage transactions, you have to apply all the migration functions from Carthage to Florence.
- We could precompute this, but that would result in quadratic blow up of the size of the data
- A separate TZS interpreter will greatly ease certain kinds of data analysis on the chain (like which instructions consume the most gas)
- Unit testing Michelson (see Raphael's posts on TZT, of which we consider TZS a generalization)
- Currently we have the python integration tests, my ad hoc regression test framework in the benchmark suite, and a third testing framework by Arvid, at least..
- Discovering, authoring, and enforcing invariants about the protocol
- Arthur and others have pointed out that even non-breaking changes can result in slightly different results in chain output.
- In the wild, outputs are huge - too large to validate by hand.
- Thus we need a way relaxing constraints from
- Constraints should be compositional: composing constraints should result in their logical conjunction.
- Possible solution: Use Prolog-like rules and variable unification to express constraints.
- This is quite convenient because it has the compositional props we want while also matching the natural deduction style the formal verifications team is used to working in.
- Gives people a common language to write rules down, verify them against the history, communicate them to protcol developer and uers, and test new versions of the protocol against.
- There might be some magic we can do around automatic inference of invariants.