# Notes to "How to Write Formally Verifiable Specifications in Python"
Partial notes to [How to Write Formally Verifiable Specifications in Python](https://hackmd.io/@0g8QuqEeQBe45CC8toURGw/H1WXgeCg1l)
The talk [slides](https://docs.google.com/presentation/d/1xtKPqN9KnMnIZfbl-7A7C7L9-ynJKVlpmmtD4UfMVtQ/edit#slide=id.g3155c2bb5f7_0_0)
---
# Problem: Mutable objects
Due to [aliasing](https://en.wikipedia.org/wiki/Aliasing_(computing)), writing to a reference may implicitly modify values obtained via other references. In the F/V context, it invalidates what we know about affected memory locations. Tracking aliasing and re-establishing properties can significantly complicate reasoning.
## Solution
Use immutable objects (non-destructive updates). Can be facilitated by [pyrsistent](https://github.com/tobgu/pyrsistent/) library.
Concerns:
- non-destructive updates are less readable, especially in case of nested updates ([example](https://github.com/saltiniroberto/ssf/blob/ad3ba2c21bc1cd554a870a6e0e4d87040558e129/high_level/protocols/rlmd/3sf_high_level.py#L187))
- MyPy typechecking fails for `PRecord` objects (can be fixed with a custom [MyPy plugin](https://github.com/saltiniroberto/ssf/blob/ad3ba2c21bc1cd554a870a6e0e4d87040558e129/my_plugin.py))
## More involved solution
Allow mutatable objects, but only when it's "safe" (e.g. allow only exclusive mutable references, similar to Rust's [Mutable references](https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html#mutable-references)).
And then mechanically transform destructive updates to non-destructive ones (see [“Mutable Forest” Memory Model for blockchain specs](https://ethresear.ch/t/mutable-forest-memory-model-for-blockchain-specs/10882) for more details).
E.g.
```
node_state.buffer_blocks[block_hash(block)] = block
```
becomes
```
node_state = node_state.set(
buffer_blocks=pmap_set(
node_state.buffer_blocks,
block_hash(block),
block)
)
```
assuming that the `node_state` is an exclusive reference.
Concerns:
- non-trivial to implement
- semantics might be "surprising" to Python users
---
# Problem: Python semantics is quite complex and not formally defined
## Solution
Use a very limited subset of Python
- `if`
- `for`
- function calls (not method calls)
- `lambda`s
Comments from @ericsson49:
- Python closures are [late binding](https://docs.python-guide.org/writing/gotchas/#late-binding-closures), which can be a problem when translating lambdas (and inner functions) to F/V languages. One solution is to allow access to immutable outer variables only
- method calls can be allowed too, e.g. when it doesn't involve virtual method dispatch
- `pset_add(some_set, some_value)` can be replaced with `some_set.add(some_value)`
- `while` loops should be okay too. In practice, recursion is already present in the 3SF specs, so why exclusing `while` loops?
- [walrus operator](https://peps.python.org/pep-0572/) (`a := expr`) is a big problem for transpilation, though it's not used in specs currently
- an additional suggestion for lambdas is to use them only, when they are immediately inlineable (e.g. `map`, `filter`, `sorted`, `max/min` and similar)
- list/set comprehnsions should be okay too (can desugar them to `map/filter`, albeit it can affect semantics in edge cases (e.g. when using walrus operator))
---
# Problem: Significant parts of the specs are in natural language
## Solution (for "honest validator" part)
Specify the “honest validator” behaviour in code.
A comment from @ericsson49:
> in general, specs may have non-deterministic aspects. But it conflicts with executability of specs (which is a highly desireable property). This might be a reason for specifying parts of specs in prose.
>
> That suggests a new challenge: how to specify non-determinism in python, but keep specs "executable" (e.g. for test generation).
## LLMs and CNLs
Were discussing [Large Language Models](https://en.wikipedia.org/wiki/Large_language_model) and [Controlled Natural Language](https://en.wikipedia.org/wiki/Controlled_natural_language) as potential ways to mitigate the problem.
E.g. use LLM to translate prose to a formal language (can be challenging). Or re-write prose using a formally defined controlled language (challenging too).
Some links:
[SAD](http://nevidal.org/sad.en.html) and [ForTheL](http://nevidal.org/download/forthel.pdf)
[Naproche](https://naproche.github.io/)
---
# Problem: Mixing concerns for external behaviour with concerns for efficiency
## Solution
2 specs at 2 different level of abstractions: high level vs lower-level specs
A comment from @ericsson49
> The high level spec doesn't have to be deterministic, which allows to express more spec parts formally (opposed to natural language), e.g. network layer related specs. A non-deterministic spec can still be used to check traces (and perhaps even to sample test vectors).