Try   HackMD

Notes to "How to Write Formally Verifiable Specifications in Python"

Partial notes to How to Write Formally Verifiable Specifications in Python

The talk slides


Problem: Mutable objects

Due to aliasing, writing to a reference may implicitly modify values obtained via other references. In the F/V context, it invalidates what we know about affected memory locations. Tracking aliasing and re-establishing properties can significantly complicate reasoning.

Solution

Use immutable objects (non-destructive updates). Can be facilitated by pyrsistent library.

Concerns:

  • non-destructive updates are less readable, especially in case of nested updates (example)
  • MyPy typechecking fails for PRecord objects (can be fixed with a custom MyPy plugin)

More involved solution

Allow mutatable objects, but only when it's "safe" (e.g. allow only exclusive mutable references, similar to Rust's Mutable references).

And then mechanically transform destructive updates to non-destructive ones (see “Mutable Forest” Memory Model for blockchain specs for more details).

E.g.

node_state.buffer_blocks[block_hash(block)] = block

becomes

node_state = node_state.set(
        buffer_blocks=pmap_set(
            node_state.buffer_blocks,
            block_hash(block),
            block)
    )

assuming that the node_state is an exclusive reference.

Concerns:

  • non-trivial to implement
  • semantics might be "surprising" to Python users

Problem: Python semantics is quite complex and not formally defined

Solution

Use a very limited subset of Python

  • if
  • for
  • function calls (not method calls)
  • lambdas

Comments from @ericsson49:

  • Python closures are late binding, which can be a problem when translating lambdas (and inner functions) to F/V languages. One solution is to allow access to immutable outer variables only
  • method calls can be allowed too, e.g. when it doesn't involve virtual method dispatch
    • pset_add(some_set, some_value) can be replaced with some_set.add(some_value)
  • while loops should be okay too. In practice, recursion is already present in the 3SF specs, so why exclusing while loops?
  • walrus operator (a := expr) is a big problem for transpilation, though it's not used in specs currently
  • an additional suggestion for lambdas is to use them only, when they are immediately inlineable (e.g. map, filter, sorted, max/min and similar)
  • list/set comprehnsions should be okay too (can desugar them to map/filter, albeit it can affect semantics in edge cases (e.g. when using walrus operator))

Problem: Significant parts of the specs are in natural language

Solution (for "honest validator" part)

Specify the “honest validator” behaviour in code.

A comment from @ericsson49:

in general, specs may have non-deterministic aspects. But it conflicts with executability of specs (which is a highly desireable property). This might be a reason for specifying parts of specs in prose.

That suggests a new challenge: how to specify non-determinism in python, but keep specs "executable" (e.g. for test generation).

LLMs and CNLs

Were discussing Large Language Models and Controlled Natural Language as potential ways to mitigate the problem.

E.g. use LLM to translate prose to a formal language (can be challenging). Or re-write prose using a formally defined controlled language (challenging too).

Some links:
SAD and ForTheL
Naproche


Problem: Mixing concerns for external behaviour with concerns for efficiency

Solution

2 specs at 2 different level of abstractions: high level vs lower-level specs

A comment from @ericsson49

The high level spec doesn't have to be deterministic, which allows to express more spec parts formally (opposed to natural language), e.g. network layer related specs. A non-deterministic spec can still be used to check traces (and perhaps even to sample test vectors).