Notes to "How to Write Formally Verifiable Specifications in Python"

Partial notes to How to Write Formally Verifiable Specifications in Python

The talk slides

Problem: Mutable objects

Due to aliasing, writing to a reference may implicitly modify values obtained via other references. In the F/V context, it invalidates what we know about affected memory locations. Tracking aliasing and re-establishing properties can significantly complicate reasoning.

Solution

Use immutable objects (non-destructive updates). Can be facilitated by pyrsistent library.

Concerns:

non-destructive updates are less readable, especially in case of nested updates (example)
MyPy typechecking fails for PRecord objects (can be fixed with a custom MyPy plugin)

Problem: Python semantics is quite complex and not formally defined

Solution

Use a very limited subset of Python

if
for
function calls (not method calls)
lambdas

Comments from @ericsson49:

Python closures are late binding, which can be a problem when translating lambdas (and inner functions) to F/V languages. One solution is to allow access to immutable outer variables only
method calls can be allowed too, e.g. when it doesn't involve virtual method dispatch
- pset_add(some_set, some_value) can be replaced with some_set.add(some_value)
while loops should be okay too. In practice, recursion is already present in the 3SF specs, so why exclusing while loops?
walrus operator (a := expr) is a big problem for transpilation, though it's not used in specs currently
an additional suggestion for lambdas is to use them only, when they are immediately inlineable (e.g. map, filter, sorted, max/min and similar)
list/set comprehnsions should be okay too (can desugar them to map/filter, albeit it can affect semantics in edge cases (e.g. when using walrus operator))

Problem: Significant parts of the specs are in natural language

Solution (for "honest validator" part)

Specify the “honest validator” behaviour in code.

A comment from @ericsson49:

in general, specs may have non-deterministic aspects. But it conflicts with executability of specs (which is a highly desireable property). This might be a reason for specifying parts of specs in prose.

That suggests a new challenge: how to specify non-determinism in python, but keep specs "executable" (e.g. for test generation).

LLMs and CNLs

Were discussing Large Language Models and Controlled Natural Language as potential ways to mitigate the problem.

E.g. use LLM to translate prose to a formal language (can be challenging). Or re-write prose using a formally defined controlled language (challenging too).

Some links:
SAD and ForTheL
Naproche

Problem: Mixing concerns for external behaviour with concerns for efficiency

Solution

2 specs at 2 different level of abstractions: high level vs lower-level specs

A comment from @ericsson49

The high level spec doesn't have to be deterministic, which allows to express more spec parts formally (opposed to natural language), e.g. network layer related specs. A non-deterministic spec can still be used to check traces (and perhaps even to sample test vectors).

Notes to "How to Write Formally Verifiable Specifications in Python"

Problem: Mutable objects

Solution

More involved solution

Problem: Python semantics is quite complex and not formally defined

Solution

Problem: Significant parts of the specs are in natural language

Solution (for "honest validator" part)

LLMs and CNLs

Problem: Mixing concerns for external behaviour with concerns for efficiency

Solution

Read more

Fork Choice compliance testing (initial phase)

2024 Research topics

Fork Choice implementation testing framework

Pyspecs to Dafny transpiler