Try   HackMD

Fully non-deterministic Noir

Afterthought

After writing up this spec, I would also like to question/justify the idea as to whether we need this. My concerns are around file access for things like state management and if there are any cases where a user may, due to ignorance allow one to execute arbitrary code in a non-sandboxed way

Problem statement

Introduction

Noir is a domain specific language for writing circuits. Non-deterministic behaviour is useful as they allow you to prove statements in a more efficient way. For example, when doing an inverse, one can either deterministically use a inversion algorithm, or non-deterministically supply the inverse and verify that it is the inverse, since we know that the inverse of a number multiplied by that number equals 1, except 0. The same applies for other operations like square root.

Another form of non-determinism is state fetching.

In general we characterise non-determinism as a method to fetch some value without needing that value to be constrained to any other value in the circuit.

This is ofcourse dangerous because I (Prover) could fetch a random value and if I do not apply the correct constraints to said value to ensure that it indeed the honest value, then this value would be known as unconstrained or mis-constrained.

Noir

In Noir, we would like to introduce this concept of non-determinism by allowing users to manipulate values and or fetch random state values. This document will specify a strategy for solving this problem.

Structure

We will split this document into two sections:

Section 1: How we will modify ACIR/ACVM

Section 2: How we will modify Noir

ACVM/ACIR Modifications

Directives

Directives in ACVM allow us to do very specific defined non-deterministic behaviour.

Inversion Directive

Lets look at the inversion directive as an example.

Directive::Inverse{input_index : WitnessIndex, output_index : WitnessIndex}

Since ACVm holds a map from witness indices to field elements. Processing this directive would involve

  • Looking up the input_index in the map and getting the concrete value.
  • Computing an inversion, then placing the inversed value into the map under the output_index key.

Known Directives

We have a few other directives which ACVM apriori knows how to deal with, we call these known directives. Another example of a known directive, is our bit decomposition directive.

This document does not propose to change these, instead we will add a new directive type.

General directives

These are directives which generalise the operation being done. We could remove known directives and only use general directives however, this way allows you to quickly figure out if an ACIR uses only known directives.

The proposed ACVM format for a general directive is:

Directive::general(name: String, input_indices : Vec<WitnessIndex>, output_indices : Vec<WitnessIndex>)

It is easy to see how these generalise known directives as one could replace the inversion directive by passing name = "inversion" and then one element each to the input and output indices.

Implementation of General Directives

In the case of known directives, ACVM knew how to handle it. However, for general directives, the implementation of them is not known. So if I pass the name compute_foo as a name to the general directive, it is not clear where one should get the implementation details of compute_foo.

For this we suggest adding a new map to ACVM, namely one that maps strings to the location of the implementation of the directive.

ie. Map<String, LocationOnDisk>

Hence, we map function names to the location of their source code on disk. This map can be called the general directive binary map.

There is a somewhat open question here as to what that source code is written in.

We have initially chosen C code due to C having a stable ABI. After conversations with Mike, WASM is also a viable option. WASM is a good option because we can sandbox code and simply disallow operations like file access.

The C/WASM code

We expect the binary to expose one method, namely:

fn call_functions(name : String, inputs : Vec<Field>, outputs : Vec<Field>)

Notice the similarities with the definition of the general directives. Now, this call_functions method may not look exactly like this in C code, due to us needing to pass a pointer and the length of the vector, but it is close enough and that is an implementation detail.

Note that we have multiple call_functions methods. For each General::Directive we encounter, we need to go into the general directives map and check for the location of the corresponding binary.

Code Blocks

General directives are expressive, however they come with friction since their implementations need to be in some external file in C/WASM.

We therefore propose another strategy to inline code into blocks. A new directive is needed:

Directive::code(inputs : Vec<WitnessIndex>, outputs : Vec<WitnessIndex>, code_block: String)

The question is then, what is the contents of code_block. Theoretically, we can make this any language and have acvm interpret it. We need to devise a strategy to copy the inputs into the parts of the code that needs it. The naive approach is as follows:

  • Enforce that variable declarations in the code cannot start with an underscore.
  • Enforce that referencing any of the inputs should start with an underscore. Hence _0 would reference the first input.
  • Tokenise the code and replace all underscores with their repspective witness values
  • Run an interpreter on the code_block
  • Save necessary values (somehow) to their output indices we may need to enforce syntax in the code_block to allow us to detect this

Summary

We have now defined two ways that allow ACVM to execute code blocks which do not apply constraints.

C Code/WASM

In one strategy, the code block is an external C file, which is extremely dangerous because one could download code that does arbitrary code execution. Using a WASM file mitigates most of this since it is sandboxed. I would personally lobby for only using wasm or sandboxed environments, however this means that one cannot easily do database fetches.

Lets investigate the last claim.
We should also investigate in what context will one need these functions.

Inline Code blocks

The second strategy is not as powerful, however it is also less dangerous, because one can see the code that is being ran inside of the Noir function; there is no external dependency. Also depending on how we do it, we may also have a lot of control over what is allowed in those blocks.

Noir

Now we specify what changes are needed on the Noir level in order to allow developers to use the newly available ACVM directive types.

Known Directives

In order to access known directives, we use a builtin_directive attribute:

#[builtin_directive("inverse")]
fn inverse(input: Field) -> Field;

How it is compiled

The above is used when we want to use a directive that ACVM knows about. In the compiler, noir will simply create a Directive::Inverse directive. This is similar to what is already happening in Noir. The difference being that, previously developers were not able to access these builtin directives.

Extern Libraries

An external library will declare a set of non-deterministic functions in a noir header file, alongside a C library which will specify how acvm should handle these non-deterministic functions in Noir.

These external libraries will be what we pass to acvm when it tries to find the implementation for a general directive.

We enforce the folder structure to be the following:

As mentioned, the header.nr file will hold extern functions, which are declared using an extern keyword.

An example of this is:

extern fn foo(a : Field, b : [Field;10]) -> Field;

The extern keyword is used to denote that this functions implementation is external to the noir language and that it is code that does not create constraints.

Calling Extern functions

Nargo

// Nargo.toml
my_extern_lib = {git = "github.com/foo/extern", extern = true}

First we add this external library as a dependency in our nargo.toml, with a boolean flag to state that it is indeed an extern library. We will see that this flag will then allow us to collect metadata that we will need later on.

Module Code

In ones source file, you would write the following:

// main.nr
use my_extern_lib;

fn main(a : Field) {
    let c = unsafe {my_extern_lib::foo(a)};
}

Note the usage of the unsafe keyword, since calling this method does not apply any constraints, we want this to be a deliberate choice by the user.

How it is compiled

Directive::general{
    name: "my_extern_lib::foo"
    inputs : vec![a, b[0], b[1], b[2], .., b[9]]    
    outputs : vec![c]
}

In the above code, the directive name is standardised to be the package name concatenated with the method name. We will only allow a single header extern file, so its not possible to have {package_name}::{module_name}::{function_name}.

Linking external libraries to locations on disk

We now focus on the fact that we need to supply a map that links the names in general directives to locations on Disk.

The idea is as follows, in our nargo.toml, we have specified whether a library is either extern or not. Since nargo also knows where the packages are stored, we can create a map from the package name to the location on Disk.

The overall communication between nargo and acvm would look like the following:

nargo_meta = Map<String, LocationOnDisk>
location = nargo_meta[{package_name}]
call_library(field_type, lib_path, inputs, len_input,&mut outputs, len_output);

Notes

The rough draft that we used to play around with ideas is stored here for prosperity: https://hackmd.io/uaMDShRYRtCVQeOwOY-syg?both