# Fully non-deterministic Noir ## Afterthought After writing up this spec, I would also like to question/justify the idea as to whether we need this. My concerns are around file access for things like state management and if there are any cases where a user may, due to ignorance allow one to execute arbitrary code in a non-sandboxed way ## Problem statement **Introduction** Noir is a domain specific language for writing circuits. Non-deterministic behaviour is useful as they allow you to prove statements in a more efficient way. For example, when doing an inverse, one can either deterministically use a inversion algorithm, or non-deterministically supply the inverse and verify that it is the inverse, since we know that the inverse of a number multiplied by that number equals 1, except 0. The same applies for other operations like square root. Another form of non-determinism is state fetching. In general we characterise non-determinism as a method to fetch some value without _needing_ that value to be constrained to any other value in the circuit. This is ofcourse dangerous because I (Prover) could fetch a random value and if I do not apply the correct constraints to said value to ensure that it indeed the honest value, then this value would be known as unconstrained or mis-constrained. **Noir** In Noir, we would like to introduce this concept of non-determinism by allowing users to manipulate values and or fetch random state values. This document will specify a strategy for solving this problem. ## Structure We will split this document into two sections: **Section 1**: How we will modify ACIR/ACVM **Section 2**: How we will modify Noir # ACVM/ACIR Modifications ## Directives Directives in ACVM allow us to do very specific defined non-deterministic behaviour. ### Inversion Directive Lets look at the inversion directive as an example. ```rust Directive::Inverse{input_index : WitnessIndex, output_index : WitnessIndex} ``` Since ACVm holds a map from witness indices to field elements. Processing this directive would involve - Looking up the `input_index` in the map and getting the concrete value. - Computing an inversion, then placing the inversed value into the map under the `output_index` key. ## Known Directives We have a few other directives which ACVM apriori knows how to deal with, we call these known directives. Another example of a known directive, is our bit decomposition directive. This document does not propose to change these, instead we will add a new directive type. ## General directives These are directives which generalise the operation being done. We could remove known directives and only use general directives however, this way allows you to quickly figure out if an ACIR uses only known directives. The proposed ACVM format for a general directive is: ```rust Directive::general(name: String, input_indices : Vec<WitnessIndex>, output_indices : Vec<WitnessIndex>) ``` > It is easy to see how these generalise known directives as one could replace the inversion directive by passing name = "inversion" and then one element each to the input and output indices. ## Implementation of General Directives In the case of known directives, ACVM knew how to handle it. However, for general directives, the implementation of them is not known. So if I pass the name `compute_foo` as a name to the general directive, it is not clear where one should get the implementation details of `compute_foo`. For this we suggest adding a new map to ACVM, namely one that maps strings to the location of the implementation of the directive. ie. `Map<String, LocationOnDisk>` Hence, we map function names to the location of their source code on disk. This map can be called the **general directive binary map**. > There is a somewhat open question here as to what that source code is written in. We have initially chosen C code due to C having a stable ABI. After conversations with Mike, WASM is also a viable option. WASM is a good option because we can sandbox code and simply disallow operations like file access. #### The C/WASM code We expect the binary to expose one method, namely: ```rust fn call_functions(name : String, inputs : Vec<Field>, outputs : Vec<Field>) ``` Notice the similarities with the definition of the general directives. Now, this `call_functions` method may not look exactly like this in C code, due to us needing to pass a pointer and the length of the vector, but it is close enough and that is an implementation detail. > Note that we have multiple call_functions methods. For each General::Directive we encounter, we need to go into the general directives map and check for the location of the corresponding binary. ## Code Blocks General directives are expressive, however they come with friction since their implementations need to be in some external file in C/WASM. We therefore propose another strategy to inline code into blocks. A new directive is needed: ```rust Directive::code(inputs : Vec<WitnessIndex>, outputs : Vec<WitnessIndex>, code_block: String) ``` The question is then, what is the contents of code_block. Theoretically, we can make this any language and have acvm interpret it. We need to devise a strategy to copy the inputs into the parts of the code that needs it. The naive approach is as follows: - Enforce that variable declarations in the code cannot start with an underscore. - Enforce that referencing any of the inputs should start with an underscore. Hence `_0` would reference the first input. - Tokenise the code and replace all underscores with their repspective witness values - Run an interpreter on the code_block - Save necessary values (somehow) to their output indices -- we may need to enforce syntax in the code_block to allow us to detect this # Summary We have now defined two ways that allow ACVM to execute code blocks which do not apply constraints. **C Code/WASM** In one strategy, the code block is an external C file, which is extremely dangerous because one could download code that does arbitrary code execution. Using a WASM file mitigates most of this since it is sandboxed. I would personally lobby for only using wasm or sandboxed environments, however this means that one cannot easily do database fetches. > Lets investigate the last claim. > We should also investigate in what context will one need these functions. **Inline Code blocks** The second strategy is not as powerful, however it is also less dangerous, because one can see the code that is being ran inside of the Noir function; there is no external dependency. Also depending on how we do it, we may also have a lot of control over what is allowed in those blocks. # Noir Now we specify what changes are needed on the Noir level in order to allow developers to use the newly available ACVM directive types. ## Known Directives In order to access known directives, we use a `builtin_directive` attribute: ```rust #[builtin_directive("inverse")] fn inverse(input: Field) -> Field; ``` **How it is compiled** The above is used when we want to use a directive that ACVM knows about. In the compiler, noir will simply create a `Directive::Inverse` directive. This is similar to what is already happening in Noir. The difference being that, previously developers were not able to access these builtin directives. ## Extern Libraries An external library will declare a set of non-deterministic functions in a noir header file, alongside a C library which will specify how acvm should handle these non-deterministic functions in Noir. These external libraries will be what we pass to acvm when it tries to find the implementation for a general directive. We enforce the folder structure to be the following: - header.nr - C code As mentioned, the header.nr file will hold extern functions, which are declared using an extern keyword. An example of this is: ```rust extern fn foo(a : Field, b : [Field;10]) -> Field; ``` The extern keyword is used to denote that this functions implementation is external to the noir language and that it is code that does not create constraints. ### Calling Extern functions #### Nargo ```rust // Nargo.toml my_extern_lib = {git = "github.com/foo/extern", extern = true} ``` First we add this external library as a dependency in our nargo.toml, with a boolean flag to state that it is indeed an extern library. We will see that this flag will then allow us to collect metadata that we will need later on. #### Module Code In ones source file, you would write the following: ```rust // main.nr use my_extern_lib; fn main(a : Field) { let c = unsafe {my_extern_lib::foo(a)}; } ``` Note the usage of the **unsafe** keyword, since calling this method does not apply any constraints, we want this to be a deliberate choice by the user. **How it is compiled** ```rust Directive::general{ name: "my_extern_lib::foo" inputs : vec![a, b[0], b[1], b[2], .., b[9]] outputs : vec![c] } ``` In the above code, the directive name is standardised to be the package name concatenated with the method name. We will only allow a single header extern file, so its not possible to have ``{package_name}::{module_name}::{function_name}``. **Linking external libraries to locations on disk** We now focus on the fact that we need to supply a map that links the names in general directives to locations on Disk. The idea is as follows, in our nargo.toml, we have specified whether a library is either extern or not. Since nargo also knows where the packages are stored, we can create a map from the package name to the location on Disk. The overall communication between nargo and acvm would look like the following: ```rust nargo_meta = Map<String, LocationOnDisk> location = nargo_meta[{package_name}] call_library(field_type, lib_path, inputs, len_input,&mut outputs, len_output); ``` ## Notes The rough draft that we used to play around with ideas is stored here for prosperity: https://hackmd.io/uaMDShRYRtCVQeOwOY-syg?both