Logic VM Design

# Logic VM Design ## Wednesday January 8 - There is a large space of possible design choices that's opening up (do every opcode every step <> FPGA <> some sort of VM thing with lookups; intermediate PODs for intermediate steps) - How do we decide what to do at each each choice? What are the design principles? - Modularity - Minimality: smallest possible set of statements and deductions - Three abstraction levels - Language: Something prolog like; how does the user "read" POD - Security: Given that I have a cryptographically valid POD, is this POD something that I want? - Circuit implementation: do we do a VM vs. FPGA; do we do recursive steps for intermediate calculations ## Tuesday January 7 - Ahmad & Aard discuss POD 2 logic design - Agreed that we are dealing with a strict subset of Prolog - Reviewed current design of system (statements, operations, anchored keys, origins) - Thinking through a minimal spec by refining the data structures we have introduced and/or working backwards from a problem we wish to solve - Follow-up meeting on Thursday - Ahmad incorporates custom deductions into the Oracle POD system (see [branch](https://github.com/ax0/parcnet/tree/custom-statements)). - Not particularly ergonomic (yet). - Examples have been updated to use this. ## Monday January 6 - Checking in on progress on the extended OraclePOD prototype - Update on the experiment to implement ETHdos and the Great Boy POD in prolog - Additional involvement from a few others - Schedule time with Edu to discuss what a more VM-style architecture would look like Aiming to integrate our thinking with the proof systems contributors by Monday January 20 - Code - OraclePOD implementation, with documentation / diagram / summary - What is not specified by the Oracle implementation? - Parameters: max depths, max number of custom ops, max number of so-and-so module - Documentation - Describe the design of a POD2: for a POD2 at rest, explain the components of that data structure and how to interpret them and verify them - Description of circuit implementation for the POD2 executor - For three choices of parameters (high/medium/low), what is the cost in cryptographic operations - estimated number of hash functions - estimated number of signature verifications - estimated number of bit decompositions and bitwidth, etc. - what can be done with these parameters? - exact feature-complete replica of ZuKYC? - ETHdos - Great Boy POD - ZK Ticket PCD ## Thursday December 19 - [current code example](https://github.com/ax0/parcnet/tree/custom-statements) - is_double and ethdos examples for custom deductions - what's remaining? - passing along a collection of custom statements - have to pass the proof of the custom statement (you might want to have multiple Ops that produce intermediate statements that are not public output statements) - next goal: understand prolog and explore what it would look like to express ethdos in a "prolog-native" way - write ethdos "prolog-natively" (figure out what that means) ## Tuesday December 17 - MVP "version 1" - You can define a custom deduction rule and a custom predicate. the deduction rule has a unique ID that is the hash of its string name; same with the custom predicate - a custom deduction rule specifies that you can spit out a new statement with some custom predicate, if some input statements exist in the inputs - we can describe these input statements as tuples of field elements. there are at most NIS input statements for a custom deduction. how to serialize an input statement description for a custom deduction rule. it's a list of field elements: - predicate ID - operand #1 - origin - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) - keyname - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) - operand #2 - origin - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) - keyname - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) - operand #3 - origin - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) - keyname - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) - optional value - variable type (numbered variable or constant) - value of variable (either a wildcard ref or a constant) ## Friday December 13 Ahmad's notes on things to come: https://hackmd.io/@H6I0VscATLi4712DE3VT5w/ByXXvItEJg https://hackmd.io/@H6I0VscATLi4712DE3VT5w/ByEE6BVN1e ## Thursday December 5 Agenda - Goals and non-goals of the first spec - Review the proposed goals written above - Form factor of the first spec - Alex: - Who is the team that will be doing this? // holding in their head the system + its parts - Now it's Brian and Ahmad (with a few part time contributors / trialing contributors) - Which parts of the system have high costs to getting wrong? - Brian: - Upgradeability from day 1 is important (and you can look at the spec). Both the spec, and the code, can be upgraded. (this is because we just wrote the v0 that we were good to throw away) - Someone should be able to read the spec and know what's going on - Extension of what Brian/Ahmad are working on - Implementation / Interoperation - axis 1 - how precise is the spec (and earlier on you may be able to be less so); axis 2 - how flexible is this spec - Opinionatedness - Not trying to build a generic parser - We are opinionated both on what cryptographic operations, as well as how you should operate on this - logicalVM (logic approach vs ) - ahistorical thing vs historical thing - declarative (doeson't care about what program hashes / computation led to this) - what is the logicVM the result of, when you want to achieve X - properties - ahistorical -> history agnostic -> - Suppose two diff ways of arriving at the same pod. That pod does not look different. - Doesn't care about history? Does care? - Consumer of the POD doesn't care about the history - Verifier does not. It just needs to look at the valid proof. - But isn't valid proof of history? - the interpretation of the data doesn't require awareness of how the data was produced. - like sqlquery or datagram - closure (under some opinionated set of the operations) - deterministic - small-byte sized things <-- this is an implementation detail, but what really matters here is composable - Pure - So why not a stack machine? - Output at any point is not interpretable or verifiable - succinct but also enough to understand what this is (interpretable) - "all you get is what you need" - "throw away the stuff that the consumer doesn't need" - zkSQL???? - still needs to append all previous queries? - self describing? - grows with the thing we're trying to express NOT growing with the amount of computation !!!!!!! - "most complete efficient" - O(1) computation, O(n) expressed - "how it's produced is not the same as the instruction" - some stronger property than declarative here. - first order predicate logic (prolog??) -- strict upgrade to what we have - sometimes extension makes things simpler - recursive - for the prolog extension, can we write down x to test against system - ops executor part of this - first order logic + type theory - oracle gadget similarity -- do it with some extension stuff - hashing / recursive verification is what's going to dominate all business logic - spin something up in the prolog system - list of requirements for type theory folks - complete first order logic - operations need to be verifiable in modern cryptography - practical to perform inside a ZKP - ethdos hard - write a new circuit for every operation - have to track a bunch of program hashes (can't succinctly and composably describes) - if vm, don't need new circuit, but need hashes - in the .pl gist - invariance is not that important - cryptography is separate from the logic - but not quite that because you could have a python script with a for loop over pods - "stronger than declarative" - sql + optimizer - compressed previous computation + new one to lead to the most succinct new representation - functional composition s1 o s 2 = s3 o s4. len (opt ()) = len (opt (s3 o s4)) - compactness metric - best possible compression?? - throws out irrelevant things - so does first order predicate logic + optimizer, match this ^ - functional composition gets compressed in the most efficient possible way - logic gives you an angle to where the solution to the most efficient solution, - sequence of computation vs a program - composition of program*****' - same thing going on at PCD level - PCD <- zkSNARKS ## Wednesday December 4 Ahmad's takeaways from yesterday: https://hackmd.io/@H6I0VscATLi4712DE3VT5w/SJddsP0X1e What is the point of this spec? - Tells us - how do we build our first circuit? - Of everything that's being built into the first prototype circuit, what "interfaces" or "function signatures" do we expect not to change--even if we're refactoring code to make things cleaner / more efficient? - Where do we expect that the circuit is "incomplete" - i.e. the circuit implements some restricted subset of functionality of the abstract spec, and we're explicitly calling out those places. - Ensuring that the circuit is tied to a data format that is upgradeable in the ways that we want it to be. Upgradeability can mean breaking or non-breaking changes. - Given a POD2, what is necessary to understand how to verify it? - In the clear - In another circuit - Given a POD2, what is necessary to understand how to interpret it? - As a developer consuming a POD in the clear, I want to check that this POD is valid. I need to know: - How do I derive the appropriate predicate ID / statement type format from each serialized statement? - How do I break down values, strings, other datatypes, etc. into field elements - What sequence of accumulations / hashes do I need to perform on these field elements to end up with an array of public signals that can be verified against the ZKP - As a developer building another circuit that ingests PODs, I want a "recipe" for how to write the verification component of the circuit, and to be able to interpret input statements coming in from different types of POD2s in different ways. - As a developer consuming a POD in the clear, I want to understand how to interpret the POD. I need to know: - If something is serialized as an array of field elements, how do I construct the "intended" data from that - ^We probably don't want for that to be the case. We probably want the thing to just be interpretable - There is some notion of data types. - You can see the POD type, then you can look up for that POD type what the verification key and verification rules are. - Given a couple of different types of POD2s, as a developer you can interpret all of their payloads in the same way --- What is the first thing we're writing down here? Here is a POD2 as might be serialized by some library outputting JSON: ``` { "payload": [ { "predicate": "VALUEOF", "anchoredKeys": [ { "origin": { "originID": "_SELF", "originGadget": "Plonky" }, "keyName": "age" } ], "optionalValue": 26 // any number or string or array of numbers / arrays of strings etc. }, { "predicate": "VALUEOF", "anchoredKeys": [ { "origin": { "originID": "_SELF", "originGadget": "Plonky" }, "keyName": "names" } ], "optionalValue": ["john", "smith", "banana", "karen"] // in this case, we have a list of strings as the optionalValue }, { "predicate": "EQUALS", "anchoredKeys": [ { "origin": { "originID": "4", "originGadget": "Plonky" }, "keyName": "fruit" // can refer to a value that's either a scalar or a vector }, { "origin": { "originID": "3", "originGadget": "Plonky" }, "keyName": "banana" } ], "optionalValue": null }, ], "proof": ..., "gadgetType": { "gadgetID": "Plonky" } } ``` Here are the rules for turning the above into an array of public signals for verification and for interpreting them: - When you turn a statement into a vector of field elements, here's how you do it: - Whenever you see an `optionalValue` in a `VALUEOF` statement, if it's an array of numbers, you assume that each of those numbers is at most $p$ (Goldilocks) (and if one is not then you error), and then you hash them all together to get some hash value. - So that means that our particular circuit can't handle the *full* abstract spec, but only a restricted subset right now, but that's what we do because circuits have to have bounds on things like array size / bidwidth / compute / etc. - Whenever you see `EQUALS` as a predicate, replace that with the field element `2` in the tuple representing the statement. - Whenever you see the `EQUALS` predicate, you know that the corresponding values to the anchoredKeys can be *either* scalars or vectors, if they are vectors they are vectors of mod $p$ elements, ... - ... ## Tuesday December 3 Main points - computational vs constraint view - now need to output 2 statements potentially - 32 bit? - 32 bit or "just integers abstractly" - generic to number systems - user-defined deductions - minimal set needed to do all gpc ops? Preliminary predicate list - Goal: People should be able to use the POD2 spec in such a way where they can look at a particular POD2, and know how to verify it given the spec (or the spec tells them where to look in some sane way) What does a predicate specify? - How to deserialize the statement (which is a tuple of some integers) such that it can be passed into a ZK circuit - Or do you pass the hash along with the statements? predicates - valueOfScalar32 <- 1 - valueOfScalar32x2Poseidon <- 2 - valueOfPoseidonMerkelizedList32 <- 3 - valueOfPoseidonMerkelizedList64 <- 4 - valueOfPoseidonHashedList <- 5 - valueOfSHAHashedList <- 83 - valueOfList <- 129 - isEqual - isEqual32 - isEqual64 - in general, do we want the 32, 64, etc. versions? - isGT - isLT - compare - 0 if LT, 1 if EQ, 2 if GT - contains - merkle vs. list? - append - concat - indexof - sum - prod - max - CRYPTOGRAPHIC PRECOMPILES - isPoseidonHashOf - isSHA256HashOf - isECDSAPriv2Pub - isEdDSAPriv2Pub - isEd25519Priv2Pub Preliminary deduction rules list - Each of the above, from entries? - Anything we want to do for transitive stuff: GT -> NEQ, GT and GT -> another GT Can we do everything we want to do with this? - zukyc - ethdos - goodboy pod Other things to look at - Custom deduction rules - In-circuit recursion