owned this note
owned this note
Published
Linked with GitHub
# Anoma Architecture Scope and Draft
Working title:
_Anoma: scale-free consensus for scale-free cryptographic accounting_
## Abstract
The protocol in this document describes the functioning of a distributed, privacy-preserving (DEFINE + Link) state machine with scale-free fractal consensus. Participating nodes can cheaply verify that their local view of global history is consistent and correct, but must make choices about which other nodes to rely upon for data storage and consensus provisioning (DEFINE + Link). The primary expected application of this protocol is the provisioning of a scale-free (DEFINE + Link) cryptographic credit (DEFINE + Link) system, which provides a local, quantifiable view of entanglement (DEFINE + Link), a measure of incentive-alignment as well as gains from cooperation (or, correspondingly, loss to defection) (TODO: Quantify). This measure can be fed back into these local choices made by participating nodes, allowing the gestalt system to use high entanglement to reduce computational resource expenditure in interactions.
Our approach provides:
- local finality
- partition tolerance under application-specific conflict resolution rules
- resource usage that scales with the required level of validation
- privacy that scales with the inverse of trust
- robustness towards heterogenous networking conditions
The credit layer provides:
- incentive compatibility (tbd on credit layer)
Considered relative to existing protocols, in this proposed architecture we unify several notions previously considered to be separate, such that distinctions in kind become instead distinctions in particular content, with a scale-free architectural topology. Specifically,
- We unify the concepts of "[message](#Message-Representation)" and "block (DEFINE)" - all messages are cryptographically hash-chained, and blocks are simply a particular type (content) of message
- We unify the privacy-preserving double-spend prevention technique of Zerocash/Zexe/Taiga - nullifiers, uniquely binding to note commitments, unlinkable and calculable with secret information - with the double-spend prevention required of distributed database systems / blockchains (preventing multiple valid conflicting descendents of the same block), recasting the mechanism as a distributed enforcement system for a linear resource logic.
- Relatedly, we unify the concepts of "note" (from Taiga), "UTXO" (from Bitcoin / parallel VMs), and "message"/"block" (as above).
- We unify various concepts of state-machine-level cryptographic identity -- public-key-based accounts, smart-contract accounts, BFT light clients, threshold keys -- and network-level cryptographic identity into a singular information-theoretic duality of _external identity_ and _internal identity_, the fundamental abstraction on top of which the protocol is built.
- We unify the concepts of _message_ and _state_, in that each message commits to its own history, and that there is no state other than can be computed from a set of messages at a particular point in partially-ordered logical time.
This unification requires a protocol construction with primitive operations which may seem expensive - e.g. sending a message requires creating a ZKP - but performance can be recovered (trading off for privacy or scaling in particular cases) by using trivial implementations of the primitives (e.g. non-ZK non-succinct proof which is just like a regular message).
## Scope
This document describes:
- the state/message model
- properties and validation of state logs
- higher order ad-hoc consensus by merging state logs
- how scale freeness works (no semantic breakage)
- privacy properties at different scales
- the graduated guarantees the protocol provides
Maybe:
- BGP(-ish) routing via consensus providers
This document specifies interfaces and properties for, but _does not_ describe in detail:
- data availability layer
- physically-addressed gossip network
- recursive succinct zero-knowledge proof scheme
- heterogenous bft consensus (can we use homogeneous?)
- anonymization of network traffic (e.g. mixnet)
# TODO
- Work out scaling proeperties of:
- (sparse) merkle trees
- sharding observation trees by identities
- sharding nullifier sets by time periods
- Import Diagrams
- Clean up Resource Logic Section
- Credit
- Privacy:
- Cryptographic
- Statistical (deferred)
- Predicate Represention Structure
- Commitment Storage Sharding
# Naming Suggestions / Glossary
Credit: TODO, better term candidates: Kudos,
Topology: TODO, better term
Message; Candidates: Event, Transaction, Minimal Atomic State Transition
Linearity Violation: TODO, is this term overloaded already?
## Abbreviations
IF = Ideal Functionality
LL = Linear Logic
CP = Consensus Provider
# Ideal Functionality
## Functions
Observers are assumed to have the ability to store data (privately via locality), perform compute, and send and receive messages over an arbitrary, asynchronous physical network (~ Turing machine with a network).
This protocol provides to an observer the ability to:
- Create cryptographically-content-addressed history
- e.g. issue messages from a private/public keypair that they create from nothing (LL: initial object)
- send messages to the terminal identity, after which they can never be used again (LL: terminal object)
- Validate cryptographically-content-addressed history that the observer learns about from elsewhere
- Validation covers correct instantiation (cryptographically-content-addressed), linear consumption from instantiation, and validity of predicates (state transition rules)
- Privacy provisions allow for any arbitrary part of this history known by another party to be revealed by that party to this observer while retaining all of these validity guarantees
- e.g. the party can reveal to the observer a message at the end of some chain of history and prove that the history is valid and terminated in that message - this message (if desired) can then be built upon by the observer who now knows it
- knowledge of a message is a necessary but not sufficient condition in order to consume / create history on top of it
- Merge two cryptographically-content-addressed histories and ensure that the linearity condition is not violated in the output
- Checks that no message was consumed to create different descendents in both histories (and as descendents commit to their ancestors, also checks that no message was consumed alongside different other messages in both histories)
- Can create a message which witnesses many prior histories (but does not consume them), and restricts its descendents to never witness conflicting histories from ones already witnessed.
- e.g. used for block production by consensus providers
These functions are scale-free, in that the compute costs do not depend on the complexity of the particular histories in question. Implementation choices for particular primitives (identity, proof systems) can be made in order to trade between different computational and communication costs and who bears them.
> Note: what is the minimal set of cryptographic assumptions needed to instantiate this protocol? I think it is just one-way hash functions ~ i.e. P /= NP.
## Constraints
### Scaling
1) Nodes who want to only participate as transaction parties should not process something more than a constant size in respect to the transactions they are a part of: e.g. do reads and writes w/o the need to process data (compute/bandwidthw/memory) worse than linear in the size of the reads and writes.
2) Consensus providers should only need to perform computation linear in the number of messages (size of the histories) they want to provide history linearity validation, or ordering for, independent of message structure and size.
3) The "prune bad branches" strategy of merging after linearity violation should not be more expensive than linear in the length of histories.
### Topology
1) Processes should be locality favoring, unless explicitly specified: Nodes in a subnetwork should not need to communicate outside of it by default.
### Privacy
1) Any node with write permission to a namespace suffix can produce a change in it, without revealing it to anyone outside the subnetwork, but produce a zk validity proof which they can verify.
2) Nodes need to be able to prove that they did not violate linearity, while preserving data and function privacy.
# Protocol
The protocol is defined as a function `update`, run locally, which takes the current local state & a received message, and returns an updated state and a possibly empty set of messages to send. We assume that nodes receive messages in a total order (parallelism in implementation for parallel receives is possible as an optimisation).
```haskell
update :: State -> Msg -> (State, {Msg})
```
where `State` is the local state, comprised of a set of previously seen and validated messages.
The protocol also defines an initial state $s_{0}$, which is the empty set.
The protocol defines this `update` function on top of a layered set of abstractions, which we describe henceforth:
- _Identity_
- _Network_
- _Physical DAG_
- _Logical DAGs_
- _Validity predicates_
- _Linearity_
- _Privacy_
- _Consensus_
# Identity Layer
The base abstraction of the protocol is knowledge-based identity, where identity of observers is defined on the basis of whether or not they know some secret information (to derive the Internal Identity).
## Basic assumptions & definitions
- `hash` is a collision-resistant one-way function
- We assume a shared execution semantics, such that functions have a canonical serialisation. (We do not assume that equivalent functions have the same serialisation, only that functions have a serialisation which is understood and can be executed by participating observers)
## Identity
Identity in the protocol is defined by a cryptographic interface. Observers can use private information (likely randomness) to create an internal identity, from which they can derive an external identity to which it corresponds. The external identity is a name which can be shared with other parties. The observer who knows the internal identity can sign messages, which anyone who knows the external identity can verify, and anyone who knows the external identity can encrypt messages which the observer can decrypt. This identity interface is independent of the particular cryptographic mechanisms, which may vary.
### External Identity
Each External Identity has the canonical representation:
`hash(verify', encrypt')` with:
- `verify'(msg) -> bool` to verify the originator of a message
- `encrypt'(msg) -> cyphertext` to encrypt a message which any recipient who knows the internal identity can decrypt
For example, `key` can be a public key in a standard asymmetric public-key encryption scheme, where `verify'` and `encrypt'` are curried with the key as `verify(key)` and `encrypt(key)`.
> Note: consider requiring ZKP that someone knows internal identity s.t. for some (a random?) `m` they can sign and decrypt it. (TODO - cryptographers)
### Internal Identity
The Internal Identity is constituted by knowledge of the following functions:
- `sign'(data) -> signed data`
- `decrypt'(cyphertext) -> plaintext`
such that any `sign'`-ed message is accepted by `verify'` and any `encrypt'`-ed message is opened by `decrypt'`.
For example, these can be the signature generation and decryption functions in a standard asymmetric public-key encryption scheme, where `sign'` and `decrypt'` are curried with the secret as `sign(secret)` and `decrypt(secret)`.
### Special Identities
To illustrate the generality we can come up with the following special keys (~ LL: initial and terminal objects with respect to information).
#### "True / All"
Anyone can sign and decrypt (`verify'` returns true and `encrypt'` returns the plaintext). No secret knowledge is required, so all observers can take on this identity.
#### "False / None"
No one can sign or decrypt (`verify'` returns false and `encrypt'` returns empty string). No secret knowledge exists that fulfills these requirements, so no observer can take on this identity.
# Network
Observers can send messages to each other by `send(identity, msg)` where `identity` is an external identity, and they can handle received messages with some `onRecv(msg)` (to which messages addressed to them will be sent). We assume an asynchronous physical network in the general case, where liveness w.r.t. some message set and some observers will require eventual receipt of all messages in the set by the observers in question.
## Messages
A `Message` is the lowest layer type, sent around between nodes. `Message` subtypes include `Observation`s, which capture partial ordering information used to craft the physical DAG, `Storage` read/write requests and responses for the distributed content-addressed data storage layer, `Compute` requests and responses for the distributed content-addressed compute cache layer, and `Network` P2P metadata requests and responses for gossiping external identity / IP associations, routing information, uptime information, etc. (full details TBD).
## Storage
The distributed content-addressed data storage layer is responsible for providing a very simple read/write storage API: nodes, voluntarily, elect to store data blobs, content-addressed by hash. Nodes can ask other nodes to retrieve data with a `StorageRead` call, providing the hash, and nodes can ask other nodes to store data with a `StorageWrite` call, providing the data as a binary blob, where the address at which to store the data is calculated with the standard `hash`. Upon retrieving data with a `StorageRead` call, nodes can check that they received the correct data by checking that `hash(data)` is equal to the address they requested data for.
```=haskell
data StorageRead = StorageRead {
address :: Hash
}
```
```=haskell
data StorageWrite = StorageWrite {
data :: ByteString
}
```
Storage read and write requests are optionally signed - the signature is not required for data integrity, but it may be useful for anti-DoS and proof-of-retrievability.
For now, this layer does not deal with erasure coding, data storage incentives, or other sharding/scaling schemes. The entanglement graph information may be sufficient for highly entangled nodes to safely store a certain amount of data for each other for free (and this is very efficient as compared to more complex schemes). Future versions of the storage layer can support erasure coding schemes "under the hood" of this basic interface, allow nodes to automatically store data for untrusted nodes who pay them some sufficiently valuable credit, and perform more complex operations under the hood, but the basic interface from the perspective of other layers should be able to remain roughly consistent with this very simple model.
This data storage layer intentionally does not feature any sort of access control or privacy semantics - data which is expected to be kept private should be encrypted before it is stored on the storage layer, and we rely on the encryption to provide privacy to the appropriate parties. It may sometimes be helpful for data to be automatically deleted "by default" after awhile - this can be achieved in the incentive case by ceasing payment, and in the high-entanglement case by internally tracking who requested data to be stored and deleting it at their request.
Storage is generally unordered and observations of storage operations performed are mostly not expected to be included in the physical DAG, although they can occaisionally be if certain nodes wish to track metadata.
## Compute
The distributed content-addressed compute layer is responsible for providing a simple API for delegating computation: nodes, voluntarily, elect to perform computations for other nodes, content-addressed by a hash of the data being computed over and a predicate relation which must be satisfied over the data and computational result.
We assume a verifiable computation scheme, e.g., subject to cryptographic _or trust_ assumptions, nodes have access to the following interface:
```
type Proof
prove :: a -> b -> (a -> b -> Bool) -> Proof
verify :: a -> b -> (a -> b -> Bool) -> Proof -> Bool
```
s.t. `verify(a, b, predicate, prove(a, b, predicate)) = 1` iff. `predicate a b = 1`, and `verify(_, _, _, _) = 0` otherwise, except with negligible probability (no false positives).
This can be instantiated in different ways:
- Trivially, but without succinctness, as `verify(a, b, predicate, _) = predicate a b`.
- With a honesty assumption, with succinctness, by `verify(a, b, predicate, proof)` checking a signature by a known external identity over `(a, b, predicate)` in `proof` (basically attesting to `predicate a b = 1`). As the computation is still verifiable, a signer of `(a, b, predicate)` where `predicate a b = 0` could be held accountable by anyone else who later performed the computation.
- With a non-interactive succinct proof scheme (e.g. SNARK/STARK), with both verifiability and succinctness subject to the scheme-specific cryptographic assumptions.
Global consensus on the verifiable computation scheme is not required, but nodes must agree on a particular scheme to use for a particular computation operation.
```=haskell
data ComputeRead = ComputeRead {
address :: Hash,
predicate :: Hash -> ByteString -> Bool
}
```
```=haskell
data ComputeWrite = ComputeWrite {
result :: Hash,
proof :: ByteString
}
```
Compute read and write requests are optionally signed - the signature is not required for compute integrity (except as in the proof), but it may be useful for anti-DoS and proof-of-retrievability.
Upon retrieving a result with a `ComputeRead` call, a node can check that they received the correct result by checking that `verify(address, result, predicate, proof)` is true. If the underlying scheme is succinct, this check is constant-time in the complexity of the predicate - and we further assume that the input and output data of the computation can be addressed by hash, then specific elements of the result could be retrieved & verified later with Merkle proofs or similar.
Computation can be cached in the storage layer at `hash(scheme, address, predicate)`, by storing `result` and `proof`, and nodes interested in the same computation verifiable with the same scheme can just fetch previously computed results and call `verify` as above. Many computations are expected to be incremental, in that a `proof` and `result` from a computation on a previous part of an append-only data structure can be reused to create a `proof` and `result` valid for the same computation performed on an extended version of this data structure. In general, the version of the compute read API exposed to higher level layers can be expected to check the cache first, and only then request computation from the network.
We expect that trusted delegation (with a honesty assumption, without verifiable compute) will be used only between highly-entangled nodes - but there it can be very useful (e.g. for automatic delegation of compute from an edge device to a trusted server). For now, this layer does not deal with incentives for performating delegated verifiable computation for untrusted nodes - this can be added by including credit payment conditional on provision of a correct computational result - but the basic interface from the perspective of other layers should be able to remain consistent with this simple model.
Computation is generally unordered and observations of computations performed are mostly not expected to be included in the physical DAG, although they can occaisionally be if certain nodes wish to track metadata.
## Observations & Physical DAG
The _physical DAG_ is the layer directly on top of the lowest-layer network, responsible for providing local partial ordering information. This DAG is singular, in that any message can reference any other message created at this layer. Different observers may see and perform computations on different sub-physical-DAGs (just in accordance with which messages they have actually received) - but any observer in possession of any particular message can check whether or not it has received all transitively referenced messages, guaranteeing that two observers treating the same message as the current state and performing the same computations on parts of the past physical DAG (w.r.t. that message) will end up with the same results. The physical DAG has no knowledge of linearity, consensus, or validity semantics - those are the responsibility of higher layers.
### Definitions
- An `Observation` is a type of message which attests to witnessing some data (possibly other messages), and provides a signature along with an external identity.
```haskell=
data Observation = Observation {
witnesses :: Set Hash,
identity :: ExternalIdentity,
signature :: ByteString
}
```
We also define `commitment(observation) = hash(witnesses, identity)`.
`Observation`s can be verified by running `identity.verify(commitment(observation), signature) => true | false`. Observers can generate valid signatures by running `sign(commitment(observation))` with their internal identity as above.
The witnesses in an observation can include any other observation `o` (just `commitment(o)`) or any arbitrary data `d` (`hash(d)`). Arbitrary data could include transactions, other network-layer messages, or anything else, but observers inspecting other observations will only care about (and be able to parse) certain formats of data.
> Anti-DoS note: We do not want observers downstream of a particular observation to need to do any processing of witnessed events in the past history of that observation which they do not care about.
> Anti-DoS note 2: Although there is a singular physical DAG, nodes still choose who to receive messages from and whether or not to witness them, and will likely refuse to accept & include other messages carrying a high computational load unless they are specifically paid to process them, incentives are aligned a-priori, or similar.
### Properties
The physical DAG provides local partial ordering information, in that an observation by identity `I` of witness set `{w}` proves that `I` knew about all witnesses in `{w}` no later than the creation of this observation, which itself can then be referenced in later observations by that observer and by others (assuming non-invertability of hash functions).
Nodes will have local rules around when to create & send an observation at all (creating an observation without sending it is equivalent to not creating an observation at all, since it could have been equivalently created later whenever a node sends something).
Examples of rules:
- Send an observation every second (or every time tick `t` based on a internal clock)
- Send an observation on a particular condition (e.g. some other messages from my fellow consensus provider nodes)
- Send an observation when requested by some other node
More frequent observations provide higher-granularity partial ordering information.
Honest nodes behave as following:
- When sending an observation, include in the witness set all observations which they have seen since the last observation they sent.
- Thus, all data the observer has ever seen will be included in the transitive backreference graph of their observations
- This is unenforceable and undetectable, since we can never prove that a node received a message until they have witnessed it (by adding receipts of some sort, we could have a method of detection, but observations are effectively a receipt already so these seems redundant).
- From the perspective of certain consensus algorithms, honest nodes also provide a total order in their witnesses: they issue observations in an order, and each observation includes the previous observation in its witness set (thus all future observations transitively reference all past observations, in a total order).
- This is unenforceable, but detectable: two observations from the same identity, neither of which includes the other in its transitive witness set, constitute a violation of total transmission order, and these violations can be processed in some fashion by higher-layer logical DAGs (e.g. slashing).
- Higher layers processing the physical DAG may or may not care about this, but the physical DAG layer should provide sufficient structure to detect such violations of total transmission order efficiently. This could be done by keeping a witness set in each message, which must be correctly updated to include all of the witnesses in the message's transitive witness history. Two observations from the same identity, neither of which includes the other in its witness set, consitute a violation of total transmission order (this can be checked in `O(log n)`).
> Note: There is no notion of linearity at this layer, only total transmission order.
# Logical DAGs
A _logical DAG_ is a DAG computed from a view (the partial information an observer has access to) corresponding to a particular message from the physical DAG according to a particular algorithm. To facilitate incremental verification and computation, logical DAGs are uniquely defined by their predicates. Different logical DAG predicates may guarantee additional structural properties (such as linearity, predicate validity, consensus w.r.t. an identity set, etc.) possibly subject to additional node behaviour assumptions. Logical DAG predicates are of type `PhysicalDAG -> t -> Bool` for some logical DAG type `t`, where a valid logical DAG for a particular physical DAG is any inhabitant of `t` such that the predicate holds. Generally, logical DAGs preserve partial ordering information from the physical DAG, in that if A occurred before B in the physical DAG, A must occur before B in any valid corresponding logical DAG.
Particular logical DAG algorithms, if their assumptions are met, generally guarantee that any two observers using that algorithm to compute or verify a logical DAG will not accept conflicting logical DAGs even if they have different partial information about the physical DAG (for definitions of "conflicting" which are specific to the logical DAG algorithm in question), and that these two observers will eventually compute isomorphic logical DAGs after receiving each other's physical DAG information (i.e. logical DAGs over the same subsets of a physical DAG are isomorphic w.r.t. to reordering of the physical DAG).
## Transaction DAG
The most basic concept of a logical DAG is a `Transaction`, which is a blob of data atomically included or not included in a particular physical DAG (witnessed as `hash(tx bytes)`). From a view of the physical DAG which includes transactions, we can compute a logical DAG of those transactions (which are trivially atomically included or not included in the logical DAG):
```haskell
type TxDAG = DAG Bytes
validTxDAG :: PhysicalDAG -> TxDAG -> Bool
validTxDAG = subDAGBy isTransaction
```
The transaction DAG for a particular physical DAG is simply the DAG calculated by taking the partial ordering of all transactions referenced by the physical DAG and ignoring other messages (e.g. observations). The representation of the transaction graph as a logical DAG is fully distinct from the execution, and is only used to determine which actions are possible, e.g. which resources are available, which paths are invalid because of still unresolved conflicts etc.
There may be many transaction types which do not need to care about each other. Here we will discuss a transaction structure designed to provide a predicate-based linear resource logic.
## Numerical DAG
Transactions are non-negative integers, state is computed by adding, there are no conflicts.
## Resource logic DAG
The resource logic structure is based on a concept of a _resource_, where particular resources have rules as to how they can be created & consumed along with logic to resolve any conflicts between duplicate consumptions (linearity violations) if applicable.
Resource types are defined by a particular `ResourceLogic`, which specifies under what conditions resources of that type can be created and consumed.
```haskell=
data ResourceLogic = ResourceLogic {
creationPredicate :: PhysicalDAG -> Transaction -> Bool,
consumptionPredicate :: PhysicalDAG -> Transaction -> Bool,
}
```
- The `creationPredicate` describes under which conditions a resource can be created.
- e.g. for credit: message from issuing user, or consumption of equal existing credit
- The `consumptionPredicate` describes under which conditions a resource can be consumed.
- e.g. for credit: authorisation from owning user
A `Resource` is data owned by a particular resource logic. In order to allow for the rendering of a global content-addressed state, resources include a `prefix` and `suffix`, which can be combined together to form a `key`, and an arbitrary `value`.
```haskell=
data Resource = Resource {
logic :: ByteString,
prefix :: ByteString,
suffix :: ByteString,
data :: ByteString,
value :: Natural
}
```
```haskell=
key :: Resource -> ByteString
key r = logic r <> domainSeparator <> prefix r <> domainSeparator <> suffix r
```
- The `logic` is a hash commitment to the set of creation and consumption predicates (as above) defining under what conditions the resource can be created and under what conditions the resource can be consumed.
- The `prefix` is an application-determined name component.
- The `suffix` is a key suffix used to distinguish between distinct-but-equal resources (e.g. same logic, same prefix, same value, but distinct suffix). Semantics of the `suffix` semantics are enforced by the resource logic (e.g. for application internal prefixing by resource owners, with suffixes addressing individual resources).
- The `data` is an arbitary bytestring which can be interpreted by the predicates (it could itself contain other predicates).
- The `value` is a natural number (non-negative integer). This resource logic model builds in a notion of fungibility, i.e. two resources with logic (and prefix) `l`, different suffixes, and values `a` and `b` are treated as equivalent to one resource with logic (and prefix) `l` and value `c` iff. `c = a + b`.
Resources can be created in a content-addressed way with the appropriate logic whenever `creationPredicate` is satisfied.
```haskell=
data Transaction = Transaction {
created :: Set Resource,
consumed :: Set Resource,
extradata :: ByteString,
finalityPredicate :: PhysicalDAG -> Transaction -> Bool
}
```
- The set of `created` resources are resources which this transaction creates.
- The set of `consumed` resources are resources which this transaction consumes.
- The `extradata` field is for additional data which may be meaningful to predicates in resource logics (e.g. signatures).
- The `finalityPredicate` determines whether a transaction is considered finalized. After the finality predicate is satisfied, resources consumed by this transaction are marked as having been consumed, and resources created by this transaction can be consumed in future transactions (assuming that their consumption predicates are satisfied).
> Additional context on finality predicates: a separate predicate is used here to provide a notion of atomicity at the level of individual transactions. Finality predicates may, for example, wait for confirmation by a consensus quorum, or select between transactions which consume the same resources according to some selection criterion. Two consumptions of the same resource are considered to possibly conflict only if both of their finality predicates are satisfied.
> TODO: Figure out more clearly what data predicates in resource logics have access to. Instead of passing the whole physical DAG / transaction data, can we instead rely on a local view of "messages" / resources sent & received? Want to rely on local invariants - and e.g. even stuff like consensus should be expressible as a resource which was previously created (and is then witnessed, but not consumed).
A `Transaction` in a resource logic DAG simply consumes a (possibly empty) set of existing resources and creates a (possibly empty) set of new resources. Transactions are atomic, in that either the whole transaction is valid (and can be appended to / part of a valid resource logic DAG), or the transaction is not valid and cannot be appended to / included in the resource logic DAG. Transactions include an `extradata` field which can be read by the resource logic predicates and may contain signatures, proofs, etc.
Transactions are valid if and only if:
- all consumed resources were previously created by a transaction in the history, which was itself (recursively) valid, and whose `finalityPredicate` was satisfied
- the consumption predicates of all the resources consumed by the transaction are satisfied
- the creation predicates of all the resources created by the transaction are satisfied
- prefixes of any created resources are of the form `hash(logic)` for the resources created
A resource logic DAG is valid if and only if:
- all transactions are valid by these conditions
- all transactions are included in the same order as their data in the physical DAG
From a particular resource logic DAG, an observer can calculate a _state_ as a key-value mapping by taking `key(resource)` and `value(resource)` for all resources created but not yet consumed in the history of that DAG.
> Note: For a multihop atomic swap, the transaction needs to refer to a consensus provider for finality, or consensus providers, at which point conflict resolution might need to be done post-hoc. If my pre-transaction (a transaction before finality criteria are met, roughly equivalent to an intent) specifies that resources need to be considered final in respect to the CP I chose, we get atomicity without risking to need post-hoc conflict resolution.
> TODO: Work out how the taiga representation is a special case of this. Ask Joe if that maps to taigas "creating virtual resources" approach.
### Linearity
Many kinds of resources may like to encode a notion of _linearity_, in that resources of that kind should only be able to be consumed exactly once, and any instances of consumption more than once encountered in a history should be detectable and addressable with an associated notion of _conflict resolution_. For example, contemporary blockchain protocols typically require linearity for representing scarce fungible and non-fungible assets, and resolve conflicts by the decision of a particular consensus algorithm.
- model enforces balance check in a transaction (w/encoded fungibility)
- model enforces recording of conflicts
- virtual resources created/destroyed in a single tx need not be recorded
- conflict resolution predicate encoded in consumption predicate - in order for resources created by a tx to be later consumed, conflicts in the history of that tx's resource graph must be resolved
#### Modeling non-linearity: virtual resources
- virtual resources are created & destroyed in a single tx, need not be recorded
### Handling conflicts
A conflict is for a linear resource created at one point `t_0` (logical time) in a resource logic DAG to be consumed at two points `t_1` and `t_1'`, both downstream of `t_0` but unordered with respect to each other, in valid transactions whose finality predicates are otherwise satisfied. Later on, when `t_1` and `t_1'` are both seen by a node computing the current state frontier of a logical DAG, the conflict is detected, and before further transactions are appended to the logical DAG, which reference both `t_1` and `t_1'`, some action must be taken to resolve the conflict.
Conflicts are detected by computing a unique nullifier for each resource, which is published when the resource is consumed. If the same nullifier is published twice in a history, this constitutes a conflict, and consumption predicates of resources in this application namespace can require that all conflicts in their namespace be resolved before those resources can be spent. Resolving conflicts might mean that some resources created on one or both sides of the conflict are rendered unspendable (meaning that their consumption predicates will never be satisfied).
For example, suppose the case of a non-fungible token with a conflict resolution rule depending on local order of a particular consensus provider. The consumption predicate of that non-fungible token will require that all spends in its history be on the locally-first side of the ordering by the referenced consensus provider, so if a conflict is detected, the version of the NFT on the side which is locally-second will be unable to be spent in the future.
Conflicts-by-resource-type
- conflict detection w/nullifiers
- somehow figure out the resource type
- keep a mapping of resource types to conflict y/n
- consumption predicates of resource types require conflict resolution of conflicts of their type
- conflict resolution can render some of the resources of that type unspendable
**CR rules determine linearizability of an application and should be checked for that by the compiler, if people write their own.**
Cases of CR:
- Resolved by some consensus provider (or ordered list of consensus providers)
- Specified at the application level (which prefixes the namespace for resources)
- Credit (fungible) - resource-level
- Let both sides of the double spend live, but broadcast double spend/violation of annouced credit issuance policy and track reputation/slash outher credit assets the originator holds
- when a double spend is detected at the resource level, require a transaction before any transaction from a particular identity is considered valid, and potentially slash the resource type on one side or both sides of the double spend
- NFT - resource-level
- vote on which side gets the real thing
- one side of the double spend is marked unspendable (as if it had been spent) at CR time
- Consumption predicate could enforce specific finality predicate, tying to the application, to make CR CP and finality CP be the same
Consumption predicates for linear resources will enforce that if a conflict (double-spend) conflict tracking state which must then be resolved before resources created downstream of either side of the conflict can be consumed.
(explain how conflict resolution predicate can be included in the consumption predicate)
(> TODO: which properties does a conflic resolution rule need to have and how do we verify/enforce them?
- deterministic
How do CRs rules compose? Which types are composable, if they can contain different finality conditions, resolution approaches, etc. First approach: Require in an app, that the underlying apps CR rule is satisfied, in a specified hierarchy.
TODO <)
- The `conflictResolutionPredicate` describes what to do when the resource in question is spent more than once in a particular history. After a particular resource has been spent more than once, the `conflictResolutionPredicate` must be satisified by any history which builds on any of those spends (i.e. any transaction with transitive inputs including the "double-spend"). Once satisfied in a particular transaction, the conflict resolution rule does not need to be satisfied again for any of the histories built on the double-spent resource for this particular double-spend, but if another spent (of the same resource) is later detected, it will need to be satisfied agian.
It is important to note that the abstraction of an `Application` is "virtual" - applications are not "deployed" or tracked in any sort of registry - they are content-addressed by particular resources. An `Application` can be said to exist by virtue of the existence resources referencing it, and if those resources are completely consumed it no longer exists (except in history).
Applications can have "sub-applications", i.e. allow for their state to be consumed to create state with a different creation/consumption/conflict resolution predicate, without restriction.
> TODO QUESTION: in this framework, is `Consensus` an `Application`?
(TODO ordering here)
### Two Sided Transactions
Example: Payment request.
I send a payment request to a specific recepient, in which I specify e.g. accepted denominations, consensus providers I like, etc.
The counterparty can then send out a transaction using resources and predicates with fulfil the finality predicate of my intent, or make a counteroffer with different properties, a process which can be repeated up e.g. to a maximum number of rounds specified in the finality predicate.
This intent does not need to be solved by a third party since I am directly communicating with the counterparty I intend to transact with.
###
The conflict resolution rule specifies what should happen in the case of a linearity violation. The conflict resolution rule only comes into play if a linearity violation is detected, and it will require that all future history built on top of either side of the conflict satisfies this rule. For example, the conflict resolution rule may require redenomination of credit on the other side of history, a vote to resolve ownership of a scarce asset, etc.
More specifically, the above guarantees lead to the following desirable properties:
- Total ordering in case of equivocations inside the observed history
- Partial ordering for all other transactions
- Finality for the observed history
- in that local ordering is never "reverted"
- parties who accepted scarce assets which will later double-spent will have their semantics resolved by the conflict resolution rule
- Non-equivocation of "good" branches, as well as checking that conflicts have been appropriately resolved according to their CR rules
In a later step, conflict resolution rules can be applied, taking into account the total ordering.
(these are the actual consensus algorithms)
*logical DAG* guarantees:
- partial order on double spends
- messages have conflict resolution rules for parallel double spends: e.g. median timestamp, voting
- non-equivocation
- finality
---
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
IGNORE BELOW ME
---
# Messages / State
## TODO
TODO^: Explain how messages propagate locally through gossip
## Message Representation
For a new message in the chain to be generated, the previous message has to be consumed (e.g. it can not be reused to send a new message in the chain)
Any observer can create a message. Messages consist of the following fields:
- key ([]bytes)
- value (bytes)
- predicate (function)
- ancestors (hash set) (possibly empty set)
- witnesses (hash set) (possibly empty set)
- nullifier secret commitment
From these fields a unique commitment and unique but unlinkable nullifier can be derived, as:
- commit(msg) = hash({all fields})
- nullifier(msg) = hash(secret, hash({all fields except nullifier}))
By way of brief explanation,
- `key` contains the message namespace
- `value` contains arbitrary data
- `predicate` defines the conditions under which a message can be consumed, which may include arbitrary data (no data/code separation)
- `ancestors` includes the messages consumed to create this message
- `witnesses` includes the messages (histories) witnessed by this message (choice of message creator)
Each node locally keeps a totally ordered log of messages it sends, as well as messages from other nodes which it derived messages from.
The content of an initial message must be created at a key derived from a PRF of the value (`{external_identity}/prf(value, predicate)`), such that two initial messages created at the same key necessarily have the same value & predicate (and thus are just the same message, with same commitment/nullifier/etc.). Initial messages must provide a signature valid for the external identity in the prefix of the key.
> Option: There must be only one initial message with a fixed value (nonce) of 0, which all subsequent messages must witness. This ~is the external identity and can optionally constrain future messages to be ordered or not. (useful for: consensus providers). Identity = Message = State, QED.
> Note: this is just like a default first predicate which checks the PRF, should be possible to explain more clearly.
> TODO_: In which cases do we want to permit initial messages to be created from nothing, or require to branch from bootstrapping state? (related to possible history-compressing optimisations)
Given a valid set of messages on the frontier (locally current "time"), we define the _state_ with respect to said messages as the mapping from keys to values, defined by this set (where the keys are guaranteed by the linearity validation to be unique).
A _namespace_ is a partitioning of the keyspace by external identity as defined above, such that only an observer who knows an internal identity can create messages in the namespace owned by that external identity. Messages without any ancestors must be created in the root namespace for a particular external identity, and knowledge of the corresponding internal identity is required to create such messages. Messages not in the root namespace for a particular external identity must be created by consuming other messages, whose predicates are checked as part of the validity check.
Messages can be kept locally or sent over the network. When sending messages over the network, they are encrypted to the recipient using `encrypt'`. Local state transitions are modelled as messages from an identity to itself.
Each participant keeps a (partially ordered) log of messages they created or interacted with. This way they can produce proofs to attest to their view of the state of the global namespace.
Considering the definition of state above, messages can be understood to encode `(key, value)` pairs, setting `key` to `value`, with the validity of the particular transition checked by the particular predicate.
Message validity rules ensure that the namespace fulfills the following properties:
- External Identities form the prefix set for the namespace.
- Ownership of suffixes is transferred by updating the predicate associated to them, allowing another idenity to consume the message and "write" to the suffix.
- Complete identity ownership transfers can be performed by a special kind of message (requires consent of recipient by signature), which for anyone who is aware of it causes them to update their local validity rules such that the old identity and new identity prefixes are collapsed to a single one (identified based on the old identity, or possibly based on locally unique enumeration - future work)
- Disjoint suffixes (e.g. via protocol specified PRFs) are used to enable asynchronous, distinguishable writes, where necessary. These can be enforced by predicates (e.g. one can mint a message with a predicate which allows another to consume it and write to their namespace, but only at a deterministically random and non-conflicting location). The input of the PRF should contain the address of the recepient / next transaction in case of a merge s.t. double spends get different suffixes and keep them through merging.
## Linear Resources
Since we are interested in tracking ownership and uniqueness of (some) state transitions, we model their behavior similar to a (simplified) linear logic. Write operations (implication) are assumed to be potentially non-local, while read operations are local and "infinite/free". Message validity checks uniqueness of nullifier spends in order to ensure linear consumption of resources in write implications.
Some state transitions, e.g. `write` operations, consume their inputs (LL: they output an additive conjunction), while others, like `read` do not (LL: they output a multiplicative conjunction), which is expressed in their type (TODO: do we want to call this type?).
Resources can be created from nothing in this cryptographically content-addressed way, and destroyed into nothing by sending them to the False Identity (terminal object).
### Borrow Checker / R/W Permissions
If any node gives permission to other nodes to operate on some state in it's namespace the following must hold:
- There can be any number of `read permissions`
- There can only be one outstanding `write permission` for any piece of state at a time.
Permissions can be given bilaterally between identities, or application specific; e.g. every identity running an application can write in the namespace of everyone else who runs it, under their own prefix, preventing collisions.
## Validity
To validate messages with respect to some known set of valid messages the following need to be checked (in order of importance):
1) Are all the predicates in the message chain valid?
2) Does the chain of backreferences point back to valid initial messages?
3) Are there any conflicts between the nullifier set of this history and the nullifier set of my known valid messages?
(Are linearity requirements, e.g. state consumed by additive operators is not reused, fulfilled?)
(Also checks that there are no double spends in this message's history)
(deals with disjoint histories only, so no nullifiers can be spent twice)
Check that either:
1) histories are disjoint, there are no nullifiers in common (spent twice)
2) message includes proof that if nullifiers were spent in both histories, they created the same messages.
In order to compose this proof, the sender of the message must know a recent witness set of mine so that they create the proof (e.g. a block produced by a consensus provider) and they must have DA on the nullifier set for that witness set.
If the validity check passes, this message is added to my set of known valid messages. If the validity check fails, this message is rejected.
> Note: TODO+: double spend announcement / (local) conflict resolution
The state transition from the message is then `applied` by appending it to the local chain.
> LangSec Note: Parsers for predicates should be determined per application and be ones of the least computational power possible. Make predicates content addressed for permissioning.
> **Progress Note: Second pass has happened up to here.**
## Validity proofs
When a node creates a message, it produces proofs for validity of the predicates and recursive validity of the histories (PCD) for all of the ancestor messages consumed in this message, and includes proofs for the validity of the histories for all of the witnessed messages in this message. The node must know the immediate ancestor messages consumed in order to create proofs that their predicates are satisfied, but all of the other proofs (for their ancestors and for any witnessed messages) have already been created and included/verified in those messages and merely to be recursively referenced (PoK of a valid proof for a history ending in this hash).
Here we use PCD (proof-carrying-data), where each valid message includes a proof of historical consistency (as defined above) for itself. Once a message is so created and the proof available, that proof can be verified by anyone who knows the commitment or nullifier for the message, so we can preserve privacy to subsequent nodes after the message is first consumed.
Messages contain a commitment to the complete spent nullifier set of their histories, which is appended to when the message is consumed to create a new message. A "merge proof" can simply be created by witnessing two histories in a subsequent message and proving that they do not conflict (then merging their nullifier sets in this subsequent message) (this is what consensus providers will roughly be doing). Messages also contain a commitment to the commitment set of their histories (this is the same commitment set as in the history of the message itself, just as a set instead of a DAG, so the structure doesn't need to be known for inclusion proofs)
> TODO: Clever tree structure which allows for this nullifier set to be appended to without computing over all of the past nullifiers.
When merging and replacing part of history, merge requests must provide a proof that the commitments which immediately descended from the double-spent nullifer in the branch of history they want to replace are added to the commitment blocklist, and subsequent messages must prove that their commitment sets do not have any intersection with the commitment blocklist (disallowing anything descended from those commitments to ever be appended to history).
### Succinctness
## Privacy
Make the validity proofs zero-knowledge
### Nullifiers
At minting a commitment is created, at spending a nullifier is created.
See IF: Privacy
TODOs:
- Figure out nullifier DA assumptions
- Figure out nullifier non-intersection proving synchrony assumptions
- Give up consistency when enough of the participants in a network decide that they want to give up consistency?
## Merging
### Merging State Histories
In this architecture, ordering is performed only when necessary, i.e. any two non-conflicting histories are either completely independent or already partially ordered with respect to one another, and can be merged without additional ordering.
In the case of a double spend (linearity violation), we must pick which branch of history to take. For this purpose, messages specify how nodes which are aware of conflicting histories involving them to pick which history to accept, which must be one of the following options:
1. Reject both conflicting histories and any descendents (~ slashing for double-spends)
2. Pick according to some other arbitrary but verifiable rule
- Example 1: best price between conflicting usages submitted before some other message (~ in a block batch, interesting for competing solvers)
- Example 2: pick according to the results of a specified consensus quorum
- Including quorums to appeal to in the case that this quorum itself violates linearity
- Example 3: select randomly according to some other verifiable random beacon
Messages in an atomic operation must agree on which predicate should be used to pick between conflicting histories involving this operation (this is enforced by a system component of the message consumption predicate). As long as the predicate remains unsatisfied, nodes will refuse to merge conflicting histories (preserving their local order). In order to obtain eventual consistency, a determistic, verifiable predicate must be provided - this ensures that if all participants eventually receive all messages, they would come to the same conclusion.
> LangSec Note: Provide a constrained DSL for describing conflict resolution?
> Note: idea: do PRF iterations on each branch of history in order to rank conflicting ones (completely randomly but deterministically) (proof-of-work in expectation)
> Or: Find a way of (locally in respecto the history in question) robust peer sampling to do random ad-hoc concensus ~ slow Avalanche?
> Or: Figure out how to do "canonical" randomness beacons
#### ZKP Implications
- Need DA for nullifiers in merge proof
- Every message carries the nullifier set for it's consumed inputs
- If any party triggers a merge, it needs to prove that it cut all the branches of it's history which contain any of the "bad nullifiers".
TODO+: Merge with section below or remove
TODO^: plaintext merging trivial
> Quick note: ZKP merging - after a merge resolution, some nullifiers become blacklisted (pruned history), subsequent merges must prove that those nullifiers have no intersections (as if the other branch was spent)
### Invalid State Transistions
Predicate validity is checked by validity proofs, so messages violating predicates of prior messages can be trivially rejected. The only non-trivial case is that of linearity violations, which is handled by history merging.
### Consensus
Consensus algorithms can be incorporated into merge instructions as a specific predicate which checks e.g. a light client verification algorithm, a threshold signature, a simple multi-signature, etc. which forms the basis of an external identity.
## Finality
In this architecture, finality is separate from history merging, as for history merging we have eventual consistency, whereas decisions about finality must be made on the basis of partial information about the state of the system (observers can never know that messages they have not yet received do not exist). Finality is only necessary for making "out-of-system" decisions (such as whether or not to hand over a cup of coffee), since the system itself can resolve history conflicts as they arise.
### Consensus providers
Users can pick one or many consensus providers (where many can be re-composed into one external identity) whose quora they consider sufficient criteria for finality (on a per-message basis). Consensus providers can run any standard BFT consensus algorithm (in which they can reuse the existing message graph if desired, a la Bullshark).
### Example: Lightweight One-Way Finality Check
If no consensus is needed we can
## Conflict Resolution
There are several resons why conflicts can arise:
1. Linearity Violations (e.g. double spends, post-hoc equivocation of ancestral state)
2. Predicate non-fulfillment
For both of these types, conflict resolution rules can get shipped via predicates.
Conflict resolution rules are:
- Application specific
- Each application (indicated by prefix) only uses one CR rule. This can be enforced by requiring the proof to record that a newly minted new carries the same predicate as is predecessors. Since state resources are also separated by application prefix, no inconsistencies can happen with CR rule application and resource consumption.
The default CR rule is local order first.
### Example CR rule: Credit
If we assume an equivocation on ancestral state of E (Equivocator), who transacted A first and now issues transactions with B, re-using the same resources. (E, A, B could be Accounts or Consensus Providers, the semantics don't change, only the merge mechanism)
A and B (or the subnetworks they provide consensus for) now produce internal history which is inconsistent with the one in the other subnet.
```mermaid
stateDiagram-v2
[*] --> E
state fork <<fork>>
E --> fork : equivocation
fork --> A0: first
fork --> B0: second
B0 --> B1
B1 --> B2
A0 --> A1
A1 --> A2
state join <<join>>
A2 --> join
B2 --> join
join --> Merged : slash
Merged --> [*]
```
One naive CR rule would be to slash all "bad state", in which case we would need to prune the all subtrees which have EQV state as an ancestor.
If we prefer to keep the histories, we can burn and remint credit stemming from the EQV (assume we are on the side of AX and BX is merged into our history).
After the merge, the in addition to E credit, E' credit exists: A variety of rules on how to do pricing can be imagined, for different tradeoffs:
- 1:1 for all assets --> inflates E credit
- 0.5:1 for E' : E --> keeps value post merge equal, punishes E credit holders
Futher punishment, outside of slashing, can be executed towards E and should be voted upon by affected parties.
After AX and BX have both had the other side merged into their history, E, E'A and E'B will exist, which all are not fungible.
For every transaction of which not all downstream transactions are merged, we need to inform the recepients. There could e.g. be forced swaps before spending tainted assets, exchanging E proportionally for any E' (depending on height of branching?) to resolve. This could happend during a finality period as well.
> Note: For granular finality periods: Use an application specific consensus provider.
After finality, a message containing the conflicting subtrees, as well as the resolution is published, s.t. eventual consistency is achieved.
Users should be able to specify intentional "double-spends" in finality conditions, in case of submission to multiple solvers etc.
### Example CR rule: NFT
In case of non-fungible assets, a vote needs to happen during finality, which one is the true transaction going forward. For high value transactions, one might want to wait for finality, and possibly do an atomic swap during a finality period or other escrow mechanism.
We care about modeling material scarcity of the underlying assets correctly.
### Example CR rule: Deterministic PRF
If the network wants shared randomness, determistic given a seed, including replay of numbers, one way would be to set the CR rule to "None", enabling state resources to be repeatedly consumed.
Here R is a transaction containing the PRF in the predicate, with the content of the transaction being only dependent on the consumed previous state of it, RX == RX' == RX'' and merges being free.
```mermaid
stateDiagram-v2
[*] --> R0
R0 --> R1
R1 --> R2
R1 --> R2'
R2' --> R3'
R2' --> R3''
R3'' --> R4''
R4'' --> R5
R3' --> R4
R2 --> R3
R3 --> R4
R4 --> R5
R5 --> [*]
```
## Consensus Providers
Have liveness, merge and conflict resolve conflicts on join of logical DAGs
---
old
- Create the message DAG
- Consensus algorithms are validity checks on the message DAG
- Validity checks can include a total order at least like
### Punishment
E.g. slashing
TODO: elaborate in consensus / merge section
---
# Credit
One of the primary motivations for building this system is to use it as a basis for a scale free credit system, but the credit system also provides us with useful features to organize network topology and compute proxies for trust between peers.
The credit system is be one of the core `applications` being run and credit operations are encoded as message types.
The goal here, is to enable every pubkey to mint credit in it's name and exchange it freely with other parties, which can decide to redeem it with the originator at a later point in time, store it, or barter with it.
## Transfer
If `alice` wants to transfer credit of some type (which she holds and knows charly is willing to accept), she can send a message updating the balance of that type to `charly`.
If she does not hold credit `charly` is willing to accept, there are multiple alternatives:
- She sends an intent message, stating she wants to acquire charly credit, to the gossip layer.
- Nodes who are willing to perform swaps `X to charly` report back to the previous hop (TODO0: and maybe her?), stating `(amount, price)`. If a full swap chain back to alice can be established she can decide to execute it.
- _WIP_: She sends a message stating she is willing to acquire charly credit up to some price. Should a path be found via gossip, it can be resolved in place. Prices will be suboptimal and it will only be considered settled once higher level consensus is established (TODO0: the canonical for that provider needs to be specified).
To be consistent with one write permission per "state at some suffix" / "resource" (TODO+: find a better name) and still enably asynchronous transfers, all credit state updates from some `minter` that get sent to `receiver` (with `minter` and `recevier` being `pubkeys`) gets updated under the namespace `receiver.minter` with each piece of credit having its own suffix.
TODO+: How to prevent observability of message chains? Can we have (sufficiently) non-colliding PRFs? Make PRF space the size of nullifier space, s.t. collision resistance does not rise.
### Double Spending
Double spending would constitute a linearity violation in our model. Upon merging two non-disjoint histories, a double spend across histories can be detected.
## Incentive Alignment
We assume that nodes who exchange a lot of credit (or often do so TODO+: define metrics/proxies for trust here) are closer to each other economically.
They are also progressively aligning incentives: If `A` holds a lot of credit of `B`, `A` will be interested in `B` doing well. Thus, a node which closer in the credit metric is more trustworthy, because the cost of it defecting is higher.
# Scalefreeness and Scaling
The goal of this protocol is to provide unified semantics for interactions between public keys, whether individuals or group entities are behind them, as well as providing a unified way to scale the size of the subnetwork up, where state validates.
The tradeoffs inherent to the system described here are:
With increasing distance, efficiency of transactions goes down, if reliability is to stay equal.
With increasing distance, privacy goes down, if efficiency is to stay low, but some efficiency can be traded for more privacy.
Since our model has `trust` or `entanglement` as a central notion, we get natural scaling effects. The higher the `entanglement` the better the tradeoffs. Since less `entangled` interaction happens less often, costly interactions are rarer than cheap ones.
## Consensus
TODO^: Update to "nodes pay for consensus provision" model
TODO+: For what are consensus providers paid?
- Including transactions in blocks
- ...
To cover the spectrum reaching from local/trusted transactions to long range/untrusted transactions, we want to have methods to increase the trustworthyness of message logs progressively.
One phase transition in trustworthyness happens when we involve consensus providers, instead of only relying on messages received from the gossip network.
Since we want permissionless consensus hierarchies, we introduce the notion of ad-hoc consensus:
A consensus provider `CP` consists of some peers which observe message traffic on the gossip layer and run (heterogenous) BFT consensus on messages of one interval to decide on a total order for all intersecting histories. Non-intersecting histories are preserved as a partial order per interval.
`CP` produces a partial order with timestamps (block height) on each message in it.
Double spend protection is linear in the cost of messages, since only emptyness of the union of nullifier sets, or fulfilment of linear resource constraints needs to be checked. (See IF: Scaling)
### Ad-Hoc Consensus Sketch
If nodes `N` and `M` who don't have existing trust relations / paths in the gossip network want to transact, they can do so with the help of consensus providers.
Each consensus provider `C` consists of a set of acceptor nodes `A`, which include transactions of nodes visible to them. `pubkeys` of `A` are revealed. `N` and `C` together can produce proof that `C` has a view of `N`.
`N` can then reveal to `M` a (sub)set of consensus providers for itself, where `M` can chose one that they trust the most, e.g. where `A` of `C` contains pubkeys trusted by `M`. Other criteria for selection might be density of transactions by `N`, freshness of data compared to other providers, etc.
Using this model, consensus providers can be ad-hoc views of the gossip layer, without requiring separate or distinct gossip networks, which would require some sort of governance or permissioning to delineate the boundaries.
`CP` also performs validity checks on the merged histories, to detect e.g. double spends in the credit application, or more generally, violations of the resource constraints.
Nodes need to pay consensus providers to have their transaction included. If `CPs` want to give free transactions to certain nodes, they can mint and distribute credit for that.
#### Zero Knowledge Consensus Provision
Every `CP` holds a snapshot of current messages seen by it's constituent nodes.
If any of these messages contain illegal state transitions, they are dropped and the punishment procedure is started locally.
For each valid message, the following happens every round:
1) The state transition is included in the validity proof.
2) The nullifier is included in the nullifier set corresponding to the validity proof.
After each round, the `CP` sends out two messages:
1) Contains the current validity proof and a commitment to the nullifier set.
2) Contains a block ID and the nullifier set.
The nullifiers then get recorded by some storage layer, or by the `CP` itself.
#### Merge between ZK Consensus Providers
If two CPs `P` and `Q` want to merge histories, `P` computes the intersection of their nullifier sets. If it is empty, it can just compute the validity proof for the merged history, resulting in a third CP with a shared view `PQ`. (They check each other proofs, but don't need access to messages.)
If the intersection is non-empty the following happens:
1) `P` looks up the earliest block ID with a conflict
2) `P` replays the `(backreference, nullifier)` pairs and perform linearity checks on the merged tree, to detect conflicts
3) `P` drop the branches of its tree which contain conflicts
4) `P` recomputes a chain of validity proofs for the valid (and resolved) branches (and merge), using `(br, nullifier)` pairs.
6) They send the `end of round` messages for the merged tree to `Q`, which can decide to reject, or accept and merge (as well as store the nullifiers).
Since the `CP` that initiates the merge performs the computation, and the one that receives it benefits from a larger subset of consensus validity, incentives are aligned here (TODO+).
### Higher Order Consensus
To scale trust up, we can also run consensus between consensus providers and merge their histories, to extend the "no double spend/no illegal state transitions" guarantee to hold for the set of all the nodes these provide consensus for.
Lower trust transactions can be executed this way, with finalization time and compute complexity rising with the amount of consensus layers which need to be queried.
We assume that nodes of consensus providers have bandwidth, storage and compute capabilities, proportional to their layer height.
# Data Availability Layer
## Interface
The DA Layer provides the following Interface:
`publish(value)` to publish content to the DA Layer.
`retrieve(key)` to retrieve values with a given `key`.
`test(key)` to receive a proof of data availability.
with `key = hash(value)`.
When using erasure coding, s.t. out of a set of `n` nodes, `k` can reconstruct a given blob.
Encryption happens out of band, that is nodes encrypt values before publishing.
## Constraints
Optimize for retrieval latency (by running a query and measuring response time).
## Payment for Storage
Options:
- Store up to N MB for friends for free forever.
- Take M credit of denomination X per MB/hour.
### Continuous Payment
Revokable credit node which enables the holder to perform continuous withdrawal. It gives permission to mint a certain amount of issuer credit per interval, until it is revoked.
DA tests are performed in regular intervals and should they fail, the note is revoked.
## Choosing DA Nodes
Pick DA layers / node sets according to best approximation of incentive alignment regarding cost in bandwidth, compute, storage vs. benefit of the outcome, per function / application, e.g.:
- `CP` needs to do `DA` for nullifier sets to enable nodes, or other consensus providers who want to merge hisotries, to perform linearity checks.
- Choose `DA` depending on who is expected to retrieve a given message.
-
TODO^:
- better anonymity set if publishing to bigger DA layer
- blind TTL
- DA layer has k of n erasure coding
- interactive DA test
- DA shouldn't require consensus
- sender and recepient should have nodes in an overlap
# Solvers
To match intent for e.g. credit swaps, or state transitions which require resource swaps, some kind of computation which can be verified after the fact needs to be performed.
## Privacy Efficiency Tradeoff
Computation (for now) needs to be in plaintext, which can have a privacy cost, e.g. on rare predicates or preferences, even if the originators of the predicates are unknown.
When aggregating over larger sets of intents, better prices for credit swaps and higher rates for matchings of other state pairs can be achieved, while incurring a higher system wide privacy loss from central visibility. In which cases do nodes benefit from wider broadcasts?
## Local Swaps (Credit Case)
If Alice wants to swap `A` for `E`, she broadcasts this intent, in the form of a limit order (or with a price curve) and a commitment to the network. Nodes who accept `A` forward this to all nodes who accept any credit they are willing to swap to. If any node is willing to swap any of the incoming credit to `E`, it announces path termination to the previous hop (and so forth), with the nodes closest to Alice reporting prices for the paths behind them back to her. She then chooses the best price path.
If route selection is competitive, and nodes have interests to be included in the routes (to claim a spread/reward), pricing for swaps should be roughly incentive compatible. (TODO+: Strengthen argument).
## With Consensus
Swap intents include consensus providers we use to decide finality of the swaps.
Alice wants to swap `A` to `B` credit.
Bob proves (pointing to `CP`) to Alice that he holds enough `B` (limit order or swap curve).
Alices intent is binding, and if bob submits them to `CP` they are deemed final upon block inclusion.
If Bob wants to offer swaps for more than one path, he can broadcast new messages, with swaps pointing to the same credit. The consensus provider then picks the first message in a path that settles to execute and drops the other paths.
Bob splits up the total amount of credit per block he wants to use for liquidity provision per block.
Also: Any node can stake LPs with a CP. The CP can participate in distributed solving like a (set of) regular node(s), or it can do local solving if it holds compatible LPs for the swap chain. Nodes who staked the LPs can override swaps with the same consensus provider by updating LPs after they do swaps themselves.
The CP can also do local solving and produce whole subpaths for the swap chain.
Swap request with whitelists for nodes could be sent to trusted nodes, if swaps should be solved without revealing them to external parties.
## Computational Power of Intents
TODO+: Should intents be turing complete? Are application specific sublanguages a good approach to bounding solve time?
## Intent Privacy
Out of Scope for now, maybe possible via MPC or FHE at cost of implementation complexity.
# Intent Solving
We exchange privacy for matching set size and price.
Very specific intents can leak information.
Support users in approximating their preferences/price sensitivity/matching size preference, with a similar process to iterated fair (combinatorial) division.
Request solver behaviour:
- (Don't) forward this intent
- Sync with (these) other solver
- Entanglement boundaries for counterparties
- Match my intent locally for privacy (assuming solver is honest)
Human Trustees:
- Have specified roles
- Enable out of band verification e.g.
# Further Research
- The concurrent state model we describe seems to have a natural formulation in the join-calculus, by expressing merges as a join primitive.
- Linear Logic: Investigate if proofnet representations can help to determine useful equivalences.
- By performing merges + proofs we could save on nullifier set size, but only for disjoint subgraphs. This could be useful for cheap fast forward consensus updates with weaker guarantees.
# Notation
TODO suffixes in order of descending urgency:
TODO^
TODO+
TODO0
TODO-
TODO_