Zero Trust Computing Architecture for Data Management: How to support Secure Async Data Flow Routing in KERI enabled Applications

See also the discussion of field normalization here:
https://hackmd.io/XfdKjT3ZQDi1M6Iv3iYhbg

Data Management in KERI

KERI is about managing verifiable data structures. When data is part of a verifiable data structure we can make strong security guarantees about that data as derived from the verifiability guarantees of the data structure itself. The principal verifiable data structure in KERI is a KEL or KERL.

Data may be directly embedded in a KEL or it may be anchored to a KEL using a cryptographic digest or SAID (self addressing identifier). A SAID is a self-referential digest as identifier. Given that the cryptographic strength is sufficient, any digest anchored data has the same verifiable security guarantees as the embedded data for which is was derived.
A SAD (Self-Addressed Data) item is a serialization of a data item that includes its SAID. A commitment to the SAID of a SAD is cryptographically equivalent to a commitment to the SAD itself.

KERI Protocol employs several types of cryptographic commitments to serialized data. Typically a cryptographic commitment is a non-repudiable digital signature on that serialized data. These are labeled commitment Types 1-5 in the following list:

Commitment by an event in a KEL: This is either event data in a KEL or a data seal anchored in a KEL. This is the strongest type of commitment as must be made by a controller at a given key state. Both the key state and the commitment (anchor) are verifiable as part of the KEL. The data or an anchor (seal) are equivalent from a security perspective. An anchor has an additional availability requirement besides that of the anchoring KEL. Whereas data embedded in the KEL has the same availability as the KEL itself. Ordering of updates to Type-1 data is determined by order of appearance of data or its anchor in KEL.
Commitment by an Identifier whose identifier is committed to by a KEL but the committed to data is not embedded or anchored in the KEL. Type-2 data is essentially a form of authorized data that may be authenticated to the committed to identifier. This is a weaker form of commitment because it only commits to an authorized identifier in the KEL. An example would be a signature on a message body or a SAID or SAD or some other type of Message where that signature is provided by some entity that is authorized by the KEL via a commitment in that KEL. Besides the controlling identifier Prefix of a given KEL other identifiers committed to in that KEL such as a witness, backer, or registrar identifier are examples of authorized entities that may make Type-2 commitments. Ordering of updates to Type-2 data is determined by a monotonic date-time of the updating entity.
Commitment to data by an Identifier that is NOT committed to by the KEL and that committed data is also not anchored in the KEL. This requires trust in the identifier or some other process to authorize or designate this identifier as trustworthy. A signature on a SAID or SAD or some other type of Message where that signature is provided by some entity that is not committed to (i.e. authorized ) by a KEL forms a commitment at this level. A commitment to data made by some entity in the wild is an example of a Type-3 commitment. Ordering of updates to Type-3 data is determined by a monotonic date-time of the updating entity.
Commitment to a generic data envelope via a signature on the envelope not the embedded data independently of the envelope. This type of commitment may be made by a signer that is authorized by a KEL. The unique characteristic of this type of commitment is that the data serialization is ephemeral. It is meant to be thrown away after its use. The data may be used or kept in another form but the serialization is not meant to be kept. This is to support truly ephemeral data communication such as a query for other data. Signing the serialization is a way to authenticate the request but once processed the request is obsolete. Likewise responses that use generic envelopes indicate that the serialization of the envelope is meant to be discarded and any data is either of temporary use of is combined or stored in another form besides the serialized envelope. Although each envelope may be uniquely identified by a digest of the whole envelope, the envelope itself is meant as a generic temporary carrier of data. Otherwise, a SAID, or SAD, or Event should be defined to indicate data serializations that are not meant to be ephemeral but are meant to be thrown away. A generic data envelope could convey a SAD (with SAID) as a payload of the envelope. In this case a commitment to the SAD or its SAID should be used instead of a commitment to the serialized envelope. Ordering of updates to Type-4 data is determined by a monotonic date-time of the updating entity.
Commitment to a generic data envelope via a signature on the envelope not the embedded data independently of the envelope. This type of commitment may be made by a signer that is NOT authorized by a KEL. This is identical to the previous classification except for the fact that the signer must be trusted or some other process that authorizes the signer besides a KEL. Ordering of updates to Type-5 data is determined by a monotonic date-time of the updating entity.

Current Facilities for communicating Data

Non-Enveloped Data Specific Typed Messages

icp, rot, ixn, ksn, etc fixed format (field composition) messages of various types.

Interactive Exchange `exn` Messages

exn messages are part of interactive exchange. An exn may be used to either solicit action and be a response to a solicitation to action by another exn. The exn was created to enable multi-step interactions.

Currently exn message envelopes uses q for its payload. Suggest changing to a for attributes (data payload) so as to not confuse it with the the q query modifiers block in a qry message. This would also allow an exn to have an optional q in addition to the a where the q provides some additional modifiers that are better adapted to a ReST endpoint model. This would better support a multi-protocol implementation that includes other protocols like TCP, UDP in addition to HTTP ReST but where the rest model is mimic-ed by the other protocols.

Non-interactive Query Message Envelope

qry messages to solicit action. Currently the only way to reply to a qry message is with a typed message. Previously there was no generic envelope for communicating data in reply to a qry beside creating a new message type for each type of reply to a qry.
One option would be to allow an exn to be sent as a reply to a qry message. This seems out of place to the intent of an exn as a multi-step interaction. Having two types of workflows , one that uses both qry and exn and another that uses only exn would be confusing.

Suggested new components

Non Interactive Reply Message Envelope

rpy, reply message envelope as either a solicited reply to a qry query message or as an unsolicited message to transfer of data . This would enable any type of data to be exchanged without requiring a dedicated message type for each type of data. The reply rpy provides a generic envelope independent of the data payload. The rpy message may be used in both solicited and unsolicited mode. In the later there is not a one-to-one correspondence between a rpy and some qry. In other words, a rpy may be triggered independently of a qry. This allows push communications or other asynchronous communication of enveloped data payloads.

An open question arises around the cases wherein a rpy message could be a valid mechanism in the context of an exchange using exn messages. An exn , could trigger a rpy but sould only do so as a side effect. The rpy so triggered should not part of the interaction protocol. Any messages that are directly part of the exchange protocol should use exn messages not rpy. A reply rpy could be triggered as a side-effect of an exn but the rpy should not be an explicit step in a defined exchange protocol.

Message Data Conveyance Lifecycle Issues

Any new data block, that is not part of an interactive protocol, starts life as a data payload of a rpy envelope. At some point if the data block is important enough or common enough that it would be more optimal to have a typed message for that data payload, then a new message type may be created that is dedicated to that data. Another reason to create a dedicated message type instead of using a rpy envelope is if the commitment to the data in the envelope is meant to persist and be forwarded to some other entity or reused outside of the context of the envelope.

Route Field

In general the r field of a rpy, qry, exn or ksn message that acts to route the data payload of the associated envelope or message. This allows message instance specific handling of the enveloped data. The route field value is a namespace so the routes are not practically limited. As a result, the route field is a more general more extensible mechanism than using data specific message types. Moreover, although a ksn is a typed message, because there are multiple reasons or actions that may trigger a ksn message, a route r field in a ksn provides a way to differentiate the reason for a given ksn and thereby direct it the correct handler.

Reply Route Field

rr reply route field (route of reply) so that solicited replies may be routed to a data flow destination on the initiator's (solicitor's) side. This is a new required field in a qry message but may be empty. When the rr field i in a qry s not empy then the associated rpy will have its r field set to the rr of the triggering qry. This enables the querier to indicate how to route the resultant associated but asynchronous reply rpy message within its computing infrastructure

In a qry message one could view the r as the destination handler of the query and the rr as the intended destination handler of the reply back to source of the qry.

Likewise exn messages could have optional rr fields that may be useful in more complex interactions protocols where the logic may branch and an explicit next route for the next exn must be provided by the preceding exn.

An r field will be added to the ksn message so that a ksn may be the routed response to qry message with a non-empty rr field.

New Granular Commitments with New Attachments

The Signed envelopes exn and rpy are meant to provide authentication with stale or replay attack protection for the envelope and its embedded data. When the data is ephemeral the envelope and attached signature may be thrown away once the the message is authenticated. When the data is persistent, its authentication must be able to be re-established or re-proven. When this proof must be conveyed forward to an external party it may require embedding it in another envelope thus resulting in nested envelopes. This may be verbose or cumbersome or confusing.

One alternative would be to define a unique message type for each embedded configuration of data. This may be problematic because it may require too frequent versioning the protocol to add the new message types or an explosion in message types.

In general, in order to comply with Zero Trust Computing principles any data that is durable should be re-authenticatible (i.e. the information such as signatures and signed serializations) should also be durable so that the authenticity may be re-established. Thus some more convenient mechanism besides re-enveloping or unique message types may be desirable. A granular construction of embedded data with correspondingly granular commitments would enable more granular data management without nested envelopes or a multiplicity of new message types.

The proposed solution provides granular commitments to an embedded SAID or SAD in an envelope (exn or rpy) using a new attachment type. There are two types of attachments based on the signer prefix type and each commitment may be to one of either a SAID or SAD.

SAID Commitment. A SAID is a Self-Addressing IDentifier. Essentially a self-referential content addressable identifier.
SAD Commitment. A SAD is Self-Addressed Data. A SAD must contain a SAID for that SAD. The SAID has as its target its associated SAD.

Given that the data payload of an envelope, rpy or exn is a SAD with an embedded SAID, or is merely a SAID then an attachment to the envelope could convey an independent granular commitment to that SAD or SAID.

Non-Transferable Identifier Granular Commitment

The attachment for a non-transferable identifier based commitment to a SAD or SAID has the following primitives in CESR format.

SAID, Prefix, Signature

The SAID is that of the associated SAD. The SAD with embedded SAID may be included in the envelope or the SAID may be included in the envelope and the SAD provided elsewhere (such as in an attachment).
The Prefix is the non-transferable identifier prefix of the committer (signer).
The Signature is the signature by the private key of the Prefix on the enveloped SAD or SAID as appropriate .

Transferable Identifier Granular Commitment

The attachment for a transferable identifier based commitment to a SAD or SAID has the following primitives in CESR format:

SAID, Prefix, SN, Digest, Indexed Signature(s) Group

The SAID is that of the associated SAD. The SAD with embedded SAID may be included in the envelope or the SAID may be included in the envelope and the SAD provided elsewhere (such as in an attachment).
The Prefix is the non-transferable identifier prefix of the committer (signer).
The SN and digest are the sequence number and digest in the KEL of the Prefix of the event that establishes the authoritative signing keys used to create the signatures in the Indexed Signatures Group. These signatures are made with those authoritative keys.

ReST API Convenience

ReST APIs assume a synchronous connection based request response architecture. Each request has a corresponding response on a given connection. As described above, qry and rpy messages provide an asynchronous mechanism for requests (queries) and corresponding responses (replies) as generic data envelopes. They provide an envelope mechanism for conveying generic data. Furthermore the asynchrounous query/reply qry and rpy message formats are designed to support asynchronous communication over non-HTTP transports like TCP and UDP. But it would be advantageous if they could be mapped to the synchronous request/response model of HTTP ReST APIs. Therefore the packet design is informed by HTTP ReST API's but adapted to non-HTTP packet formats.

When one looks at how web frameworks work, the heavy lifting of URL composition, encoding, and decoding is done by the framework. Any given endpoint just receives the end result of that parsing in the form of dicts that contain the parsed and decoded parameters. These typically include: a path parameters dict, a query string dict, a headers dict etc. Each dict has fields with labeled values. The values in each of these dicts have already been URL decoded. Therefore instead of using URL strings with path, query, and fragment string in the qry and rpy messages, the URL itself with path and query strings may be exploded into labeled fields in a mapping block. To clarify, he resource path string, and and any path parameter elements and query string parameters the path and query strings used to compose the URL may be provided instead by labeled fields in a mapping block. The advantage pf exploding the URL path and query string into a map or dict is that there is no need for URL parsing to process a qry or rpy thus making the envelope protocol agnostic. Nonetheless, a URL can be constructed from the exploded components when needed to use an HTTP ReST endpoint transport or equivalent.

This approach generalizes the ReST concept of a resource or endpoint into a data flow route. Unlike ReST where only server's have resource endpoints and clients identify each resource end point using a URL (Uniform Resource Locator), the route and reply rout fields enable both client's and server's to use data flow routes. These enable the identification of to endpoints on both sides of any interaction or data exchange or pub/sub. By having routes on both sides, generic peer-to-peer protocols like UDP and TCP may be supported not merely client-server protocols like HTTP.

Recall that the route field is denoted with the compact label r. The route field may map one-to-one to a URL path fragment so it can be conveniently mapped to an HTTP ReST API. In a flow based or data flow based programming architecture, the route maps to a data buffer. Some behavior is responsible for processing that buffer.

In addition the reply route rr field enables a solicitor or subscriber to specify a reply route or return route that the receiving party may publish to. The reply route field is denoted with the rr compact label. The details of each message, namely, qry and rpy are found below.

qry Message

The qry message as described above provides a way to solicit a reply or subscribe to a push stream. The r field contains the route path string. The path elements are delimited with the / character. For example route/path/to/a/resource. It serves to namespace routes, resources, or endpoints. So instead of multiplying message types, one for each unique composition of data fields, a namespace identifies unique data field compositions via a route path tree. This allows the qry message to be generic. Each r field path string value may address or route to a unique data resource. The qry message also contains a reply route field denoted with the rr compact label. This allows the solicitor or subscriber or querier to specify the return route of the corresponding reply.

The qrymessage also contains a version string field as its first field. The version field is denoted with compact label v. Because of the version string any compatible serialization may be used such as JSON, CBOR, MPCK, or CESR.

Another notable field in a qry message is the query field denoted by the compact label q. Its value is a map or dict whose labels and values are the path and query parameters that further define the query.

The major drawback of the route path string plus exploded path parameter, query string dict approach is that a JSON object is more verbose than a URL string where that URL includes the full path and query string because JSON block delimiters add a few characters over the URL ? = & separators. But if the URL includes % escaped encodings in the query parameters then the compactness advantage may flip to JSON exploded query mapping. Moreover if the serialization is not JSON but a binary equivalent such as CBOR or MGPK or CESR then the the resulting binary query map may be always more compact than the encoded URL.

As per the rest design documentation Web API Design best practices for ReSTful interface design, each resource has two base urls: a collection URL, and a single element URL. The collection URL depends on query parameters for operations on the collection. A special case, however, of any collection is a single element. This makes the single element base URL somewhat redundant. One benefit of this redundancy is the the clarity that single element URLs provide about intent. However under the hood one can do everything with the collective base URL and a singular query string equivalent. In order to regain some of the simplicity of the non-collective (singular) query, we take advantage of the fact that in KERI we enforce ordering of the appearance of fields in a mapping. (ordered mapping). This means that the query q block could be interpreted as the a non-collection request if all the fields in the query q block are singular (non-collections) and the ordering and presence or absence of fields mimics a singular (non-collection) traversal of the resource path.

Authentication Support

Solicitation benefit from some form or access restrictions or access control. This requires both authentication and replay attack prevention. Unlike KERI event messages which are cryptographic commitments or disclosures originally initiated by some controller that are verified according to the protocol against signatures by the controller of the associated identifier, or receipt messages which are merely conveying signature and other cryptographic material used to verify signatures on events, a qry query is asking some host to disclose information. As a result it makes sense to include a bare bones authentication mechanism in the qry to enable authorization of that disclosure by a layer above KERI. For security reasons it's best if the authentication happens at the ingestion of the qry. For better scalability and asynchronicity, a non-interactive mechanism is preferred. The heavy lifting needed to support authentication is already done by KERI. That heavy lifting is provided by the current key state for a given identifier. This means that a query may have attached to it the identifier of the querier and a signature of the query message. This essentially authenticates the querier.

As mentioned above, attached to the serialized query message body is an attachment with the prefix of the querier and signature(s). Authentication of signature(s) always assumes that latest key state for the prefix as well as the service endpoint address for the querier. Key state is not relevant for non-transferable prefixes but is relevant for transferable ones. The signatures are verified by the host server (querient) against the latest available key state known by that host server for the querier. This means that information about which establishment event was used for the signatures does not need to be attached to indicate which set of keys were used. It may be assumed to always be the latest key state. If not then the query is not timely anyway and may be safely dropped. The latest querier service endpoint data may also be attached to the query or assumed available to the server (querient).

Signing a query message, however, does not protect against is a replay attack of that signed query. Queries need to be timely. The simplest non-interactive mechanism to protect against replay attacks is a date-time stamp in the query message. The qry message must therefore have a dt field. The querient (server, recipient of the query) then refuses any signed queries whose datetime stamps, dt, are not within a narrow time window around the server's current datetime. An attacker has to replay any signed queries within that window. Thus stale queries outside that window may be refused. Consequently an attacker can't request the information outside of that window. unless the attacker is able to compromise private keys. Key compromise is hard. The server can increase the degree of protection by enforcing a policy of that all queries must have monotonically increasing datetime stamps. This can be done by keeping a cache of the latest query. Monotonicity also making any replay attacks detectable by the querier. The server will only respond to the first query at a given datetime so a successful replay attack requires a man-in-the middle but unless that man-in-the middle is part of the routing infrastructure it may be detected because the reply must be specifically routed by the server to the querier's service end-point not the man-in-the middle's service end point. But a signed query that indicates the return service endpoint address may not be changed by the man-in-the-middle without compromising keys. Mixed routing are multi-perspective routing infrastructure would foil a non-key-compromise man-in-the middle replay attack on a such a query.

Example qry

{
  "v" : "KERI10JSON00011c_",  
  "t" : "qry", 
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
 "dt": "2020-08-22T17:50:12.988921+00:00",
  "r" : "logs",
  "rr": "log/processor",
  "q" : 
  {
       "i":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
       "sn": "5",
       "dt": "2020-08-01T12:20:05.123456+00:00",
  }
}

The dt field in the query body top level is the datetime of the request used for replay attack prevention,
The field r, is the query route.
The field rr, is the reply route (return).
The q block separates the query body from the query envelope and avoids confusion without having to define unique field labels.
The q field value holds the equivalent query parameters exploded into a mapping.
Should a dt field be provided in the q block then that dt is for the query target log entry not for replay attack protection.

qry messages may be signed with an associated attachment that provides the signer (querier)as well as the signature. This serves to authenticate that qry. A given recipient (querient) could drop any qry messages that were not signed by identifiers it did not recognize or accept as authorized. If the querient does not have the current key state for the querier the querient may escrow the query.

{
  "v" : "KERI10JSON00011c_",  
  "t" : "qry", 
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
 "dt": "2020-08-22T17:50:12.988921+00:00",
  "r" : "logs",
  "rr": "log/processor",
  "q" : 
  {
       "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
       "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
       "sn": "5",
       "dt": "2020-08-01T12:20:05.123456+00:00",
  }
}

Reply Message, `rpy`

The reply, rpy, message as described above provides a way to respond to a solicitation or publish to a push stream.

The attribute data payload block for the rpy message is denoted with the compact label a. Its value is a map or dict whose labels and values constitue the body of data for the replay. The a block may be a SAD (Self-Addressed Data) item with embedded SAID (Self-Addressing IDentifier). The attribute block, a may contain nested SADs as appropriate
The rpy message also contains its own self-referential SAID field denoted with the compact label d (for content addressable digest). This makes the total reply message as an envelope a SAD (Self-Addressed Data) item in its own right. This enables the rpy message envelope to be persisted with a known content addressable identifier, i.e. the d field value as the database key. This provides support for reasoning about the rpy as the authentication mechanism (with attached signature) for re-establishing the authentication of its embedded data when that embedded data is not a SAD. The SAID in d is generated from the contents of the rpy using the SAID derivation algorithm.
The rpy message uses the BADA (Best Available Data Acceptance) model for its security. These means that there must be an attributable originator of the rpy. The BADA security model provides a degree of replay attack protection. The attributate originator (issuer, author, source) is provided by an attached signature couple or quadruple. A single reply could have multiple originators. When used as an authorization the reply attributes may include the identifier of the authorizer and the logic for processing the associated route may require a matching attachment.
The r field contains the route path string. The path elements are delimited with the / character. For example route/path/to/a/resource. It serves to namespace routes, resources, or endpoints. So instead of multiplying message types, one for each unique composition of data fields, the r field namespaces data field compositions via a route path tree. This allows the rpy message to be generic. Each r field path string value may address or route to a unique data resource. The value of the r field is the value of the rr field of the corresponding qry message that motivated a given rpy message. If the rr field in the qry is empty then the r field in the rpy may be an empty string.
The rpymessage also contains a version string field as its first field. The version field is denoted with compact label v. Because of the version string any compatible serialization may be used such as JSON, CBOR, MPCK, or CESR. The version field is authoritative for any nested SADs in the reply. This enables the reply and any nested SADS to use any of the supported serialization without including a version field in each nested SAD. In other words, the nested SADs share the reply message envelope's version. The reply envelope's version field must be stored along side any nested SAD storage. The security posture is that the nested SAD is signed so the SAD itself is tamper proof (evident) and the serializations are incompatible so worst case if an attacker corrupts the separately stored version field, deserialization will fail. Nonetheless, given a limited set of serializations it would be straightforward to rediscover the orginal serialization. This is viewed as a reasonable tradeoff, that is, an unsigned version field from the envelope that is inherited by any nested SADs versus a dedicated redundant but signed version field in each nested SAD. Should this not be secure enough, the original envelope with its signed version field and signature could be stored along side any nested SADs not merely the unsigned version field. This makes the version field tamper evident at the cost of redundant storage of the reply message but still keeps the nested SADs clear of having to include a per SAD version field. The nested SADs still share the reply message envelope's version but more securely.

Example rpy

{
  "v" : "KERI10JSON00011c_",  
  "t" : "rpy",  
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
  "dt": "2020-08-22T17:50:12.988921+00:00",
  "r" : "logs/processor",
  "a" : 
  {
       "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
       "name": "John Jones",
       "role": "Founder,
  }
}

Out-of-order or Stale Reply Data

When the rpy message is used as an envelope for Type-2 or higher data, the recipient needs some mechanism for detecting stale, or out-of-order transmission of updates to that data via a reply message. Stale re-transmission may be innocuous or part of a replay
or DDOS attack. When the data is Type-1 the ordering may be determined by the anchor location in the associated KEL. But for Type-2 or higher data there is no anchor so some other mechanism is needed to order updates to the data. The simplest non-interactive mechanism for ordering data updates is a monotonic date-time. The date-time in the rpy is relative to the rpy sender's (replier's) clock. The monotonicity of the date-time enables the recipient (replient) of the rpy to detect out-of-order updates by first storing that date-time of the most recent received rpy for that data and then comparing it to the date-time of any newly recieved rpy. Therefore the rpy message includes a dt field. Any recipient of a rpy may refuse it if its date-time stamp, dt, field is not later than the date-time it already has on hand for that data item.

rpy Authentication and Authentication

rpy messages may be signed with an associated attachment that provides the identifier of a signer as well as the signature. This authenticates the rpy with respect to its sender signer. A given recipient could drop any rpy messages that were not signed by identifiers it did not recognize or accept as authorized. If the receipient does not have the current key state for the sender (signer) then it may escrow the rpy message and attachments untill such time as it has the current key state.

Query/Reply Tracking and Matching

In the case where the same data item provided by a rpy may come from multiple sources, the recipient may choose to track sources independently of each other by matching each rpy to a prior originating qry. If there is no match then the rpy amy be dropped. Tracking could be based on some combination of different items of information each providing a tracking mechanism. One tracking mechanism is to match the source signer identifier in a signature attachment of the rpy to some destination identifier in the prior matching qry. For example, if a qry asks for information about a KEL from the controller or a witness to that KEL, both of whose identifiers are associated with that KEL, then a corresponding rpy signed by either the controller or a witness could be matched to the originating query for information about that KEL. Another tracking mechanism is to match the route, r, field of the rpy with the reply route, rr, field of the originating qry. In addition, a transaction identifier or cryptographic token could be included in the rr in the originating qry to more specifically match a given rpy to that qry.

Reply Security Posture

Ephemeral Reply Data

In this case the reply's payload data is meant to be ephemeral. A signature on the message establishes authenticity of the message envelope but the intent is that once processed the envelope is discarded and the data payload is not intended to be the target of a cryptographic commitment (signature) that is stored nor is it intended that the authenticity of the data ever needs to be re-proven externally or internally. Thus the signature of the envelope merely serves as an ephemeral authentication. In accord with Zero Trust Computing (ZTC) principles, ephemeral data is is stored in memory and is not meant to be persisted to durable storage. In general data stored in protected process memory is not accessible by other processes unless specifically granted. When that process exists all data in memory is lost. Thus the authentication and authorization of the startup of process itself provides a degree of protection to the data stored in its memory. When data is placed into durable storage however that protection is removed. The chain of custody or control over a given storage device (hard drive, flash drive, etc) may have been broken during any time that the running process is stopped. Consequently the next time the process runs and loads the data from durable storage there is no guarantee the data is still authentic. As a result best practices ZTC is to re-verify or re-establish the authenticity of any data in durable storage whenever there is any doubt as to the chain of custody of that durable data. Encrypting the data merely displaces that authentication chain-of-custody to the chain-of-custody of the decryption key. One may make a time performance trade-off between re-verification of signatures at startup and continuous encryption/decryption while running.

Persistent Reply Data

When Type-2 data conveyed by a rep is persisted to durable storage there must be a mechanism for re-establishing the authenticity of that data. Persistence indicates a need to re-prove the authenticity of the data either externally or internally. This usually means storing the latest signed version of that rpy with the attached signature and then re-verifying the signature on that rpy before refreshing the stored data. If the signature does not verify or the verified rpy data does not match the persisted data then the persisted data is no longer authenticatible and should not be trusted until it is re-authenticated (i.e. authenticity is re-established).

Reply with SAD (Self-Addressed Data)

In order to better reason about the embedded attributed data payload it may be desirable to make make the a, attribute block into a SAD (Self-Addressed Data) block with an embedded SAID ( Self-Addressing IDentifier). The SAID is provided by the d field in the the a block , i.e. the contents of the a block is a SAD item. A SAID is a specially derived self-referential cryptographic digest of the data block in which it resides. This makes is self-referential content addressable identifier or self-addressing identifier for short.

{
  "v" : "KERI10JSON00011c_",  
  "t" : "rpy",  
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM", 
  "dt": "2020-08-22T17:50:12.988921+00:00",
  "r" : "logs/processor",
  "a" : 
  {
       "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
       "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
       "name": "John Jones",
       "role": "Founder,
  }
}

Expose Message

Exposure messages for disclosure of sealed data associated with anchored seals in a KEL. Reference to anchoring
seal is provided as an attachment to exposure message.
Exposure 'exp' message is a SAD item with an associated derived SAID in its d field.

{ 
  "v": "KERI10JSON00011c_",
  "t": "exp",
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
  "r": "sealed/processor",
  "a":
  {
    "d": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
    "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
    "dt": "2020-08-22T17:50:12.988921+00:00",
    "name": "John Jones",
    "role": "Founder",
  }
}

Independently Persisted SAD

An alternative to persisting the whole rpy with attached signature in order provide a mechanism for re-establishing the authenticity of include data is to use a SAD format for the a, attributes block (which therefore includes an embedded SAID) and also attach a signature on just the serialized attributes block (SAD). This attached signature would be in addition to the attached signature on the whole rpy envelope. In this case the whole reply with attached signature does not need to be stored to re-establish the authenticity of the data attributes, merely the SAIDed (SAD) attributes block and the attached signature on that block. This approach makes the rpy envelope bigger but may make the persisted storage smaller. This approach allows the rpy to be used in an ephemeral manner to filter out stale rpy messages as well as to enable matching the rpy to a qry or for cueing the embedded data via the route, r field but only persist the embedded a block and attached signatures.

Type-1 Data

When the SAD is Type-1 data then order of appearance of an anchor in the corresponding KEL or TEL (given by the attached anchor seal) determines whether or not the SAD is stale. Consequently Type-1 SADs do not need an embedded date-time for stale detection. (they may need an embedded date-time for some other purpose).
Note the d field is the SAID of the contents of the a block, i.e. the contents of the a block is a SAD item.

{
  "v" : "KERI10JSON00011c_",  
  "t" : "exp",  
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
  "r" : "logs/processor",
  "a" : 
  {
    "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
    "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
    "name": "John Jones",
    "role": "Founder,
  }
}

{ 
  "v": "KERI10JSON00011c_",
  "t": "exp",
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
  "r": "sealed/processor",
  "a":
  {
    "d": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
    "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
    "dt": "2020-08-22T17:50:12.988921+00:00",
    "name": "John Jones",
    "role": "Founder",
  }
}

Type-2 Data

When the data is Type-2 then a date-time field must be included in the attributes block to identify stale or out-of-order updates.
Note the d field is the SAID of the contents of the a block, i.e. the contents of the a block is a SAD item.

{
  "v" : "KERI10JSON00011c_",  
  "t" : "rpy",  
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
  "dt": "2020-08-22T17:50:12.988921+00:00",
  "r" : "logs/processor",
  "a" : 
  {
       "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
       "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
       "dt":  "2020-08-22T17:50:12.988921+00:00",
       "name": "John Jones",
       "role": "Founder,
  }
}

Compact Reply with SAID only

An alternative is to only include the SAID in the attributes block of the signed rpy to provide a cryptographic committment to the associated SAD and then provide the SAD itself in a cache or in an attachment. This allows for reduced bandwidth requirements when the same SAD may be transmitted or shared redundantly. The compact form can be used as a notice of a change or update that triggers a request for the actual data. When used as a notice the dt field in the rpy is used to determine if it's a stale notice that may be ignored. Whether or not the SAD contains a date-time field depends on if the SAD has an anchor in a corresponding KEL or TEL.

{
  "v" : "KERI10JSON00011c_",  
  "t" : "rpy",  
  "d": "EZ-i0d8JZAoTNZH3ULaU6JR2nmwyvYAfSVPzhzS6b5CM",
  "dt": "2020-08-22T17:50:12.988921+00:00",
  "r" : "logs/processor",
  "a" : 
  {
       "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"
  }
}

The d field is the SAID of the contents of the externally provided a block, i.e. the contents of the a block is a SAD item.
The actual SAD for Type-2 data is as follows and is provide elsewhere:

{
       "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
       "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
       "dt":  "2020-08-22T17:50:12.988921+00:00",
       "name": "John Jones",
       "role": "Founder,
  }

When the SAD is Type-1 data then order of appearance of anchor in the corresponding KEL or TEL (given by the attached anchor seal) determines wether or not the SAD is stale. Consequently Type-1 SADs do not need an embedded date time for stale detection. (they may need an embedded datatime for some other purpose).

{
       "d":  "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
       "i": "EAoTNZH3ULvYAfSVPzhzS6baU6JR2nmwyZ-i0d8JZ5CM",
       "name": "John Jones",
       "role": "Founder,
  }

ksn Message (Key State Notice)

As described above, instead of using the generic rpy message envelope, some data is important enough and used enough to justify a dedicated message type. One such message with associated data is the ksn message type. Notable here is a revised ksn that adds a route, r, field. A ksn represents Type-1 data, i.e. data included in a KEL.
An example ksn is provided below:

Example ksn

    {
        "v": "KERI10JSON00011c_",
        "i": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
        "s": "2":,
        "t": "ksn",
        "p": "EYAfSVPzhzZ-i0d8JZS6b5CMAoTNZH3ULvaU6JR2nmwy",
        "d": "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
        "f": "3",
        "dt": "2020-08-22T20:35:06.687702+00:00",
        "et": "rot",
        "kt": "1",
        "k": ["DaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"],
        "n": "EZ-i0d8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CM",
        "bt": "1",
        "b": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"],
        "c": ["eo"],
        "ee":
          {
            "s": "1",
            "d": "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
            "br": ["Dd8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CMZ-i0"],
            "ba": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"]
          },
        "di": "EYAfSVPzhzS6b5CMaU6JR2nmwyZ-i0d8JZAoTNZH3ULv",
        "r", "route/to/buffer"
    }

Note that the ksn includes a route, r, field. This may be an empty string. The other fields of the ksn have been defined elsewhere.

tsn Message (Transaction State Notice)

As described above, instead of using the generic rpy message envelope, some data is important enough and used enough to justify a dedicated message type. One such message with associated data is the tsn message type. Notable here is a revised ksn that adds a route, r, field. A tsn represents Type-1 data, i.e. data included in a KEL.
An example tsn is provided below:

exn Message (exchange)

{
  "v": "KERI10JSON00011c_",
  "t": "exn",
  "i": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM", // recipient
  "dt": "2020-08-22T17:50:12.988921+00:00",  // replay attack prevention
  "r": "route/of/exchange",  //route
  "rr": "replyroute/of/subsequent/exchange",  //reply route
  "a": 
    {
        "name": "John Jones"
    }  // data payload
}

Best Available Data Acceptance (BADA) Policy

BADA (Best Available Data Acceptance) model for each reply message.
Latest-Seen-Signed Pairwise comparison of new update reply compared to
old already accepted reply from same source for same route (same data).
Accept new reply (update) if new reply is later than old reply where:
1) Later means sn (sequence number) of last (if forked) Est evt if any that
provides keys for signature(s) of new is greater than or equal to
sn of last Est evt that provides keys for signature(s) of olf.
2) if key state same or non-transferable then Later means date-time-stamp of new is greater than old

    If nontrans and last Est Evt is not yet accepted then escrow.
    If nontrans and partially signed then escrow.

    Escrow process logic is route dependent and is dispatched by route,
    i.e. route is address of buffer with route specific handler of escrow.

Read Update Nullify (RUN) Model

Relative to Client-Server or Peer-to-Peer interaction:
Create, Read, Update, Delete (CRUD)
Read, Update, Nullify (RUN)
Decentralized control means server never creates only client. Client (Peer) updates server (other Peer) always for data sourced by Client (Peer). So no Create. Non-interactive monotonicity means we can’t ever delete. So no Delete. We must Nullify instead. Nullify is a special type of Update.

Ways to Nullify:

null value
flag indicating nullified

Rules for Update :

(anchored to key state in KEL)

Accept if no prior record.
Accept if anchor is later than prior record.

Rules for Update:

(signed by keys given by key state in KEL, ephemeral identifiers have constant key state)

Accept if no prior record.
Accept if key state is later than prior record.
Accept if key state is the same and date-time stamp is later than prior record.

Restful APIs

A useful set of design guidelines for ReSTful APIs may be found here:

Web API Design
A related but more dated book.
API Design

The basic design consists of two base URLs per resource. A collection URL and a specific element in the collection URL.

'/dogs?all=true' (collection with query parameters to operate on the collection)

'/dogs/1234' (specific element with path to specify element)

The base URLs are operated on with the HTTP verbs, POST, GET, PUT, PATCH, and DELETE corresponding to the CRUD (create, read, update, delete) methods on a database. Unlike PUT, PATCH allows updating only part of a resource.

Resource	POST create	GET read	PUT/PATCH update	DELETE delete
/dogs	Create a new dog	List dogs	Bulk update dogs	Delete all dogs
/dogs/1234	Error	Show dog 1234 (if exists)	Update dog 1234 (if exists)	Delete dog 1234 (if exists)

Sweep complexity under the '?'
Use limit and offset for pagination.
/dogs?limit=25&offset=50

Suggested Resources

Clone Replay of KELs in first seen order.

Logs are stored with key of identifier prefix plus monotonic date time.

\logs

\logs\{pre}

\logs\{pre}\{datetime}

{pre} is template for identifier prefix

{datetime} is template for url encoded ISO8601 datetime. (Alternatively the datetime could be encoded as a Unix compatible datetime floating point number, but that is not a format that is universal to all operating systems)

\logs\EABDELSEKTH replays first seen log for identifier prefix `EABDELSEKTH' (prefix clone)

\logs?all=true replays first seen log for all identifier prefixes in database (full database clone)

\logs\EABDELSEKTH\%272020-08-22T17%3A50%3A09.988921%2B00%3A00%27 (get event of prefix at datetime)

\logs\EABDELSEKTH?after={datetime}&limit=1
Returns next event for 'EABDELSEKTH' after {datetime} where date time is ISO8601 URL encoded.

\logs\EABDELSEKTH?before={datetime}
Returns all events for 'EABDELSEKTH' before {datetime} where date time is ISO8601 URL encoded.

\logs\EABDELSEKTH?after=%272020-08-22T17%3A50%3A09.988921%2B00%3A00%27

Returns all events for 'EABDELSEKTH' after '2020-08-22T17:50:09.988921+00:00' (url encoded ISO-8601)

\logs\EABDELSEKTH?before=%272020-08-22T17%3A50%3A09.988921%2B00%3A00%27

\logs?pre=EABDELSEKTH&after=%272020-08-22T17%3A50%3A09.988921%2B00%3A00%27
Equivalent query on the collective base URL

\logs?pre=EABDELSEKTH,E123ABDELSE,EzyyABDELSE
Get logs for three prefixes

Replay of KELs by Sequence Number

Given recovery forks the KEL indexed by sn will not be the same as the first seen KEL. The key state will be the same but the exact sequence of events in a replay will not. So this is for verifying key state not for cloning the append only event log. But for any verification, the KEL by sn is more appropriate because you can query the key state at any sn and allows a verifier to find a given authoritative event in the log by its location seal.

\events

\events\{pre}

\events\{pre}\{sn}

{pre} is template for identifier prefix
{sn} is template for sequence number

\events?all=true (All KELs in database in order by sn)

\events\{pre} (KEL for identifier prefix {pre})

\events\{pre}\{sn} (event at {sn} where {sn} is template for sequence number

\events\{pre}?offset={sn}&limit=1000 (next 1000 events starting at sn = {sn}

\events?pre={pre}&sn={sn}
Using collective to get event at sn of pre

Fetch Keys for Event at given sequence number

/keys

/keys/{pre}

/keys/{pre}/{sn}

Fetch latest Key State for KEL at prefix

/states

/states/{pre}

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

Zero Trust Computing Architecture for Data Management: How to support Secure Async Data Flow Routing in KERI enabled Applications

Data Management in KERI

Current Facilities for communicating Data

Non-Enveloped Data Specific Typed Messages

Interactive Exchange exn Messages

Non-interactive Query Message Envelope

Suggested new components

Non Interactive Reply Message Envelope

Message Data Conveyance Lifecycle Issues

Route Field

Reply Route Field

New Granular Commitments with New Attachments

Non-Transferable Identifier Granular Commitment

Transferable Identifier Granular Commitment

ReST API Convenience

qry Message

Authentication Support

Example qry

Reply Message, rpy

Example rpy

Out-of-order or Stale Reply Data

rpy Authentication and Authentication

Query/Reply Tracking and Matching

Reply Security Posture

Ephemeral Reply Data

Persistent Reply Data

Reply with SAD (Self-Addressed Data)

Expose Message

Independently Persisted SAD

Type-1 Data

Type-2 Data

Compact Reply with SAID only

ksn Message (Key State Notice)

Example ksn

tsn Message (Transaction State Notice)

exn Message (exchange)

Best Available Data Acceptance (BADA) Policy

Read Update Nullify (RUN) Model

Ways to Nullify:

Rules for Update :

Rules for Update:

Restful APIs

Suggested Resources

Clone Replay of KELs in first seen order.

Replay of KELs by Sequence Number

Fetch Keys for Event at given sequence number

Fetch latest Key State for KEL at prefix

Interactive Exchange `exn` Messages

Reply Message, `rpy`