Deciding on a proof format

# Deciding on a proof format ## TL;DR: We stated in [this hackMD](https://hackmd.io/@je-UlHlgS6SReG8Ji1vg4Q/r19C4kpwi) that we are abandoning `JcsEd25519Signature2020` in favour of `di-eddsa-2020` because it has already been adapted to the new `DataIntegrityProof` type and is considered (more) stable and certainly better maintained. In light of [this point from the VC implementers guide](https://w3c.github.io/vc-imp-guide/#pf9a) which is demonstrated very cleary in [this example](https://github.com/setl/rdf-urdna/tree/master/jsonld-warnings) we might want to reconsider this choice. The ways forward are broadly speaking one of the following: - Stick to the previous decision of embrasing `di-eddsa-2020`. - Update the JCSEd25519Signature2020 suite to be correct and use the `DataIntegrityProof` type/format. - Embrace [VC-JWT](https://w3c.github.io/vc-jwt/). Feel free to skip to the [comparison table](#Comparison-table) to get an overview of how these approaches compare. # Introduction We give an overview of the two proof formats for securing veriable data suggested by the VC WG and go into a bit of detail on what supporting said format means in terms of cryptographic agility and code. The final section provides a summary table of how the approaches compare to each other. # The DataIntegrityProof format ## Examples ``` { "@context": ["https://w3id.org/security/data-integrity/v1"], "type": "DataIntegrityProof", "cryptosuite": "ecdsa-2022", "created": "2022-11-29T20:35:38Z", "verificationMethod": "did:example:123456789abcdefghi#keys-1", "proofPurpose": "assertionMethod", "proofValue": "z2rb7doJxczUFBTdV5F5pehtbUXPDUgKVugZZ99jniVXCUpojJ9PqLYV evMeB1gCyJ4HqpnTyQwaoRPWaD3afEZboXCBTdV5F5pehtbUXPDUgKVugUpoj" } ``` The old (but still rather similar) style we are more familiar with looks like this: ``` { "type": "JcsEd25519Signature2020" "created": "2020-02-21T22:37:48Z", "verificationMethod": "did:work:6sYe1y3zXhmyrBkgHgAgaq#key-1", "nonce": "524a497a-871c-4b89-a325-205f78fc5fcd", "proofValue":"2NuYkfkFy8o4R8iRCaLzTYc1Uss8s22rvC7jiY4BA49sFmWbwbEkQ6BHQPULEp7mPfQQvJdKSjKRLJF8sp62GKvo" } ``` The differences between the new and old style will be explained in more detail in the next section. ## A spec in the process of being refactored When securing data (such as verifiable credentials) with the the IOTA Identity framework we (currently) use a proof format defined in the [verifiable credential data integrity specification](https://w3c.github.io/vc-data-integrity/), a spec formerly known as *Linked Data Proofs* which is now in the process of being refactored. As part of the refactor a new value for the `type` field has been introduced called `DataIntegrityProof`. The idea is that rather than having one ```(Proof.type, VerificationMethod.type)``` pair for every cryptosuite e.g. ``` (Ed25519Signature2020, Ed25519VerificationKey2020) , (EcdsaSecp256k1Signature2019, EcdsaSecp256k1VerificationKey2019), (JcsEd25519Signature2020, JcsEd25519Key2020) etc. ``` there is instead a `cryptosuite` field in the proof specifying the name of the cryptosuite, and similarly there is a new verification method `type` called `Multikey` which is capable of representing many popular public key types. The new `DataIntegrityProof` system is more cryptographically agile because one can use the same verification method when verifying signatures from multiple suites now (something that is reasonable to expect if the suites only differ in the canonicalization and/or hashing algorithm) and it also helps limit the number of contexts that are necessary for those interested in working with linked data. In my understanding many of the existing linked data cryptosuites from the [LD-cryptosuite registry](https://w3c-ccg.github.io/ld-cryptosuite-registry/) are in the process of being adapted to the `DataIntegrity` type (see for example [di-eddsa-2020](https://w3c-ccg.github.io/di-eddsa-2020/)). I am however unsure of whether all said suites will depricate/remove the old type value or if both will be supported in the years to come. ## Linked data proofs As the previous name of the VC data integrity spec suggests, most cryptosuites utilising this proof format are based on linked data. Pretty much all of these suites use [URDNA 2015](https://w3c-ccg.github.io/rdf-dataset-canonicalization/spec/index.html#dfn-urdna2015) which canonicalizes the JSON-LD document into RDF Quads. In simplified terms this means that one signs the semantic meaning of the data rather than the data itself. For developers that are not familiar with Linked data technologies proofs based on signing the canonical RDF quads representation can lead to confusion and at worst security issues. This fact is perhaps best illustrated in [these examples](https://github.com/setl/rdf-urdna/tree/master/jsonld-warnings). I believe this is an issue even when following the [semantic interoperability guidelines from the VC data model 2.0 spec](https://www.w3.org/TR/vc-data-model-2.0/#semantic-interoperability), but I have not been able to verify this yet (the json-ld playground has been unavailable recently). ## JcsEd25519Signature2020 This is the one Linked data proof suite I am aware of that does not use URDNA2015, but rather JCS to serialize the JSON-LD document. This means that one signs the data essentially up to whitespace and key/field ordering. We believe this is much simpler for developers to account for, but which then comes at the cost of not being able to re-organize the document into a semantically equivalent representation without breaking the signature. From what I can tell there are no plans to update the spec to use the new `DataIntegrityProof` type and we are aware of known issues such as [#22](https://github.com/decentralized-identity/JcsEd25519Signature2020/issues/22), [#26](https://github.com/decentralized-identity/JcsEd25519Signature2020/issues/26) with the specification. Furthermore there are also a few open issues on the repository for the `serde_jcs` crate we use, but note that there is also now a perhaps more maintained crate called [json-syntax](https://crates.io/crates/json-syntax) that also has JCS functionality. Fixing the spec (and possibly our implementation) is a possible way forward, but we need to think carefully about the following questions before doing so: - Are we willing to accept the responsibility of maintaining our own spec? - Are there (enough) other parties interested in this suite, or does reviving the spec accomplish little more than introducing an additional burden for implementers who want to support as many cryptosuites as possible? ### DataIntegrity based Cryptosuites in code If we are to go for the `DataIntegrityProof` format as our default, we will probably want to enable cryotographic agility by allowing users to plug in a [Cryptosuite](https://w3c.github.io/vc-data-integrity/#cryptographic-suites) of their choice when securing verifiable data. We have considered a trait for this (a first PoC can be found in [this branch](https://github.com/iotaledger/identity.rs/blob/feat/suite-api/identity_key_storage/src/secured_methods/cryptosuite.rs#L15)), but it is not clear how sufficient this will be. Something like the [BBS+ cryptosuite](https://w3c-ccg.github.io/ldp-bbs2020/) might require additional parameters (which one would otherwise need to set in a stateful way). Our branch also contains a more conservative and low-level [mechanism](https://github.com/iotaledger/identity.rs/blob/feat/suite-api/identity_key_storage/src/secured_methods/document_ext.rs#L93) as a complementary approach to unlock cryptographic agility. The method provides a way to extract a RemoteKey (or perhaps it should rather be named `SecuredKey` ?) from a document that can be called at any point to sign data with the given algorithm (if it is supported by the storage and the extracted key). It is also important to mention that it is not clear to us at this point what is needed to make the `PresentationValidator` capable of validating proofs from arbitrary cryptosuites. There would probably be a need for a mapping of cryptosuite handlers, but it is not clear how much of the validation each handler does and what can be taken care of more centrally. It is also hard to abstract over different canonicalization methods, especially if data transformations are required before any form of validation can take place in a manner that is cryptosuite dependent. # JWT as an alternative proof format ## Example(s) JWT header of a JWT based verifiable presentation (non-normative) ``` { "alg": "RS256", "typ": "JWT", "kid": "did:example:ebfeb1f712ebc6f1c276e12ec21#keys-1" } ``` JWT payload of a JWT based verifiable presentation (non-normative) ``` { "iss": "did:example:ebfeb1f712ebc6f1c276e12ec21", "jti": "urn:uuid:3978344f-8596-4c3a-a978-8fcaba3903c5", "aud": "did:example:4a57546973436f6f6c4a4a57573", "nbf": 1541493724, "iat": 1541493724, "exp": 1573029723, "nonce": "343s$FSFDa-", "vp": { "@context": [ "https://www.w3.org/2018/credentials/v1", "https://www.w3.org/2018/credentials/examples/v1" ], "type": ["VerifiablePresentation"], // Array of Base64 encoded strings "verifiableCredential": [" "] } } ``` Verifiable presentation using JWT compact serialization (non-normative) eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6ImRpZDpleGFtcGxlOjB4YWJjI2tleTEifQ.e yJpc3MiOiJkaWQ6ZXhhbXBsZTplYmZlYjFmNzEyZWJjNmYxYzI3NmUxMmVjMjEiLCJqdGkiOiJ1cm46d XVpZDozOTc4MzQ0Zi04NTk2LTRjM2EtYTk3OC04ZmNhYmEzOTAzYzUiLCJhdWQiOiJkaWQ6ZXhhbXBsZ To0YTU3NTQ2OTczNDM2ZjZmNmM0YTRhNTc1NzMiLCJuYmYiOjE1NDE0OTM3MjQsImlhdCI6MTU0MTQ5M zcyNCwiZXhwIjoxNTczMDI5NzIzLCJub25jZSI6IjM0M3MkRlNGRGEtIiwidnAiOnsiQGNvbnRleHQiO lsiaHR0cHM6Ly93d3cudzMub3JnLzIwMTgvY3JlZGVudGlhbHMvdjEiLCJodHRwczovL3d3dy53My5vc mcvMjAxOC9jcmVkZW50aWFscy9leGFtcGxlcy92MSJdLCJ0eXBlIjpbIlZlcmlmaWFibGVQcmVzZW50Y XRpb24iLCJDcmVkZW50aWFsTWFuYWdlclByZXNlbnRhdGlvbiJdLCJ2ZXJpZmlhYmxlQ3JlZGVudGlhb CI6WyJleUpoYkdjaU9pSlNVekkxTmlJc0luUjVjQ0k2SWtwWFZDSXNJbXRwWkNJNkltUnBaRHBsZUdGd GNHeGxPbUZpWm1VeE0yWTNNVEl4TWpBME16RmpNamMyWlRFeVpXTmhZaU5yWlhsekxURWlmUS5leUp6Z FdJaU9pSmthV1E2WlhoaGJYQnNaVHBsWW1abFlqRm1OekV5WldKak5tWXhZekkzTm1VeE1tVmpNakVpT ENKcWRHa2lPaUpvZEhSd09pOHZaWGhoYlhCc1pTNWxaSFV2WTNKbFpHVnVkR2xoYkhNdk16Y3pNaUlzS W1semN5STZJbWgwZEhCek9pOHZaWGhoYlhCc1pTNWpiMjB2YTJWNWN5OW1iMjh1YW5kcklpd2libUptS WpveE5UUXhORGt6TnpJMExDSnBZWFFpT2pFMU5ERTBPVE0zTWpRc0ltVjRjQ0k2TVRVM016QXlPVGN5T Xl3aWJtOXVZMlVpT2lJMk5qQWhOak0wTlVaVFpYSWlMQ0oyWXlJNmV5SkFZMjl1ZEdWNGRDSTZXeUpvZ EhSd2N6b3ZMM2QzZHk1M015NXZjbWN2TWpBeE9DOWpjbVZrWlc1MGFXRnNjeTkyTVNJc0ltaDBkSEJ6T 2k4dmQzZDNMbmN6TG05eVp5OHlNREU0TDJOeVpXUmxiblJwWVd4ekwyVjRZVzF3YkdWekwzWXhJbDBzS W5SNWNHVWlPbHNpVm1WeWFXWnBZV0pzWlVOeVpXUmxiblJwWVd3aUxDSlZibWwyWlhKemFYUjVSR1ZuY 21WbFEzSmxaR1Z1ZEdsaGJDSmRMQ0pqY21Wa1pXNTBhV0ZzVTNWaWFtVmpkQ0k2ZXlKa1pXZHlaV1VpT 25zaWRIbHdaU0k2SWtKaFkyaGxiRzl5UkdWbmNtVmxJaXdpYm1GdFpTSTZJanh6Y0dGdUlHeGhibWM5S jJaeUxVTkJKejVDWVdOallXeGhkWExEcVdGMElHVnVJRzExYzJseGRXVnpJRzUxYmNPcGNtbHhkV1Z6U EM5emNHRnVQaUo5ZlgxOS5LTEpvNUdBeUJORDNMRFRuOUg3RlFva0VzVUVpOGpLd1hoR3ZvTjNKdFJhN TF4ck5EZ1hEYjBjcTFVVFlCLXJLNEZ0OVlWbVIxTklfWk9GOG9HY183d0FwOFBIYkYySGFXb2RRSW9PQ nh4VC00V05xQXhmdDdFVDZsa0gtNFM2VXgzclNHQW1jek1vaEVFZjhlQ2VOLWpDOFdla2RQbDZ6S1pRa jBZUEIxcng2WDAteGxGQnM3Y2w2V3Q4cmZCUF90WjlZZ1ZXclFtVVd5cFNpb2MwTVV5aXBobXlFYkxaY WdUeVBsVXlmbEdsRWRxclpBdjZlU2U2UnR4Snk2TTEtbEQ3YTVIVHphbllUV0JQQVVIRFpHeUdLWGRKd y1XX3gwSVdDaEJ6STh0M2twRzI1M2ZnNlYzdFBnSGVLWEU5NGZ6X1FwWWZnLS03a0xzeUJBZlFHYmciX X19.ft_Eq4IniBrr7gtzRfrYj8Vy1aPXuFZU-6_ai0wvaKcsrzI4JkQEKTvbJwdvIeuGuTqy7ipO-EYi 7V4TvonPuTRdpB7ZHOlYlbZ4wA9WJ6mSVSqDACvYRiFvrOFmie8rgm6GacWatgO4m4NqiFKFko3r58Lu eFfGw47NK9RcfOkVQeHCq4btaDqksDKeoTrNysF4YS89INa-prWomrLRAhnwLOo1Etp3E4ESAxg73CR2 kA5AoMbf5KtFueWnMcSbQkMRdWcGC1VssC0tB0JffVjq7ZV6OTyV4kl1-UVgiPLXUTpupFfLRhf9QpqM BjYgP62KvhIvW8BbkGUelYMetA the blue, green and purple characters correspond to the encoded header, payload and JWS signature respectively. ## VC-JWT As the example above suggests, verifiable credentials and presentations are not meant to be secured by a JWS directly, but should first be transformed into a JWT payload using certain pre-registered key values. This transformation process is deterministic and defined in the [VC-JWT spec](https://w3c.github.io/vc-jwt/). The main benefits of this approach is that this is an already established proof format with existing infrastructure and is in fact required [by EBSI](https://ec.europa.eu/digital-building-blocks/wikis/display/EBSIDOC/E-signing+and+e-sealing+Verifiable+Credentials+and+Verifiable+Presentations) (thanks Eike). When compared to `DataIntegrityProof` with either URDNA2015 or JCS, the JWT format does however come with the major drawback that one cannot directly transform a JWT representation into the data model without destryoing the proof. The full JWT always needs to be kept around, not just the JWS signature. This is because canonicalization is not applied and string/bytes representations of JSON are non-deterministic. ### What supporting JWT might look like in code If we want this to be our (for now) supported way of securing data, then we should at least cover all of the following user stories (as a user I want to): 1. Build a Credential in a way that is intuitive as long as I have a somewhat decent grasp of the VC data model. 2. (Similarly) Build a Presentation in a way that is intuitive as long as I have a somewhat decent grasp of the VC data model. 3. Secure a Credential in accordance with VC-JWT . 4. Secure a Presentation in accordance with VC-JWT . 5. Validate a Credential represented as a JWT. 6. Validate a Presentation represented as a JWT. 7. Encounter as few surprises as possible when working with the IOTA Identity library. #### Building a Credential After discussing with Philipp I think we probably want to keep the `Credential` and `CredentialBuilder` more or less as they are now and you can build up your credential using the existing API(s). #### Securing a Credential I expect this to look something like the following ``` // This is a method on the CoreDocument /// Generates a JWT protecting the Verifiable Credential in accordance with the /// VC-JWT spec. CoreDocumentExt::create_vc_jwt(&self, storage: Storage, data: &Credential, alg: &str, fragment: &str) -> Result<JWT, VCJwtGenerationError>; ``` where the return type `JWT` essentially just wraps a String representation of the JWS (JWT with JWS signature). The described method produces the JWT payload (following the prescribed transformations: issuer -> iss, etc.), generates the JWT header in accordance with `alg` and `fragment` (we might want some more options for advanced use, in order to be fully compatible with the EBSI requirements) and then uses the `Storage` to sign the JWT which our code then assembles into a JWS. A thing to note here is that we don't need to supply a `Cryptosuite` (an object that produces a `DataIntegrityProof` according to some specification). This is due to the more restrictive nature of the registered JWS digital signature algorithms, (some of) which the `Storage` should be able to handle directly. In other words we have traded the expressiveness of Linked data proofs for simplicity. ### Building a Presentation Again we keep `Presentation` and `PresentationBuilder` as before. If one wants to build a JWT-style presentation, it must only include credentials that are secured by JWTs. In other words one would do something like this: ``` credential.attach_jwt_proof(proof: JWT); presentation_builder.credential(credential); let presentation = presentation_builder.build()?; ``` This will have implications when attempting to produce a JWT for the presentation which we will promptly explain. After discussing with Philipp we would prefer to avoid leaking implementation details, so there is no way to directly print a `Presentation`. #### Securing a Presentation The API is similar to the credential case ``` // This is a method on the CoreDocument /// Generates a JWT protecting the Verifiable Presentation in accordance with the /// VC-JWT spec. /// /// This can only succeed in the case where all credentials contained in the presentation have an attached JWT (see `Credential::attach_jwt_proof`). CoreDocumentExt::create_vp_jwt(&self, storage: Storage, data: &Presentation, alg: &str, fragment: &str, options: &ProofOptions) -> Result<JWT, VPJwtGenerationError>; ``` however as mentioned this can only succeed if every `Credential` contained in the `Presentation` (internally) contains a corresponding JWT. Unfortunately this goes against point 7 (the principle of least astonishment), but I don't see how we can improve much on this without a proliferation of additional types. Perhaps having a dedicated `JWTPresentationBuilder` where users supply `JWT` representations of Credentials would be an improvement? #### Validating a Credential We introduce a new type ``` pub struct JWSVerifier { alg_handlers: HashMap<String, fn(&JWT, decoded_header: Header, public_key: &PublicKeyJwk) -> Result<(), JWTVerificationError>> } ``` Which is attached to the `CredentialValidator`. The validation API would be something like ``` CredentialValidator::validate<DOC: AsRef<CoreDocument>>(&self, credential_jwt: JWT, issuer_documents: &[DOC]) -> Result<Credential, CredentialValidationError>; ``` If the validation succeeds you will get the decoded `Credential` returned (so you can do your own additional validations if necessary). #### Validating a Presentation This is similar to validating a Credential and is ommitted. #### More Considerations In order to support `VC-JWT` we will also need to support Verification material of type [`PublicKeyJwk`](https://w3c.github.io/vc-data-integrity/#dfn-publickeyjwk). # Comparison table Let us try to summarize how the proof formats and approaches compare in a table. We use the abbreviation `DI` for `DataIntegrityProof`. |Property | DI (default = `di-eddsa-2020`) | `JcsEd25519Signature2020` updated | VC-JWT | |-------------------------|---------------------------------|-----------------------------------|-----------------| |Required by EBSI | No? | No | Yes | |Cryptographic agility | Advanced | None as of now | Basic | |Stability | Spec being refactored | Needs to be ensured by us | Established\* | |Proof secures | Semantic meaning | JSON object | Payload | |Cognitive complexity | High | Low | Low-Medium | |Popularity | Strong in the VC WG | Unknown | Pre-established | |Open world data modelling| Yes | Partial\*\* | Partial\*\* | |Code complexity | High | Low | Medium | \*Note that altough the JWT format is established, the VC-JWT spec could still change (to require different transformations for example). There are in fact some members of the VC WG at the W3C that are not happy with the spec as it is now. \*\*Furthermore by having *partial* support for open world data modelling we mean that it is possible for issuers to include contexts in the credential to be signed, but our `JcsEd25519Signature2020` or`VC-JWT` implementations as envisioned would not enforce correct handling of said context. See also the two comparison tables in the [VC implementation guidelines](https://w3c.github.io/vc-imp-guide/#proof-formats) for further comparisons. ## Personal opinion(s) - Oliver: I would suggest to primarily support `VC-JWT` in the 0.7 release due to it being required by EBSI and the fact that it delivers some amount of cryptographic agility without introducing much complexity. Prioritising this now does not necessarily mean never supporting data integrity proof cryptosuites. Indeed we might very well want to implement BBS+ at some point. - Philipp: I would go along with Oliver's opinion. I'm primarily worried about the (apparent) instability of both DI proofs and `VC-JWT`, again going back to the point that we want to deliver a _stable_ release of IOTA Identity for Stardust. But that isn't reasonably possible with the flux at both the spec level we see here and the flux on the IOTA ledger level. As for the proof specs: Since none of them are completely accepted specs this might also free us in a way, since it's out of our control and we can, and perhaps should not, worry about it too much.