Structured Key Schedule Design Using NKDFs

--- tags: chrisundjan --- # Structured Key Schedule Design Using NKDFs ## Abstract The HMAC RFC by Krawczyk, Bellare and Canetti from 1997 is nowadays the de facto default pseudorandom function used in key exchange protocols. One of its most popular uses is specified in the HKDF RFC by Krawczyk from 2010---indeed, HKDF is a cornerstone of the key schedules of TLS, MLS and the Noise protocol family. However, unlike in Krawczyk's carefully outlined extract-once approach, TLS, MLS and Noise extract multiple times from the same key material. In addition, since they combine multiple keys, current key schedules consist of complex chaining of HKDFs. The choices for which context to hash are results of extended working group discussions, and the rationales are not immediate from the design. In this paper, we take a structured approach to key schedule design. We first introduce and construct NKDF, namely key derivation functions which gracefully combine multiple keys, and we provide instructions for which context to include. We then show that the NKDF building blocks can be securely combined in flexible ways using simple domain-separation principles. ## Introduction (2020-12-27) Cryptographic schemes rely on carefully designed key schedules to derive keys for use in their subcomponents. While key schedules have a long history in the context of symmetric-key primitives such as AES, key schedule design for key exchange protocols has only recently become complex design objects. A central component in the key schedules of cryptographic key exchange protocols such as TLS, MLS and the Noise family is the HKDF key derivation function designed by Krawczyk. Krawczyk carefully designed HKDF to first _extract_ a key from input key material and then _expand_ the extracted key into multiple output keys. KDFs facilitate key schedule design since they allow to derive multiple keys from a single input key and to include protocol context into the derivation. Yet, key schedule designs are faced with combining _several_ input keys which become available at different points in time. HKDF was not designed to handle such complex interfaces and new design patterns have since emerged to provide such functionality. Perhaps surprisingly, the security of the new design patterns relies on non-standard ad hoc assumptions and, e.g., requires to extract multiple times from the same key material---a no go in extractor design and something which HKDF specifically aimed to avoid. It is an open question how to construct a key schedule whose security is based on standard assumptions. The functionality of the key schedule needs to support - multiple input keys which become available at different points in time - an a priori unknown number of input keys - deletion of (some) input keys - derivation of key material from any subset of the input keys - inclusion of context In turn, the design needs to - avoid extracting mutiple times from the same value - provide guidance for which protocol context to include In this work, we split these problems into two parts. We start with a _static_ building block called NKDF which has a _fixed_ number of input keys. We then turn to constructing key schedules out of NKDFs. Key schedules allow for an _arbitrary_ number of input keys and support deletion of earlier input keys. For both NKDFs and key schedules, we provide security definitions, constructions and reductions to a collision-resistant pseudorandom function (PRF). The de facto default choice for a collision-resistant PRF is Krawczyk's HMAC function. NKDFs provide pseudorandom keys as long as one of the input keys has sufficient min-entropy. Our construction of NKDFs consists of the three phases _extract once_, _domain separation_ and _statistical single-use combiner_. The idea is to first extract a single key from each input key material, then apply domain separation to obtain single-use key values and then finally apply a single-use (statistically secure) combiner/extractor which yields a uniformly random key as long as one of its inputs is uniformly random. We now elaborate on each of the phases in turn. **Extract once**: Following Krawczyk's HKDF design, we suggest to extract a key once from each input key material and then use the extracted key to key a PRF for further derivation. Importantly, we suggest to use a fixed salt value (which is drawn uniformly at random at protocol standardization time) for extraction. This is useful in the protocol setting since a _protocol-chosen_ salt is susceptible to modification by active attackers, which might lead the protocol participants to use _different_ salts with the _same_ input key material, leading to double extraction, contrary to the extract-then-expand paradigm outlined by Krawczyk. **Domain separation**: An established and useful design principle in key derivation is _domain separation_. Namely, it is common to include a _label_ value into the key derivation which is different for each KDF employed in the key schedule of a protocol. We suggest to apply careful domain-separation also between the following cases: - position in which a key is used. E.g., there should be no confusion/correlation between the case where a key is used as first or as second input. - overall number of keys. E.g., there should be no confusion/correlation between a case where a key is used in a 2-key NKDF and a 3-key NKDF. - identity of all keys which are input to the derivation. E.g., if one of the input keys is different, then the other keys resulting from this phase should be different, too. For item (3), we need a notion of _identity_ or _handle_ of a key. Some keys come with natural, public handles. E.g., for the Diffie-Hellman key g^xy^, a natural identity is the lexicographically sorted pair (g^x^, g^y^). For other keys, we suggest to derive and include such a handle. **Statistical Combiners**: The domain-separation phase now results in key values which are only used _once_. Thus, we can rely on _statistical combiners_, i.e., combiners whose security can be proven such as the xor combiner for pseudorandomness. A key schedule might chain several NKDFs in order to - be able to delete earlier keys. Namely, for an NKDF evaluation, all inputs need to be known whereas for chaining, only the chaining key needs to be stored. Key deletion strengthens forward secrecy. - be able to combine an a priori unknown number of keys. In particular, the length of a chain might be unbounded when it corresponds to the number of session resumptions in TLS. - gain efficiency when only a subset of keys is known. Namely, in an NKDF, unknown keys can be replaced by a zero string---while this is efficient when combining few keys, the approach quickly becomes inefficent as the key count grows, e.g. due to potential session resumption. We prove that this chaining approach is sound as long as the different NKDFs are cleanly separated by the use of different labels. The proof is a clean hybrid argument which proceeds in the topological order of the NKDFs. The current protocol versions of TLS and MLS merge key material using the extract function of HKDF and thus rely on the security of multiple extraction from the same input key material using different salts. Thus, the current analyses of TLS need to rely on the salted oracle Diffie-Hellman assumption which combines a Diffie-Hellman assumption with multiple extraction from the same Diffie-Hellman secret. In turn, the analysis of MLS assumes that the extract function is a PRF when keyed with the input key material---however, it has only been shown that the extract function of HKDF is a PRF when keyed with the salt. In turn, there is no proof yet that the extract function of HKDF is a PRF when keyed with the input key material. We construct key schedules for TLS and MLS that are straightforward application of our NKDF-based approach and thus require neither multi-extraction nor extraction-as-PRF with comparable efficiency. For this, we introduce multiple variants of NKDF, that operate on different classes of input keys (Diffie-Hellman, high entropy and pseudorandom keys) and provide different guarantees (pseudorandom output keys, or pseudorandom and unique output keys). The proofs presented in this paper follow the State Separating Proofs (SSP) methodology, outlined by... . SSPs turned out to be instrumental in defining clean abstractions and making general arguments over classes of key schedule constructions. In particular, since reductions can be specified by cuts in graphs, we can provide arguments merely by observing that the shape of graphs is well-formed. - further sections (old) - definitions - assumptions - ... NKDF - KS ??? - all as SSP - constructions - ... NKDF - generic KS - reductions - ... NKDF to assumptions - generic KS to NKDF --- ## Introduction #### Setup - Was ist eine KDF? - principles - - - - concepts - - - - In this paper, NKDFs - Warum? - Protocols construct them ad-hoc all the time - Practical NKDF (attempts) - list ad-hoc designs made by protocols - vielleicht nicht alle, sondern common pattern zeigen - ursprung des common pattern, optls? - limitations - weak key uniqueness for internal keys - strong assumptions - We initiate the study of this kind of primitive - Definitions - and start by finding a definitional framework - Constructions - construction framework ? - extract-then-expand - multi-key-extract - prf as expand - DH x - Sym x - cr or not x - tools for DomSep - public handles for keys - include them, their position, their number - include other context as well - Krawczyk labels for each key - Key schedule framework: How to use an NKDF in a key schedule - Proofs - Proofs for (cr)(DH)NKDF & (cr)NPRF - Can we have generic proofs with chaining and resumption? - poly-bounde length - specify interface: k1 mit k2, dann später k3... - Ansatz: - Es gibt einen graphen (sketch, probably flawed) - knoten - ??? - "out_i" -> given to attacker - out key packages - "in_i" -> symbolic input variables - DH keys? - basically input key packages - (DH)NKDF node - node label: `([out.label | for all out s.t. out = (*, node)], salt)` - edges - properties: - in/A -> NKDF: - in -> NKDF: used as input - * -> out: given to attacker - edge label: `f(edge) = ([node.label | Edge(node, _) = edge], derivLabel)` - label := edge label, n, value label 1, value label 2,..., value label n - [a, b, c], ctx -> NKDF: edge label: "NKDF(ctx, [a,b,c])" (oder so) - unique edge label ? - NKDF( (NKDF(a,b,c),d), (NKDF(a,b,c),e), (NKDF(a,b,c),f) ) - statement: - NKDF(n, a, b, c, NKDF(d, e, f) ...) honest iff a honest or b honest or c honest or d honest or e honest or f honest or k... - Design Principles - public handles - can be achieved by mirroring the operations on keys - leaves open handles of base keys - DH: pubkey - sym: XPD(key)??? - DomSep - Applications - TLS - MLS - Noise ------------------------------------------------ :-1: The design of key exchange protocols The design of classical key exchange protocol successively lead up to Lowe pointed - Needham-Schroder --> hash stuff into the key Classical two-party key exchange protocols such as Needham-Schroder --> domain separation between identity pairs. - PCS/FS... --> multi-key (paragraph below) - protocols kombinieren viele keys weil FS, PCS(!) - Noise, TLS, MLS, Signal - diagramme von den combinern aus den protokollen zeigen Recently developed protocols such as Transport Layer Security (TLS), Messaging Layer Security (MLS), Noise and Signal aim to achieve forward secrecy (FS) and/or post-compromise security (PCS). FS ensures security of past communication sessions even when a long-term secret gets corrupted later. PCS provides guarantees even _after_ a key was compromised by introducing new fresh key material. As a result, all aforementioned protocols need to combine key material into a single key. The respective constructions are sketched in Figure 1. - deswegen stellt sich die Frage, wo man domain separation braucht, i.e. was in welche derivation warum reinhashen sollte oder auch nicht. - jumping ahead, man sollte soviel context verwenden, wie man hat, so früh wie möglich. Wenn man das nicht tut, geht das Protokoll vielleicht nicht gleich kaputt, aber man verwendet stärkere Annahmen und es ist lästig, es zu analysieren. . - systematic studies, way of combining stuff. Unfortunately, TLS, MLS, Noise and Signal keine ordentliche domain separation => stärkere, unbewiesene Annahmen (die schon okay sein könnten, aber man kann's auch einfach systematisch machen) MLS: - TreeKEM is the worst of the worst: - no group context in TreeKEM derivations - no previous key material in TreeKEM derivations - Jan hat irgendein decentralized paper, das das auch schlecht macht - https://eprint.iacr.org/2020/1281 - domain separation in literature - domain separation for signatures: - https://eprint.iacr.org/2016/711 - hash primitive domain separation: - https://eprint.iacr.org/2013/772 - https://eprint.iacr.org/2017/003 . - wollen FS/PCS haben - was ist FS/PCS und warum ist das wichtig - achieven das durch key combiners #### Structure - view this primitive as a design and study it systematically - provide a systematic way of studying such designs - study existing designs (point our their weird assumptions and open questions, note that this recovers existing stuff) - clean design not obvious (at least not based on established assumptions) - make a new proposal based on established assumptions #### view this primitive as a design and study it systematically We set out to study such key-combining primitives (KCP) systematically. We start by distinguishing four types of KCPs: NPRFs, NKDFs, DHNKDFs and MixNKDFs. An NPRF takes several keys and combines them into a single pseudorandom key---as long as one of the input keys was uniformly random and secret. An NKDF achieves the same result as long as one of the inputs has high-entropy and is secret. A DHNKDF takes as input pairs of a Diffie-Hellman share and an exponent and provides a pseudorandom key as long as one share-exponent pair was honest. Here, we call a share-exponent pair honest, if both the exponent itself, as well as the exponent of the share are secret. A MixNKDF mixes asymmetric DHNKDF-inputs and symmetric NKDF-inputs. In key exchange, each of these primitives is useful to provide key indistinguishability of the derived sessions keys. We then introduce their collision-resistant variants crNPRF, crNKDF, crDHNKDF and crMixNKDF. These collision-resistant variants additionally provide unique output keys even when all key material is corrupted. In key exchange, collision-resistant key derivation primitives provide agreement properties, even when key material is corrupted. I.e., if two participants disagree on an important protocol variable, they will derive different keys. Returning to MLS, TLS, and Noise, we study which security each of these designs aims for and establish the underlying assumptions, recovering the results of ... in a systematic way. In particular, Noise and TLS 1.3 rely on a crMixNKDF and require the salted oracle Diffie-Hellman assumption (eprint/2019/436) > insert conclusion of study here + discuss the assumptions (and why they might be non-standard) KCP constructions tend to be re-used: OPTLS was the first protocol to rely on HKDF as a key combiner, which is now used in TLS and MLS. In turn, Signal and Noise use HMAC as a key combiner. The assumptions underlying these key combiners are plausible, yet non-standard. While there is no indication of practical attacks in the real world, it is considered good practice in protocol design to prefer constructions based on standard assumptions. This is especially the case for a cryptographic primitive which seems to be central to protocol design and is frequently re-used. We thus suggest to base KCPs on _standard_ assumptions. Studying them in isolation additionally allows us to optimize efficiency. We construct (cr)NPRFs and (cr)NKDFs based on standard PRFs and KDFs, respectively (as well as collision-resistance). Their overall efficiency is similar to existing constructions, their parallel efficiency is significantly higher, see Table... . Our (cr)DHNKDF and (cr)MixNKDF constructions, in turn, additionally rely on the oracle Diffie-Hellman assumption which is non-standard, yet less exotic than ... , ... , ... and ..., respectively, which were used in the analysis of ... and ... . #### Further research directions - entropy is boring - Is HMAC/HKDF a dual PRF? - can we construct key combiners based on DH shares based on standard CDH/DDH (without RO, obviously :-p ) All constructions require slightly different assumptions. All assumptions ar , albeit in slightly different variations and under different assumptions. All the assumptions underlying these use All these constructions share one property: They all use HMAC or HKDF (which itself is based on HMAC) to construct the combiner. The assumptions underlying these key combiners are plausible, yet non-standard. While there is no indication of practical attacks in the real world, it is considered good practice in protocol design to prefer constructions based on standard assumptions. Are these Bellare #### provide a systematic way of studying such designs #### study existing designs (point our their weird assumptions and open questions, note that this recovers existing stuff) #### clean design not obvious (at least not based on established assumptions) #### a new proposal based on established assumptions #### entropy ist nicht spannend respective use key combiners or DPRFs in order to achieve forward secrecy (FS) or post-compromise security (PCS). The respective constructions are sketched in Figure 1. assumptions

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.