Design: FTL Identity

# Design: Identity in FTL **Author**: @aszlavik @gak @aat  ## Description (what) FTL's internal identity scheme assigns unique identities which are bind-able to resources within FTL. FTL identities will be assignable to FTL components (Runner, Controller, In/Egress etc.) as well as internal resources (such as databases, pubsub etc.). Leveraging internal identity constructs will pave the way for fine-grained access control and rich audit mechanisms. ## Motivation (why, optional) Authentication and authorization are corner stone primitives of any distributed system, underpinning access control capabilities. Today, FTL's only form of identity are the "Model Keys", largely used by the ftl controller to manage the cluster. These keys however only encode infrastructure information and provide no protections for extraction of identity information and impersonation.  ## Goals - Identities are unforgable by system observers. - Establish trust in newly provisioned resources. - Identities are bindable to FTL modules, components and resources. -- Identities cannot be extracted from their containing security domain. - Ensure identities are unique across an FTL cluster. - Logically separate FTL identity and the underlying infrastructure identity of a workload or resource. ### Non-Goals (optional) - Define access control mechanisms within FTL (see capability model). ## Design (how) ### Overview FTL's identity design follows a simple PKI approach. A component within the controller (currently termed the Provisioner Authority [PA]) acts as the root of trust. All identities within the cluster are derived from this authority. For this initial version, an FTL identity is externally represented through the public key portion of an asymmetric keypair. We are choosing EdDSA as our signature scheme and Curve25519 as our elliptic curve. FTL identifier will therefore be 32 bytes in size. Identity management follows the following principles: - Identities are only valid if validated by the PA. - Identity credentials are bound to the resource they are assigned to. - Identities credentials should not be extractable from their containing security boundary. - Identities are cheap to create and follow the lifecycle of the resource they are attached to. - Identities assigned to compute resources, should be differentiable between instances of the resource (ie. different deployments). These principles aid in defining security domains for identities as well system auditability. ### Protocol There are two discrete flows where identity information is exchanged within FTL today: Establishment of identity and subsequent Authentication with the cluster. #### Establishing Identity Establishing identity within an FTL deployment requires interaction between the component / resource and the Provisioner Authority. Our initial use case is focused on establishing identity for FTL Runners, but this approach can be extended for any resource so long as it can communicate with the PA. Steps: 1. The provisioner instantiates a new runner and dispatches a workload to it. 1a. The provisioner injects a nonce into the environment of the new workload (32 bits of randomness). 1b. The provisioner records the nonce, alongside other metadata pertaining to the newly provisioned instance. 2. The runner run-time, generates an identity key pair 3. The runner sends it's public key and nonce to the PA for signing. 4. The PA attests to the legitimacy of the runner, signs the public key and returns it to the runner. The random nonce of the runner #### Using FTL Identity for Authentication In order to use the identity for authentication for outbound requests, a runner (bearer) presents their signed public key in addition to a signature over the request made. The signature is generated using the identity key. Steps: 5. Runner crafts a request, signs it with it's identity key and sends it to the upstream resource along with it's signed public key. 6. The resource node verifies the authenticity of the public key against the public key of the PA. The resource node then verified the authenticity of the request against the public key of the origin. 7. At this stage, the resource node has trust that the request was issued by the public key holder and may apply authorization controls. **A note on practicality:** This proposal has focused just on the critical authentication components. In practice, a runner would have their public key and additional metadata signed by the PA. For example, a human readable identifier which system administrators may craft authorization policies against. This approach requires asymmetric crypto operations on every request which we wish to authenticate. Once required, future performance improvements could include: - Symmetric key exchange and caching between peers. Peers could exchange symmetric keys on first contact (similar to TLS handshake) and use this key for future authenticated message exchange. - Key caching. Resources could cache identifying information about a requesting counter-party. With this approach we do not explicitly authenticate every request, but bind the authentication to an open socket, underlying mTLS certificate or similar datum. - Transition to full mTLS or SPIFFE identities at the application layer. ![image](https://hackmd.io/_uploads/HkislgcTC.png)  ### Required changes (how) FTL Controller: - Generate a root keypair (PA) for the cluster on initial setup of FTL. - Add key signing service for runners / resources. FTL Runner: - Generate local identity key pairs. - Send key pairs to PA for signing. - Retain signed identity for presentation. - Ability to sign arbitrary requests to proof holding of identity key.  ## Rejected Alternatives (optional) - We are not pursuing a full x509 certificate based identity today to avoid the complexity of managing these certificates. We may still adopt x509 (or even SPIFFE identities). - We are not using an "central authorization server" approach ala [Kerberos](https://en.wikipedia.org/wiki/Kerberos_(protocol)). - Largely rejected to avoid initial complexity - This comes at the expense of performance, a symmetric key ticketing system is theoretically more performant with fewer hops than an asymmetric cryptography scheme. The performance concerns can be optimized, through a [hybrid](https://en.wikipedia.org/wiki/Hybrid_cryptosystem) keying approach in the future.  ## Security model FTL's identity design focuses on mitigating unauthorized access to internal resources by application code. There are a number of key assumptions underpinning the security of this identity scheme: Like any PKI system, FTL's initial internal identity approach assumes that the PA (CA) is trustworthy. Compromise of the PA's root key inherently collapses all trust relationships of the system. The model also depends on the integrity of the underlying container infrastructure. We assert that reasonable steps have been taken to limit access a runners container (and pod) by both human and non-human principles. Since all identity information is resident in the user space of the runner's runtime, caution should be taken to not snapshot or clone a running image.