Attestation Basics

# Attestation Basics ###### tags: `Notes` :::danger **This document is the raw notes and a precursor of the [Attwestion Flow](https://hackmd.io/KHyHqYHFTSukTOvMjW9N2Q) document, which supercedes it.** ::: ## Introduction TBD ## Authentication & Authorization Authentication is a system which attempts to identify the person or organization associated with an operation. It often looks something like the following scenario where Jane wants to find out how much money is in her bank account: ```mermaid sequenceDiagram participant customer participant bank customer ->> bank: "What is the balance of account 123456789?" bank ->> customer: "Who are you?" customer ->> bank: "I am Jane Doe. Here's my ID." bank ->> bank: Is Jane's ID valid? Yes. bank ->> bank: Does Jane own account 123456789? Yes. bank ->> customer: "Hi Jane! Account 123456789 has $102.59." ``` In the above example, if the bank gave information about the account to anyone who asked then no one would have any privacy. Instead, the bank performs two checks: authentication and authorization, respectively. In the first check, the bank **authenticates** Jane by confirming that she is who she says she is. In the second check, the bank **authorizes** Jane by confirming that Jane should be allowed to have information about the account in question. However, imagine that Jane has an identical twin sister named Jody. If Jody can steal Jane's ID, then Jody can access the account information *even though she is not Jane*. This is because Jody was able to gain access to Jane's **credentials**. In fact, this very scenario is the cause of most data breaches. In the vast majority of data breaches, the authentication and authorization system works as designed. The failure is simply that someone other than Jane was able to pretend to be Jane. ## Replacing Authentication with Attestation Imagine, using the prior scenario, that we had a machine that could perfectly detect what the person would do with the data that we gave them. Then the exchange could look like this: ```mermaid sequenceDiagram participant code participant bank code ->> bank: "What is the balance of account 123456789?" bank ->> code: "What are you going to do with that information?" code ->> bank: "I will balance Jane's check book." bank ->> bank: Is the code lying? No. bank ->> bank: Is balancing Jane's check book allowed? Yes. bank ->> code: "Account 123456789 has $102.59." ``` This process of determining what application code will do with information is called **attestation**. Attestation is an alternative approach to authentication which **identifies the code** that is receiving the data rather than a person or organization. Notice that under either authentication or attestation schemes the authorization step is still performed. ## Runtime Requirements for Attestation In order for attestation to work we need to be able to guarantee the following properties of the remote code: 1. **integrity of application code & data** - without this guarantee, an attacker could change what the code will do with the data after attestation has completed. 2. **confidentiality of application data** - without this guarantee, an attacker could simply take the data and use it for purposes other than the attested code. These properties are the minimum requirements in the Confidential Computing definition as defined by the [Confidential Computing Consortium](https://confidentialcomputing.io). A system which provides these properties is called a **Trusted Execution Environment (TEE)**. > 📝 A TEE containing the Enarx runtime is called a **Keep**. ## Attestation Primitives ### Code Measurement Attestation techonologies, to be discussed in detail below, today provide a mechanism for communicating what code will do with data provided: **measurement**. Before executing the first instruction in a TEE, the hardware uses a cryptographic hash to measure the initial state of a TEE. This uniquely identifies the code that will run in this TEE. Once the TEE is running, it can request a (signed) copy of the initial measurement. This can be passed to a remote party for validation. For example: ```mermaid sequenceDiagram participant enarx participant firmware participant TEE participant remote enarx ->> firmware: create empty TEE firmware ->> enarx: OK loop enarx ->> firmware: add code firmware ->> firmware: update measurement firmware ->> enarx: OK end enarx ->> firmware: start TEE firmware ->> TEE: start TEE TEE ->> firmware: get measurement firmware ->> TEE: measurement (signed) TEE ->> remote: measurement (signed) ``` But how can the remote party (sometimes called a **relying party**) trust that the measurement it received is valid? For that we need a hardware root of trust. ### Hardware Root of Trust For a relying party to trust any given measurement, it needs assurance that the measurement was produced by the hardware that instantiated the TEE and not by some fraudulant software process or vulnerable hardware. For this reason, the relying party does not receive a measurement by itself. Rather, the relying party recieves a complete **attestation report** which contains an application measurement along with information about the hardware and firmware which created the TEE. This attestation report is then signed by a private key from a key pair known only to the CPU which produced the TEE. In order for the relying party to trust this key pair, the public key is signed by a hierarchical certificate structure with the root certificate being provided out of band by the hardware manufacturer. The relying party can validate this hierarchical certificate structure in order to establish trust that the attestation report was generated by an actual CPU. The resulting structure looks like this: ```mermaid flowchart TD manufacturer[Manufacturer Root Cert] intermediate["Intermediate Cert(s)"] report[Report] cpu[CPU Key] manufacturer --signs--> intermediate intermediate --signs--> cpu cpu --signs--> report subgraph Hardware Attestation Service manufacturer intermediate end subgraph Host subgraph Firmware report cpu end end ``` ### Attestation Data It is still, however, insufficient to receive a signed attestation report. An attacker could, for example, copy an attestation report from another process and simply reuse it. Therefore, there needs to be some way to bind an attestation report to some particular data (such as a message or a cryptographic key). This ensures to the relying party that the data was produced inside of a TEE. To facilitate this, the method within a TEE to request an attestation report allows the caller to specify some attestation data to be included in the report. This opaque data field will then be signed by the CPU key along with the other data in the report. Often this opaque data field will contain a `nonce` or the hash of a `public key` or other unique material. Putting this entire process together in a single diagram looks like this: ```mermaid flowchart TD manufacturer[Manufacturer Root Cert] intermediate["Intermediate Cert(s)"] attestation[Attestation] rp[Relying Party] report[Report] cpu[CPU Key] data[Data] manufacturer --signs--> intermediate intermediate --signs--> cpu data --1. requests--> report cpu --2. signs--> report report --3. returns--> attestation attestation --4. provides--> rp subgraph Hardware Attestation Service manufacturer intermediate end subgraph Host subgraph Firmware report cpu end subgraph Keep attestation data end end ``` ## Remaining Problems Although attestation has given us an important tool for increased security, some problems still remain: 1. **functional equivalence** - Since each TEE hardware implementation creates the TEE in distinct ways and since each attestation protocol is proprietary, the same code compiled for multiple TEEs produces differing measurements. However, the algorithms in those workloads are the same. This means that we end up getting differing measurements for the same algorithms on different platforms. This is a nightmare for management. How can we determine that the workloads running on differing platforms are the same regardless of irrelevant platform differences? 2. **auditability** - Having an attestation policy which contains a bundle of permitted measurements is an auditability disaster. How do we know which measurements correspond to which application names and versions? How can we express attestation policy in a way that is human readable? 3. **discrete validation** - Validating an attestation report requires validating both platform state and workload state. However, teaching every service to understand how to validate the platform state for every separate hardware platform as well as workload state is a gargantuan task. How can we separate concerns so that platform state can be validated independently from workload state? 4. **protocol modification** - Since authentication is typically performed at the protocol level, do we have to teach every protocol how to also do attestation? Is it possible to construct attestation so that it can be used with existing protocols? Solving these problems will be the topic of the next article.