---
title: "Efficient Distributed Secure Memory with Migratable Merkle Tree "
tags: paper-notes
---
<style>
.markdown-body{
& .title-tags-preview{
display: none;
}
& h1, & h2, & h3{
font-family:"Noto serif series=all", serif;
}
/* Paper title */
& h1:first-of-type::first-line{
font-size: 24pt;
}
& h1:first-of-type{
position: relative;
width: fit-content;
padding: 0 2.5em 0;
margin: auto;
margin-bottom: 6px;
border: none;
font-size: 18pt;
text-align: center;
counter-reset: section-count 0;
}
& h1:first-of-type::first-line{
font-size: 24pt;
}
& h1:first-of-type::before,
& h1:first-of-type::after{
content: "";
position: absolute;
height: calc(10px + 90%);
width: 15px;
border-top: 0.15em double #F19A3E;
border-bottom: 0.15em double #F19A3E;
}
& h1:first-of-type::before{
inset: 0 auto 0 0;
border-left: 0.15em double #F19A3E;
}
& h1:first-of-type::after{
inset: 0 0 0 auto;
border-right: 0.15em double #F19A3E;
}
/* Conference Name */
& h1 + h6{
margin: 0 auto 14px;
text-align: center;
font-size: 8pt;
color: #C8C8C8;
}
& h1 + h6::before{
content: "";
display: block;
width: 5em;
border-top: 1px solid #C8C8C8;
padding-top: 5px;
margin: 0 auto 0;
}
& h1 + h6 + blockquote{
border: none;
padding: 10px 0px 14px 0px;
color: #808080;
}
/* Headers */
& h1:not(h1:first-of-type){
position: relative;
padding: 7px 70px 3px;
border-bottom: 1px solid #C8C8C8;
font-size: 24pt;
text-align: center;
counter-increment: section-count;
}
& h1:not(h1:first-of-type)::before{
content: counter(section-count, decimal-leading-zero);
position: absolute;
inset: 50% auto auto 0;
font-family: "Roboto", "DejaVu Sans Mono", sans-serif;
font-size: 32pt;
color: #F19A3E;
transform: translateY(-50%);
}
& h2{
position: relative;
/* border-bottom: 1px solid #C8C8C8; */
border-bottom: none;
padding: 1px 0 3px 16px;
font-size: 16pt;
}
& h2::before{
content: "";
position: absolute;
inset: 50% auto 0 0;
transform: translateY(-50%);
height: 75%;
width: 0.2em;
background-color: #F19A3E;
}
& h3{
/* border-bottom: 0.7px solid #F19A3E; */
border-bottom: none;
}
/* alert blocks */
& .alert-success,
& .alert-info{
position: relative;
color: #F19A3E;
background-color: #F19A3E33;
}
& .alert-success{
border: 2px solid #F19A3E;
}
& .alert-info{
border: none;
}
& .alert-success::after,
& .alert-info::after{
display: block;
position: absolute;
font-size: 40pt;
font-style: bold;
color: #F19A3E;
-webkit-text-stroke: 0.5px #FFFFFF;
font-family: 'Noto Sans script=all';
}
& .alert-success::after{
content: "✔";
inset: -40px -5px auto auto;
}
& .alert-info::after{
content: "🛈";
inset: -40px 0px auto auto;
}
& .alert-warning,
& .alert-danger{
position: relative;
color: #FFFFFF;
background-color: #F19A3E;
}
& .alert-danger{
text-align: center;
padding-left: 50px;
padding-right: 50px;
}
& .alert-danger::before,
& .alert-danger::after{
content: "⚠";
display: block;
position: absolute;
transform: translateY(-47%);
font-size: 40pt;
color: #FFE0BF;
font-style: bold;
font-family: 'Noto Sans script=all';
}
& .alert-danger::after{
inset: 50% 5px auto auto;
}
& .alert-danger::before{
inset: 50% auto auto 5px;
}
/* Lists */
& ul{
& li:not(li:first-child){
margin-top: 5px;
}
& li::marker{
content: '► ';
color: #F19A3E;
}
& li:first-child{
margin-top: 6px;
}
& li li::marker{
content: '▻ ';
}
& li li li::marker{
content: '▸ ';
}
& li li li li::marker{
content: '▹ ';
}
}
& ol{
list-style-type: decimal-leading-zero;
& li::marker{
font-family: "Roboto", "DejaVu Sans Mono", sans-serif;
font-weight: 600;
color: #F19A3E;
}
& ol{
counter-reset: sub-nl-index;
list-style-type: none;
}
& ol li{
counter-increment: sub-nl-index;
}
& ol li::marker{
content: counter(sub-nl-index) " — ";
font-family: "Roboto", "DejaVu Sans Mono", sans-serif;
font-weight: 600;
color: #F19A3E;
}
}
/* Horizontal rule */
& hr{
width: 80%;
height: 0.15em;
margin-left: auto;
margin-right: auto;
background-color: #FFE0BF;
}
/* Hyperlink */
& a{
color: #F19A3E;
text-decoration: underline;
}
& a span::before{
content: "⧉ ";
font-family: "Noto sans", serif;
font-size: 10pt;
}
/* Text styles */
& mark{
background: linear-gradient(to bottom,
#FFFFFF 17%, #FFE0BF 17%, #FFE0BF 83%, #FFFFFF 83%);
}
/* Table */
& table{
display: table;
width: 100%;
border-collapse: seperate;
& thead, & th, & tr, & td{
border: 2px #FFFFFF solid;
}
& tbody tr:nth-child(2n + 1){
background-color: #EFEFEF;
}
& thead tr{
color: #FFFFFF;
background-color: #F19A3E;
}
}
/* Q&A using spoilers */
& h1, h2, h3, h4, h5, h6{
counter-reset: qa-index;
}
& details[open]{
& summary{
font-weight: bold;
}
& summary::before{
content: "Q" counter(qa-index, decimal-leading-zero)". ";
counter-increment: qa-index;
color: #F19A3E;
}
& > summary{
list-style-type: "";
}
}
}
</style>
# Efficient Distributed Secure Memory with Migratable Merkle Tree
###### IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2023
# Introduction
Many hardware assisted enclaves exists, such as:
+ Intel SGX
+ AMD SEV
+ ARM CCA
+ RISC-V Penglai
## Problems
+ Existing enclaves are designed to protect memory on a ==single node==
+ Inter-node network is considerd ==untrusted==, thus software secure channel (e.g., TLS/SSL) is required for data tranfer
+ Crypto-based approaches incur significant overhead
+ Netword connection bandwith — 400-800 Gbps
+ AES/HS3 algorithm w/ AES/GCM accelerator throughput — 2-40 Gbps
:::success
**Proposed Solution:** Scale current single-node memory protection engine todeal with multiple nodes
:::
## Challenges
+ Root of integrity / one-time pad is sealed in CPU and connot be trasferred by untrusted network
+ Middle-man attacks are easy, such as luanching replays, re-order attacks
# Background Knowkedge
## Hardware-based Memory Protection
1. **Counter-Based integrity Tree:**
A hash tree also includes a counter value when the tree is modified. It ==guarantees confidentiality== of secure memory, and is resistant to replay attacks.
2. **One-Time-Pad Encryption:**
The code address and counter value are used to generated a **OTP** (one time pad) to perform XOR with some Plaintext generating a cyphertext.
## Demand of Distributed Confidential Computing
1. **Security gurantee for both code and data:**
Hardware enclaves CAN protect confidentiality of code and data, but CANNOT guarantee the integrity, requiring users to verify integrity of input and check the mesurements of the executive code. Despite the better performance of hardware enclaves, security checks brings high overhead
2. **Large-Scale Data Transfer:**
Communication overhead between nodes is one of the key metrics for distributed computation.
## Inefficiency of Secure Channel over Untrusted Interconnection
| Method | Throughput | Connection |
| :------: | :------: | :------: |
| PCI-E 5.0 | 32 GT/s | CPU-Device |
| UCI-E | 32 GT/s | Chiplets |
| RDMA | 400 Gb/s | Remote Memory |
| NVLINK | 900 GB/s | GPU |
1. **Limited throughput over secure channel mechanism:**
Interconnection latencies are low (above table), but also has no protection (untrusted). Secure channel can resist tanpering during transmission but has low throughput and reduces performance by orders of mangnitude.
2. **Establish the secure channel between enclaves:**
Enclave design uses a untrusted buffer to copu I/O messages out of enclave memory, requiring itself to encrypt the message. Due two the ==2 extra memory copies==, it is considered time-comsuming.
# Design Overview
## Design Goals
1. The system should eliminate re-encryption overhead when transferring secure memory, and *saturate* the maximum throughput of the interconnection.
2. The system should achieve same level of security gurasntee as secure channel mechanisms, with concern of the following properties:
+ Confidentiality
+ Integerity
+ Freshness of transferred data
3. The system should hide hardware details for userapplications and inherit programming paradigm of the distributed confidential computation.
## Threat Model
The **TCB** (Trusted Computing Base) only contains the CPU and the most previleged software.
+ **Physical Attacks:**
Off-Chip hardwares plugged in are possible to perform attacks such as:
- Spying
- Slicing
- Replay Attacks
+ **Privileged Software Attacks:**
- Despite attacks triggering software attacks from REE, the firmware running most-privileged mode CANNOT be comprised.
- TEEOS is trusted but optional
- Side-Channel attacks and DoS attacks are not considered.
## Challenges
To send secure memory to others directly, the following problems must be solved:
:::spoiler {state=open} How to decouple the memory protection from the CPU-bound secret ina single node?
Unique physical address is used to generate the OTP for memory encryption, but the key is sealed within the CPU.
:::
:::spoiler {state=open} How to securely transfer the secure memory to the remote node without re-encryption?
If the secure memory is sent, the receiver wouldn't be able to decrypt and authenticate the transferred memory. However, we must defend against attacks during transfer.
:::
:::spoiler {state=open} How to hide the hardware details and provide the suitable primitive for distributed conputation?
The new hardware extension cannot complicate the implementation of user applications or break the paradigm in distributed confidential computation.
:::
## Important Terms
+ **Integrity Forest —** Spans among multiple nodes. To join, each node must be verified via a *global attestation mechanism*.
+ **MMT Closure —** A transfer unit, integrating both data and metadata. A remote node would be able to *decrypt* and *authenticate* the transfer memory according to it.
+ **MMT Closure Delegation &edash;** Communication protocol which allows transferring MMT closures.
+ **MMT Monitor &edash;** A module in the most privileged mode, works to:
- Origanize secure hardware
- Establish connection between remote nodes
# Detailed Design of the MMT (Migratable Merkle Tree)
## Multiple-nodes Integrity Tree
Two steps are used to set up a integrity forest:
1. A global authority node will verify each participant node joining the distributed computing. Three phases are included for a participant ==attest itself== to the global authority node:
1. The attested node generate a key agreement message to attestation server, discuss a session key.
2. The attested node sends a manufacturer certificate sealed with machine key to attestaation server.
The attestation server verify the certificate wity manufacturer's public kay and respond a CA report.
3. The attested node sends node-related message (e.g., software mesurement, node meta-info).
The attestation server checks the info and responds a ==global-unique node== id to node.
2. A integrity forest is constructed to protect the memory among nodes.
```graphviz
graph G{
rankdir = TB
ranksep = 0.3
nodesep = 0.4
node[
shape = circle,
label = ""
fontname = sans
]
IF[
shape = record;
width = 4;
label = "Integrity Forest";
]
splines = false;
IF -- A_1 [style = dotted]
IF -- if11, if12 [color = invis]
IF:s -- B_1 [style = dotted]
IF -- if21, if22 [color = invis]
IF -- C_1 [style = dotted]
A_1 -- A_2, A_3
A_2 -- A_4, A_5
A_3 -- A_5, A_6
B_1 -- B_2, B_3
B_2 -- B_4, B_5
B_3 -- B_5, B_6
C_1 -- C_2, C_3
C_2 -- C_4, C_5
C_3 -- C_5, C_6
if11, if12, if21, if22, if31, if32[
width = 0.52;
color = invis
]
{ rank = same; A_1 -- if11 -- if12 -- B_1 -- if21 -- if22 -- C_1 [color = invis] }
{ rank = same; A_2 -- A_3 -- if31 -- B_2 -- B_3 -- if32 -- C_2 -- C_3 [color = invis] }
A_1[ label = 1 ]
B_1[ label = 2 ]
C_1[ label = 3 ]
node[
shape = record;
width = 2.3
label = " Local Memory "
]LM1, LM2, LM3
A_5 -- LM1[color = transparent, weight = 100]
B_5 -- LM2[color = transparent, weight = 100]
C_5 -- LM3[color = transparent, weight = 100]
N1[
label = "Node 1"
width = 5;
]
N2[
label = "Node 2"
width = 2.3
]
LM1:sw -- N1:nw [color = invis]
LM2:se -- N1:ne [color = invis]
LM1, LM2 -- N1 [style = dotted, weight = 50, constraint = false]
LM3 -- N2 [style = dotted]
}
```
In traditional hardware-supported memory protection, and OTP can only be used once, and is generated using a counter and the memory address. <u>However, in multi-node scenario, there will exist same addresses</u>.
To solve the problem, the system assigns a **global memory address**, ==used only in integrity checks==, comprising two parts:
1. **Global unique node id —** generated from the attestation server
2. **Monotonic number —** generated from hardware to map into physical memory
## MMT Scheme
+ An MMT is ==a subtree== in the integrity forest.
+ (Compared with a normal merkle tree,) The root is expand with fout kinds of metadata:
- **MMT state**
An MMT could be in one of four states &edash;
- `valid`: The MMT is active and performs security check for memory access
- `invalid`: MMT is unallocated / reclaimed, memory is non-secure memory
- `sending`: MMT is transferring memory, read-only
- `waiting`: Works with MMT in sending state, waits for remote transferred memory
- **MMT key**
A user defined key for memory encryption / authentication. Processess holding the same key can modify the same secure memory
- **MMT counter**
- **MMT global-unique address**
Initialized after global attestation, re-assigned if state is changed to `valid`