# Oyster Persistent Storage Design AWS Nitro Enclaves do not support persistent storage natively. All data stored within an enclave is ephemeral and lost upon enclave reboot or termination. This limitation prevents stateful applications from running reliably in enclaves without an external persistence mechanism. ## Goals 1. **Data Persistence**: Enable enclave applications to store data that survives enclave restarts and host machine failures. 2. **Data Confidentiality**: Only code within the enclave should be able to decrpyt the data stored on persistent storage. 3. **Data integrity**: The enclave must be able to verify that the data was indeed generated by the enclave itself and has not been tampered with or forged by any external entity. 4. **Data consistency**: The data should remain exactly as the enclave stored it, ensuring no modifications or corruption during storage or retrieval. 5. **Migration Support**: Allow seamless migration of persistent data when moving between operators or host machines. 6. **Performance**: Minimize latency for read/write operations to support high-throughput applications. 7. **Simplicity**: Provide developers with familiar filesystem semantics (POSIX) or well-documented APIs for data access. ## Assumptions 1. The enclave can communicate with the host system via VSOCK. 2. Network connectivity is available for remote storage solutions. 3. Enclave IP addresses can be used for access control mechanisms where applicable. 4. Storage solutions must protect against unauthorized access even if the host operator is compromised. ## Solutions ### Solution 1: Network-Based Persistent Storage (NFS) **Goal Addressed**: Data Persistence, Data Confidentiality, Data integrity, Data consistency and Migration Support. #### Option 2A: Direct NFS Mount Inside Enclave ##### Architecture ``` Docker Volumes -> NFS mount with blue image → VSOCK → NFS Server ``` ##### Implementation Steps 1. Set up NFS server on a dedicated Linux system. 2. Include nfs-utils nix package in the nix build and setup nfs-client mount in the blue image. 3. User provides the NFS server ip + Directory similar to other init-params. 4. Setup client side encryption with (gocryptfs/TLS/Kerberos) using the KMS key. 5. Use docker volumes in docker-compose to mount the nfs remote directory inside docker. 6. Store the encrypted files on remote NFS server. ##### How It Achieves Goals - **Data Persistence**: NFS server provides centralized persistent storage. - **Data Confidentiality**: Encryption before writing ensures confidentiality. - **Migration Support**: NFS access can be regranted to new enclave IPs after authentication. - **Simplicity**: POSIX filesystem semantics familiar to developers. ##### Benefits 1. Simple setup leveraging established NFS standards and tooling. 2. Low redundancy - single source of truth for data. 3. POSIX filesystem semantics - developers use standard file operations. 4. Easy migration - grant NFS access to new enclave. ##### Implementation Solution The NFS mount challenge has been resolved using the following approach: 1. **Oyster Blue Image Modifications**: Updated the base image to include necessary NFS client packages and configurations. 2. **Docker Volume Mounts**: Used Docker volume mounts in docker-compose applications to expose NFS mount points to the enclave container. 3. **NFS Configuration**: - Enabled `noresvport` option on the client side (allows connections from non-privileged ports) - Enabled `insecure` option on the server side (accepts connections from ports above 1024) This approach bypasses the need for privileged Docker containers while maintaining NFS functionality. #### Security Considerations: NFS Encryption **Critical**: NFS does not support encryption by default. All data transmitted between the enclave and NFS server is sent in plaintext over the network. This creates potential security vulnerabilities even if data at rest is encrypted. #### Encryption Solutions **1. Kerberos Authentication & Encryption** - Provides strong authentication and optional encryption (using `sec=krb5p`) - **Requirements**: - Separate Key Distribution Center (KDC) server setup - Complex configuration and key management - All clients and servers must be in the same Kerberos realm - **Pros**: Industry-standard security, integrated authentication - **Cons**: High operational complexity, additional infrastructure dependency **2. TLS/Stunnel Encryption** - Wraps NFS traffic in TLS tunnel - **Requirements**: - Certificate management infrastructure (CA, certificate generation, rotation) - Stunnel - Certificate distribution to all enclaves - **Pros**: Well-understood encryption mechanism, flexible certificate policies - **Cons**: Certificate lifecycle management overhead **3. Filesystem-Level Encryption (gocryptfs/eCryptfs)** - Encrypts files at the filesystem layer before NFS transmission - **Requirements**: - Specific kernel flags must be enabled (FUSE support, encryption modules) - Kernel modules may not be available in standard Nitro Enclave environments - Currently **not working out-of-the-box** in oyster-cvm - **Pros**: Transparent encryption, no network protocol changes - **Cons**: Kernel dependency issues, limited support in enclave environments, requires investigation of kernel configuration **4. REST encrypt server** * REST-based file server running inside a secure enclave for encrypted NFS storage * **Requirements**: * Enclave environment with secp256k1 & AES-GCM. * NFS-mounted persistent storage. * **Pros**: Strong confidentiality, integrity, and authentication through AES-GCM, no kernel dependencies, simple setup. * **Cons**: Not POSIX-compliant, users cannot perform direct filesystem operations. All interactions must go through `/upload` and `/download` REST endpoints. * **Initial setup** : https://github.com/marlinprotocol/oyster-nfs-encrypt #### **Comparison for All the Solutions** #### **1. Data Confidentiality** | System | Guarantee | Notes | | ------------------ | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | **gocryptfs** | Strong | Uses AES-GCM (auto-generated) with per-file random nonces. | | **eCryptfs** | Strong | Kernel-based encryption with per-file FEK (file encryption key) protected by a master key. | | **EncFS** | Moderate | Provides encryption but depends on userspace key management. | | **NFS + Kerberos** | Strong | Kerberos ensures authenticated and encrypted NFS traffic, preventing unauthorized access during transmission, but does not encrypt data at rest. | | **REST encrpyt server** | Strong | Uses AES-GCM (Derived from sepc256k1) with per-file random nonces. | All four solutions can maintain confidentiality during operation if encryption keys and credentials are securely held inside the enclave. #### **2. Data Integrity** | System | Guarantee | Notes | | ------------------ | --------- | --------------------------------------------------------------------------------------------------------------------- | | **gocryptfs** | Yes | Uses AES-GCM (authenticated encryption) with per-block authentication tags. Detects tampering or bit corruption. | | **eCryptfs** | No | Does not provide integrity authentication; corrupted ciphertext may go undetected. | | **EncFS** | Partial | Optional HMAC per file in newer versions, but weaker than AEAD-based integrity. | | **NFS + Kerberos** | Partial | Ensures message integrity during transmission via RPCSEC_GSS but does not protect data integrity once stored on disk. | | **REST encrypt server** | Yes | Each encrypted file includes an authentication tag generated by AES-GCM. On retrieval, the enclave verifies the AES-GCM tag during decryption. If the tag does not match, the file is rejected.| eCryptfs does not provide **cryptographic integrity protection for stored data**, while NFS + Kerberos only ensures **in-transit integrity**. #### **3. Data Consistency** | System | Guarantee | Notes | | ------------------ | --------- | --------------------------------------------------------------------------------------------------------------------------------- | | **gocryptfs** | Partial | Detects corruption via AES-GCM tags, but relies on the underlying filesystem (for example, NFS) for write ordering and atomicity. | | **eCryptfs** | Partial | Depends on the lower filesystem; kernel writes are synchronous but not end-to-end verified. | | **EncFS** | Partial | No journaling or version tracking; relies entirely on underlying filesystem consistency (NFS locking). | | **NFS + Kerberos** | Partial | Kerberos adds secure authentication and replay protection, but consistency still depends on NFS semantics and mount options. | | **REST encrypt server** | Partial | No journaling or version tracking; relies entirely on underlying filesystem consistency (NFS locking). | #### Overall Summary | Property | gocryptfs | eCryptfs | EncFS | NFS + Kerberos | REST encrypt server | | ------------------------ | --------- | -------- | ------- | ----------------------------- | ----------------------------- | | **Data confidentiality** | Yes | Yes | Partial | in transit only |Yes | | **Data integrity** | Yes | No | Partial | Partial in transit only | Yes | | **Data consistency** | Partial | Partial | Partial (NFS locks) | Partial | Partial (NFS locks) #### **Conclusion** * **gocryptfs** provides the strongest end-to-end protection, ensuring both confidentiality and integrity of data at rest, lack of security for gocryptfs.can cause issue, unless a way to backup master-key is found. * **eCryptfs** offers kernel-level encryption but lacks integrity protection. * **EncFS** is weaker overall due to following reasons : Weak key derivation, No forward secrecy, deleting the conf file cause issue, unless a way to backup master-key is found. * **NFS + Kerberos** secures **transit** (authentication, encryption, and message integrity) but not at **rest**, it should be combined with one of the above encryption layers for full protection. * **REST encrypt server** is not POSIX-compliant, users cannot perform direct filesystem operations. All interactions must go through `/upload` and `/download` REST endpoints. Ultimately, the choice of solution depends on specific requirements. #### Challenges 1. **Network Latency**: Higher latency compared to local storage. 2. **Access Control**: Need robust mechanism to authenticate enclaves and manage IP whitelisting. 3. **Single Point of Failure**: NFS server downtime directly impacts availability. 4. **Encryption Complexity**: Native NFS encryption requires significant additional infrastructure (Kerberos KDC or TLS certificate management). 5. **Network Security**: Without encryption solutions, NFS traffic is vulnerable to network-level attacks (eavesdropping, MITM). #### Option 2B: NFS Mount on Host System ##### Architecture ``` Enclave NFS client mount → VSOCK → NFS Server on host/operator ``` ##### Implementation Steps 1. Setup NFS server on the parent instance(host/operator). 2. Other steps similar to NFS direct mount. ##### How It Achieves Goals - **Data Confidentiality**: Encryption protects data from host. - **Simplicity**: Leverages existing NFS infrastructure. ##### Benefits 1. Simpler enclave setup - no privileged operations needed inside enclave. 2. Established NFS standards and tooling. 4. POSIX filesystem semantics on host side. ##### Challenges 1. **Host Dependency**: Data availability depends on host machine uptime and reliability. 2. **Operator Trust**: Relies on operator to maintain NFS server and connectivity. 5. **Backups** : Need to backup the data stored on host server in-order to prevent loss of data. ### Solution 2: Redundant Storage System **Goal Addressed**: Data Persistence, Data Confidentiality, Migration Support, Performance #### Architecture ``` Enclave → [Encrypt with KMS] → VSOCK → Host Filesystem → Periodic Sync → S3/EBS/NFS ``` #### Implementation Steps 1. Enclave application encrypts files using a KMS key accessible only to the enclave. 2. Encrypted data is transmitted to the host via VSOCK. 3. Host stores encrypted files locally on its filesystem. 4. Host periodically syncs local files to remote storage (S3, EBS, or NFS). 5. On read operations, enclave retrieves encrypted data from host, verifies integrity, then decrypts. 6. Enclave never directly accesses remote storage services. #### How It Achieves Goals - **Data Persistence**: Dual storage (host + remote) ensures data survives host failures. - **Data Confidentiality**: Host only stores encrypted data, cannot read contents. - **Migration Support**: Encrypted data can be transferred to new operators (since tied to KMS key); access to remote storage can be revoked/regranted. - **Performance**: Local host storage provides low-latency access for frequently accessed data. #### Benefits 1. Host cannot read data since it is encrypted with enclave-specific KMS key. 2. Operator flexibility - host can choose any storage backend (S3, EBS, NFS). 3. Low latency for enclave operations (reads/writes to host are fast). 4. Protects against single points of failure (host crashes, disk failures). 5. Easy migration - encrypted data can be moved to new operators without security concerns. #### Challenges 1. **Double Storage Cost**: Requires storage space on both host and remote backend. 2. **Synchronization Complexity**: Must implement robust sync mechanisms to prevent data loss or conflicts. 3. **Access Management**: Remote storage access control needs careful implementation: - IP-based whitelisting for enclave/host - Revocation mechanisms when migrating operators - Since data is encrypted, unauthorized access to storage service is less critical but still undesirable 4. **Sync Timing**: Determining optimal sync frequency balances data loss risk vs. network overhead. ### Solution 3: Object Storage (MinIO/S3) **Goal Addressed**: Data Persistence, Data Confidentiality, Migration Support #### Architecture ``` Enclave (MinIO/S3 SDK) → Network → MinIO Server / S3 Buckets ``` #### Implementation Steps 1. Set up a [MinIO](https://github.com/minio/minio) server or provision user-managed S3 buckets. 2. Application developers integrate [MinIO SDKs](https://docs.min.io/community/minio-object-store/developers/minio-drivers.html) into enclave applications. 3. Enclave encrypts files with KMS key before upload. 4. Access tokens/credentials provided via init-params during enclave startup. 5. Application uses object storage APIs for read/write operations. #### How It Achieves Goals - **Data Persistence**: Object storage provides durable, persistent storage. - **Data Confidentiality**: Encryption protects data from storage provider. - **Migration Support**: Access tokens can be rotated; data accessible from any enclave with credentials. - **Simplicity**: Well-documented SDKs and APIs available in multiple languages. #### Benefits 1. **Simple Access Control**: Users provide access tokens in init-params; no complex IP whitelisting needed. 2. **No Redundancy Overhead**: Single storage backend without duplication. 3. **Scalability**: Object storage scales to handle large datasets. 4. **Standard Protocols**: S3 API is widely supported and understood. 5. **Cloud-Native**: Integrates well with modern cloud architectures. 6. **Built-in Features**: Versioning, lifecycle policies, and replication available. #### Challenges 1. **Higher Latency**: Network round-trips increase latency compared to local storage; not optimal for high-throughput, low-latency workloads. 2. **API Learning Curve**: Developers must learn object storage APIs/SDKs instead of using familiar POSIX filesystem operations. 3. **Application Changes Required**: Existing applications using filesystem semantics need refactoring. 4. **Object Storage Semantics**: No true random-access file modifications; must upload entire objects. 5. **SDK Dependencies**: Applications must include MinIO/S3 SDK libraries. ## Comparison Matrix | Criteria | Redundant Storage | NFS (Direct) | NFS (Host) | Object Storage | |----------|------------------|--------------|------------|----------------| | **Latency** | Low | Medium | Medium | Medium-High | | **Implementation Complexity** | High | Medium | High | Medium | | **Storage Cost** | High (2x) | Low | Low | Low | | **Migration difficulty** | Medium | Low | Medium | Medium | | **Host Dependency** | High | None (except already present networking setup) | High | None (except already present networking setup) |