Enclave Forking

# Enclave Forking ## High Level Overview * Parent enclave * SM will mark the parent’s EPM as read-only. (this will be replaced by the snapshot) * SM will allocate a new read-only * Child Enclave * SM will create an EPM region with the same number of EPM as the parent and initialize the enclave with initial registers ($satp, etc) * All pointers will still point to the parent process * SM allocates metadata to mark the child enclave as child of parent enclave * Each child will receive a read-only PMP entry for parent upon enclave switch * Upon write access to the parent’s EPM pages, SM will get a access fault * Store/AMO access fault -- Exception code 7 * SM will copy the page * Note: SM may have to copy multiple pages (multiple forks) to other enclaves * I.E. Parent enclave spawns forks multiple child enclaves and parent enclave edits a page -- must copy the page to each child it spawns * Code can be handled in: `void machine_page_fault(uintptr_t *regs, uintptr_t dummy, uintptr_t mepc)` ## Code Modifications ``` /* enclave metadata */ struct enclave { ... enclave_id eid; //enclave id ... enclave_id parent_eid; // parent eid struct list children; // list of children eids } ``` * Each enclave will now have a parent eid, `parent_eid` * Default value will be `-1` * Any enclave that is forked will have its parent eid set. * Each enclave will now have a list of its children enclaves, `children` * This isn't completely necessary * Might be useful to keep track of children eids if we wanted to implement `wait` * Upon `context_switch_to_enclave`, we also want to flip the PMP regions of the parent enclave * The PMP regions will grant `read` access to the child enclave, but NOT write. * Ensures that the host still cannot access this read-only image. * PMP region will be marked by `enclave[eid].regions` * The SM is responsible for ensuring fork is correct, so the EID of the of parent enclave is guaranteed to be correct. `void machine_page_fault(uintptr_t *regs, uintptr_t dummy, uintptr_t mepc)` `void pmp_trap(uintptr_t* regs, uintptr_t mcause, uintptr_t mepc)` * `int cpu_get_enclave_id()` * Use this to find the enclave which caused the access fault * In the parent, we need to keep track of all dirty pages since we called our snapshot, we can keep track of this through the access fault. * Can find the `dram_base` of the parent enclave by doing `enclaves[parent_eid].dram_base` * Check if the faulting address, `mepc`, is within the parent enclave's EPM and that it is a `write` access * If it is in the parent's EPM, copy the page into the child's EPM. * This can be done easily by keeping track of the EPM base of each enclave `runtime_pa_params.dram_base` ``` offset = (mepc & PAGE_MASK) - runtime_pa_params.dram_base; memcpy(enclaves[cpu_get_enclave_id()].dram_base + offset, enclaves[parent_eid].dram_base + offset, PG_SIZE); ``` * If the `mepc` isn't in the parent's enclave then handle the trap normally ``` /* Creates a copy of the parent enclave (enclave that called fork) and returns the child enclave's eid. */ enclave_id enclave_fork(uintptr_t* regs, enclave_id eid, int load_parameters){ enclave_id child_eid = -1; // Copy registers (at time of fork), parameters (dram_base, runtime_base, runtime_entry, etc) // Create new enclave with new enclave eid child_eid = copy_enclave(regs, load_parameters); // Add parent enclave's PMP region to child enclave // Set child enclave's parent eid to correct eid // Add child to parent enclave's children list register_child(child_eid); return child_eid; } ``` * We also set `$a0` register to be `0` for the child's process since the return value of the child process upon 'fork()' should be `0` # Non-determinism (post-init measurement) * After the child enclave is initialized and created, it might receive input parameters (i.e. in serverless computing, we can provision the enclave with some function) * At the time of `fork()`, the parent enclave's state may have changed (i.e. stack and heap are modified) since its initial measurement * We need to be able to attest some intermediete state of the enclave * Can we use the initial measurement of the parent enclave to help speed up intermeidete hash? ## Solution 1: Selective Measuring * The enclave application can designate a region which will not be measured. * For enclave applications that are non-deterministic, we can mark sections in code that won't be included in the intermediete measurement * This can be a simple ELF section, which must be marked and cannot be modified after the enclave application is initialized in the SM * This can be implemented by using a bit in the PTE to mark a page to NOT be included in the measurement * Initial measurement still contains the non-measured section * Initial measurment ensures the non-measured sections are zero'd out * Upon any intermediete measurement, marked pages will not be considered. * This ensures that no adversary modifies the binary when it is initialized by the SM * Cons: * An adversary can exploit this by putting injected code in the non-measured pages. * Adversary can inject code in pages marked as non-measured and create a valid attestation report. * We can mitigate this with standard buffer overflow protections (i.e. canaries, bounds checking, etc) ## Solution 2: Measure Log between Host and Enclave * Non-determinism is from outside the EPM (i.e. inputs given from the host) * SM keeps track of logs sent and received from an external source (i.e. host, enclave, etc.) * All messages recieved/sent from external source will be logged and included in measurement. * What about concurrency, where communication between several external sources are in non-determinsitic order? * We can use a Multi-Set Hash (https://people.csail.mit.edu/devadas/pubs/mhashes.pdf) * Hashes do not depend on the order of the set * Cons: * This doesn't solve non-determinism from the program itself (i.e. calling on a RNG within an enclave) * We assume ALL non-determinism is from communication