Storing and Cleaning Up Slots (2)

# Storing and Cleaning Up Slots (2) So far, I have been discussing the strategy for implementing _proper cleanup_ for slots in terms of storage primitives exposed to the marketplace, and a combination of active cleanup and garbage collection. After the last round of discussions, a number of those assumptions have shifted: 1. the Marketplace _does not_ call the storage system directly, ever. It emits events through async callbacks, and the code in those callbacks should provide the required guarantees; 2. we do not do active cleanup, and instead rely on garbage collection for everything^[Not my personal preference, but I guess that's beside the point.]. In that sense, the main callbacks we care about are `onStore`, `onExpiryUpdate` and, to a lesser extent, `onClear`. ### Callback 1: onStore Let the _maximum start time_ for a contract be the deadline that the storage has for filling some slot. The main guarantee provided by `onStore` is that, when it completes successfully, the slot: 1. is guaranteed to be present in its entirety on-disk; 2. is guaranteed to stick around until the _maximum start time_ for the current request. Since the expiry is set to the maximum start time, any failures to fill the slot -- crashes or otherwise -- will cause the dataset to be eligible for garbage collection within that timeframe.^[Again, not a big fan: you'll probably have to set some very conservative values here so slots do not fail spuriously, meaning failed slots might linger for quite a while.] In case there are multiple active contracts $c_1, \cdots, c_n$ for the slot with maximum store times $a_1, \cdots, a_n$, `onStore` guarantees that the dataset will be available until time $\max a_i$. ### Callback 2: onExpiryUpdate Like `onStore`, `onExpiryUpdate` guarantees that a slot will remain on disk until the maximum expiry value supplied to the callback, taken over all calls. Fig. 1 shows a slot with $3$ storage contracts. The repeated calls to `onExpiryUpdate` will cause the dataset to expire only at the end of $c_3$, after which it becomes eligible for garbage collection. Since the expiration time passed on this callback corresponds to expiration of the storage contract itself, this guarantees that the dataset remains avalable till the end of the contract with the latest expiration time. ![slot](https://hackmd.io/_uploads/S19FcQQAkx.png) **Figure 1.** $3$ overlapping contracts for some slot causing the expiry to be extended twice. At the end of the last contract, the dataset can be garbage collected. **Contract cancellations.** Contract cancellations should in general be uneventful, except when the contract being cancelled is the one with the latest expiration; i.e., $c_3$ in Fig. 1. In this case, we have two options: 1. don't do anything and let the dataset linger on-disk despite the cancelled contract; 2. shrink the expiration for the dataset down to the contract with the second latest duration; i.e., $c_2$. For us to be able to do (2), the new expiration must be communicated as part of the cleanup callback. `OnClear` currently does not communicate this: ```nim= OnClear* = proc(request: StorageRequest, slotIndex: uint64) {.gcsafe, upraises: [].} ``` So it would have to be extended to accomodate it; e.g.: ```nim= OnClear* = proc(request: StorageRequest, slotIndex: uint64, expiry: SecondsSince1970) {.gcsafe, upraises: [].} ``` **Racing deletes.** Garbage collection is not instantaneus. It could happen, therefore, that a new contract for a dataset that is already expired gets issued after its expiration time, either as part of a late renewal, or as part of a new request altogether. `onStore` will guarantee that the dataset will be either re-downloaded or have its expiry extended. We will use locking internally within the repostore and a future download manager to make that work. [^1]: [^2]: