# Delayed Write Buffers:
> A solution to storage access patterns for Secret Tokens
> by Blake Regalia and Ben Adams
###  StarShell - @supdoggie and @darwinzero
## Motivation
When conducting token transfers, Secret contracts must store balances and transfer histories in its internal (encrypted) key-value database. However, even though the database key is encrypted, it is unique to the recipient. This allows an attacker to monitor which database keys are read from and written to each execution to deduce [storage access patterns](https://docs.scrt.network/secret-network-documentation/development/development-concepts/secret-contract-fundamentals/privacy-essentials) and thus correlate transfers to a particular recipient the next time that key is read from or written to.
Since the contract _must_ access a sender's balance when executing an outgoing transfer (to make sure they have enough to cover the spend), this leads to a conundrum: how can we update a recipient's balance during a transfer without accessing a storage area associated with their account?
## Requirements
In order to design of a set of data structures and algorithms that will address these challenges, we first turn to requirements engineering. Taking this approach, we are able to iteratively plan out each necessary component while avoiding redundancies and ensuring that solutions do not violate other requirements.
### R1. Storage Access Patterns: Transfers Table
**Requirement:** When transferring tokens, the contract must not access **both** storage areas of sender (or owner) and recipient, in order to avoid linking them together.
**Solution:** Instead of writing to the recipient's storage area, the contract will write the details of the transfer to a tabular data structure, the entire contents of which get read and overwritten upon each access.
### R2. Finite Table Capacity: Delayed Write Buffer
**Requirement:** Since the capacity of the transfers table is finite, once the table reaches its capacity, the contract must make room for new entries and evict older ones.
**Solution:** Randomly select an entry from the table (i.e., the buffer) to evict and update the stored balance of its recipient (i.e., "settle" the entry).
### R3. Flush Attacks: Accumulation
**Requirement:** Repeated transfers to the same recipient must not cause other entries to settle as this would allow an attacker to "flush" the rest of the buffer where they could record which keys are written to in order to deanonymize the recipient set of the previous N transactions.
**Solution:** If a recipient already has an entry in the buffer, rather than creating a new entry (which would reduce the anonymity set), or replacing the existing entry (which would create a storage access pattern link), that existing entry will instead accumulate new transfers.
### R4. Side Channel Leaks: Constant Time Operations
**Requirement:** The total timing of transfer executions, the timing between reads from storage, and the total amount of gas consumed must be consistent across the various combinations of transfer/buffer conditions. In other words, a malicious node operator running modified software outside the enclave must not be able to deduce information about the internal state of the buffer nor the recipient of transfer executions (such as whether or not the recipient was in the buffer, or whether or not some of the previous transfers were to recipients that were already in the buffer, etc.).
**Solution:** All operations between (a) the beginning of the execution and (b) the last read from storage will be constant-time. Additionally, the final amount of gas used shall be identical across all transfer executions (using gas evaporation as necessary, and with reasonable exceptions for features unrelated to the privacy of transfers).
### R5. Storage Write Leaks: Phony Writes
#### R5a. Recipient Entry
**Requirement:** Due to R3, the contract does not need to evict an entry from the buffer if the recipient already has an entry. However, this would lead to a branch in execution where sometimes an entry gets settled (i.e., if recipient is not in buffer), whereas other times an entry is not settled (i.e., if recipient _is_ already in buffer). More importantly, this would also produce storage write patterns, where the absence of a write would indicate the recipient of the transfer was already in the buffer.
**Solution:** If the recipient is already in the buffer, in order to produce the same storage write pattern, rather than settling an existing entry from the buffer (which would violate R3), the contract will instead write to the storage area of an unrelated address.
#### R5b. Owner/Sender Balance
**Requirement:** For any transfer, the balance of the owner must be checked and changed in order to cover and account for the amount of funds being moved. The amount of an outgoing transfer will either (a) exceed the owner's total balance and fail, (b) exceed the owner's stored balance but not the their total balance, (c) exceed the owner's buffered balance but not their total balance, (d) both b and c, (e) be covered by the owner's stored balance, (f) be covered by the owner's buffered balance, or (g) both e and f. Except for (a), the contract must not leak which of these conditions were true upon executing a transfer.
**Solution:** The contract will always first check the buffer to see if the sender has an entry and settle it. If the owner does not have an entry in the buffer, in order to produce the same storage write pattern, then the contract will write to the storage area of an unrelated address.
### R6. Flooding Attacks: Buckets
**Requirement:** An extremely vigilant attacker with ample funds must not be able to deduce the entries in the buffer through a series of flooding attacks by which they monopolize the buffer with high certainty using owned recipients. Albeit prohibitively expensive to perform at scale, a determined attacker may be able to correlate or de-anonymize a single transfer event to a pre-selected victim recipient.
**Solution:** Instead of storing each owner's/sender's/recipient's balance and history under a single storage key, use buckets to group together addresses into finite anonymity sets, which get read from and written to as a single unit, concealing which items were accessed from the perspective of an observer monitoring the database.
## Delayed Write Buffer (DWB)
We introduce the Delayed Write Buffer, a fixed-width data structure that gets read from and written to on every transfer execution, persisted under a constant storage key.
When a user executes a transfer, instead of writing to the storage key associated with the recipient's balance, the contract inserts an entry into the Delayed Write Buffer containing the recipient's address, the amount, and the ID of the transfer event (which, when dereferenced, contains metadata such as sender's address, date/time, and memo).
Once the buffer has reached its capacity, the contract will select an entry from the buffer at random and "settle" it by updating the corresponding recipient's balance and removing the entry from the buffer, effectively avoiding the direct storage access association between sender and recipient.
## Example
<!--  -->
<!--  -->

### Executing the transfer
First, a new transfer event is saved to storage, keyed by a globally incrementing transfer event ID:
```
let transferId := append(transferEvents, {
sender: Alice
recipient: Carol
amount: 50 TKN
date/time: ...
memo: ...
})
```
Next, the contract loads the Delayed Write Buffer from storage, selects an entry at random, and settles it. For the sake of this example, assume the state of the buffer is as follows:
```
Transfers buffer: [
0: ["Tom", "50 TKN", {"#75"}]
1: ["Edgar", "10 TKN", {"#69"}]
2: ["Bob", "20 TKN", {"#124"}]
...
15: ["Josh", "15 TKN", {"#89"}]
]
```
The contract randomly selects the entry `2: ["Bob", "20 TKN", {"#124"}]`. The contract settles this by updating Bob's _stored_ balance, then replaces it with the new entry: `2: ["Carol", "50 TKN", {"${transferId}"}]`, making the new state of the buffer as follows:
```
Transfers buffer: [
0: ["Tom", "50 TKN", {"#75"}]
1: ["Edgar", "10 TKN", {"#69"}]
2: ["Carol", "50 TKN", {"#124"}]
...
15: ["Josh", "15 TKN", {"#89"}]
]
```
At this point, nothing associated with Carol's storage areas has been accessed. Instead, Alice's execution has incidentally written to the storage area associated with *Bob's* account by writing to his stored balance, even though Alice and Bob have no transfer associations.
This strategy greatly reduces the ability for an attacker to correlate sender and recipient through storage access patterns.
### Querying for balance & history
Entries in the Delayed Write Buffer count towards a user's balance. When Carol queries for her balance, the contract searches the buffer to find any entries that belong to Carol (i.e., where Carol is the recipient). It adds this subtotal to her _stored_ balance to arrive at her actual balance.
```
let balance := userBalances[user]
for entry in transfersBuffer:
if entry.recipient == user:
balance += entry.amount
return balance
```
Similarly, queries for transfer history must also combine data from these two storage areas for a given user's history (more on history below).
### Sender's balance
When executing a transfer, the contract should first check if there is a transfer in the buffer designated for the current sender before updating their balance.
For example, if Alice is transferring tokens to Carol, the contract should settle the transfer in the buffer designated to Alice before updating her balance.
_Before:_
```
Transfers buffer: [
...
7: ["Alice", "15 TKN", {"#210"}],
...
]
```
_After:_
```
Transfers buffer: [
...
7: ["Alice", "0 TKN", {}],
...
]
```
### Distinct recipients
In order to prevent an attacker from 'flushing' a victim's storage key by transferring to them many times, entries in the buffer must be distinct by recipient address, and repeated transfers to the same recipient must accumulate in that entry.
In other words, if the recipient already has an entry designated to them in the buffer, the contract must _update_ this entry with the combined transfer amount rather than creating a new entry.
If Carol transfers 15 TKN to Bob, the contract updates his buffered balance like so:
_Before:_
```
Transfers buffer: [
...
8: ["Bob", "50 TKN", {"#297"}]
...
]
```
_After:_
```
Transfers buffer: [
...
8: ["Bob", "65 TKN", {"#316", #297"}]
...
]
```
### Transaction history events
In order to update the recipient's transfer history without accessing a storage key associated with their account (which would create storage access patterns), a reference to the transfer event (using its ID) is saved to the buffer. However, since multiple transfers to a given recipient can accumulate under a single entry in the buffer, the entry needs a way to store all these events.
Our solution is to use a linked list, where each new transfer inserts at the head of the list:
<!--  -->

With this approach, an entry in the buffer for a given recipient can accumulate events ad infinitum. The storage areas associated with transaction history events are only ever accessed a single time across all executions (when they are written to). The only other time they are accessed is when they are read by the query node at the time a user queries for their transaction history.
## Seeding the buffer
The buffer is effective at providing privacy when it is saturated. If a brand new contract starts settling entries in the buffer before it is saturated, the privacy mechanism is undermined.
To prevent this, contracts should only start settling entries once the buffer reaches its capacity.
A simple way to implement this is to encode a single uint, `emptySpacesCounter`, at the beginning of the data structure to track how many empty spaces are left in the buffer. If the uint is greater than `0`, then the contract can simply write to the index at *that value minus one*. If the uint reads `0`, then the buffer has reached capacity and the contract should choose an entry at random to settle and replace.
When initializing the contract, the `emptySpacesCounter` should be set to the capacity of the buffer.
## Data structure example for token transfers
The Delayed Write Buffer is a simple byte sequence of k entries, prefixed by a single uint encoding how many unused spaces remain:
```
[emptySpaceCount, ...transfers]
```
And where each entry in transfers is a triple of:
```
[recipient, amount, eventsLinkedListId, listLen]
```
```
20 bytes: recipient (canonical address)
+8 bytes: amount (uint64)
+5 bytes: events list ID (fits in uint64)
+2 bytes: list length
---------------------
= 35 bytes per entry
```
## Constant-time search
The contract should avoid leaking data about the contents stored in the Delayed Write Buffer through side channel attacks. Certain operations on the buffer should therefore be constant-time.
### Reasoning
To demonstrate what would happen otherwise, imagine the buffer stored its entries sorted by recipient address and used a binary search to find whether or not the buffer has an incoming transfer for a specific recipient. An attacker could generate lots of accounts and probe the time/gas the contract spends searching the buffer for each one. In theory, the attacker would be able to deductively brute force the leading bytes of an entry's recipient address.
#### Implementation
When searching the buffer for a given recipient's address, a constant-time byte comparison should be used. A few potential options include:
- [constant_time_eq](https://docs.rs/constant_time_eq/latest/constant_time_eq/fn.constant_time_eq_n.html)
- [fixed_time_eq](https://docs.rs/rust-crypto/0.2.36/crypto/util/fn.fixed_time_eq.html)
Pseudocode for the search algorithm:
```
let transferInBuffer := nil
for entryIndex in 0..BUFFER_CAPACITY
let offset := entryIndex * BUFFER_CAPACITY
let recipient := slice(transfersBuffer, offset, offset + 20)
if constantTimeEq(recipient, user):
transferInBuffer = readEntry(entryIndex)
```
## Selecting buffer parameters
Let _k_ be the capacity of the buffer.
The probability that a transfer entry for a particular recipient has not been accessed after _n_ subsequent transfer executions is:
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mrow>
<mi>P</mi>
<mo>(</mo>
<mtext>not yet accessed</mtext>
<mo>)</mo>
<mo>=</mo>
<msup>
<mrow>
<mo>(</mo>
<mfrac>
<mrow>
<mi>k</mi>
<mo>–</mo>
<mn>1</mn>
</mrow>
<mi>k</mi>
</mfrac>
<mo>)</mo>
</mrow>
<mi>n</mi>
</msup>
</mrow>
</math>
#####
Visualizing this as a function of _n_ for various choices of _k_:

Interpretting the chart above, we can see that for $k = 64$, there is still a 20% chance that a recipient's storage area hasn't been accessed after 100 executions. This probability is what protects an attacker from deducing which transaction actually transferred tokens to a given recipient (before considering bucketing).
Another way to think about this chart is to analyze at what point an attacker reaches a high-confidence value (say 99.5%) that a recipient's storage key has been accessed sometime within the last _n_ executions. In other words, for a buffer width of $k = 64$, this is roughly equivalent to an anonymity set of size 336.
$1 - P(\operatorname{not yet accessed}) = 1 - \left(\frac{64-1}{64}\right)^n = 0.995$
$n = \frac{\ln(0.005)}{\ln(63/64)} = 336.4362...$
## Bitwise-Trie of Bucketed Entries (BTBE)
Complementary to all of the privacy benefits above about the Delayed Write Buffer, we introduce a novel data structure to further enhance the privacy of Secret token transfers. The BTBE manages a tree of constant-sized tables (buckets), where each one deterministically groups together a set of addresses (along with their stored balance and transfer history) keyed by the cryptographic hash of their address with some internal contract secret. The bitwise-trie distributes addresses among buckets, ensuring that the query/execution cost of locating and inserting entries grows logarithmically with new recipients.
The following diagram illustrates the insertion of a new entry for Edgar's stored balance, where his address hashes to the hexadecimal value `0xF5` (`0b1111 0101`), showing how the trie changes from left to right as new branches are created until reaching the 2nd leading bit of the hash.
<!--  -->

<!--
Larger values of _k_ improve privacy but incur more expensive gas costs.
import numpy as np
import matplotlib.pyplot as plt
# Define the range for k (from 2 to 100)
k_values = np.arange(0, 200)
k = 4
# Define the function r(k)
r_values = 1 - (((k - 1) / k) ** k_values)
k1 = k * 2
r1_values = 1 - (((k1 - 1) / k1) ** k_values)
k2 = k * 4
r2_values = 1 - (((k2 - 1) / k2) ** k_values)
k3 = k * 8
r3_values = 1 - (((k3 - 1) / k3) ** k_values)
k4 = k * 16
r4_values = 1 - (((k4 - 1) / k4) ** k_values)
k5 = k * 32
r5_values = 1 - (((k5 - 1) / k5) ** k_values)
k6 = k * 64
r6_values = 1 - (((k6 - 1) / k6) ** k_values)
# Plotting
plt.figure(figsize=(8, 4))
plt.plot(k_values, r_values, label="k=4", color='gold')
plt.plot(k_values, r1_values, label="k=8", color='orange')
plt.plot(k_values, r2_values, label="k=16", color='red')
plt.plot(k_values, r3_values, label="k=32", color='purple')
plt.plot(k_values, r4_values, label="k=64", color='blue')
plt.plot(k_values, r5_values, label="k=128", color='teal')
plt.plot(k_values, r6_values, label="k=256", color='green')
plt.title("Probability a transfer settles after\nn subsequent transfer executions")
plt.xlabel("Number of executions (n)")
plt.ylabel("P(transfer settling)")
plt.grid(True)
plt.legend()
plt.show() -->