How to retrieve state from UBT with batched PIR

# Private Ethereum State Retrieval over UBT using Vectorized BatchPIR + Oblivious Compression ## 0. Goal and Threat Model **Goal.** A client wants to retrieve *Ethereum state values* (account fields, storage slots, contract code chunks, …) from an off‑chain UBT state provider: * without revealing **which state keys** they are interested in; and * while being able to **verify** those values against an authenticated UBT root using Merkle proofs. **Parties & assumptions** * **Client (wallet / prover):** * Knows the UBT key-derivation rules and the canonical UBT root for some block. * Holds the HE secret key for PIR. * **Server (state provider):** * Stores a flattened UBT database for one or more block heights. * Knows only the HE public key + evaluation keys. * Is *honest-but-curious*: follows the protocol, but tries to learn the queried indices. * **Blockchain:** * On-chain commitment is an MPT root. The server additionally provides an authenticated **MPT→UBT transformation proof + UBT root** so the client can trust UBT proofs. Privacy notion: the server learns nothing about which UBT locations (and hence which Ethereum accounts / slots) are being queried, beyond aggregate batch size. --- ## 1. UBT as the State Backing Store We use the Unified Binary Tree (EIP‑7864) as the canonical off‑chain state representation. ### 1.1 UBT shape and proof size * **Key–value space.** A single, unified trie over 32‑byte keys and 32‑byte values: * Keys: 31‑byte **stem** + 1‑byte **subindex**. * Values: 32‑byte payloads (or Merkleized chunks, depending on node type). * **Binary arity.** The trie is arity‑2, so for a tree with ~2²⁴ leaves: * Merkle proof depth ≈ 24. * Proof size ≈ 24 siblings × 32 bytes = 768 bytes, ignoring small framing overhead. * **SNARK‑friendliness.** No extension nodes, no RLP; just a binary Merkle tree over fixed‑width hashes. This structure is exactly what we need: short proofs (log N) and a clean mapping to a flat key–value database. ### 1.2 Flattening UBT to a PIR Database The server maintains a **flat key–value database** derived from the UBT: * **Key:** a 32‑byte *UBT location identifier* (effectively the node location), derived from: * the 31‑byte stem (= shared prefix for 256 related leaves); and * a 1‑byte subindex that identifies: * which account/header leaf or storage/code leaf, or * an internal Merkle node in the binary tree. * **Value:** the serialized node payload: * node type, children hashes, and/or the leaf’s value hash/content; * enough to reconstruct Merkle paths and leaf values. This is a standard key–value store behind PIR: indices are just logical array positions over the set of all UBT nodes. The server materializes: * a mapping from `(stem, subindex)` → **array index** `j ∈ [0, N)`; and * a fixed ordering of internal nodes and leaves so that `DB[j]` is the value used by the PIR engine. ### 1.3 Mapping Ethereum State to UBT Keys The client must deterministically derive the right *UBT keys* for each high‑level “state value” it wants. We assume EIP‑7864‑style helper functions like `get_tree_key`, `get_tree_key_for_storage_slot`, and `get_tree_key_for_code_chunk` (names for exposition). | State element | UBT key layout (logical description) | | | | ----------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | - | -------------------------------------------------------------------------------------------------------------------------------- | | **Account basic data** (nonce, balance, code size, version) | Packed into a single 32‑byte value at stem `s = hash(address)` and **subindex 0** (`BASIC_DATA_LEAF_KEY`). This means opening one UBT branch suffices to read all those fields. | | | | **Storage slot `k`** | `stem = hash(address | | floor(k / 256))`, `subindex = k mod 256`. Slots in a 256‑slot window share a stem and live under a single “storage stem” branch. | | **Contract code chunk `i`** | Code is chunked into 31‑byte segments plus metadata, then placed at specific subindices via `get_tree_key_for_code_chunk(address, chunk_index)`. The **first 128 chunks** can share the account header stem to improve locality (as per UBT spec). | | | Given this, the client can compute: 1. the UBT leaf key `(stem, subindex)` for each desired state value; 2. from there, the *binary path* and the list of internal nodes needed for the Merkle proof from that leaf up to the UBT root. All of these nodes map to specific database indices `I = {i₁, …, i_ℓ}` in the flattened DB. --- ## 2. From State Values to PIR Indices ### 2.1 What the Client Actually Needs For each requested **state value** (e.g., “balance of address A” or “storage slot k of contract C”), the client needs: 1. the **leaf value** (actual 32‑byte state field or code chunk); 2. the sequence of **sibling nodes** on the path from the leaf to the UBT root; 3. the **UBT root** itself and a **MPT→UBT proof** so that the root commitment is anchored in Ethereum consensus. With (1)–(3), the client can locally recompute the root and verify: * correctness of the value, and * consistency with the on‑chain MPT root. ### 2.2 Computing the Batch of Node Indices For a batch of user‑level state values `S = {x₁, …, x_m}`: 1. For each `x_j`, derive its UBT leaf key `(stem, subindex)` using the appropriate mapping. 2. For each leaf, compute the **binary path** from leaf to root under the UBT rules (public, deterministic). 3. Collect the set of **all unique UBT nodes** needed across all paths: $\mathcal{N} = \bigcup_{j=1}^m \text{PathNodes}(x_j)$ 4. Map each node in `𝒩` to its flattened DB index using the server’s public indexing scheme (or a client‑reconstructed one if feasible). 5. The result is a batch of indices: $I = {i_1, …, i_\ell}, \quad \ell = |\mathcal{N}|$ This `I` is the input to the BatchPIR protocol. Note that multiple user‑level state values often share large parts of their path, so `ℓ` is typically much smaller than `m · log N`. --- ## 3. BatchPIR Backbone We adopt a **single‑server, RLWE‑based BatchPIR** with vectorization and oblivious compression: * **Vectorized BatchPIR** (Mughees & Ren) for efficient computation + packing of small entries. * **LSObvCompress / LSObvDecompress** (Bienstock et al.) for oblivious compression of request and response ciphertexts in the cuckoo‑hashing BatchPIR framework. ### 3.1 Cuckoo Hashing and Buckets Standard pattern: * Server splits the DB of size `N` into buckets using `w` hash functions (`w ≈ 3`), producing `B ≈ 1.5ℓ` buckets at query time, each containing a subset of DB entries. * Client maps its batch indices `I` into the same bucket space using **cuckoo hashing** (max size 1 per bucket): * each real index is placed into one of its `w` candidate buckets; * empty buckets are filled with dummy index 0. * For each non‑empty bucket, we run a **vectorized PIR** instance, but we pack many buckets into a small number of ciphertexts using vectorization. This gives: * amortized server time ≈ tens of ms per entry; * baseline response size ≈ `1.5ℓ` logical PIR responses (before vectorization/compression); ### 3.2 Vectorized PIR (compressed, high level) Vectorization leverages SIMD‑like HE slots: * Each RLWE plaintext encodes a **vector of entries**; each ciphertext handles many entries simultaneously. * Within each bucket, the server: * arranges DB entries as a 3‑dimensional hypercube; * processes the PIR query dimension by dimension with homomorphic multiplications and rotations; * ends with a ciphertext that has the requested entry in exactly one slot. Packing across buckets: * Requests from multiple buckets are packed into the same ciphertext by interleaving slots. * Similarly, bucket contents are encoded so rotations don’t create collisions. * The server merges vectorized responses from many buckets into **one ciphertext** by: 1. copying each bucket’s non‑zero slot to all slots via rotate‑and‑sum; 2. masking different slot positions for different buckets; 3. adding all masked ciphertexts together. Result: retrieving 256 entries (256‑byte entries, 1M‑entry DB) gives ≈ 19× communication overhead over raw plaintext, with amortized ~123 ms per entry on a single core. --- ## 4. Oblivious Compression: LSObvCompress + LSObvDecompress Cuckoo‑based BatchPIR introduces **dummy** queries/responses (~0.5ℓ). LSObv* eliminates the need to explicitly send encryptions for these zeros. ### 4.1 LSObvDecompress for Request Compression (Client → Server) Context: among the `1.5ℓ` logical bucket queries, only `ℓ` correspond to real UBT nodes; the others are dummy. * **Compressor:** the **client** knows the set of relevant logical bucket indices `I ⊂ [1.5ℓ]`, `|I| = ℓ`. * **Decompressor:** the **server** gets only: * a compact encoding `ĉ` (≈ `(1+ε)ℓ` ciphertexts), * and the public key; it does **not** know which positions in `[1.5ℓ]` are real vs dummy. Using LSObvDecompress: 1. Client constructs a plaintext vector of query messages (one per logical bucket), with zero entries for dummy buckets. 2. Client runs the compress algorithm (solving a small random linear system) and outputs a compact encrypted representation `ĉ` of dimension ≈ `(1+ε)ℓ`. 3. Server runs the oblivious decompression algorithm, which: * homomorphically multiplies by a random band matrix; * recovers a vector of ciphertexts `c̃` of length `1.5ℓ`, where all real bucket queries are correctly reconstructed, and dummy slots can be arbitrary encryptions. 4. Server feeds `c̃` into the usual BatchPIR pipeline as if the client had sent all `1.5ℓ` bucket queries explicitly. This saves ≈ 30% of request communication compared to sending all queries, while preserving query privacy and correctness with negligible failure probability (~2⁻⁴⁰). ### 4.2 LSObvCompress for Response Compression (Server → Client) On the response side: * The server obtains `1.5ℓ` logical PIR responses, of which only `ℓ` are non‑zero (real UBT nodes); the remaining 0.5ℓ responses encrypt zero. Using LSObvCompress: 1. Server views responses as a length‑`1.5ℓ` ciphertext vector `c̃`; it knows only `t = ℓ`, *not* which positions are non‑zero. 2. It runs LSObvCompress.ObvCompress on `c̃`: * picks a random band matrix `M` with `m = (1+ε)ℓ` rows; * performs a matrix–vector product `ĉ = M·c̃` using only homomorphic additions. 3. Client, who *does* know which logical responses are relevant (i.e., which Merkle nodes it asked for), runs Decompress: * reconstructs the subset of non‑zero plaintexts `{(i, p_i)}_{i∈I}` by solving a small linear system in the clear after decryption. Result: * Response size shrinks from `1.5ℓ` logical responses to ≈ `(1+ε)ℓ` ciphertexts, with ε ≈ 0.05 in practice, i.e. ~30% reduction. * Encoding and decoding cost is dominated by O(nλ) and O(tλ) additions (no multiplications), which is small relative to the underlying PIR work. ### 4.3 Combined with Vectorization Vectorization and LSObv* are orthogonal: * vectorization packs multiple entries per ciphertext (when entries are small); * LSObv* reduces the total number of ciphertexts required for the logical bucket layer. Combined, we get: * **request:** `⌈(1+ε)ℓ / r⌉` ciphertexts instead of `⌈1.5ℓ / r⌉`, where `r` is how many bucket queries we pack per ciphertext; * **response:** `⌈(1+ε)ℓ / d⌉` ciphertexts instead of `⌈1.5ℓ / d⌉`, where `d` is how many logical responses we pack per ciphertext. --- ## 5. End-to-End Protocol Flow Putting everything together for a single block height: ### 5.1 Setup (Server, per block height) 1. **Maintain UBT state:** * Build (or incrementally update) the binary UBT from the canonical MPT using MPT→UBT migration logic from EIP‑7864. * Compute the UBT root. 2. **Flatten:** * Assign each UBT node a stable index `j ∈ [0, N)`. * Store `DB[j] = node_value` in a key–value DB keyed by `(stem, subindex)` or the node location. 3. **Preprocess for BatchPIR:** * Hash indices into `B` buckets via Angel et al.’s framework. * Encode bucket contents for vectorized PIR (hypercube layout, slot packing). 4. **HE setup:** * Receive client HE *public key* + evaluation keys (relin/rotation). * Optional: reuse the same HE parameters across many clients, but keys are per client. ### 5.2 Query (Client → Server) For a batch of state requests `S = {x₁,…,x_m}`: 1. **Compute UBT leaves:** * For each `x_j` (e.g., `(address, storage_slot)`), derive `(stem, subindex)` using UBT mapping. 2. **Derive path node indices:** * For each leaf, compute Merkle path and corresponding node indices; deduplicate to get `I = {i₁,…, i_ℓ}`. 3. **Cuckoo hashing into buckets:** * Run client‑side cuckoo hashing over `I` to assign each index to one of `B ≈ 1.5ℓ` buckets. * Fill unused buckets with dummy index. 4. **Per-bucket index → PIR index:** * Convert DB index to “index within bucket” using precomputed mapping rules. 5. **Vectorized query generation:** * For each bucket, form a standard single‑query PIR request (hierarchical hypercube index). * Pack query vectors from multiple buckets into ciphertexts via vectorization. 6. **Request compression:** * Apply LSObvDecompress.Compress on the plaintext vector of bucket queries so that only the ℓ real queries drive the compressed encoding. 7. **Encrypt and send:** * Encrypt the compressed request and send to the server. ### 5.3 Server Processing 1. **Oblivious decompression (requests):** * Run LSObvDecompress.ObvDecompress to expand the compressed query into `1.5ℓ` logical bucket queries, without learning which are real. 2. **Bucket PIRs:** * For each bucket: * run vectorized PIR over its encoded DB subset to produce a bucket response ciphertext with (at most) one non‑zero slot. * Merge per‑bucket responses into a small set of vectorized ciphertexts (as in Vectorized BatchPIR). 3. **Oblivious compression (responses):** * Apply LSObvCompress.ObvCompress to the vector of bucket responses to shrink from `1.5ℓ` logical responses to ≈ `(1+ε)ℓ` ciphertexts. 4. **Send response:** * Return the compressed, vectorized ciphertexts + the current UBT root + MPT→UBT proof for that root. ### 5.4 Client Verification 1. **Decrypt & decompress:** * Decrypt the compressed response. * Run LSObvCompress.Decompress to recover the full set of (index, value) pairs for all requested nodes. 2. **Rebuild Merkle proofs:** * For each high‑level state value `x_j`: * recontruct its leaf value from the corresponding UBT leaf node; * collect all siblings along the path; * recompute the UBT root. 3. **Root checking:** * Verify each recomputed UBT root equals the root included in the response. * Verify the UBT root links to the on‑chain MPT root via the MPT→UBT proof. 4. **Output:** * If all checks pass, return the decrypted state values to the application (wallet, verifier, SNARK circuit, etc). The server never learns which `(stem, subindex)` pairs (hence which accounts/slots/code chunks) were actually used to assemble the Merkle proofs. --- ## 6. Efficiency Snapshot (for orientation) Concrete numbers obviously depend on parameter choices and implementation, but the underlying primitives support the following ballpark: * **Vectorized BatchPIR (no compression):** * DB: 1M entries, entry size 256 B; batch ℓ = 256. * Amortized server time ≈ 123 ms per entry. * Total comm ≈ 19× plaintext size (client+server). * **With LSObvCompress/Decompress:** * Request and response layers shrink by ≈ 30% in terms of ciphertext count for the cuckoo buckets, for both small and large entries. On top of that, the UBT’s binary structure keeps **state maintenance** cheap: * Updating one Ethereum state slot touches only `O(log N)` ≈ 24 nodes (for `N ≈ 2²⁴`) in the flattened DB and a corresponding number of hashes.