# Detailed Note: Rapidly Verifiable XMSS Signatures ## 1. XMSS (Extended Merkle Signature Scheme) ### 1.1 Hash-Based Cryptography Context Hash-based cryptography provides a secure and reliable way of encrypting, decrypting, and verifying digital signatures, offering cryptographic solutions that can withstand attacks from quantum computers. * **Foundation:** It relies on the cryptographic strength of hash functions and pseudo-random functions, which are well-understood and not known to be broken by quantum computers. * **Mechanism:** * Establishes a mutual secret key using a key agreement protocol relying on hash functions. * Uses a Merkle signature scheme to sign and verify messages using the shared secret key. * The Merkle signature scheme uses a hash tree to generate a signature secure against both classical and quantum attacks. ### 1.2 Stateful vs. Stateless Hash-based signatures are divided into two categories: * **Stateless:** (e.g., SPHINCS) Do not require keeping track of state but often have larger signature sizes. * **Stateful:** (e.g., XMSS) Requires the storage of a secret-key state; the signing key changes after every signature. * **Crucial Constraint:** If an attacker obtains signatures of two different messages created using the same One-Time Signature (OTS) key, it becomes computationally feasible to create forgeries. Therefore, the state must be updated (index incremented) after every signature. ### 1.3 Structure of XMSS XMSS combines the Winternitz One-Time Signature (WOTS+) with a Merkle Tree structure. * **Merkle Tree:** A binary hash tree of height $h$. The root node is the public key, and the leaves are the OTS key pairs. * **WOTS+ (Winternitz One-Time Signature Plus):** * Used for the leaves of the tree to sign the actual messages. * Parameters: * $n$: Security parameter (message digest length). * $w$: Winternitz parameter (trade-off between speed and size). * $l$: The total number of blocks in a signature ($l_1$ message chains + $l_2$ checksum chains). ### 1.4 Evolution: From Classical WOTS to WOTS+ Classical WOTS hash chains were deterministic and unkeyed, introducing several security issues: 1. **Chain Splicing / Multi-target Attacks:** * If an attacker observes multiple signatures, they may collect partial hash chains. * Because chains are not randomized, these partial chains can sometimes be combined. * This weakens security in multi-signature or adaptive settings. 2. **Unsustainability for Tree-based Schemes:** * When WOTS is used as a leaf signature in a Merkle tree, an attacker sees many WOTS public keys and signatures. * The lack of per-step domain separation becomes exploitable. **The WOTS+ Solution:** WOTS+ fixes these problems by introducing randomness to every step of the hash chain. * **Keyed Hashing:** Instead of hashing directly $x \to H(x)$, WOTS+ hashes using a bitwise XOR with a random value $r$ (and domain separators): $$x \to H(x \oplus r)$$ * **Role of $r$:** $r$ is a public, deterministic random value unique to each chain index and step index. This ensures domain separation and mitigates multi-target attacks. ## 2. Optimize XMSS Verification ### 2.1 The Core Goal The primary question addressed is: *"How can we maximize the verification speed of XMSS on embedded devices?"*. * **Context:** In IoT and embedded applications (e.g., vehicle-to-vehicle communication, secure boot), verification happens much more frequently than signing. * **Resource Constraints:** Embedded devices (like ARM Cortex-M4) are constrained in memory and computing power, making verification efficiency critical. ### 2.2 PZMCM / Winternitz Tuning The optimization relies on a technique proposed by PZMCM (Perin et al.), which exploits the probabilistic nature of WOTS+ chain lengths. * **Cost Analysis:** The cost of verifying a WOTS+ signature is largely determined by the values of the $l_1$ integers $h_1, \dots, h_{l_1}$ derived from the message digest. * Verification requires completing the hash chains. * The number of hash computations is $\sum (w - 1 - h_i)$. * **Key Insight:** As $h_i$ increases (the value in the signature is higher), the work remaining for the verifier *decreases*. * **The Trade-off:** * We can loop over $T$ counters and append them to the message: $M_{ctr} = H_{msg}(ctr, message)$. * We search for the counter that maximizes the cumulative chain length (sum of $h_i$). * This reduces the number of steps the verification would take, **hence reducing the verification time**. * *Note:* There are about a factor of 10 fewer checksum chains than message chains, so optimizing message chains yields significant gains. ### 2.3 Rapidly Verifiable XMSS Algorithm To implement this efficiently without exploding the signing time, the algorithm uses an iterative approach that stores the internal state of the hash function. **Algorithm Logic:** Instead of re-hashing the entire message for every counter attempt ($T$ times), we: 1. Compute the hash state for the static part of the message (Prefix + Message Body). 2. Store this `intermediate_state`. 3. Only process the final block (containing the counter) $T$ times. This makes the added signing cost independent of the message length. #### Python Implementation *This implementation replaces the standard signing routine to include the iterative counter search.* ```python import math def XMSS_sign_iterative(M, SK, t, n=32): """ Generates an XMSS signature optimized for verification speed using an iterative counter search. Parameters: M (bytes): The message to be signed. SK (object): The XMSS private key structure. t (int): The exponent for the number of counters to test (Total T = 2^t). n (int): Security parameter (byte length of message digest, e.g., 32 for SHA-256). Returns: tuple: (Updated SK, Signature Sig, best_counter ctr) """ # 1. Retrieve and update the signature index from the secret key idx_sig = getIdx(SK) setIdx(SK, idx_sig + 1) # 2. Initialize Address structure (set to zero) ADRS = toByte(0, 32) # 3. Generate randomness 'r' using a PRF # Derived from part of the SK and the current index sk_prf = getSKPRF(SK) r = PRF(sk_prf, toByte(idx_sig, 32)) # 4. Optimization Setup: Pre-calculate Hash State best_length = -1 # Calculate number of full 512-bit (64-byte) blocks to process once. # Formula: 2 (header blocks) + floor(ceil(len(M))/64) # Note: The '2' covers the prefix data (r, Root, idx) msg_len_bytes = len(M) # equivalent to ceil(log2(M)/8) num_blocks = 2 + math.floor(msg_len_bytes / 64) # Construct the input prefix: r || Root || index || Message # This input is static except for the final block containing the counter input_stream = ( r + getRoot(SK) + toByte(idx_sig, n) + M ) # Compute and store the intermediate SHA-256 state # This avoids re-hashing the large message body for every counter iteration intermediate_state = sha256_inc_blocks(input_stream, num_blocks) # 5. Iterative Search for Best Counter # Loop through 2^t possible counters to find the one producing the fastest-to-verify hash best_ctr = 0 final_h = None for i in range(0, 2**t): # Preserve the pre-calculated state before modifying it temp_state = intermediate_state.copy() # Prepare the final input block: Last part of Message || counter i current_input_block = last_block(M + toByte(i, 8)) # Finalize the hash computation using the preserved state # h is the candidate message digest h = sha256_inc_finalize(temp_state, current_input_block, 1) # Calculate the "length" (cost) of verifying this specific hash # Higher length = faster verification (less hashing required for WOTS+) new_length = wots_getlengths(h) # If this hash is faster to verify than previous best, keep it if new_length > best_length: best_length = new_length best_ctr = i final_h = h M_prime = h # The message digest to sign is the best hash found # 6. Generate the standard XMSS signature using the optimized message digest # treeSig generates the WOTS+ signature and authentication path signature_body = treeSig(M_prime, SK, idx_sig, ADRS) # Construct the final signature bundle: index || randomness || signature Sig = toByte(idx_sig, n) + r + signature_body # Return updated key, signature, and the optimized counter return (SK, Sig, best_ctr) ``` ### 2.4 RFC Compatibility * To maintain compatibility with RFC 8391, the counter `ctr` is appended to the message input rather than altering the internal hash structure. * The modified line in the signing algorithm becomes: `byte[n] M' <- Hmsg(r || getRoot(SK) || toByte(idx_sig, n), (M || ctr))`. **Benefit:** A standard RFC-compliant verifier can verify the signature as long as they are aware the message has an appended counter (which can be handled via a wrapper function). ### 2.5 Performance Analysis **Signing Time:** By storing the hash state, the time added to signing is independent of message size. For a 100KB message and , signing time drops from over 3 hours (without optimization) to just 14 seconds. * **Verification Speed:** * Using counters yields a ~20% speedup. * Using (approx. 1 minute of signing time) creates signatures that are verified 1.44 times faster than standard signatures. * Combining this with Time-Memory Trade-Off (TMTO) optimizations on an ARM Cortex-M4 reduces verification cycles from 13.85 million to 6.56 million (over a factor of 2 reduction). ### 2.6 Security * The PZMCM technique (searching for a specific hash) might introduce bias, but formal analysis proves that as long as the hash function behaves like a random function, security does not significantly degrade. * The parameters listed in the RFC still achieve the same level of security with this tuning applied. **Rust Implemenation:** https://github.com/developeruche/cryptography-n-zk-research/tree/main/digitial-signatures/xmss-sha256 ### Raw hand written notes; *This for personal knowledge graphing context** ![04-02-2026, 02:58 Microsoft Lens 4](https://hackmd.io/_uploads/Hkh3JVxP-e.jpg) ![04-02-2026, 02:58 Microsoft Lens(1) 3](https://hackmd.io/_uploads/r12nkEgDZe.jpg) ![04-02-2026, 02:58 Microsoft Lens(2) 2](https://hackmd.io/_uploads/ryn2J4xPZg.jpg) ![04-02-2026, 02:58 Microsoft Lens(3) 1](https://hackmd.io/_uploads/H13n1Egwbl.jpg)