ACDK25-On the Simulation-Extractability of Proof-Carrying Data

# [ACDK25] On the Simulation-Extractability of Proof-Carrying Data **Authors:** Behzad Abdolmaleki, Matteo Campanelli, Quang Dao, Hamidreza Khoshakhlagh **Venue:** IACR ePrint Archive 2025/2037 **Report Date:** November 2025 --- [TOC] ## Executive Summary :::info **Key Finding** This paper presents the first systematic study of simulation extractability for proof-carrying data systems, introducing tailored security definitions that accommodate recursive proof composition and proving that major construction templates satisfy these definitions under appropriate conditions. ::: This work addresses a critical gap in the security analysis of recursive proof systems by establishing formal foundations for reasoning about non-malleability properties. The research introduces a refined notion of simulation extractability specifically designed for proof-carrying data and incrementally verifiable computation, then demonstrates that the primary construction templates for these systems achieve this security property when their underlying components satisfy natural requirements. The results validate that many deployed proof-carrying data systems already possess strong non-malleability guarantees by design, providing formal security foundations for systems currently processing billions of dollars in cryptocurrency transactions and supporting critical blockchain infrastructure. ## Background and Motivation Proof-carrying data (PCD) has emerged as a transformative paradigm for verifying distributed computations among mutually distrusting parties. The technology enables a prover to proceed incrementally with very low memory footprint, making it particularly attractive for scalability solutions compared to monolithic proof systems that require access to the entire circuit trace. Recent advances in accumulation-based and folding-based recursion have made proof-carrying data increasingly practical, with current deployment in real-world applications including cryptocurrency protocols such as Mina and Lurk, as well as zero-knowledge virtual machines. However, when proof schemes operate within larger protocols, basic security properties such as knowledge soundness and zero-knowledge prove insufficient for compositional security. The stronger property of simulation extractability becomes essential in these settings, requiring that knowledge extraction succeeds even when a malicious prover $\mathcal{A}$ can request simulated proofs $\Pi_{\text{sim}}$ for arbitrary statements from a simulator $\mathcal{S}$ with access to trapdoor $td$. This property directly implies non-malleability, preventing adversaries from successfully tampering with valid proofs to create different valid proofs without knowing the underlying witnesses $w$. :::warning **Real-World Impact** Historical analysis reveals that over 300,000 Bitcoin (valued at billions of dollars) have been involved in malleability-related attacks on the Bitcoin network, demonstrating that these security concerns extend far beyond theoretical considerations. ::: The existing literature has established simulation extractability for numerous non-recursive zero-knowledge succinct non-interactive arguments of knowledge, including Bulletproofs, Spartan, Sonic, PLONK, Marlin, Lunar, and Basilisk. Nevertheless, no prior work has formalized $\text{SIM-EXT}$ for recursive systems like proof-carrying data or incrementally verifiable computation, where proofs nest and interdepend across multiple layers forming complex directed acyclic graph structures. This paper fills that fundamental gap. ## Main Contributions ┌─────────────────────────────────┐ │ Contribution 1: │ │ New Definition │ │ Partial Compliance for PCD │ │ SIM-EXT │ └──────────────┬──────────────────┘ │ ┌─────────────────┴─────────────────┐ │ │ ▼ ▼ ┌──────────────────────────┐ ┌──────────────────────────┐ │ Contribution 2: │ │ Contribution 3: │ │ SIM-EXT SNARK │ │ SIM-EXT NARK + │ │ ⟹ SIM-EXT PCD │ │ KS Accumulation │ │ │ │ ⟹ SIM-EXT PCD │ └──────────────┬───────────┘ └──────────────┬───────────┘ │ │ ▼ ▼ ┌──────────────────────────┐ ┌──────────────────────────┐ │ Instantiations: │ │ Instantiations: │ │ • Groth-Maller │ │ • Nova-based systems │ │ • Fractal │ │ │ └──────────────────────────┘ └──────────────────────────┘ ### Defining Simulation Extractability for Proof-Carrying Data The paper's first major contribution addresses a fundamental definitional challenge stemming from the observation that standard simulation extractability does not directly transfer to recursive settings. The core difficulty emerges from the interaction between simulated proofs and honest proof generation when proofs compose recursively across multiple layers. ```mermaid graph LR subgraph "Standard SIM-EXT (Flat Proofs)" A1[Adversary A] -->|produces π*| A2[Statement x*] A3[Simulator S] -.->|simulated proofs| A1 A2 --> A4{Extractor E} A4 -->|MUST extract| A5[Complete witness w*] A4 -.->|❌ fails if simulated| A6[Cannot handle simulated proofs] style A6 fill:#ffe6e6 style A5 fill:#e1f5e1 end subgraph "PCD SIM-EXT (Recursive Proofs)" B1[Adversary A] -->|Π* for z₀⇝z'| B2[Final output] B3[Simulator S] -.->|sim proof z₀⇝zd| B1 B1 -->|honest proof zd⇝z'| B2 B2 --> B4{Extractor E} B4 -->|✓ extracts| B5[Witnesses for honest segment] B4 -.->|✓ stops at| B6[Simulated proof partial compliance] style B5 fill:#e1f5e1 style B6 fill:#fff4e6 end ``` Consider an incrementally verifiable computation scenario represented by a chain $z_0 \rightsquigarrow z_1 \rightsquigarrow \cdots \rightsquigarrow z_d \rightsquigarrow \cdots \rightsquigarrow z_{d+k}$ where each transition $z_i \rightsquigarrow z_{i+1}$ applies a function $F$ with local witness $w_i$ satisfying $z_{i+1} = F(z_i, w_i)$. An adversary $\mathcal{A}$ first requests from simulator $\mathcal{S}$ a proof $\Pi_{\text{sim}}$ for the statement $z_0 \rightsquigarrow z_d$, then honestly continues the computation from $z_d$ to $z_{d+k}$ using witnesses $w_d, \ldots, w_{d+k-1}$ that $\mathcal{A}$ actually possesses. The adversary returns the final proof $\Pi^*$ claiming $z_0 \rightsquigarrow z_{d+k}$. :::danger **The Extraction Dilemma** An extractor $E$ facing this proof must determine whether it can extract the complete witness sequence $w_0, \ldots, w_{d+k-1}$. While extraction should succeed for the honest segment $z_d \rightsquigarrow z_{d+k}$, extraction necessarily fails for the simulated segment $z_0 \rightsquigarrow z_d$, even if witnesses $w_0, \ldots, w_{d-1}$ theoretically exist but remain computationally infeasible to find. ::: ```mermaid graph LR subgraph "Simulated Segment (from S)" z0[z₀] -->|F, w₀| z1[z₁] z1 -->|F, w₁| z2[z₂] z2 -->|"..."| zd[zd] style z0 fill:#ffe6e6 style z1 fill:#ffe6e6 style z2 fill:#ffe6e6 style zd fill:#ffe6e6 end subgraph "Honest Segment (known to A)" zd -->|F, wd| zd1[zd+1] zd1 -->|F, wd+1| zd2[zd+2] zd2 -->|"..."| zf[z'] style zd1 fill:#e1f5e1 style zd2 fill:#e1f5e1 style zf fill:#e1f5e1 end Extract1["❌ Cannot extract w₀, w₁, ..., wd-1"] -.-> z0 Extract2["✓ Can extract wd, wd+1, ..."] -.-> zd1 style Extract1 fill:#fff style Extract2 fill:#fff ``` Standard simulation extractability would demand extraction of all witnesses $w_0, \ldots, w_{d+k-1}$ throughout the entire chain. However, this requirement proves impossibly strong because witnesses corresponding to simulated proofs may exist abstractly without being computationally accessible to any polynomial-time extractor. The simulator $\mathcal{S}$ generates valid proofs using trapdoor $td$ precisely without needing to compute these witnesses. The paper resolves this through Definition 4.2 introducing the concept of partial compliance. A vertex $u \in V(\text{PCT})$ achieves $(V_{pp}, Q_{\mathcal{S}})$-partial compliance when for all outgoing edges $e = (u,v)$ labeled $(z^{(e)}, \Pi^{(e)})$, the standard compliance conditions hold unless $(\Psi_u, z^{(e)}, \Pi^{(e)}) \in Q_{\mathcal{S}}$ where $Q_{\mathcal{S}}$ denotes the set of tuples returned by the simulation oracle. Specifically, when the outgoing proof does not appear in the simulation log, both the local data must satisfy $$\Psi_u(z^{(e)}, w_{\text{loc}}^{(u)}, z^{(e_1)}, \ldots, z^{(e_{m'})}) = 1$$ and the verifier must accept via $V_{pp}(\Psi_u, z^{(e)}, \Pi^{(e)}) = 1$ for incoming edges $e_1, \ldots, e_{m'}$ to vertex $u$. The transcript $\text{PCT}$ achieves partial compliance when every vertex satisfies this relaxed requirement. ```mermaid graph TD A[Adversary outputs proof Π*] --> B{Is Π* simulated?} B -->|No| C[Extractor must produce witnesses] B -->|Yes| D[Game returns 0: adversary wins trivially] C --> E{Recursively extract from prior proofs} E --> F{Is prior proof simulated?} F -->|No| G[Continue extraction recursively] F -->|Yes| H[Stop extraction: partial compliance satisfied] G --> E H --> I[Output partially compliant transcript] ``` The simulation extractability game for proof-carrying data, formally defined in Figure 9 of the paper, then requires that for any adversary $\mathcal{A}$ producing a non-simulated accepting proof $(\Psi^*, o^*, \Pi^*)$ where $\Psi^* \in \mathcal{F}$, $(\Psi^*, o^*, \Pi^*) \notin Q_{\mathcal{S}}$, and $V(ivk, o^*, \Pi^*) = 1$, there exists an extractor $E_\mathcal{A}$ outputting a transcript $\text{PCT}^*$ satisfying both $(V_{pp}, Q_{\mathcal{S}})$-partial compliance and the output matching condition $o(\text{PCT}^*) = o^*$. The advantage is defined as $$\text{Adv}^{\text{PCD-SE}}_{\mathcal{A}, \mathcal{S}, E_\mathcal{A}, D}(\lambda) := \Pr[\text{PCD-SE}^{\mathcal{A}, \mathcal{S}, E_\mathcal{A}}_{\mathsf{PCD}, \mathcal{F}, D}(1^\lambda) = 1] \leq \text{negl}(\lambda)$$ where the game returns $1$ when the verifier accepts a non-simulated proof but the extractor fails to produce a partially compliant transcript with matching output. This definitional innovation captures the intuition that extraction should proceed as far as possible through honest proof generation, halting only at genuinely simulated components or at base cases of the computation. The approach balances rigorous security requirements with technical feasibility, preventing trivial impossibility results while maintaining meaningful non-malleability guarantees that suffice for protocol composition. ### Simulation Extractability from Succinct Arguments The second contribution, formalized in Theorem 5.2, establishes that simulation-extractable succinct non-interactive arguments of knowledge yield simulation-extractable proof-carrying data when used recursively according to the classical template. This result provides a direct path from well-understood flat proof systems to secure recursive systems with provable non-malleability properties. The construction follows the template introduced by Bitansky, Canetti, Chiesa, and Tromer where proof-carrying data emerges from a succinct argument $\mathsf{ARG} = (G, I, \mathcal{P}, V)$ capable of efficiently verifying its own verification circuit $V^{(\lambda, N, k)}$. For a compliance predicate $\Psi$, the construction defines a recursion circuit $R^{(\lambda, N, k)}_{V, \Psi, pp}$ that takes instance $(ivk, z)$ and witness $(w_{\text{loc}}, (z_i, \pi_i)_{i \in [m]})$ and outputs $1$ precisely when both conditions hold: the compliance predicate accepts via $\Psi(z, w_{\text{loc}}, z_1, \ldots, z_m) = 1$, and for all non-trivial incoming edges where $(z_i, \pi_i) \neq \bot$, the succinct argument verifier accepts via $V^{(\lambda, N, k)}(ivk, (ivk, z_i), \pi_i) = 1$. ```mermaid graph TD subgraph "PCD from SNARK: Recursive Composition" A[Input: z, wloc, prior proofs πi] B[Recursion Circuit RV,Ψ,pp] C[SNARK Prover P] D[Output: PCD proof Π = π] A --> B B -->|"Check Ψ(z,wloc,zi) = 1"| B1[Compliance] B -->|"Check V(ivk,zi,πi) = 1"| B2[Prior proofs valid] B1 --> C B2 --> C C --> D end subgraph "Verification" E[PCD Verifier V] F[SNARK Verifier V] E -->|"delegates to"| F F -->|"verifies recursion circuit"| G[Accept/Reject] end D -.->|"proof π"| E style B fill:#e3f2fd style C fill:#f3e5f5 style F fill:#f3e5f5 ``` At each step, the prover demonstrates compliance by generating $\pi \leftarrow \mathcal{P}(ipk, (ivk, z), (w_{\text{loc}}, (z_i, \pi_i)_{i \in [m]}))$ where the succinct argument proves the recursive relation defined above. The proof-carrying data proof consists simply of the succinct argument proof $\Pi := \pi$, and the proof-carrying data verifier invokes the succinct argument verifier via $V(ivk, (ivk, z), \pi)$. :::success **Theorem 5.2 (SIM-EXT SNARK ⇒ SIM-EXT PCD)** If the underlying succinct argument $\mathsf{ARG}$ satisfies simulation extractability with simulator $\mathcal{S}$ and extractor family $\{E_\mathcal{A}\}$, then the recursive proof-carrying data construction $\mathsf{PCD}$ satisfies simulation extractability with corresponding simulator and extractor. Furthermore, if $\mathsf{ARG}$ admits black-box or straightline extraction, the same property transfers to $\mathsf{PCD}$. ::: The security reduction constructs a sequence of succinct argument adversaries corresponding to each potential node in the computation graph induced by the proof-carrying data adversary's output. The root adversary $\mathcal{A}_{\text{root}}$ simulates the proof-carrying data adversary $\mathcal{A}$, translating simulation queries via the mapping $$\tilde{\mathcal{S}}(\Psi, z) \mapsto \mathcal{S}(R_\Psi, z)$$ where $R_\Psi$ denotes the recursion circuit corresponding to predicate $\Psi$. When $\mathcal{A}$ outputs $(\Psi^*, o^*, \Pi^* = \pi^*)$, the root adversary outputs $(R_{\Psi^*}, o^*, \pi^*)$ interpreted as a succinct argument proof for the recursive relation. ```mermaid graph TD A[PCD Adversary A] --> B[Root SNARK Adversary A_root] B -->|"uses E_root"| C[Extractor E_0] C -->|"produces PCT_0"| D[Depth-0 transcript] D --> E[Layer 1: SNARK Adversary A_1] E -->|"uses E_A1"| F[Extractor E_1] F -->|"extends to PCT_1"| G[Depth-1 transcript] G --> H[Layer 2: SNARK Adversary A_2] H -->|"uses E_A2"| I[Extractor E_2] I -->|"extends to PCT_2"| J[Depth-2 transcript] J -.->|"continue..."| K[...] K -.-> L[Final: Extractor E_d = E_A] L --> M[Complete partially compliant PCT] style C fill:#e3f2fd style F fill:#e3f2fd style I fill:#e3f2fd style L fill:#f3e5f5 ``` The extraction process proceeds recursively by layers, constructing extractors $E_0, E_1, \ldots, E_d$ for transcripts of increasing depth. The base extractor $E_0$ applies the succinct argument extractor $E_{\mathcal{A}_{\text{root}}}$ to obtain the witness $$w^* = (w_{\text{loc}}^*, [(z_i, \pi_i)]_{i=1}^{m'}) \leftarrow E_{\mathcal{A}_{\text{root}}}(\text{Trans}_{\mathcal{A}_{\text{root}}})$$ representing the local data and immediate predecessors in the computation. This yields a partial proof-carrying transcript $\text{PCT}_0$ with a single root node labeled $(o^*, \Pi^*)$ and $m'$ outgoing edges labeled $(z_i, \Pi_i = \pi_i)$ respectively. For each subsequent layer $k$, the process defines a new succinct argument adversary $\mathcal{A}_k$ that identifies each non-simulated leaf edge $(z_k, \Pi_k = \pi_k)$ in $\text{PCT}_{k-1}$ and outputs $(R_{\Psi_k}, z_k, \pi_k)$. The corresponding extractor $E_{\mathcal{A}_k}$ yields witness $w_k = (w_{\text{loc},k}, [(z_{\text{child}}, \pi_{\text{child}})])$ for the relation $R_{\Psi_k}$ at instance $z_k$. The extractor then extends $\text{PCT}_{k-1}$ by adding local witness $w_{\text{loc},k}$ to node $k$ and creating new child nodes for edges $(z_{\text{child}}, \Pi_{\text{child}} = \pi_{\text{child}})$, yielding $\text{PCT}_k$. The overall extractor $E_\mathcal{A}$ executes this recursive process until no further non-simulated leaf nodes require expansion. The success probability analysis employs a hybrid argument across extraction layers. Let $N$ denote the number of nodes in the final extracted transcript for which extraction is attempted. The probability that extraction fails at some node is bounded by the union bound over all potential extraction points: $$\text{Adv}^{\text{PCD-SE}}_{\mathcal{A}, \mathcal{S}, E_\mathcal{A}}(\lambda) \leq \sum_{k \in \text{extracted nodes}} \text{Adv}^{\text{ARG-SE}}_{\mathcal{A}_k, \mathcal{S}, E_{\mathcal{A}_k}}(\lambda)$$ To relate this to a single succinct argument adversary, construct $\mathcal{A}'$ that internally simulates $\mathcal{A}$, randomly selects a node $k$ from the extraction path, and executes the logic of $\mathcal{A}_k$. This reduction yields $$\text{Adv}^{\text{PCD-SE}}_{\mathcal{A}, \mathcal{S}, E_\mathcal{A}}(\lambda) \leq N \cdot \text{Adv}^{\text{ARG-SE}}_{\mathcal{A}', \mathcal{S}, E_{\mathcal{A}'}}(\lambda)$$ Since $\text{Adv}^{\text{ARG-SE}}_{\mathcal{A}', \mathcal{S}, E_{\mathcal{A}'}}$ remains negligible by the succinct argument's simulation extractability and $N$ is polynomially bounded because $\mathcal{A}$ and $E_k$ run in expected polynomial time, the overall advantage remains negligible. The preservation of extraction strength represents a subtle but important aspect of this result. When the underlying argument admits straightline extraction where the extractor $E$ runs the adversary exactly once without rewinding, this property transfers to the proof-carrying data construction. Recent work by Chiesa, Guan, Samocha, and Yogev has shown that straightline extraction enables tighter concrete security bounds for proof-carrying data that remain independent of recursion depth, a result that complements the simulation extractability analysis presented here. | Construction Template | Base SNARK | Resulting PCD | Extraction Type | |:---------------------|:-----------|:--------------|:----------------| | Cycles of Curves | Groth-Maller | SIM-EXT PCD | Black-box | | Fractal | Transparent IOP | SIM-EXT PCD | Straightline | The theorem yields immediate practical instantiations. Applying it to the Groth-Maller simulation-extractable succinct argument instantiated on cycles of pairing-friendly elliptic curves as in the Ben-Sasson, Chiesa, Tromer, and Virza construction produces a simulation-extractable proof-carrying data scheme suitable for deployment in settings requiring pairing-based cryptography. Similarly, the Fractal transparent recursive proof system, which achieves universal composability and hence simulation extractability as shown by Chiesa and Fenzi, generates a simulation-extractable proof-carrying data construction through this template with post-quantum security properties. ### Simulation Extractability from Arguments and Accumulation The third contribution, formalized in Theorem 5.5, addresses more lightweight proof-carrying data constructions that combine non-succinct arguments with accumulation schemes according to the template developed by Bünz, Chiesa, Lin, Mishra, and Spooner. This approach avoids the overhead of fully succinct arguments by using accumulation to efficiently aggregate verification checks, requiring only that the argument prove a relation of manageable size independent of recursion depth. The construction works by proving at each step a recursive relation $R^{(\lambda, N, k)}_{V, \Psi, pp}$ that takes instance $(avk, z, \text{acc}.x)$ and witness $(w_{\text{loc}}, [z_i, \pi_i.x, \text{acc}_i.x]_{i \in [m]}, pf)$ and verifies three conditions. First, the compliance predicate must accept via $\Psi(z, w_{\text{loc}}, z_1, \ldots, z_m) = 1$. Second, for all non-trivial incoming edges, the accumulation verifier must accept via $$V^{(\lambda, m, N, k)}(avk, [qx_i]_{i=1}^m, [\text{acc}_i.x]_{i=1}^m, \text{acc}.x, pf) = 1$$ where $qx_i := ((avk, z_i, \text{acc}_i.x), \pi_i.x)$ represents the instance part of the accumulated predicate input. Third, the new accumulator must pass the decider check via $D(dk, \text{acc}) = 1$. ![IVC from NARK and Accumulation Schemes](https://hackmd.io/_uploads/BJnFPRK1Wg.png) *Figure 1: Architecture showing how NARK and accumulation scheme components compose into a full PCD system. The IVC prover uses the compliance predicate, accumulation prover, and argument prover, while the IVC verifier delegates to the argument verifier and accumulation decider.* The proof-carrying data proof takes the form $\Pi = (\pi_{\text{ARG}}, \text{acc})$ where $\pi_{\text{ARG}} \leftarrow \mathcal{P}(ipk, (avk, z, \text{acc}.x), (w_{\text{loc}}, [z_i, \pi_i.x, \text{acc}_i.x]_{i \in [m]}, pf))$ represents the argument proof certifying the recursive relation, and $\text{acc} = (\text{acc}.x, \text{acc}.w)$ represents the new accumulator produced via $$(\text{acc}, pf) \leftarrow \mathcal{P}(apk, [(qx_i, qw_i)]_{i=1}^m, [\text{acc}_i]_{i=1}^m)$$ where $qw_i = (\text{acc}_i.w, \pi_i.w)$ contains the witness components. The verifier accepts when both $V(ivk, (avk, z, \text{acc}.x), \pi_{\text{ARG}}) = 1$ and $D(dk, \text{acc}) = 1$ hold simultaneously. :::success **Theorem 5.5 (NARK + Accumulation ⇒ SIM-EXT PCD)** Let $\mathsf{ARG}$ be zero-knowledge with simulator $\mathcal{S}$ and $\mathsf{AC}$ be zero-knowledge with simulator $\mathcal{S}'$. If $\mathsf{ARG}$ satisfies simulation extractability and $\mathsf{AC}$ satisfies knowledge soundness with collision resistance, then the proof-carrying data construction $\mathsf{PCD}$ satisfies simulation extractability. ::: The security reduction for this template presents additional challenges compared to the succinct argument case due to potential mix-and-match attacks where adversaries combine components from different simulation queries. A naive approach attempting to reduce directly to simulation extractability of both the argument and the accumulation scheme proves insufficient. The paper resolves this through a more sophisticated analysis requiring only simulation extractability of the argument and knowledge soundness plus collision resistance of the accumulation scheme. The collision resistance property ensures that the accumulator instance $\text{acc}.x$ cryptographically binds to the accumulator witness $\text{acc}.w$, typically achieved by having $\text{acc}.x$ serve as a collision-resistant hash of $\text{acc}.w$. This binding prevents adversaries from maintaining a fixed $\text{acc}.x$ while swapping different $\text{acc}.w$ values across multiple proofs. Combined with the fact that $\text{acc}.x$ appears in the argument instance $(avk, z, \text{acc}.x)$, this ensures that the argument proof $\pi_{\text{ARG}}$ binds to the complete accumulator $(\text{acc}.x, \text{acc}.w)$. ```mermaid graph TD subgraph "Dual Extractor Strategy" A[PCD Adversary A at layer i] A --> B[ARG Adversary Ai] A --> C[AC Adversary A'i] B -->|"SIM-EXT extraction"| D[Extractor E_Ai] C -->|"KS extraction"| E[Extractor E_A'i] D -->|"extracts wloc, [zi,πi.x,acci.x], pf"| F[ARG witnesses] E -->|"extracts [qwi], [acci.w]"| G[AC witnesses] F --> H[Combine witnesses] G --> H H --> I[Extend PCT from layer i to i+1] end I -.->|"repeat for next layer"| A style D fill:#f3e5f5 style E fill:#fff3e0 style H fill:#e8f5e9 ``` The reduction constructs both an argument adversary $\mathcal{A}_i$ and an accumulation scheme adversary $\mathcal{A}'_i$ for each layer $i$ of the transcript. The argument adversary $\mathcal{A}_i$ simulates the proof-carrying data adversary while managing simulation queries carefully. To simulate the proof-carrying data simulator $\tilde{\mathcal{S}}$ on input $(\Psi, z)$, the argument adversary first computes $\text{acc} \leftarrow \mathcal{S}'(pp_{\mathsf{AC}}, pp, R)$ using the accumulation simulator, then obtains $\pi_{\text{ARG}} \leftarrow \tilde{\mathcal{S}}(pp, R, (avk, z, \text{acc}.x))$ from the argument simulation oracle, and finally returns $\Pi = (\pi_{\text{ARG}}, \text{acc})$ to the proof-carrying data adversary. For each node $u$ at depth $i$ with non-simulated proof, the argument adversary outputs $(R^{(\lambda, N, k)}_{V, \Psi^*, pp}, (avk, z^{(u)}, \text{acc}^{(u)}.x), \pi^{(u)}_{\text{ARG}}.x)$ where the proof splits as $\pi^{(u)}_{\text{ARG}} = (\pi^{(u)}_{\text{ARG}}.x, \pi^{(u)}_{\text{ARG}}.w)$. By the simulation extractability of the argument, there exists an extractor $E_{\mathcal{A}_i}$ producing witness $$(w^{(u)}_{\text{loc}}, [z^{(u)}_j, \pi^{(u)}_j.x, \text{acc}^{(u)}_j.x]_{j \in [m]}, pf^{(u)}) \leftarrow E_{\mathcal{A}_i}(\text{Trans}_{\mathcal{A}_i})$$ The accumulation scheme adversary $\mathcal{A}'_i$ then uses this extracted information by constructing predicate input instances $qx^{(u)}_j = ((avk, z^{(u)}_j, \text{acc}^{(u)}_j.x), \pi^{(u)}_j.x)$ and outputting $(R^{(\lambda, N, k)}_{V, \Psi^*, pp}, [qx^{(u)}_j]_{j \in [m]}, [\text{acc}^{(u)}_j.x]_{j \in [m]}, \text{acc}^{(u)}.x, pf^{(u)})$. The knowledge soundness of the accumulation scheme guarantees an extractor $E_{\mathcal{A}'_i}$ producing witness components $$([qw^{(u)}_j]_{j \in [m]}, [\text{acc}^{(u)}_j.w]_{j \in [m]}) \leftarrow E_{\mathcal{A}'_i}(\text{Trans}_{\mathcal{A}'_i})$$ where $qw^{(u)}_j = (\text{acc}^{(u)}_j.w, \pi^{(u)}_j.w)$ completes the witness information needed to extend the transcript. The overall extractor $E_{i+1}$ for layer $i+1$ combines these components by first running $E_{\mathcal{A}'_i}$ to obtain the complete witness structure, then extending the transcript $\text{PCT}_i$ by adding child nodes corresponding to the extracted incoming edges. The success probability analysis bounds the advantage by summing over all layers: $$\text{Adv}^{\text{PCD-SE}}_{\mathcal{A}, \mathcal{S}, E_\mathcal{A}}(\lambda) \leq \sum_{i=0}^{d-1} \left(\text{Adv}^{\text{ARG-SE}}_{\mathcal{A}_i, \mathcal{S}, E_{\mathcal{A}_i}}(\lambda) + \text{Adv}^{\text{AC-KS}}_{\mathcal{A}'_i, E_{\mathcal{A}'_i}}(\lambda)\right)$$ Since the argument satisfies simulation extractability and the accumulation scheme satisfies knowledge soundness, both terms remain negligible for each layer, yielding negligible overall advantage for any constant depth $d$. The critical insight enabling this approach is that the accumulation scheme need not satisfy the stronger simulation extractability property because the argument's simulation extractability already provides necessary protection against malleability at the proof level. The accumulation scheme only needs to guarantee through knowledge soundness that valid accumulators correspond to valid collections of predicate witnesses and prior accumulators, while collision resistance ensures proper binding between instance and witness components. This theorem carries significant practical implications for systems built using folding schemes and accumulation-based recursion. Schemes like Nova, when instantiated with simulation-extractable arguments for the rank-one constraint system relations, obtain simulation extractability for the full proof-carrying data system. The result also informs design of new systems by clarifying which components require the stronger simulation extractability property and which need only standard knowledge soundness, potentially enabling more efficient constructions through targeted optimization of security properties. ## Technical Approach and Methodology The proof techniques throughout the paper exhibit careful attention to the subtleties of recursive extraction in the expected polynomial-time model. A central challenge involves managing the composition of multiple extractors across layers while ensuring that the cascade preserves the expected polynomial-time property. The paper models all parties as non-uniform expected polynomial-time algorithms receiving polynomial-size advice $\text{ai}$ and having access to infinite random tapes such that average running time over randomness choices remains polynomial. The recursive extraction strategy consistently builds extractors layer by layer according to the recurrence $E_{i+1}(\text{Trans}) := \text{Extend}(E_i(\text{Trans}), E_{\mathcal{A}_i}(\text{Trans}), E_{\mathcal{A}'_i}(\text{Trans}))$ where the extension operation adds newly extracted nodes to the transcript. This compositional approach requires verification that running times satisfy $\mathbb{E}[T_{E_{i+1}}] \leq \mathbb{E}[T_{E_i}] + \mathbb{E}[T_{E_{\mathcal{A}_i}}] + \mathbb{E}[T_{E_{\mathcal{A}'_i}}] + \text{poly}(\lambda)$ to ensure the final extractor $E_\mathcal{A} := E_d$ runs in expected polynomial time. The success probability arguments employ hybrid techniques introducing intermediate games $H_0, H_1, \ldots, H_N$ where $H_0$ represents the real simulation extractability game and $H_k$ represents a game where the first $k$ extractions succeed but the $(k+1)$-th may fail. The difference between consecutive hybrids satisfies $$|\Pr[H_{k-1} = 1] - \Pr[H_k = 1]| \leq \text{Adv}^{\text{ARG-SE}}_{\mathcal{A}_k, \mathcal{S}, E_{\mathcal{A}_k}}(\lambda)$$ enabling the union bound to establish overall negligible advantage. The relationship between partial compliance and full compliance represents another technical subtlety requiring precise formalization. Partial compliance allows the extractor to halt at simulated proofs, formalized through the condition that compliance requirements apply only when $(\Psi_u, z^{(e)}, \Pi^{(e)}) \notin Q_{\mathcal{S}}$. However, the security game requires that the adversary's final output $(\Psi^*, o^*, \Pi^*)$ satisfy $(\Psi^*, o^*, \Pi^*) \notin Q_{\mathcal{S}}$, creating an asymmetry between what the adversary must produce and what the extractor must explain. This asymmetry proves essential for avoiding impossibility results while maintaining meaningful security. The treatment of auxiliary input distributions requires careful attention throughout security reductions. The knowledge soundness games for all primitives parameterize by auxiliary input distributions $D$ where for arguments we require $D(pp) \to (\text{ai})$, for accumulation schemes we require $D(pp_{\mathsf{AC}}) \to (pp_\Phi, \text{ai})$ with admissibility meaning $\{pp_\Phi : pp_\Phi \leftarrow H(1^\lambda)\} \approx \{pp'_\Phi : (pp'_\Phi, \cdot) \leftarrow D(1^\lambda)\}$, and for proof-carrying data we require $D(pp) \to (\text{ai})$. The reductions must carefully construct auxiliary input distributions for reduced problems that remain admissible with respect to original problems, particularly affecting the accumulation scheme analysis where the auxiliary input must simulate the proof-carrying data environment including transcripts of simulation queries. ## Significance and Future Directions This work establishes the first formal foundations for reasoning about non-malleability in recursive proof systems, providing security guarantees for deployed systems processing substantial financial value and supporting critical blockchain infrastructure. By demonstrating that the two major construction templates for proof-carrying data yield simulation-extractable systems under natural conditions on their components, the paper validates design choices made in practice and provides rigorous guidance for future system development. The definitional contribution creates a framework applicable beyond the immediate scope of proof-carrying data and incrementally verifiable computation. The partial compliance concept may extend to other settings where proofs compose in sophisticated ways, including incrementally verifiable computation with multiple heterogeneous proof systems operating in parallel, proof-carrying data with dynamic computation topologies that change during execution, or recursive compositions involving interactive protocols that transform into non-interactive form through Fiat-Shamir compilation. Each of these settings may require further refinement of the partial compliance concept to accommodate their specific structural properties. The technical results enable immediate practical application to security analysis of real-world systems. Developers of recursive proof systems can verify that their constructions achieve non-malleability by checking that underlying components satisfy appropriate properties and then applying the composition theorems. For the succinct argument template, this reduces to verifying simulation extractability of the base argument system. For the accumulation template, this requires verifying simulation extractability of the argument component and knowledge soundness plus collision resistance of the accumulation component. This modular approach significantly simplifies security analysis compared to monolithic reasoning about the full recursive construction. :::info **Open Directions** - Extension to online extraction models where the extractor operates during adversary execution - Analysis of concrete security bounds relating proof-carrying data advantage to component advantages with explicit constants - Investigation of simulation extractability for folding schemes with custom constraint systems beyond rank-one constraint systems - Development of simulation extractability notions for proof-carrying data with more complex graph structures ::: The observation about accumulation scheme zero-knowledge, while perhaps appearing secondary, actually simplifies both theoretical analysis and practical implementation substantially. Simulator implementations need not handle oracle reprogramming through maintaining tables of prior queries and responses, and security proofs avoid the associated technical complexity of arguing that reprogramming remains consistent across multiple invocations. This makes accumulation-based constructions more accessible to practitioners and easier to analyze for cryptographers developing new schemes. Looking forward, several research directions emerge as natural extensions of this work. Extending the results to more general extraction models, particularly online extraction where the extractor operates during adversary execution rather than post-hoc on complete transcripts, would strengthen security guarantees and potentially enable tighter bounds. Analyzing simulation extractability for specific novel constructions such as HyperNova with customizable constraint systems or Mangrove with multiple folding layers would expand the set of systems with proven non-malleability properties. Understanding concrete security implications through explicit bounds on how proof-carrying data advantage relates to component advantages as a function of depth and fan-in would inform practical deployment decisions and parameter selection. ## Conclusion This paper resolves a fundamental gap in the security analysis of recursive proof systems by introducing appropriate definitions of simulation extractability for proof-carrying data and establishing that the two major construction templates satisfy these definitions under natural conditions on their components. The work demonstrates that simulation-extractable succinct arguments yield simulation-extractable proof-carrying data through recursive composition according to Theorem 5.2, and that lighter constructions combining arguments with accumulation schemes achieve simulation extractability when the argument component satisfies the property according to Theorem 5.5. These results provide formal assurance that many deployed proof-carrying data systems possess strong non-malleability guarantees, supporting their use in security-critical applications processing substantial financial value. The definitional innovation of partial compliance resolves the inherent tension between standard simulation extractability requirements demanding extraction of all historical witnesses and the realities of recursive proof composition where simulated proofs necessarily appear in computation chains. By permitting extractors to halt at simulated proofs rather than requiring extraction through them, the new definition captures meaningful security properties while remaining technically achievable. This balanced approach, formalized through the $(V_{pp}, Q_{\mathcal{S}})$-partial compliance concept, should inform future work on security definitions for complex cryptographic systems exhibiting recursive structure. The technical results establish that simulation extractability composes well through the primary templates for building proof-carrying data systems, with clear sufficient conditions on underlying components. For the succinct argument template, simulation extractability of the base argument suffices via the reduction in Theorem 5.2. For the accumulation template, simulation extractability of the argument combined with knowledge soundness and collision resistance of the accumulation scheme suffices via the reduction in Theorem 5.5. These composition theorems enable modular security analysis and provide actionable guidance for system designers seeking to ensure non-malleability properties in their recursive proof constructions. --- **Keywords:** Proof-carrying data, incrementally verifiable computation, simulation extractability, non-malleability, recursive proofs, accumulation schemes, zero-knowledge arguments, knowledge soundness, folding schemes **Paper Classification:** Cryptographic protocols, proof systems, security definitions **Relevant Prior Work:** Bünz et al. (accumulation schemes), Bitansky et al. (PCD foundations), Groth-Maller (simulation-extractable SNARKs), Chiesa et al. (recursive composition)