# Shadow support for Polkadot-SDK research
## Summary
The “Shadow support” work investigated the ability to execute **polkadot-sdk** networks inside the **Shadow** discrete-event network simulator. After some Polkadot-SDK modifications described below we enabled deterministic, and reproducible simulation of Polkadot network on a single Linux machine, while still running (largely) real node binaries rather than a simplified model.
The necessary Polkadot-SDK modifications are delivered via the PR:
- **PR:** Shadow support
- **URL:** https://github.com/paritytech/polkadot-sdk/pull/10055
- **Branch:** `xDimon:shadow_support` (referenced in instructions)
Shadow itself is an event-driven network simulator that “directly executes real application code”, allowing large private-network experiments with realistic network conditions (latency, bandwidth, packet loss) without requiring real distributed infrastructure.
- Shadow project: https://github.com/shadow/shadow
## What was implemented / delivered
### 1) Ability to run Polkadot networks under Shadow
The core outcome is that a Polkadot network (validators, collators, subxt-client) can be launched as a Shadow simulation, where:
- nodes run as Shadow-managed processes,
- system calls are intercepted by Shadow,
- network conditions are controlled by the simulation configuration,
- the experiment becomes deterministic when a fixed seed is used
### 2) Tooling to generate a Shadow-compatible network configuration
A helper script (`network_builder.sh`) is provided to:
- generate chainspecs/genesis configuration for a requested topology,
- build any required binaries with “Shadow compatibility” enabled (via `x-shadow` features),
- emit a ready-to-run Shadow configuration (`shadow.yaml`) plus the directory layout Shadow expects.
This reduces the effort from “manual per-node setup” to a single command that outputs a runnable simulation bundle.
## Polkadot-SDK modifications for Shadow support
In order to run Polkadot nodes under Shadow, several modifications were necessary to ensure compatibility with Shadow’s execution model and to allow deterministic behavior. The changes are implemented behind a single compile-time feature flag x-shadow so that non-simulation builds remain unaffected — these changes are not meant for production. They are acceptance-test/simulation-only modifications. The key modifications include:
* Disable jemalloc under x-shadow builds (jemalloc’s internal locking patterns can deadlock under Shadow).
* Replace Unix-domain-socket (UDS) based IPC between relay node and PVF workers with TCP sockets for x-shadow builds (Shadow does not support UDS).
## How to use (test procedure)
Below is the documented procedure to launch a Polkadot network under Shadow.
### Prerequisites
- Linux machine (Shadow targets Linux)
- Shadow installed per its README:
- https://github.com/shadow/shadow
### Minimum example: dev network
1. Install Shadow:
- Follow Shadow’s readme: https://github.com/shadow/shadow
2. Check out the Polkadot-SDK branch that includes Shadow support:
- https://github.com/xDimon/polkadot-sdk/tree/shadow_support
3. Build Polkadot-SDK with Shadow support:
```bash
cargo build --release --features x-shadow
```
4. Create `shadow.yaml` for a minimal dev network:
```yaml
general:
stop_time: 100s
model_unblocked_syscall_latency: true
network:
graph:
type: 1_gbit_switch
hosts:
server:
network_node_id: 0
processes:
- path: /path/to/polkadot-sdk/target/release/polkadot
args: --dev
start_time: 0s
expected_final_state: running
```
5. Run the Shadow simulation:
```bash
shadow --seed 42 --progress true --parallelism $(nproc) /path/to/shadow.yaml > shadow.log
```
6. Inspect logs:
- Node logs appear under the `shadow.data` directory created by Shadow during the run.
- Note, that hashes of blocks produced are the same between runs depending on the seed.
7. Cleanup between runs:
- Remove `shadow.data` before subsequent runs to avoid mixing outputs:
```bash
rm -rf shadow.data
```
### Network example: 8 RC validators, 1 parachain, 4 collators
Same steps as above, but replace 4. with generation of a more complex network using`network_builder.sh` script:
1. Fetch the `network_builder.sh` script:
```bash
cd scripts
wget https://raw.githubusercontent.com/xDimon/polkadot-sdk/refs/heads/shadow_support_squash/scripts/network_builder.sh
cd ..
```
4. Generate the simulation bundle (chainspec + binaries + `shadow.yaml`):
- Example: **8 RC validators**, **1 parachain**, **4 collators**
- Recommended: remove `target/` before first run to avoid mixing incompatible build artifacts.
```bash
HOST_BW="200 Mbit" SHADOW_LATENCY="75 ms" ./scripts/network_builder.sh 8 1 4 /tmp/shadow-net
```
5. Run the Shadow simulation:
```bash
shadow --seed 42 --progress true --parallelism $(nproc) /tmp/shadow-net/shadow.yaml > shadow.log
```
## Benefits of Shadow support
### 1) Deterministic, reproducible networking experiments
With Shadow, the network is simulated deterministically under a fixed seed. This is particularly valuable for Polkadot-SDK because:
- networking and timing issues are often nondeterministic in real deployments,
- reproducing “rare” behaviors becomes feasible by pinning the seed + configuration,
- flaky tests can be stabilized into reliable scenarios,
- debugging becomes faster and more reliable.
### 2) Realistic network conditions without physical infrastructure
Shadow allows injecting controlled network characteristics (latency, bandwidth, etc.) to study:
- block production and finality behavior under constrained links,
- gossip and propagation dynamics,
- performance of collators and validators under bandwidth pressure
- cencoring of certain nodes (e.g. when malicious collators are cencoring an honest collator)
This can replace (or greatly reduce) the need for expensive multi-machine testbeds or cloud deployments for many classes of experiments.
## Case study: collators isolation
We utilized the new Shadow support to simulate a specific attack vector targeting the Polkadot availability layer.
### The Attack Scenario
**Setup**: A parachain with 4 collators (**A**, **B**, **C**, and **D**) assigned to consecutive block production slots.
- **Collator B** is the honest victim.
- **Collators A, C and D** are colluding attackers.
**Mechanism**:
1. **Collator A** produces a block and distributes it to the backing validators but selectively **withholds** the block data from **Collator B**.
2. **Collator B**, unable to import A's parent block (missing data), cannot build its own block on top of it immediately. It must attempt to recover the data from the availability layer.
3. **Collator C** (the next attacker) produces their block. Because B was delayed waiting for data, C can submit their block to the network before B recovers and produces.
4. **Result**: B's slot is effectively skipped or "censored," and the chain progresses from A directly to C.
### Simulation Implementation
Using the `network_builder.sh` script, we simulated this network topology inside Shadow:
- **Topology**: 4 Collators, 8 Relay Chain Validators.
- **Isolation**: We used the `ISOLATE_COLLATOR_IDX` parameter to simulate the network censorship.
- This sets `packet_loss: 1.0` (100% loss) for P2P connections between the victim Collator B and its peers (A, C and D).
- Connections to Validators were left intact (to allow backing), strictly simulating the "selective withholding" behavior between collators.
- CLI command to generate the setup:
```bash
HOST_BW="200 Mbit" SHADOW_LATENCY="75 ms" ISOLATE_COLLATOR_IDX=2 ./scripts/network_builder.sh 8 1 4 /tmp/shadow-net
```
### Results
The Shadow simulation demonstrated the efficacy of the attack in a deterministic environment:
- **Observation**: The block production rate of the isolated honest collator B (collator-2000-2) dropped significantly compared to the non-isolated baseline.
- **Root Cause**: The latency introduced by the need to fetch missing data via alternative paths (availability recovery) caused B to miss its strict slot deadlines.

```
============================================================
BLOCK PRODUCTION AND FINALIZATION BY COLLATOR
============================================================
collator-2000-1:
Total blocks produced: 98
Blocks finalized: 41 (41.8%)
Blocks NOT finalized: 57
collator-2000-2:
Total blocks produced: 14
Blocks finalized: 11 (78.6%)
Blocks NOT finalized: 3
collator-2000-3:
Total blocks produced: 104
Blocks finalized: 41 (39.4%)
Blocks NOT finalized: 63
collator-2000-4:
Total blocks produced: 100
Blocks finalized: 54 (54.0%)
Blocks NOT finalized: 46
```
### Mitigation Verification
Using the same simulation setup, we tested a mitigation strategy:
- **Fix**: Increasing the **slot duration** configuration in the chainspec for the parachain from the default (6s) to a longer duration (18s).
- **Outcome**: With longer slot times, the honest collator (B) had sufficient time to recover the withheld data from the availability layer and produce a valid block before the slot deadline expired.
- **Conclusion**: Shadow allowed us to verify that extending slot duration effectively neutralizes this specific censorship vector without requiring a complex physical testbed.

```
============================================================
BLOCK PRODUCTION AND FINALIZATION BY COLLATOR
============================================================
collator-2000-1:
Total blocks produced: 41
Blocks finalized: 21 (51.2%)
Blocks NOT finalized: 20
collator-2000-2:
Total blocks produced: 28
Blocks finalized: 22 (78.6%)
Blocks NOT finalized: 6
collator-2000-3:
Total blocks produced: 27
Blocks finalized: 21 (77.8%)
Blocks NOT finalized: 6
collator-2000-4:
Total blocks produced: 33
Blocks finalized: 21 (63.6%)
Blocks NOT finalized: 12
```
## Future work
### CI integration with x-shadow builds
Set up CI pipelines to build Polkadot-SDK with `x-shadow` feature to ensure continued compatibility and catch regressions early.
### Zombienet-sdk integration
Integrate Shadow support into the zombienet-sdk framework to allow users to easily launch Shadow-based simulations as part of existing zombienet workflows, enabling more deterministic network testing scenarios and avoid flaky tests.
## Conclusion
Shadow support provides a practical, repeatable, and scalable way to run Polkadot-SDK networks under a simulated but realistic network environment. It improves the developer workflow for network-centric testing and debugging, makes regressions easier to reproduce, and enables larger experiments on local hardware than traditional integration setups.
**References**
- PR: https://github.com/paritytech/polkadot-sdk/pull/10055
- Tracking issue: https://github.com/paritytech/polkadot-sdk/issues/9748
- Shadow: https://github.com/shadow/shadow