**Goal**: To test a new [feature PR ](https://github.com/paritytech/polkadot/pull/6278) by executing a given Polkadot/Kusama block on the runtime changes using the [`try-runtime` CLI](https://paritytech.github.io/substrate/master/try_runtime_cli).
**Context**: As a Substrate/FRAME developer, you have implemented a new non-trivial feature that may affect runtime behaviour in ways we can't easily capture with unit/integration tests. In an ideal world, we'd be able to run the new runtime against real world chain data. Fortunately, there is a tool for that: [`try-runtime` CLI](https://paritytech.github.io/substrate/master/try_runtime_cli). `try-runtime` allows us to execute one or many blocks against any runtime (strings attached, we'll touch this point later) by populating the local [node's externalities](https://paritytech.github.io/substrate/master/sp_externalities/index.html) with the state from any running chain. In addition, `try-runtime` allows developers to test runtime upgrades, but that specific feature is out of context for now.
For this exercise, we want to test [this PR](https://github.com/paritytech/substrate/pull/12588) against real chain state. The PR refactors the [elections phragmen pallet](https://paritytech.github.io/substrate/master/pallet_elections_phragmen/index.html) and we want to execute a Polkadot block when an election happens against a runtime built with the changes in the PR, and make sure state transition and side effects work as expected.
> Note: It is out of scope of this note to explain the ins and outs of `try-runtime`. There are a few resources online for that, namely [this Substrate seminar by Kian](https://www.crowdcast.io/e/substrate-seminar/41), the [`try-runtime` substrate docs](https://docs.substrate.io/reference/command-line-tools/try-runtime/), [`try-runtime` CLI cargo docs](https://paritytech.github.io/substrate/master/try_runtime_cli), and others. Basic knowledge of the client/runtime architecture and forkless runtime pgrades in Substrate is helpful.
---
### A few notes on node and runtime versions
The separation between the node and the runtime in Substrate clients has the benefit of enabling [forkless runtime upgrades](https://docs.substrate.io/build/upgrade-the-runtime/) in Substrate-based chains. In a nutshell, the fact that the state transition logic of the blockchain (the runtime) can be changed without requiring validators to update their clients, together with the fact that the runtime logic ([`:CODE`](https://paritytech.github.io/substrate/master/sp_storage/well_known_keys/constant.CODE.html)) is stored in the chain storage and part of the consensus, allows the chain to upgrade through an extrinsic [`Call::set_code`](https://paritytech.github.io/substrate/master/frame_system/pallet/enum.Call.html#variant.set_code) without forks.
The separation between client and runtime is important to keep in mind for our exercise. First, we want to build a node locally with the same version (or similar) as the remote node where we'll fetch the Polkadot chain data, so that when the `try-runtime` CLI injects the remote block state into the local node externalities, the local node can decode the block successfully and pass the externalities to the runtime to be executed. We also want to make sure the runtime running on the local node has our PR applied.
### Preliminaries
First, we fetch some info from the remote node, namely the client and runtime versions.
> Note: the remote node must be running with the `--ws-external` flag
1. **Check the remote node version**
The remote client version can be fetched through the [`system_version` RPC method](https://polkadot.js.org/docs/substrate/rpc#version-text-1).
```bash!
parity_gpestana ~/parity/polkadot % wscat -c $POLKADOT_REMOTE -x '{"jsonrpc":"2.0", "id":1, "method":"system_version"}' | jq
{
"jsonrpc": "2.0",
"result": "0.9.30-064536093f5",
"id": 1
}
```
2. **Check the remote runtime version**
The spec version can be checked using the `state_getRuntimeVersion` RPC method
```bash!
parity_gpestana ~/parity/substrate % wscat -c $POLKADOT_REMOTE -x '{"jsonrpc":"2.0", "id":1, "method":"state_getRuntimeVersion"}' | jq | grep specVersion
"specVersion": 9340,
```
3. **Check the remote node chain**
Just to make sure that the remote node is running the chain we expect, we can fetch the running chain:
```bash!
parity_gpestana ~/parity/polkadot % wscat -c $POLKADOT_REMOTE:9944 -x '{"jsonrpc":"2.0", "id":1, "method":"system_chain"}' | jq
{
"jsonrpc": "2.0",
"result": "Polkadot",
"id": 1
}
```
4. **Check the node and runtime versions of the PR to test**
```bash!
# build the node with the PR changes
$ cd substrate/
$ git chechout gpestana8250_npossolver # checkout substrate PR feature branch
$ cd ../polkadot/
$ git checkout master
$ diener patch --crates-to-patch ../substrate --substrate # patch dependencies to use the local dependencies in the substrate PR
$ git checkout gpestana8250_npossolver_companion
$ cargo build --release --features try-runtime
# check spec version and node version
$ ./target/release/polkadot --version
polkadot 0.9.33-84a1a982a4c
$ ./target/release/polkadot
#...
Native runtime: polkadot-9330 (parity-polkadot-0.tx17.au0)
#...
```
In summary, we can see that the remote and local nodes and runtime versions are out of sync, which may cause some issues when running `try-runtime`. But let's explore the different options we have.
- **Remote node**:
- node version: `0.9.30-064536093f5`
- runtime version: `9340`
- **Local node**:
- node version: `0.9.33-84a1a982a4c`
- runtime version: `9330`
### Attempt 1: `try-runtime` with local/remote node and runtime versions out of sync
In our first attempt, we'll try to run the try-runtime CLI with the local node and runtime binaries compiled from the previous section.
First, we copy the runtime wasm binary located in `./target/release/wbuild/polkadot-runtime/polkadot_runtime.wasm` into polkadot's root dir.
We can run the `try-runtime execute-block` on the compiled with the following command. This command will pull the storage data from the remote node at the block `0x94c6de80849df...` and try to execute the block in the local node/runtime given.
```bash
$ RUST_BACKTRACE=1 ./target/release/polkadot try-runtime \
--runtime ./polkadot_runtime.wasm \
execute-block \
live --uri ws://$POLKADOT_REMOTE \
-p elections \
--at 0x94c6de80849df826cd9b8580f1de74990fd808cbb014f32994363a79db3f4d0a
```
> Note: the hash of the Polkadot block to execute against the feature runtime is the [0x94c6de80849df826cd9b8580f1de74990fd808cbb014f32994363a79db3f4d0a](https://polkadot.subscan.io/block/13708800), which corresponds to a block where a `ElectionsPhragmen::NewTerm` event occurred in Polkadot.
The result of the command is the following:
```bash
2023-01-15 17:40:48 scraping key-pairs from remote at block height 0x94c6de80849df826cd9b8580f1de74990fd808cbb014f32994363a79db3f4d0a
2023-01-15 17:40:48 adding data for hashed prefix: b4c3bd1893eed3200a72f07e8b90c8d3, took 0s
2023-01-15 17:40:48 adding data for hashed key: 3a636f6465
2023-01-15 17:40:48 adding data for hashed key: 26aa394eea5630e07c48ae0c9558cef7f9cce9c888469bb1a0dceaa129672ef8
2023-01-15 17:40:48 adding data for hashed key: 26aa394eea5630e07c48ae0c9558cef702a5c1b19ab7a04f536c519aca4983ac
2023-01-15 17:40:48 initialized state externalities with storage root 0x7023b9ad106a0ce88f6e47b8a8af6cf938ae56b44a2fcefef69a390efd0767f6 and state_version V0
2023-01-15 17:40:49 original spec: RuntimeString::Owned("polkadot")-9300, code hash: 4f946f0caab019f8694a00e54c5bc5889b333360b3b4238b667c9f52276fcce1
2023-01-15 17:40:49 new spec: RuntimeString::Owned("polkadot")-9330, code hash: a16246c978a49f1758af048bba0ca2455c5159aed9091818ca8f9ea9171f62af
2023-01-15 17:40:49 fetching next block: 0x656516d8667ec412704f0382773b6ec0ad4a2bae0fd4708d5ebf1f38df262f6a
2023-01-15 17:40:49 try-runtime: executing block #13708801 / state root check: false / signature check: false / try-state-select: All
2023-01-15 17:40:49 Migration did not execute. This probably should be removed
2023-01-15 17:40:49 Storage to version 1
2023-01-15 17:40:49 Migration did not execute. This probably should be removed
2023-01-15 17:40:49 Skipping CleanupAgendas migration since it was run on the wrong version: StorageVersion(0) != 4
2023-01-15 17:40:49 [13708800] 💸 v13 applied successfully
2023-01-15 17:40:49 Migrating disputes storage to v1
2023-01-15 17:40:49 MigrateToV4 should be removed.
2023-01-15 17:40:49 ⚠️ XcmPallet declares internal migrations (which *might* execute). On-chain `StorageVersion(0)` vs current storage version `StorageVersion(0)`
2023-01-15 17:40:49 ParentBlockRandomness did not provide entropy
2023-01-15 17:40:49 executing transaction UncheckedExtrinsic(Some((<wasm:stripped>, (, , , , , , , , ))), <wasm:stripped>) failed due to <wasm:stripped>. Aborting the rest of the block execution.
2023-01-15 17:40:49 panicked at 'Bitfields and heads must be included every block', /Users/parity_gpestana/parity/polkadot/runtime/parachains/src/paras_inherent/mod.rs:200:17
Error:
0: Invalid input: failed to execute TryRuntime_execute_block: Execution aborted due to trap: unreachable
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⋮ 4 frames hidden ⋮
5: polkadot::main::heeab25705f693f1f
at <unknown source file>:<unknown line>
6: std::sys_common::backtrace::__rust_begin_short_backtrace::h26ef0ba5cb371bb2
at <unknown source file>:<unknown line>
7: std::rt::lang_start::{{closure}}::h25cc62a11c95bb5f
at <unknown source file>:<unknown line>
8: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::ha4b10a239e2af884
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:286
9: std::panicking::try::do_call::h6b4bcb7d3635e86a
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:483
10: std::panicking::try::h579c8cca81ff0f69
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:447
11: std::panic::catch_unwind::h4a997c12755a6e33
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panic.rs:137
12: std::rt::lang_start_internal::{{closure}}::hf20057b44f57f87c
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/rt.rs:148
13: std::panicking::try::do_call::hdfca34da16d8863f
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:483
14: std::panicking::try::h4f4022e500de0807
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:447
15: std::panic::catch_unwind::h62721286166676e8
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panic.rs:137
16: std::rt::lang_start_internal::h659a783147314d97
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/rt.rs:148
17: _main<unknown>
at <unknown source file>:<unknown line>
Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.
```
**Problem**: The compiled client is not able to decode the block. Since compiled client (from `paritytech/polkadot@master`) is out of sync with the remote node (version 0.9.33 vs v0.9.30).
**Potential solution**: In the next attempt, we'll backport the companion polkadot branch into a branch with the node version `v0.9.30`, so that the local node and remote node have the same version (note: the runtime version are still going to be different, but that *may* be OK).
### Attempt 2: `try-runtime` with local/remote node versions in sync but runtime versions out of sync
We'll now build the client with the version v0.9.3 and try to run the local node with the PR runtime extracted in the first attempt.
The remote node is running the `v0.9.30-064536093f5` version, so we can checkout the `064536093f5` commit in `paritytech/polkadot` and re-build the node.
```bash!
$ git checkout 064536093f5
$ cargo build --release --features try-runtime
# confirm that the compiled local node version is the expected
$ ./target/release/polkadot --version
polkadot 0.9.30-064536093f5
```
Now that the local and remote nodes versions are in sync, we can try to run the `try-runtime` locally again with the runtime wasm binary compiled in the previous steps:
```bash
$ RUST_BACKTRACE=1 ./target/release/polkadot try-runtime \
--runtime ./polkadot_runtime.wasm \
execute-block \
live --uri ws://$POLKADOT_REMOTE \
-p elections \
--at 0x94c6de80849df826cd9b8580f1de74990fd808cbb014f32994363a79db3f4d0a
```
The result of the command is the following:
```bash
error: Found argument '--runtime' which wasn't expected, or isn't valid in this context
If you tried to supply `--runtime` as a value rather than a flag, use `-- --runtime`
USAGE:
polkadot try-runtime [OPTIONS] <SUBCOMMAND>
For more information try --help
```
As the error above shows, the v0.9.3 `try-runtime` CLI version does not support loading a different runtime with the `--runtime` command. We can try to copy the runtime wasm into the `./target/release/wbuild/polkadot-runtime/` folder and try again without providing te runtime binary with the `--runtime` param.
```bash
$ cp ./polkadot_runtime.wasm ./target/release/wbuild/polkadot-runtime/
$ RUST_BACKTRACE=1 ./target/release/polkadot try-runtime \
execute-block \
live --uri ws://$POLKADOT_REMOTE \
-p elections \
--at 0x94c6de80849df826cd9b8580f1de74990fd808cbb014f32994363a79db3f4d0a
```
The result of the command was the following:
```bash
2023-01-15 18:18:04 found matching spec name: "polkadot"
2023-01-15 18:18:04 Custom("[backend]: frontend dropped; terminate client")
====================
Version: 0.9.30-064536093f5
# ...
Thread 'main' panicked at 'spec version mismatch (local 9300 != remote 9340). This could cause some issues.', /Users/parity_gpestana/.cargo/git/checkouts/substrate-7e08433d4c370a21/a3ed011/utils/frame/try-runtime/cli/src/lib.rs:670
```
The same error happends when the `--overwrite-wasm-code` param is passed in the command, which may indicate that the `try-runtime` `v0.9.30-064536093f5` CLI version may only work with native runtime.
**Potential solution**: We can now try to port both the polkadot companion and substrate feature PRs to the `0.9.30-064536093f5` and `9340` versions, respectively to make sure both the local node and runtime are in sync with the remote node.
### Attempt 3: `try-runtime` with local/remote node and runtime versions in sync
The remote runtime version is `9340`, which corresponds to the `paritytech/polkadot` tag `v0.9.34`. However, most importantly, the node version is the `0.9.30-064536093f5`, which corresponds to the `v0.9.30` release tag. Our plan of action is the following:
1. git cherry-pick the changes in the companion PR into the tag/v0.9.30 `paritytech/polkadot` branch
- `git cherry-pick 90a1f90043b2045d9cd11ff439c84d39916e19f4^..b6baf7d2593f6af5e5422dd2a9ba63380f559514`
3. git cherry-pick the changes in the substrate PR into the v3.0.0 `paritytech/substrate` branch
4. patch the polkadot dependencies to use the local substrate branch
5. build the node with `try-runtime` feature enabled
At this point, we'll have a `v0.9.30` local node with the substrate changes in the PR as the native runtime, ready to run the `try-runtime execute-block` against the remote node.