Unfuck Moonsama

Moonsama executed the `schedule_code_upgrade` call on the relay chain to upgrade their parachain. The problem with this call is that it sets the `GoAhead` signal which then triggers the parachain to fail as it is not expecting the signal. There is an issue open to solve this by not setting the signal: https://github.com/paritytech/polkadot/issues/7202 I think we should be able to fix this from the parachain by overwriting the runtime code on the parachain side and then issuing a new upgrade. ## Prepare a new runtime upgrade 1. Take the ***same code*** that was used to generate the runtime that is currently running on the parachain. Not the runtime that was passed to `schedule_code_upgrade`. 2. Patch `cumulus` to use a fork (again same branch/commit as being used for the on chain runtime). In this fork you need to remove the following code: https://github.com/paritytech/cumulus/blob/21919c89ad99fc4e2e6f12dd6679eb2d41892ac3/pallets/parachain-system/src/lib.rs#L397-L419 3. Build the runtime that includes this patch (let's call it `PATCHED_RUNTIME`). ## Apply the runtime to your parachain Substrate is loading the runtime from the state. So, to build block `X` it takes the runtime from block `X - 1`. However, we support overriding this mechanism using the chain spec, but this will then require that every node on the network uses this chain spec. The following needs to be added to the chain spec (check [`polkadot.json`](https://github.com/paritytech/polkadot/blob/7e1e635eb01dc1d2f0ce140d817b024dc364a9db/node/service/chain-specs/polkadot.json) for an example): ``` "codeSubstitutes": { "132724": "HEX_STRING_OF_PATCHED_RUNTIME", } ``` `132724` is the number of the block from which one the patched runtime should be used. This is the number of the latest block that you build. This chain spec then needs to be circulated on all nodes and then the collators should start to produce blocks again. However, the blocks will fail at validation on the relay chain. Why? Because we removed some code that is still part of the blob registered at the relay chain and there will be some mismatch at execution (basically it will fail with the same issue as your collators currently). ## Call `scheulde_code_upgrade` again You need to call `schedule_code_upgrade` again, but this time with the `PATCHED_RUNTIME`. After this has happened, your parachain and the relay chain should use the same code again to build and validate blocks. You will not be able to directly call `schedule_code_upgrade` as there is currently some "cooldown" that basically prevents that parachains upgrade every block. Looking at the chain state (`paras::upgradeCooldowns`): ``` ... [ 3,334 16,253,944 ] ``` This means that after block `16,253,944` you can call the function again. I also checked the other requirements and there should be nothing else that should prevent the call from succeeding (hopefully). ## Aftermath After all the things above are finished, you will need to upgrade your runtime again to the one that you passed the first time to `schedule_code_upgrade` (the one that brought us into this situation). But this time you use `set_code` on your parachain ;) # Evaluation I had overseen [this](https://github.com/paritytech/polkadot/blob/7e1e635eb01dc1d2f0ce140d817b024dc364a9db/runtime/parachains/src/paras/mod.rs#L1767). This meant that you could not schedule another upgrade while one was already active. So, I had proposed the following: > So, what you could do. You setup some new genesis state That has PendingValidationCode set https://github.com/paritytech/cumulus/blob/21919c89ad99fc4e2e6f12dd6679eb2d41892ac3/pallets/parachain-system/src/lib.rs#L402-L406 The genesis runtime should be the one that you passed to schedule_code_upgrade And you also put this runtime into PendingValidationCode And then you call set_current_head with the header of your new genesis This worked and they could bring back the chain online.