Try   HackMD

"Conor Daly - 2024 Indianapolis 500 Sponsorship" referenda bug

The referenda was approved by the community, but it failed on execution. The execution of the referenda was at block 19907903 and it failed with scheduler (CallUnavailable). This means that the pre-image couldn't be found and thus, the call could not be executed.

What happened?

The pre-image was registered with the following transaction:

{
	"bytes": "0x13030f0040d9dd884d0a000948fe4034185578ec11298db785082bdc7ab98c82e14aac164b4a8d924c0d53"
}

The referenda itself was then submitted with the following transaction:

{
	"proposal_origin": { "Origins": "BigSpender" },  
	"proposal": { 
		"Lookup": {
			"hash": "0x94ed0dbc882ed18b3193fd275f78edab0d3f3827f8da3603ae251656c9ff195a",
			"len": 42
		}
	},
	"enactment_moment": { "After": 100 }
}

When a referenda is successful it is scheduled as a task and waits to be executed. In the case of the BigSpender track this delay is at least 7 days. When it comes to execution of the task, the following code will be executed:

let (call, lookup_len) = match T::Preimages::peek(&task.call) {
	Ok(c) => c,
	Err(_) => {
		Self::deposit_event(Event::CallUnavailable {
			task: (when, agenda_index),
			id: task.maybe_id,
		});

		return Err((Unavailable, Some(task)))
	},
};

This peek method ultimately ends up here:

fn fetch(hash: &T::Hash, len: Option<u32>) -> FetchResult {
	let len = len.or_else(|| Self::len(hash)).ok_or(DispatchError::Unavailable)?;
	PreimageFor::<T>::get((hash, len))
		.map(|p| p.into_inner())
		.map(Into::into)
		.ok_or(DispatchError::Unavailable)
}

The len and hash parameters are the one that were passed in when creating the referenda:

"Lookup": {
	"hash": "0x94ed0dbc882ed18b3193fd275f78edab0d3f3827f8da3603ae251656c9ff195a",
	"len": 42
}

The important line in the code above is this:

PreimageFor::<T>::get((hash, len))

This code is about retrieving the pre-image at key (hash, len) from the state. If any of these parameters is different to when the pre-image was registered, the pre-image will not be found in the state. In the first transaction above the following pre-image was registered:
0x13030f0040d9dd884d0a000948fe4034185578ec11298db785082bdc7ab98c82e14aac164b4a8d924c0d53

The registration of the pre-image happens in the note_bytes function, while the important lines (for this issue) of the code are the following:

let hash = T::Hashing::hash(&preimage);
let len = preimage.len() as u32;

The problem is now that len for the pre-image is 43, while at registration 42 was passed. This leads to the PreImageFor lookup to fail and ultimately to the CallUnavailable event that was observed.

So, the issue was at submission of the referenda by putting in an incorrect len of 42. The submission was done using polkadot-js (only recommended for advanced users;)). Multiple attempts to reproduce the issue failed. polkadot-js is actually filling out the len automatically when inserting the pre-image hash. So, the assumption is that there was maybe a bug at this time in polkadot-js that got fixed by accident or that there was a bit flip and thus the incorrect len was inserted. The exact cause cannot be reproduced anymore.

Learnings

While there was no immediate issue on the runtime side, it could still be improved to reduce the likelihood of such issues. Pull request 3850 in polkadot-sdk was created to introduce a check of the len passed to the submission versus the length of the pre-image stored in the state. This check will only work if the pre-image was registered before submitting the actual referenda, which is in line with what the current documentation is proposing. While this isn't a perfect solution to prevent this from happening again, the issue as observed here would have been detected by this.

One actual issue was discovered through this failed referenda that is fixed by pull request 3849. The issue being that the pre-image is now still in a requested state. This means that the pre-image can not be removed from the state right now, because the system still assumes that the pre-image is requested for something else in the future. This needs to be fixed up ultimately by some other OpenGov proposal that will unrequest the pre-image.