Slashing reform

Update: We've a proposal that handles timing with more detail at https://github.com/paritytech/substrate/issues/7398#issuecomment-728875053

We've a "slashing reform" design https://github.com/w3f/research-internal/issues/530 that should basically end slashing of honest nodes across the board. It meshes nicely with two other problems: our session keys never sign their own certificates, and some future session keys require proofs-of-possession.

As present, our session key certificates look morally like

pub trait SessionKey {
    type Secret,
    type Public,
    type Signature,
}
pub struct SessionKeyCert<K: SessionKey> {
    controller_pubkey: AccountKey,
    session_pubkey: K::Public,
    controller_signature: [u8; 32],
}

which we'd expand these by some wrapper like

pub struct SessionKeyFullCert<K: SessionKey> {
    certificate: SessionCert<K>,
    counter: u16,
    tag: u64,
    session_back_sig: K::Signature,
}

Anytime a node starts up, it roughly proceeds like:

  • load SessionKeyCerts and counter value from disk,
  • increments counter and write counter back to disk,
  • creates a fresh tag from system randomness,
  • signs a fresh SessionKeyFullCert containing this fresh tag and counter, and
  • announces these full session key certificates for registration on-chain.

We never import our own tag from disk or chain, only recreate a fresh on upon start. At any time, we only permit each validator have only one SessionKeyFullCert of each session key type registered on-chain. We only register larger counters over smaller counters, which prevents replays. We determine key age from the inner certificate: SessionKeyCert not the outer SessionKeyFullCert wrapper when deciding if a key lives on chain for enough epochs to use.

All session keys would gain logic that prevents equivocations, and this patter works okay for BABE, but we've slightly varied logic for some session keys, like Sassafras registers single-use block sealing keys with its ring VRFs.

GRANDPA

A grandpa node begins signing grandpa messages only three (two?) grandpa rounds after seeing the chain finalize its fresh session key certificates with its own tag and counter. All grandpa message contain their current tag. All grandpa nodes ignore grandpa messages unless they contain the most recently finalized tag for that grandpa key.

In this, grandpa nodes can validate grandpa votes only after viewing finalized blocks. We risk worse grandpa freezes this way, so eventually nodes should give up and consider these full session key certificates finalized, even if grandpa does not consider them finalized. We should formalize this slightly better of course. :)

BABE

We could ignore blocks unless they contained the current tag, which avoids honest equivications.

A fresh SessionKeyFullCert wrapper does not change the BABE (VRF) key of course, but altering a BABE key itself still requires waiting one epoch.

We'd define catchup blocks in the relative-time protocol using https://github.com/w3f/polkadot-spec/pull/168#issuecomment-717538418

Transport (under consideration)

All validators need some transport key registered on-chain somewhere, so they can do long-term vs long-term key exchanges in Noise IK handshakes. We could do the psk part first.

​IKpsk0?:
​	<- s
​	...
​	-> e, es, psk?, s, ss
​	<- e, ee, se

In this, psk? means optionally sending H_1(psk) followed by hashing psk into the key exchange. If nodes have psks then they use them, but otherwise validators might consider your node less important. After a connection works then the psk ratchets forwards using the final shared secret.

If a validator comes online then it lacks any psks, but it posts this transaction that updates its wrappers, and this transaction creates a new tag, so everyone sets psk = H(ss | tag) for new nodes that register.

Roads not taken

We could replace tag: u64 with a fresh_extra_block_seal_public_key: sr25519::PublicKey that serves roughly tag's function, but also seals blocks alongside the BABE (VRF?) key. We do not take this path because it seemingly does not add any protections.

We could replace tag: u64 with a fresh_transport_public_key: x25519::PublicKey (or ed25519) that serves roughly tag's function, but also acts like a long-term transport key in noise. If we do this then validators cannot even "talk like validators" until their SessionKeyFullCert gets registered on-chain. We improve forward secrecy thius way because now validators who become compromised never leak anything about their conversation partners. This adds more, but still not much.