Merkelized Metadata

People are using all kind of UIs/wallets to sign their transactions to send them to the chain. As they do not want to trust random UIs/wallets on the internet to not steal their private keys, they rely on hardware wallets. These hardware wallets only support signing of given data without exposing the key (at least when you are using a proper one ;)). There are multiple different hardware wallets. In the Polkadot ecosystem the best known are Polkadot Vault, Ledger and Kampela (still in an early phase, but paid by the Polkadot treasury). These hardware wallets need to be able to interpret the data send by the online UI/wallet. There are currently two different ways this is done in the Polkadot ecosystem. First, being a fixed parser for the transaction format of Polkadot that has to be updated to support new versions of the Polkadot runtime. Second, being a transaction parser that uses the metadata for decoding. So, this requires to have the full metadata on the device (more than 1MiB) and a way to securely update the metadata. Both approaches currently rely on central entities to either sign the metadata or the "fixed" parser. Also supporting parachains isn't that easy currently because they need to be included into the metadata portal or would require some custom Ledger app. The problem with the metadata portal is that someone needs to run this portal and need to sign the metadata. The custom Ledger apps would have the problem of getting an approval for the official app store from Ledger and also the parser would need to be changed for every parachain as the FRAME runtimes don't enforce a particular format of the transaction. So, we have two ways of implementing hardware wallets. "Fixed" parsers or metadata based parsers (okay, you could also use blind signing, but we should not even think about this). As Polkadot supports forkless runtime upgrades for itsself and all of its Parachains, things can change very quickly. So, using a "fixed" parser is almost a no go, because it would require constant updating as chains are evolving. We are left with the metadata based parser, which currently has the following two problems. First, metadata is to big and would not fit on every possible hardware wallet currently out there. Second, trusting the metadata currently requires to trust a central identity signing the metadata for you. We can solve both of these problems by introducing merkelized metadata. Merkelized metadata basically means to chunk the metadata into individual pieces, put them into a accumulator (e.g. a merkle tree) and use the digest of this accumulator (e.g. root hash of a merkle tree) as secure identifier for a particular metadata instance. This would solve the problem of the metadata size by only requiring invidual chunks of the metadata and not the full to decode the transactions. A hardware wallet wanting to sign a transaction would get proofs for these chunks from the online wallet. The hardware wallet would use the digest of the accumulator to ensure that the proofs are correct. It can get the digest of the accumulator also from the online wallet. To ensure that the online wallet hasn't provided the wrong digest and proofs, the hardware wallet will include the digest in the signed payload of the transaction. The chain that is aware of the digest as well, will ensure that the digest is the same by also including it on chain when checking the signature of the transaction. If the hardware wallet was using an incorrect digest, the transaction would be rejected on chain (actually before entering the transaction pool). So, a user can not get fooled into signing an incorrect transaction. This will solve both of the stated problems. We now need to work with the teams of Zondax (for the Ledger app) and Kampela together to come up with a working implementation. The main questions that are left for the implementation are, what kind of accumulator to use and how to chunk the data. Both of these questions are very important, as we want to ensure that the metadata is chunked as efficient as possible to produce as small as possible proofs when decoding a transaction. This will require some research work and tinkering. As Kampela should have enough memory/store to cache these proofs, only the Ledger app will require some more optimizations to make it work. However, the optimizations will need to be done on the Ledger app side by e.g. streaming the proofs and decoding the transaction on the fly (Ledger only has around 4KiB of memory), but that is nothing that should make the implementation impossible. At the end we should have some specification that describes on how to chunk the metadata and what kind of accumulator to use.