# Reed-Solomn coding in Polkadot We've found [our Reed-Solomn crate](https://github.com/darrenldl/reed-solomon-erasure) to be considerably slower than we'd like. We'll use this over the short term since the crate was designed for our use case. This document outlines future work for more drastic improvements. See also: https://codyplanteen.com/assets/rs/gf256_prim.pdf ## Arithmetic Our [`simd-accel` feature](https://github.com/darrenldl/reed-solomon-erasure/blob/master/simd_c/reedsolomon.c) already implements $\mathbb{F}_{2^{16}}$ using [Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions](https://www.ssrc.ucsc.edu/media/pubs/c9a735170a7e1aa648b261ec6ad615e34af566db.pdf) by James Plank, Kevin Greenan, and Ethan Miller. It builds upon $\mathbb{F}_{2^4}$ because SIMD prefers 16-by-16 tables. We should check if this code actually produces reasonable assembler, which maybe rustc churn breaks. Syed wants to try plugging in the gf-complete code directly. We could experement with $\mathbb{F}_{2^{12}}$, but this looks like considerable work for minimal gain. ### Formal derivatives [*An Efficient (n,k) Information Dispersal Algorithm based on Fermat Number Transforms*](https://www.citi.sinica.edu.tw/papers/whc/3450-F.pdf) by Sian-Jheng Lin and Wei-Ho Chung provides an impressively fast Reed-Solomn encoding algorithm over Fermat fields. You'll find a brief explanation of its formal deriviative trick in [FastECC's readme](https://github.com/Bulat-Ziganshin/FastECC#faster). [*Novel Polynomial Basis and Its Application to Reed-Solomon Erasure Codes*](https://arxiv.org/abs/1404.3458) by Lin, Han and Chung then provides a related design suitable for binary fields and [their FFTs](https://vitalik.ca/general/2019/05/12/fft.html), for which they provide a [research implementation](https://github.com/SianJhengLin/Fast-algorithms-of-Reed-Solomon-erasure-codes) over $\mathbb{F}_{2^8}$. ALSO https://www.citi.sinica.edu.tw/papers/whc/5524-F.pdf Among the Reed-Solomn libraries over $\mathbb{F}_{2^{16}}$ we've glanced over the fastest appears to be [leopard](https://github.com/catid/leopard) by Chris Taylor, which builds upon the second paper's research implementation by adding optimized SIMD instructions. We found leopard only implements a high rate code, meaning it [requires fewer parity shards than data shards](https://github.com/catid/leopard/issues/16). We require a low rate code for Polkadot, which makes direct deployment impossible (see also [leopard#14](https://github.com/catid/leopard/issues/14) and [wirehair#16](https://github.com/catid/wirehair/issues/16)). We also need to understand our code, and so should our auditers, but leopard consists of hard to read C-assembler translated directly from the second paper with bespoke unexplained optimizations. In fact, there are two related papers by the same authors that better describe the [high rate code](https://github.com/catid/leopard/blob/master/docs/HighRateDecoder.pdf) roughly corresponding to leopard, and also describe the [low rate code](https://github.com/catid/leopard/blob/master/docs/LowRateDecoder.pdf) corresponding more to what Polkadot requires. There is a non-functional `encodeL` function resembling Algorithm 1 from the [Lin, Han and Chung paper](https://arxiv.org/abs/1404.3458), but not this low rate code paper. <img style="float: right;" src="https://www.thirtythreeforty.net/posts/2020/05/hacking-reolink-cameras-for-fun-and-profit/riir.png" alt="Rewrite it in Rust" width="200" /> We should understand the [low rate code paper](https://github.com/catid/leopard/blob/master/docs/LowRateDecoder.pdf) and the he [Lin, Han and Chung paper](https://arxiv.org/abs/1404.3458) along with any dependencies, probably correct the [research implementation](https://github.com/SianJhengLin/Fast-algorithms-of-Reed-Solomon-erasure-codes), and then rewrite it readably in Rust. Initially we'll optimize more for readability, without using SIMD, but perhaps exploring the leopard source for optimization ideas. [*Novel Polynomial Basis and Its Application to Reed-Solomon Erasure Codes*](https://arxiv.org/abs/1404.3458) We'd then later do a SIMD implementation similar to leopard, but compatable with our readable implementaiton. ### Frobenius Additive FFT As discussed in [leopard#5](https://github.com/catid/leopard/issues/5), there is a faster Frobenius FFT described in [*Frobenius Additive Fast Fourier Transform*](https://arxiv.org/pdf/1802.03932.pdf) by Liu, Chen, Chen, Kuo, and Yang, along with a [Python implemntation](https://github.com/fast-crypto-lab/Frobenius_AFFT). and previously in [*The Frobenius FFT*](http://www.texmacs.org/joris/ffft/ffft.pdf) by Joris van der Hoevena and Robin Larrieu. In fact, the resarch code and leopard already implement [Additive Fast Fourier Transforms over Finite Fields](http://www.math.clemson.edu/~sgao/papers/GM10.pdf) by Shuhong Gao and Todd Mateer, so one question remains about how much improvement this yields. We should thus also understand this Frobenius Additive FFT and further optimize our implementation since the Chris Tayler envisions a 5x speed up. ### https://gitlab.com/eddyb/medimgrs/