ark-substrate design choices

# ark-substrate design choices https://github.com/paritytech/substrate/pull/13031 https://github.com/paritytech/ark-substrate https://github.com/w3f/ark-scale/ ### Coordinates Affine coordinates and serialized forms differe in Arkworks and ZCash/BLST, but an easy conversion is given by https://github.com/MystenLabs/fastcrypto/tree/main/fastcrypto-zkp/src/bls12381 We take and return [projective coordinates](https://github.com/w3f/ark-scale/blob/master/src/hazmat.rs) in scalar multiplications, which avoids doing divisions in the runtime. We return projective coordinates from MSMs for consistency with arkworks. It also improves performance in degenerate 1-MSM calls. [Arkworks](https://github.com/arkworks-rs/algebra/blob/master/ec/src/models/short_weierstrass/group.rs) and [ZCash](https://github.com/zkcrypto/bls12_381/blob/main/src/g1.rs) use different projective coordinates, with some comparison in [*"Efficient utilization of scalable multipliers in parallel to compute GF(p) elliptic curve cryptographic operations"*](https://www.researchgate.net/publication/290029144_Efficient_utilization_of_scalable_multipliers_in_parallel_to_compute_GFp_elliptic_curve_cryptographic_operations) by Adnan Gutub, but see also https://eprint.iacr.org/2015/1060. Achim showed conversion needs only 2-3 multiplicaitons each way in https://github.com/achimcc/bls12_381/blob/main/src/g1.rs#L563-L602 ### Point preperation Arkworks `G2Prepared` points are huges, like 16kb. We therefore make `G2Prepared` points be [`G2Affine`](https://github.com/paritytech/ark-substrate/blob/main/models/src/models/bls12/g2.rs#L20) which breaks the multi-preperation optimization in Arkworks. We could memoize aka cache our `G2Affine -> G2Prepared` map for the whole block run, and drop only after `on_finalize`, which imposes this optimiation even if the underlying code never exploits it. We need real in-memory types in substrate, but a cache avoids waiting for this. We'd want a cache eviction here because otherwise any protocol with user controlled `G2` points could consume excessive memory. It's likely parachain authors' weights incorrectly ignore this attack. We worry cache eviction itself makes weight estimates tricky. Instead, we could provide an explicit host calls which manually caches instances of the `G2Affine -> G2Prepared` map. We'd likely make parachains invoke this explicitly in `on_initalize`. We could instead invoke caching whenever user's invoke `G2Affine: Into<G2Prepared>>`, but bypass when they invoke `Pairing::miller_loop::<_,G2Affine>`, which gives fine grained control, but defualts more towards caching. ### G_T We'll add scalar multiplications and MSMs for [`PairingOutput`](https://github.com/arkworks-rs/algebra/blob/master/ec/src/pairing.rs#L134), but not the target field itself. We'll likely assume the element lies in `G_T` here, not the full `Pairing::TargetField`, certianly for the MSM, which permits endomorphism optimizations. We do not yet exclude our single scalar multiplications later supporting `Pairing::TargetField`. We already have a seperate call for `final_exponentiation` call though, so likely seperate calls make maintance easier. ### Q&A Remarks extracted from [#13031](https://github.com/paritytech/substrate/pull/13031) --- We should encode points in whatever way minimizes conversion and maintenance costs. We do not expect that wasm and the host build identical types, meaning we need serialization and deserialization. Can `[u8; N]` be passed more efficiently than `Vec<u8>`? Or does it become `Vec<u8>` too? In principle, we could pass `[u8; N]` types, but Arkworks does not provide that `N` however, except maybe for `PrimeField::BigInt`, so this should all be done manually, and no point if they become `Vec<u8>`s anyway. It appears scale turns `&[u8]` into `Vec<u8>` too, so afaik not much better you could do there either. We should not pass `ArkScale<[T]>` or similar because MSM and Miller loops methods either already take `Iterator<Item=T>`s, or else should do so in future, so constructing the `ArkScale<[T]>` becomes wasteful. If a host calls take `ArkScale<T>` or `ArkScaleProjective<T>` directly then we face two concerns - If we want to use blst via say https://github.com/MystenLabs/fastcrypto/tree/main/fastcrypto-zkp/src then we incur an extra serialization layer, or maybe pretty ugly unsafe code that tweaks the flags. - If a parachain project wants zcash or whatever traits to wrap our host calls, then they need an extra serialization layer. I'm thus fairly confidant `Vec<u8>`s are the right types to pass here, but if anyone says `[u8; N]` work better then fine. `ArkScale<T>` seems definitely wrong though. I think `pub struct SerializedProjective(pub Vec<u8>)` would not be similarly wrong, but it needs whatever let scale treat it like a `Vec<u8>`. --- > With Scaling/Serializing manually, we are going to do double work and serialize twice Afaik this is false. `Vec<u8>`s are not reserialized (supposedly). We convert a `Iterator<Item = impl Into<AffineRepr>>` directly into its final `Vec<u8>` form here. Your way would first copy into a `Vec<AffineRepr>`, second serialize into a `Vec<Vec<u8>>`, and then third reserialized into a `Vec<u8>`. > and make the code more error prune by accepting Vec instead of CurveGroup or AffineRepr. We're doing internal interfaces here, nothing end users observe, making this concern irrelevant. Also, it must work for more than arkworks on both sides, like I said above, which completely forbids using `ArkScale<G1Affine>` or similar. > we should follow the other crypto exmaples and accept concrete types and let substrate take care of serialization. No. There is no reason to expect an internal interface to look like a interface designed for a complete protocol. A `Projective` done your way requires an unnecessary division operation. It'd maybe be different if `Projective` might latter change, but so far nobody foresees this. Amusingly, arkworks serialized forms have proven less stable than their internal projective forms over the last couple years. lol As an aside, it's fine doing ed25519 this way since [it's keys are just bytes](https://github.com/ZcashFoundation/ed25519-zebra/blob/main/src/verification_key.rs#L49), but sr25519 should maybe be optimized to avoid those extra [serializations and deserializations.](https://docs.rs/schnorrkel/latest/src/schnorrkel/keys.rs.html#580). --- > here we take opaque vectors as input for all the host functions. This is not inline with the rest of the HF where the inputs are typed and serialization/deserialization is performed by the machinery that pass data around transparently using scale encoding. > What about defining newtypes for G1Affine G2Affine and co. and implement Encode/Decode/TypeInfo? I addressed this up thread.. - https://github.com/paritytech/substrate/pull/13031#issuecomment-1546312415 - https://github.com/paritytech/substrate/pull/13031#issuecomment-1546542620 - https://github.com/paritytech/substrate/pull/13031#issuecomment-1549760947 I'll explain again.. We're of course nowhere near substrate passing types without serialization into `Vec<u8>`s. I've a thread at https://internals.rust-lang.org/t/multi-archetecture-layout/18538 on this btw. As such, we cannot improve performance by passing a wrapper type, although maybe wrappers like `pub struct Projective(Vec<u8>);` could become zero cost if scale handled them exactly like their inner `Vec<u8>`. We do not afaik borrow `&[u8]`s across the boundary either, but that's not too relevant here sadly. All those other crypto types already live in serialized form, ala `pub struct PublicKey([u8; 32]);` or else we mistakenly treat them like they do. We've an interface internal to the cryptography here, so the types cannot already be nicely polished bytes like those little signature schemes. Instead we take what benefits the `Vec<u8>` target gives us.. We convert a `Iterator<Item = impl Into<G1Affine>>` directly into its final `Vec<u8>` here. `ark_scale::ArkScale<..>` already provides a universal wrapper, but if you pass types like `ArkScale<Vec<G1Affine>>` or `Vec<ArkScale<G1Affine>>` then you first copy the iterator into a `Vec<G1Affine>`, second serialize into a `Vec<Vec<u8>>`, and then third reserialized into a `Vec<u8>`. Ugh, three serialization layers! We pass affine points for MSMs and Miller loops. Afaik all elliptic curve crates already work this ways, in part because you could batch normalize many projective into affine with only one division. We'd pay this division separately for every single point if we passed affine points to single scalar multiplications though. Instead we avoid these divisions by passing projective points instead of affine points for regular scalar multiplications. We might pass `ArkScaleProjective<G1>` in a few places in principle, except.. First, we want runtime crates that implement other projects' elliptic curve traits over exactly the same host calls. Zcash's traits in https://github.com/zkcrypto/ff and https://github.com/zkcrypto/group and https://github.com/zkcrypto/pairing, but several other curve ecosystems have their own traits, like RustCrypto traits in https://docs.rs/elliptic-curve/latest/elliptic_curve/ Second, we might later swap in faster libraries into the host like [blst](https://github.com/supranational/blst) into the host. In theory, one might even swap in hardware accelerated curves, although we'd likely never make validators buy FPGAs or whatever. At this point, we've likely scenarios in both the host and runtime where `ArkScaleProjective<G1>` requires an extra serialization and deserialization layer, like https://github.com/MystenLabs/fastcrypto/tree/main/fastcrypto-zkp/src does. We'd then have three serializations if both happened, or *five serialization layers* for MSMs and Miller loops! An MSM could've size `O(transactions)` too! We'll always pass a `Vec<u8>` anyways, so we avoid these cases with extra serialization layers by going directly to bytes. ###

Read more

Subsrtate next generation storage

Jet KZG

Aggregation of Merkle proofs from homomorphic hashes

Validator rewards in Polkadot/Kusama