Week 45 - HackMD

## Notes/Report week 44 Woking on understanding kernel, fft, ifft. Poly commitments, NTT, MSM, Montgomery. ### code pull apart understand `multiexp.rs` from `ec-gpu` (The Filecoin version) so far, before moving on to bones `multiexp_bound_and_ifft()` etc ``` --------------------------------------- test.rs --> let gpu = multiexp_gpu(&pool, (g.clone(), 0), FullDensity, v.clone(), &mut kern).unwrap(); | | --------------------------------------- multiexp.rs --> pub fn multiexp() // calculate multiexp, This is the main entry point. | | --------------------------------------- multiexp.rs |-->pub fn parallel_multiexp<'s>() | --------------------------------------- multiexp.rs |--> kern.multiexp(bases, exps) | | --------------------------------------- multiexp.rs |--> pub fn multiexp(&self, bases, exponents) | //Run the actual multiexp computation on the GPU. | | ``` ```Rust! let kernel_name = format!("{}_multiexp", G::name()); let kernel = program.create_kernel(&kernel_name, global_work_size, LOCAL_WORK_SIZE)?; kernel .arg(&base_buffer) .arg(&bucket_buffer) .arg(&result_buffer) .arg(&exp_buffer) .arg(&(bases.len() as u32)) .arg(&(num_groups as u32)) .arg(&(num_windows as u32)) .arg(&(window_size as u32)) .run()?; let mut results = vec![G::Curve::identity(); self.work_units]; program.read_into_buffer(&result_buffer, &mut results)?; ``` ### Montgomery representation: In modular arithmetic, computations involving modular multiplication (taking the remainder after multiplying two numbers) can be computationally expensive. Montgomery representation transforms the modular multiplication operation into a series of modular additions, making it more efficient. ### Montgomery Multiplication: The Montgomery multiplication algorithm does not directly compute $ab\,\textrm{mod}\, p$. Instead it computes $abR^{−1}\,\textrm{mod}\, p$ for some carefully chosen number $R$ called the Montgomery radix. Typically, $R$ is set to the smallest power of two exceeding $p$ that falls on a computer word boundary. For example, if p is 381 bits then $R = 2^{6×64} = 2^{384}$ on 64-bit architecture. In order to make use of the Montgomery multiplication the numbers $a$ and $b$ must be encoded into the Montgomery form: instead of storing (a, b), we store the numbers $(\tilde{a},\tilde{b})$ given by $\tilde{a} = aR\,\textrm{mod}\, p$ and $\tilde{b} = aR\,\textrm{mod}\, p$. i.e., Montgomery form: $\tilde{a} = aR\,\textrm{mod}\, p$ A simple calculation shows that the Montgomery multiplication produces the product $ab\,\textrm{mod}\, p$, encoded in the Montgomery form: $(aR)(bR)R^{−1} = abR\,\textrm{mod}\, p.$ The idea is that numbers are always stored in the Montgomery form so as to avoid costly conversions to and from the Montgomery form. The conversion between Montgomery form and the standard form involves multiplying or dividing by the Montgomery parameter $R$. The Montgomery reduction algorithm exploits this property by introducing a much faster multiplication routine which computes the $n$-residue of the product of the two integers whose $n$-residues are given. **WIP** references: https://eprint.iacr.org/2017/1057.pdf ### `FIELD_mul_default()` uses: Coarsely Integrated Operand Scanning (CIOS) veriant for Montgomery multiplication understand L268 in `field.cl`: ```opencl! DEVICE FIELD FIELD_mul_default(FIELD a, FIELD b) { FIELD_limb t[FIELD_LIMBS + 2] = {0}; for(uchar i = 0; i < FIELD_LIMBS; i++) { FIELD_limb carry = 0; for(uchar j = 0; j < FIELD_LIMBS; j++) t[j] = FIELD_mac_with_carry(a.val[j], b.val[i], t[j], &carry); t[FIELD_LIMBS] = FIELD_add_with_carry(t[FIELD_LIMBS], &carry); t[FIELD_LIMBS + 1] = carry; carry = 0; FIELD_limb m = FIELD_INV * t[0]; FIELD_mac_with_carry(m, FIELD_P.val[0], t[0], &carry); for(uchar j = 1; j < FIELD_LIMBS; j++) t[j - 1] = FIELD_mac_with_carry(m, FIELD_P.val[j], t[j], &carry); t[FIELD_LIMBS - 1] = FIELD_add_with_carry(t[FIELD_LIMBS], &carry); t[FIELD_LIMBS] = t[FIELD_LIMBS + 1] + carry; } FIELD result; for(uchar i = 0; i < FIELD_LIMBS; i++) result.val[i] = t[i]; if(FIELD_gte(result, FIELD_P)) result = FIELD_sub_(result, FIELD_P); return result; } ``` ### general notes on `Affine` / `Projective` For the BLS 12-381 `Engine` --> (`blstrs`). The terms "affine" and "projective" refer to different representations of points on an elliptic curve. These representations are used to perform arithmetic operations on points efficiently. 1. **Affine Representation:** - In the affine representation, a point on an elliptic curve is represented by its $x$ and $y$ coordinates. - The coordinates $x$ and $y$ are elements from a finite field, often a prime field. - The equation of the elliptic curve is typically given in the form $y^2 = x^3 + ax + b$, where $a$ and $b$ are constants. 2. **Projective Representation:** - In the projective representation, a point on an elliptic curve is represented by three coordinates: $X$, $Y$, and $Z$. - The coordinates $X$, $Y$, and $Z$ are related to the affine coordinates $x$ and $y$ by the equations $x = X/Z$ and $y = Y/Z$. - Projective coordinates are used to simplify the arithmetic operations on elliptic curve points, making them more efficient. **Affine and Projective Coordinates Relationship:** - The relationship between affine and projective coordinates is established through the homogenization process. Given an affine point $(x, y)$, its projective representation $(X, Y, Z)$ is obtained by setting $X = x$, $Y = y$, and $Z = 1$. - To go from projective to affine, you divide the $X$ and $Y$ coordinates by $Z$, i.e., $x = X/Z$ and $y = Y/Z$. **Advantages of Projective Coordinates:** - Projective coordinates facilitate more efficient point addition and doubling operations on elliptic curves. These operations involve fewer field multiplications, and inversion operations can be deferred until the end of the computation. - Projective coordinates help avoid costly operations like division in finite fields, making arithmetic on elliptic curves more efficient. In the pairing libs/crates the `Engine` trait etc, the terms "affine" and "projective" are used to represent points on the elliptic curve groups G1 and G2. The trait defines associated types (`G1Affine` and `G2Affine`) for the affine representations of points, which are used in pairing operations.