FHE Schemes

This is a WIP.

Some materials we use for below notes:

Summary of FHE Slides [1]
Summary of FHE Blog [2]
FHE survey paper 2022 [3]
Intro to FHE Video [4]
CHIMERA [5]
KPZ22 [6] shows concretely many important formula in the Appendix
TFHE-Tech and TFHE-Overview

Preliminaries on Mumber/Polynomial Representation

Number can be presented in "whole", radix, rns, or tower form:

  • "whole" when the max value is n while the domain is [0,n-1], examples include (native) prime field modulo arithmetic
  • radix is in the form of
    x=0l1xibi
    where
    xi[0,b1]
    and
    b
    is the base, decimal or binary representation are examples of radix form

    radix form is more general than the "whole" where we can say that

    n is the base and we only count
    i
    to
    0

    we can do additions in parallel but we have to accumulate the carries
    we can do comparison fast by going from most significant "bit" to least significant "bit"

  • rns, (residue number system, see more at RNS and apps), is a more general form where we have different bases as long as they are co-prime
    xi=0l1ripi
    (the
    pi
    ' s co-prime with each other)

    if all

    pi' s are different then we have advantage when doing additions in parallel as there is no carry
    we have to convert to radix form or "whole" for comparison

  • tower form, TBD

Nevertheless, when we have bases (radix or rns) we can do SIMD (packing).

Polynomial can be presented in coefficient or evaluation form:

  • coefficient:
    f(X)=0d1aiXi
  • evaluation:
    {xi,f(Xi)}0d1

Coefficient of polynomials are numbers represented in forms above;
Polynomial also has bases (

f(X)=q(X)z(X)):

  • z(X)=Xl1+1
    (used in ( R)LWE [2])
  • z(X)=0l1(Xxi)

When dealing with polynomial we rely on FFT (NTT) for fast computation (as opposed to radix or rns parallelization of numbers computation).

Generations of FHE schemes

1st, 2nd, 3rd ([1]):

Screenshot 2025-02-06 at 09.38.20

4th ([1]):

Screenshot 2025-02-06 at 09.40.07

2nd has efficient packing than 3rd so better for depth 1 or depth 2 computation!
4th is not directly usable because it is on approximate numbers, however its idea should be useful for tweaks (later schemes)

Preliminaries on Encapsulation and Noise Growth Management

Encapsulation of numbers (e.g. encryption) can carry noises, e.g. an encapsulation of 16 "bits" (think in radix or rns form) can have actual value of upto 4 bits and noises upto 12 bits, and we can always recover the actual value if the noises do not exceed that 12 bits threshold.

noises increase "a bit" when adding two encapsulations, in our example if noise is 11 bits and adding two of that we still have only 12 bits and still good, but not beyond that

noises increase a lot when multiplying an encapsulation with a number, e.g. multiply an encapsulation with 8 bit noise and a 4 bit number still get good encapsulation of 12 bits noise

noises increase a lot lot when multiplying two encapsulations, e.g. multiply two encapsulations of 4 bits value and 2 bits noise is safe with 12 bits noise afterwards, but not beyond that

Its all about noise growth management when handling encapsulation.
If we trivially make something like 8 bits value with 1.000.000 bits of noise we can have a lot of multiplication depth but this is expansive in terms of both storage and computation.

Noise Growth Management Strategy

Relinearization (a.k.a Key Switching, 2nd + 3rd gen FHE) [2]

Screenshot 2025-02-06 at 10.00.17

quite circular and we need to keep the "degree" of the relinearization polynomial low below the noise threshold

Modulus Switching (2nd + 3rd gen FHE) [4]

Screenshot 2025-02-06 at 10.28.46

Rescaling (4th gen FHE) [2]

Screenshot 2025-02-06 at 09.40.07

Important Schemes

1st gen FHE [3]

Screenshot 2025-02-06 at 10.08.24

2nd gen FHE [3]

Screenshot 2025-02-06 at 10.09.30

3rd gen FHE [3]

Screenshot 2025-02-06 at 10.13.59

4th gen FHE [3]

Screenshot 2025-02-06 at 10.15.38

Comparison [3]

3rd gen is faster but requires more communication (ciphertext size) > we encrypt (and pack) with 2nd gen for communication and then switch (and unpack) to 3rd gen for computation! (ONLY WHEN BOOSTRAPPING IS NEEDED, OTHERWISE NO NEED CONVERSION)!

Screenshot 2025-02-06 at 10.17.24
Screenshot 2025-02-06 at 10.18.19

For non-binary we use 2nd gen and binary we use 3rd gen

A bit more details on important Schemes

Notation

  • Rq=Zq[X]/(XN+1)
    ; (packing
    N
    elements inside a ciphertext)
  • N
    is a power of 2;
  • q
    is the ciphertext space of ciphertext
    c
    ;
  • t
    is the plaintext space of message
    m
    and
    q>>t
    ;
  • χ
    is the distribution for key and error;
  • sRqχS
    is the secret key;
  • eRqχE
    is the error;
  • LWE:
    c=(a,b)=.(a0,,\aN1,b)ZqN+1
    s.t.
    <a,s>+b=m+e
    ;
  • RLWE:
    m
    becomes
    m(X)
    where
    c=(a,b)Rq2
    s.t.
    as+b=m(X)+e(X)
    ;
  • q
    is a simplication of
    Qi=qiQi1
    where
    qi
    is a chain of moduli that has more or less same bit width (e.g. 32 or 64 bit primes)
  • noise has to be less than
    q/2

General

  • Plaintext messages are vector size

    N of elements (coefficients) modulo
    t
    (prime
    t=p
    but in general
    t=pr
    coprime to
    2N
    for packing)

  • Keygen:

    sχS;
    aRq
    ;
    eχE
    ;
    b=as+2e
    ;

    • sk=(1,s)Rq2
      and
      pk=a=(b,a)Rq2
  • Enc

    mR2 becomes
    m=(m,0)Rq2
    ;
    r,e0,e1χ

    • c=(m)+2(e0,e1)+ar
    • i.e.
      c=(c0,c1)=(m+2e0+br,2e1ar)
  • Dec (remember, all

    mod q arithmetic below)

    • m=<c,s> mod 2=c0+c1s mod 2=m+2(e1+e1s+er) mod 2
  • Add: just add!

  • Mul: just mul (element wise) but look at decryption, now need refresh:

    • <c,s><c,s>=(c0+c1s)(c0+c1s)=c0c0+(c0c1+c1c0)s+c1c1s2=d0+d1s+d2s2
  • Enc of

    m can be scaled with
    Δ=Q/t
    in BFV

BGV/BFV, non-binary, can be fixed point, i.e. Key Switching + Modulus Switching

  • Key Switching
    • BitDecomp(xRqn,q)
      decomposes
      x=0log q2juj
      where
      uR2n
      into
      u0,,ulog q)R2nlog q
      ;
    • Powersof2(xRqn,q)
      returns
      (x,2x,,2log qx)Rqnlog q
      ;
    • SwitchKeyGen(
      s1Rqn1,s2Rqn2
      )
      • A
        = KeyGen(
        s2
        ,
        N=n1log q
        )
      • B=A+Powersof2(s1)
    • SwitchKey(
      B,c1
      ) (
      c1
      in
      s1
      switching to
      c2
      is
      s2
      using
      B
      )
      • c2=BitDecomp(c1)TBRqn2
  • Modulus Switching from modulo
    q
    to
    p
    , i.e.
    <c,s>q=<c,s>p mod 2
    just need scaling (and rounding) by
    p/q
    , this also reset noises (given
    p<q
    sufficiently)
  • Refresh switching key from
    s2
    to
    s
    and do the modulus switching to reduce noise !!! or using the chain of moduli:
    • c1=Powersof2(c,qj)
    • Scale
      c1
      in
      qj
      to
      c2
      in
      qj1
      (goes down the chain)
    • Now in modulo $q_{j-1}:
      c2
      can be switched from
      sj2
      to
      c3
      in
      sj1

If represented as RNS, can also do hybrid key switching, GHS method is efficient but need to double N or do q/2

TRLWE (Torus)

  • Torus is roughly
    m/q mod 1
    (the fractional part only)
  • Message space is
    T=R[X]/(XN+1)
    and ciphertext space is
    T(k+1)l
    ;
  • Decryption requires computing
    κLipschitz
    function
    ϕs:TNxTT
    s.t.
    ϕs(a,b)=bsa

CKKS, approx., i.e. Rescaling

TFHE, binary, short, i.e. Programmable Boostrapping

More recent advances

GBFV

Important concepts (whiteboxing the schemes)

Key Switching

No noise reset but can switch to a new (can be smaller) key

Modulus Switching

Can reset noise and switch to a smaller modulus

Packing/Unpacking

Pack and unpack, think about sending a packed ciphertext (not possible to do mult) that can be unpacked (using some big key) into several ciphertexts (that can do mult)

Amortization

Doing things in a batch so that it cost less in average (caution: may not parallelizable)

Rotation (Automorphism)

Cyclic rotation

xxk

Programmable Bootstrapping

Beyond multiplication, basically a lookup table

Functional Bootstrapping

This is similar to PB?, think about doing a specific hash like SHA256?

Scheme Switching

CHIMERA on LWE/RLWE schemes, think about switching from non-binary to binary and maybe just extraction of some bits:

Screenshot 2025-02-12 at 14.55.15

Transciphering (or Hybrid HE)

Caution

All convenient things may require a BIG one time setup (but may be it is fine we do that and we are free next time)

A bit more complex: Arithmetic Simulation

A whole number modulo p can fit "natively" into an FHE ciphertext prime p base but can be expansive depending on p.

A whole number modulo p can fit "non-natively" into an FHE ciphertext composite n in rns form

n=0l1pi resulting in many ciphertexts, cheaper in computation but has to do modulo reduction on p and this is expansive.

A byte array can fit "natively" into a binary ciphertext.