## Hardness of LWE and $k$-SUM Implies Hardness of Sparse-LWE
* $\big( \mathcal{D}_a, \mathcal{D}_s, \mathcal{D}_e, m\big)$-LWE refers to LWE where $a \sim \mathcal{D}_a$, $s\sim \mathcal{D}_s$ and $e \sim \mathcal{D}_e$, and $m$ is the number of LWE samples given to the adversary. The modulus $q$ is implicit.
* $\big( \ell, n, k\big)$-SUM refers to average-case $k$-SUM where the instance is $(a_1,\ldots,a_n)$ where each $a_i \sim \mathbb{Z}_q^\ell$ is i.i.d.
**Theorem:** $\big( \mathcal{U}(\mathbb{Z}_q^n), \mathsf{Sp}_{n,k}, \mathcal{N}_{\sigma}, m\big)$-LWE is $T$-hard assuming that the following two problems are $T$-hard to solve:
(1) $\big( \mathcal{U}(\mathbb{Z}_q^\ell), \mathcal{U}(\mathbb{Z}_q^\ell), \mathcal{N}_{\sigma}, m\big)$-LWE; and
(2) $\big( \ell, n, k\big)$-SUM
*Proof:* We show reductions $\mathcal{R}_1$ and $\mathcal{R}_2$ such that if there is an adversary that solves sparse LWE, then either $\mathcal{R}_1$ succeeds in breaking $k$-SUM or $\mathcal{R}_2$ succeeds in breaking LWE.
$\mathcal{R}_1$, on input $(A,y)$ where $A \in \mathbb{Z}_q^{\ell \times n}$ and $y \in \mathbb{Z}_q^\ell$, does the following:
* Pick a random matrix $B \in \mathbb{Z}_q^{m\times \ell}$ and a noise matrix $N \sim \mathcal{N}_{\sigma'}^{m\times n}$, and a noise vector $e \sim \mathcal{N}_{\sigma''}^{m}$ and feeds the sparse-LWE-breaker with the tuple $$ (C := BA+N, By+e)$$
* If the breaker outputs a $k$-sparse vector $s$ such that $As = y \bmod{q}$, output $s$ else abort.
Note that $$ By+e = BAs+e = (BA+N)s + (e-Ns) = Cs+e' \bmod{q}$$ where $e' := e-Ns$. If $\sigma'' > \sigma'\sqrt{k}m / \epsilon$, the $\ell_1$-distance between $e-Ns$ and $e \sim \mathcal{N}_{\sigma''}^{m}$ is at most $\epsilon$.
So, modulo $C$ not being uniformly random and the variation distance issue with $e'$, this looks like a sparse-LWE instance which the breaker should solve successfully. In turn, $\mathcal{R}_1$ just solved $k$-LWE.
The only way the breaker fails is if it can distinguish between $C=BA+N$ and a uniformly random matrix. By a simple hybrid argument, this is as hard as solving LWE. We argue that if this happens, there is an LWE algorithm.
This is where $\mathcal{R}_2$ comes in. $\mathcal{R}_2$, on input a matrix $C$, runs the sparse-LWE solver on input $(C,Cs+e')$ where $s$ is $k$-sparse and $e'$ is as above. If the sparse-LWE solver succeeds (this can be checked efficiently), $\mathcal{R}_2$ says "uniformly random" else it says "LWE".
If $\mathcal{R}_1$ did not solve $k$-SUM, then the sparse-LWE solver failed on input $(C,Cs+e')$ where $C=BA+N$ is an LWE matrix. On the other hand, by assumption, the sparse-LWE solver succeeds on input $(C,Cs+e')$ where $C$ is uniformly random. In other words, $\mathcal{R}_2$ distinguishes between an LWE matrix $C$ vs. a random matrix. QED.
It remains to work out parameters. (TODO)
* $\ell$ can be as small as $\omega(\log m) \approx \omega(\log n)$.
* $q^{\ell} \gg {n \choose k}$ so $\ell \log q > k \log (n/k) \approx k\log n$. If $q$ is polynomial in $n$, it suffices for $\ell = \Omega(k)$. In any case, $k$ can be barely superconstant.
* There seems to be a tradeoff between the strength of the LWE assumption and that of the $k$-SUM assumption (as should be the case.)
**What is really going on here?** This theorem is essentially the same as GKPV10, except that the (statistically true) leftover hash lemma has been replaced with the (computationally plausibly true) $k$-SUM assumption, which should, and indeed does, give us more leverage!!
Note/Question to myself: $(\ell \approx k\log n,n,k)$-SUM is syntactically the same as LWE in dimension $n-k\log n$ with $n$ samples and a $k$-sparse error. In turn, by the Hermite normal form connection, same as LWE with a $kn/(n-k\log n) \approx k$-sparse secret and $k$-sparse error. What the theorem above is doing is lifting this further: it says that $k$-sparse LWE with far more samples, i.e. $m = \mathsf{poly}(n)$, is also hard if one additionally assumes LWE.