# Jeopardy Writeup
> The choice is yours...choose your curve!
>
> nc 2020.redpwnc.tf 31542
We are provided with a file called `jeopardy.sage`. The server prompts us for `a, b, p` and constructs an elliptic curve using those parameters. It then ensures that the elliptic curve order is at least 200 bits, and finally it checks if the order is prime. Then it asks us to solve an elliptic curve discrete log problem.
What's unusual, however, is the presence of a custom prime checker. Code below:
```python=
def isPrime(n):
if n == 2: return True
if n%2 == 0: return False
r,d = 0,n-1
while d%2 == 0: r += 1; d //= 2
for k in range(1):
a = random.randint(2,n-2)
x = pow(a,d,n)
if x == 1 or x == n-1: continue
for i in range(r-1):
x = pow(x,2,n)
if x == n-1: continue
else: return False
return True
```
This is a Miller-Rabin test with 1 round, which we can confirm by looking at the Miller-Rabin pseudocode on Wikipedia.

An important thing to note is that this is a *probabilistic* test, which means this test may return true for a composite number. The solve method now becomes clear:
1. We generate a composite number $n > 2^{200}$ that can fool the `isPrime` function.
2. We then construct an elliptic curve with order $n$. If all the factors of $n$ are small enough (less than $2^{40}$), then we can easily solve the elliptic curve discrete log problem with the Pohlig-Hellman algorithm.
## Constructing a psuedoprime
### What are we looking for?
If we look back at the code, we realize this isn't exactly Miller-Rabin. The `continue` on the 12th line doesn't actually continue the WitnessLoop, only the inner `for` loop. This thus reduces the isPrime function to checking if $n$ is odd (or 2) and $a^d\equiv\pm1\pmod n$, where $n-1=2^rd$ and $d$ is odd.
Since $n>2^{200}$, we must have $n-1$ is even, so $r\ge1$. If we assume $r=1$, this would imply $n\equiv3\pmod4$. We get $a^{(n-1)/2}\equiv\pm1\pmod n$. Squaring both sides gives us $$a^{n-1}\equiv 1\pmod n.$$ Hey, wait a second - that's just the Fermat primality test! An important thing to note about the Fermat test is that there are composite numbers $n$ that can fool it for *all* $a$ such that $\gcd(a,n)=1.$ These numbers are called **Carmichael numbers**. So, we probably want to look for Carmichael numbers that are $3\bmod4$.
By Korselt's criterion, we have that every Carmichael number $n$ is squarefree, and for every prime $p$ that divides $n$, it is true that $p-1$ is a factor of $n-1$. Importantly, this means that if $n\equiv3\pmod4$, then every prime factor of $n$ must also be equivalent to $3\bmod4$.
### How often will this work?
If we pass a Carmichael number $n=p_1p_2\dots p_k$ that is $3\bmod4$ into the isPrime function, it will check if $a^{(n-1)/2}\equiv\pm1\pmod n$. Let us assume that $a^{(n-1)/2}\equiv b\pmod n$, and $b$ is either $1$ or $-1$. By the Chinese Remainder Theorem, we can split this up into $k$ different congruences: $a^{(p_i-1)/2\cdot x_i}\equiv b\pmod{p_i}$ for each $i$ from $1$ to $k$. Note that we are allowed to write $\frac{n-1}{2}=\frac{p_i-1}{2}\cdot x_i$ by Korselt's criterion. In fact, since $\frac{n-1}{2}$ is odd, every $x_i$ must be odd. So, we have:
\begin{align*}
a^{(p_i-1)/2\cdot x_i}&\equiv b\pmod{p_i}\\
a^{(p_i-1)/2\cdot(2y_i+1)}&\equiv b\pmod{p_i}\\
a^{(p_i-1)y_i+(p_i-1)/2}&\equiv b\pmod{p_i}\\
a^{(p_i-1)y_i}a^{(p_i-1)/2}&\equiv b\pmod{p_i}\\
a^{(p_i-1)/2}&\equiv b\pmod{p_i}
\end{align*}
The last step is justified by Fermat's Little Theorem. Some people may recognize that last result - that's Euler's criterion for the Legendre symbol! The Legendre symbol $\left(\frac{a}{p}\right)$ determines if a number $a$ is a quadratic residue modulo $p$. If $\left(\frac{a}{p}\right)=1$, then $a$ is a quadratic residue modulo $p$; otherwise $\left(\frac{a}{p}\right)=-1$ and $a$ is a quadratic nonresidue modulo $p$ (we ignore the case when it equals $0$ since that happens very rarely). Since quadratic residues comprise half of the numbers from $1$ to $p-1$, the probability that $a^{(p_i-1)/2}\equiv b\pmod{p_i}$ is roughly $\frac12$.
Combining all of the congruences together, we get that the probability that $a^{(n-1)/2}\equiv b\pmod n$ is approximately $(1/2)^k$. So, the probability that $a^{(n-1)/2}\equiv\pm1\pmod n$ is approximately $(1/2)^{k-1}$.
In conclusion, we want a Carmichael number that is $3\pmod4$ and has relatively few factors.
### Finding a desirable number
We will be using [paper #1](https://arxiv.org/abs/1203.6664) and [paper #2](https://arxiv.org/abs/2002.07095) to construct our desired number. However, we're not actually going to use the algorithm described in the first paper, nor the improved algorithm described in the second paper.

Paper #1 describes a method for constructing Carmichael numbers called the **Erdős construction**.

The useful thing about this method is that we can control what primes make up our number. Recall that we are looking for Carmichael numbers that are $3\bmod4$, which means we can only use primes that are $3\bmod4$.
#### Step 1: Constructing $\Lambda$
If we look ahead at step $3$, we want to achieve a number that is $1\bmod\Lambda$. This will be much easier if $\Lambda$ is smaller.
We also look ahead to step $2$. Here, we generate primes such that $p-1$ divides $\Lambda$. However, we only want primes that are $3\bmod4$. So, having more factors of $2$ will not increase the number of primes in $\mathcal{P}$. So, we set $h_0$ to $1$, meaning all $h_i$ will also be set to $1$.
In the end, we let $\Lambda=q_1q_2\cdots q_r$ for some $r$.
#### Step 2: Constructing $\mathcal{P}$
We need $p-1$ to divide $\Lambda$, so we iterate through all factors $f$ of $\Lambda$ and compute $f+1$. We check if this number is a prime that is $3\bmod4$ and greater than $p_r$. We also place some bounds on the primes, since don't want too many factors, but we also don't want the discrete logarithm to take a long time. So, we select the bounds $2^{20}<p<2^{40}$. The set of these numbers makes up $\mathcal{P}$.
#### Step 3: Selecting the primes
We now must find a subset of $\mathcal{P}$ that multiplies to $1\bmod\Lambda$. This is the modular subset product problem, which is what paper #2 focuses on. They describe a straightforward algorithm based on the birthday attack.

They also describe a more memory-efficient method of generating Carmichael numbers, but we don't really care.
We compute the product of every possible subset of the first half of $\mathcal{P}$, and the *inverse* of the product of every possible subset of the second half of $\mathcal{P}$. Then we check if the results have any numbers in common. If so, we can combine the two subsets and they will have a product of $1\bmod\Lambda$. This takes up $O\left(2^{\text{len}(\mathcal{P})/2}\right)$ memory and $O\left(2^{\text{len}(\mathcal{P})/2}\right)$ time, which is really inefficient. But we'll just keep the numbers small ($r=10$) and hope we find something without blowing up our computer.
#### Code
We combine these three steps into a python script below.
```python=
from gmpy2 import mpz, invert, is_prime
from operator import mul
from functools import reduce
from itertools import *
# copied from stackoverflow (https://stackoverflow.com/a/19391111)
def psieve():
import itertools
yield from (2, 3, 5, 7)
D = {}
ps = psieve()
next(ps)
p = next(ps)
assert p == 3
psq = p*p
for i in itertools.count(9, 2):
if i in D: # composite
step = D.pop(i)
elif i < psq: # prime
yield i
continue
else: # composite, = p*p
assert i == psq
step = 2*p
p = next(ps)
psq = p*p
i += step
while i in D:
i += step
D[i] = step
# transcribed from paper
def BA_MSPP(Lambda,P,c):
n = len(P)-1
I1 = P[0:(n+1)//2+1]
I2 = P[(n+1)//2+1:n+1]
U1 = {reduce(mul,u)%Lambda:u for u in chain.from_iterable(combinations(I1, r) for r in range(1,len(I1)+1))}
print("U1 generated")
U2 = {invert(reduce(mul,u),Lambda):u for u in chain.from_iterable(combinations(I2, r) for r in range(1,len(I2)+1))}
print("U2 generated")
inter = U1.keys()&U2.keys()
if inter:
for res in inter:
yield U1[res] + U2[res]
if 1 in U1:
yield U1[1]
if 1 in U2:
yield U2[1]
primes = psieve()
exps = [1]*10
primes = [mpz(next(primes)) for i in range(len(exps))]
Q = []
Lambda = mpz(1)
for i in range(len(primes)):
Q.append(mpz(primes[i]**exps[i]))
Lambda *= Q[-1]
P = []
# copied from stackoverflow (https://stackoverflow.com/a/171784)
def divisorGen():
factors = list(zip(primes[1:],exps[1:]))
nfactors = len(factors)
f = [0] * nfactors
while True:
yield reduce(mul, [factors[x][0]**f[x] for x in range(nfactors)], 1)
i = 0
while True:
f[i] += 1
if f[i] <= factors[i][1]:
break
f[i] = 0
i += 1
if i >= nfactors:
return
print(Lambda, Lambda.bit_length())
for pr in divisorGen():
prod = 2*pr+1
if 2**40>prod>2**20 and Lambda%prod and is_prime(prod):
P.append(prod)
P.sort()
print(len(P))
for i in BA_MSPP(Lambda,P,1):
if len(i)%2:
prod = reduce(mul,i)
if prod.bit_length()>=200:
print(prod,prod.bit_length())
print(list(map(int,i)))
```
We run the program and get quite a few outputs. The one with the smallest number of factors is $\boxed{13144149227846859391587797565803336945061475809444187482498931}$, a 204-bit number which factors into 1193011 \* 1247291 \* 1272811 \* 1939939 \* 2661331 \* 8036887 \* 8731031 \* 14804791 \* 1293938647. Since there are $9$ factors, the probability of this succeeding is about $\frac{1}{256}\approx0.4\%$. This is small, but fine since we can make multiple requests against the server.
## Constructing an Elliptic Curve
If we look up "construct elliptic curve with given order", the first result is [this paper](https://www.math.leidenuniv.nl/scripties/Broker.pdf) titled "Constructing Elliptic Curves of Prescribed Order" which goes into detail about how to construct elliptic curves. Somewhere in there it mentions complex multiplcation and it sounds pretty complicated. Perhaps someone else has already implemented this?
So we look up "site:github.com Constructing Elliptic Curves of Prescribed Order" and land on this tool called [ecgen](https://github.com/J08nY/ecgen). In the README we see that we can generate an elliptic curve with a given order using the `-n / --order=ORDER` option, and the tool uses complex multiplication.
The README seems to imply that the given order must be prime, but if we dig into the source we find the files `src/cm/cm_prime.c` *and* `src/cm/cm_any.c`.

So in fact we can generate a curve with any order.
If we look at `test/ecgen.sh` we can see some example usage, which tells us that we can put in the prime factors of the order instead of the order itself.

In the end our final command is
```bash
./ecgen --order=1193011,1247291,1272811,1939939,2661331,8036887,8731031,14804791,1293938647 --fp 203
```
In the end this gives us some output. These are the only important bits:
```json=
"field": {
"p": "0x082dfbd9185fea44a0ce334dc3f878152b1fddc4e7c79d4790e1"
},
"a": "0x069b456b42f15ca8a64e7cbba890951a5dfabd56ee2e8fb7a444",
"b": "0x0514285447f4934e80aad9187e6c89e5e7345f91af3cc10a3708"
```
This gives us our desired elliptic curve parameters.
## Running Against Server
Recall that our constructed order only fools the `isPrime` checker about $0.4\%$ of the time. So, we need to set up our script to continue retrying connections until one recognizes our order as prime, which will take roughly 250 connections.
We will use sage for this since it has a builtin discrete_log method that uses Pohlig-Hellman. I didn't have pwntools installed on sage so I borrowed a netcat implementation from [this Github script](https://gist.github.com/leonjza/f35a7252babdf77c8421) and modified it a bit. Final solve script below:
```python=
import socket
import sys
class Netcat:
def __init__(self, ip, port):
self.buff = ""
self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.socket.connect((ip, port))
def read(self, length = 1024):
d = self.socket.recv(length)
sys.stdout.write(d)
return d
def read_until(self, data):
while not data in self.buff:
d = self.socket.recv(1024)
sys.stdout.write(d)
self.buff += d
pos = self.buff.find(data)
rval = self.buff[:pos + len(data)]
self.buff = self.buff[pos + len(data):]
return rval
def write(self, data):
self.socket.send(data)
def close(self):
self.socket.close()
p = 0x082dfbd9185fea44a0ce334dc3f878152b1fddc4e7c79d4790e1
a = 0x069b456b42f15ca8a64e7cbba890951a5dfabd56ee2e8fb7a444
b = 0x0514285447f4934e80aad9187e6c89e5e7345f91af3cc10a3708
field = GF(p)
E = EllipticCurve(field, [a,b])
print(E)
while 1:
nc = Netcat('2020.redpwnc.tf',31542)
nc.write(str(a)+'\n')
nc.write(str(b)+'\n')
nc.write(str(p)+'\n')
nextline = nc.read_until('\n')
if 'Traceback' in nextline:
nc.close()
continue
x,y,z = map(int,nextline.strip().split("p: ")[-1].strip("()").split(" : "))
G = E(x,y)
print("G = "+str(G))
nextline = nc.read_until('\n')
x,y,z = map(int,nextline.strip().split("p: ")[-1].strip("()").split(" : "))
pub = E(x,y)
print("pub = "+str(pub))
log = G.discrete_log(pub)
print(log)
nc.write(str(log)+'\n')
while 1:
nc.read()
break
```
After running this script for a while, we obtain the flag: `flag{why_cant_penguins_fly_892839288}`.
omg tux <3