Decode This

Cryptography, 200 points

Description

Ram has something to show you, which can be of great help. It definitely contains the piece of text "pctf" and whatever follows it is the flag. Can you figure out what it is?

Solution

We are given a ciphertext :

vuqxyugfyzfjgoccjkxlqvguczymjhpmjkyzoilsxlwtmccclwizqbetwthkkvilkruufwuu

along with a Python script :

import random

file = open("secret.txt","r")
secret = file.read()

flag = ""
for i in secret:
    if i.isalpha():
        flag += i
l = len(flag)

key = [[int(random.random()*10000) for e in range(2)] for e in range(2)]

i = 0
ciphertext = ""

while i <= (l-2):
    x = ord(flag[i]) - 97
    y = ord(flag[i+1]) - 97
    z = (x*key[0][0] + y*key[0][1])%26 + 97
    w = (x*key[1][0] + y*key[1][1])%26 + 97
    ciphertext = ciphertext + chr(z) + chr(w)
    i = i+2

cipherfile = open('ciphertext.txt','w')
cipherfile.write(ciphertext)

So this clearly looks like a Hill cipher with a 2x2 matrix. The key matrix is randomly populated with integers up to 4 digits, but it doesn't matter since all the operations are done modulus 26 ; it is equivalent to search for a matrix whose coefficients are in

{0,...,25}.

Here is a little reminder on the Hill cipher : choose a matrix

P=(abcd)

with coefficients

(a,b,c,d) in
{0,...,25}
such that its determinant is invertible modulus 26, i.e there exists
p
such that
p(adbc)1(mod26)
.

You split the plaintext in groups of two letters, and for each pair of letters

m1,m2 (consider A = 0, Z = 25), the encrypted pair will be

(c1c2)=P(m1m2)

In order to decrypt the ciphertext, you need to run the same calculations but with

P1, the inverse matrix of
P
(mod 26), which is here

P1=(adbc)1(dbca)(mod26)

Now we could just brute-force all the possible decryption matrices (we know there are less than

264=456976 which is easy to brute-force). We can do much more intelligent though, using a KPTA (Known Plain Text Attack) that only runs once through the ciphertext.

Indeed, let's denote by

(p1,p2) a pair of letters from the plaintext and
(c1,c2)
their encrypted form. Same thing for
(p1,p2)
and
(c1,c2)
which are the next two letters (for instance, "PC" and "TF"). We have :

ap1+bp2=c1cp1+dp2=c2ap1+bp2=c1cp1+dp2=c2

and if we let

x=(a,b,c,d),
y=(c1,c2,c1,c2)
and

Q=(p1p20000p1p2p1p20000p1p2)

we thus have

Qx=y.
Now if we choose a decent enough quadruplet
(p1,p2,p1,p2)
, the matrix
Q
has a determinant which is invertible modulus 26. Actually, choosing the crib "PCTF" does yield
detQ=1369
which is invertible (
3×(1369)1(mod26)
).

Now, we simply have to compute

Q1y on each couple of letters in the ciphertext to have a candidate for the encryption matrix, which greatly reduces the amount of matrices to try out.

Here is the final script.

import numpy as np
import math

det = lambda A: int(round(np.linalg.det(A)))
numbers = lambda T: list(map(lambda x: ord(x) - 97, list(T)))
letters = lambda T: ''.join(chr(97 + l) for l in T)

def modMatInv(A, p):
    n = len(A)
    adj = np.zeros(shape=(n, n))
    for i in range(n):
        for j in range(n):
            adj[i][j] = ((-1)**(i + j) * det(minor(A, j, i))) % p
    return (modInv(det(A), p) * adj) % p

def modInv(a, p):
    for i in range(1, p):
        if (i * a) % p == 1:
            return i
    raise ValueError("%s has no inverse mod %s" % (a, p))

def minor(A, i, j):
    n = len(A)
    minor = np.zeros(shape=(n - 1, n - 1))
    p = 0
    for s in range(n - 1):
        if p == i:
            p += 1
        q = 0
        for t in range(n - 1):
            if q == j:
                q += 1
            minor[s][t] = A[p][q]
            q += 1
        p += 1
    return minor

def decrypt(ciphertext, M):
    ciphertext = numbers(ciphertext)
    decrypted = np.array([
        M.dot(ciphertext[i:i + 2]) % 26
        for i in range(0, len(ciphertext), 2)
    ])
    decrypted = map(int, list(decrypted.reshape(-1)))
    return letters(decrypted)

def kpta(ciphertext, crib):
    u, v, w, z = numbers(crib)
    Q = np.array([
        [u, v, 0, 0],
        [0, 0, u, v],
        [w, z, 0, 0],
        [0, 0, w, z],
    ])
    Qinv = modMatInv(Q, 26)
    c = numbers(ciphertext)
    for i in range(0, len(c) - 4):
        y = np.array(c[i:i+4])
        x = Qinv.dot(y) % 26
        M = np.array([[x[0], x[1]], [x[2], x[3]]])
        try:
            M_ = modMatInv(M, 26)
            print(M.flatten(), decrypt(ciphertext, M_))
        except:
            pass

C = "vuqxyugfyzfjgoccjkxlqvguczymjhpmjkyzoilsxlwtmccclwizqbetwthkkvilkruufwuu"
kpta(C, "pctf")

Here is what it outputs.

[ 9. 21. 12. 11.] pctfaqnrrfqhcscypuqthzmkxpesklxcpurfeubkqtrvekcyzwtrrrdprvzclxnbnlugduug
[19. 16. 10. 15.] bskhsyytifthgqqmfgnzozuownislhjafgifuijqnzadmgqmbgezcxctadlcgngvoxeqtueq
[ 5. 14. 13.  9.] nvsryigpixtoyqugbrluolkioxikvibxbrixyyhplugfwgugpbkxadsfgflrslihkzsitbsi
[ 7.  4. 12. 19.] bkidyqgrkvbtmaigpctfybqggdkinztqpckvakbotfilyaigvsapchabilryepibklcibgci
[12. 19. 17.  1.] fnveiopkrwqxaoqqbtwbhgqivgewgjvtbtrwsohdwbzkacqqjzbixahezkzbbehwzieedxee
[17.  1. 13. 12.] utndygtvjbxesmoyidpctfagxbworgkbidjbmsevpchhkyoysrfbbzjhhhodlfvpxjkgkrkg
[21. 11. 14. 11.] pezbiupfltslkyamluyndryctxygeftklultmczkyntnuwamrefdrvjvtnhsrlhlzraqfyaq
[25. 12.  1.  9.] vxknqmedeldsyweunrpouduagfeypctfnrelggnppouxwseutjwjqhyluxzhezoryfosdfos
[21.  3.  0.  5.] telpiepbtfyhoieqpcexrzwenfgsyrnspctfumboexbjcqeqpurfzvpjbjfcnzhxztoeluoe
[17. 15.  2.  5.] nehhsclhtnynsgckrqsvrnoortgagzrgrqtnskpisvlpkmckvwbpnvhblpfahrtfbduwlauw
[15. 10. 18. 17.] ramhasalittlesecretforyourighthereitispctfilikeclimbinghillswhataboutyou
[13. 25. 22. 15.] rsjxwczhlpsvausytecddxsujlyckrjytelpemdccdhraosypctfvxdjhrhgtddfhdygfiyg
[17. 12.  4.  7.] riqjogkvefdhsouaxshfuzgsalesddnyxsefwwfwhfslkcuaxquhidkxslzcadwpijsadusa
[ 4.  7. 21. 21.] bvdiawnmpgiloeowbxivxegqfqcgsrfdbxpgcmhfivpcciowtndspqvwpctfzcnqnukmvlkm
[25. 11. 22.  9.] lkltouxfffuhmeosrayjzzwmjbssmjjoraffkopayjtzyiosnopvjhbrtzpctfjlvrkyhuky
[23.  5. 20.  3.] xsdfqerbvhgpawgepqsfbjgkdxiumjdkpqvhkobisfllasgefapvhxbrllliplbxltiotcio

This contains the original plaintext, as well as the flag. Enjoy!

(yes, I do know brute-forcing all the matrices would have actually been faster to code but I think this solution is way more elegant :p)