PicoCTF Walkthrough: Part 4 (Tiếng Việt)

<h1 style="text-align:center; color:#e1829c"> PicoCTF Walkthrough: Part 4 </h1> Dưới đây là các lời giải cho một số thử thách của **picoCTF** mà mình cảm thấy thú vị và hữu ích. ## It's Not My Fault 1 ![image](https://hackmd.io/_uploads/HJQTV_RClx.png) :::spoiler not_my_fault.py ```python= #!/usr/bin/python3 -u import random import string import hashlib import time from Crypto.Util.number import inverse, getPrime, bytes_to_long, GCD from sympy.ntheory.modular import solve_congruence FLAG = open('flag.txt', 'r').read() def CRT(a, m, b, n): val, mod = solve_congruence((a, m), (b, n)) return val def gen_key(): while True: p = getPrime(512) q = getPrime(512) if GCD(p-1, q-1) == 2: return p, q def get_clue(p, q, BITS): while True: d_p = random.randint(1, 1 << BITS) d_q = random.randint(1, q - 1) if d_p % 2 == d_q % 2: d = CRT(d_p, p - 1, d_q, q - 1) e = inverse(d, (p - 1) * (q - 1)) print("Clue : ", e) return def get_flag(p, q): start = time.time() ans = int(input()) if (time.time() - start) > (15 * 60): print("Too long!") exit() else: if ans == p + q: print(FLAG) else: print("oops...") #PoW vals1 = "".join([random.choice(string.digits) for _ in range(5)]) vals2 = "".join([random.choice(string.hexdigits.lower()) for _ in range(6)]) user_input = input("Enter a string that starts with \"{}\" (no quotes) which creates an md5 hash that ends in these six hex digits: {}\n".format(vals1, vals2)) user_hash = hashlib.md5(user_input.encode()).hexdigest() if user_input[:5] == vals1 and user_hash[-6:] == vals2: p, q = gen_key() n = p * q print("Public Modulus : ", n) get_clue(p, q, 20) get_flag(p, q) ``` ::: Nói ngắn gọn về mục tiêu của bài này thì ta cần phải vượt qua được **Prook of Work (PoW)**, chính là cái yêu cầu oái oăm về cái hàm băm **MD5**, sau đó ta phải tìm được $p$ và $q$ từ $n$ và $e$ để tính tổng $p + q$, từ đó khôi phục lại **flag**. Về phần **PoW**, vì yêu cầu về **MD5** không quá khắt khe khi chỉ cần $6$ ký tự cuối của giá trị băm giống với giá trị yêu cầu của server, từ đó ta có thể tiến hành brute-force bằng cách cho $5$ giá trị đầu mà server đầu đưa cho ta rồi nối chuỗi với các ký tự brute-force. Theo kiểm nghiệm thực tế, thì nếu may mắn chúng ta chỉ cần vài chục nghìn vòng lặp là đã vượt qua, nhưng nếu xui xẻo thì phải mất thậm chí đến hàng trăm triệu vòng lặp thì mới vượt qua được. Sau khi đã vượt qua đượt vòng **PoW**, server sẽ cho ta $n$ và $e$ để ta có thể tìm lại được $p$ và $q$ (hoặc ít nhất $p + q$) để nộp lại $p + q$ cho server. Nếu không có lỗi hoặc điểm yếu trong triển khai, thì nhiệm vụ này không thể thực hiện được, tuy nhiên vì đây là một thử thách CTF nên chắc chắn sẽ có một điểm yếu nằm ở đâu đó. Sau một lúc tìm kiếm và phân tích đề bài, mình phát hiện giá trị $d_p$ có không gian nghiệm rất nhỏ, chỉ $20$ bit, vì vậy mình nghĩ ngay đến việc brute-force $d_p$ và khôi phục $p$, $q$ với thông tin mới này. Vì đây là bài đăng dành cho các bài khó nên mình sẽ mặc định các bạn đang đọc có trình độ cao rồi, ta sẽ có một tính chất như sau trong **RSA** (chứng minh thì bạn có thể tự mình làm, vì cái này không cần kiến thức toán học quá cao siêu): $$ c^{d_p} = m^{ed_p} \equiv m \pmod{p} \Leftrightarrow p|(c^{d_p} - m) $$ trong đó $m$ là văn bản rõ và $c$ là văn bản mã. Do đó nếu nếu ta brute-force đến $d_p$ đúng, thì ta sẽ thu được $p = \gcd(c^{d_p} - m, n)$. Sau khi brute-force ra $p$, ta có thể tìm lại được $q$ và sau đó tính được $p + q$ và nộp lại cho server, vậy là ta đã hoàn thành hướng giải để tìm lại **flag**. >[!Note] Thông tin thêm >Không phải lúc nào brute-force cũng ra kết quả, nguyên nhân vì sao lại như vậy thì bạn có thể đọc [ở đây](https://hackmd.io/@0160ca14/H18qRGuile#RAS-KMA-CTF-2025---l%E1%BA%A7n-2), nó tương tự với trường hợp của bài này (đọc ở phần chú ý trong thử thách). Dưới đây sẽ là toàn bộ lời giải cho thử thách này: :::spoiler solution.py ```python= from pwn import remote from Crypto.Hash import MD5 import re, itertools, string from tqdm import tqdm from Crypto.Util.number import isPrime from multiprocessing.dummy import Pool import random, gmpy2 HOST = "mercury.picoctf.net"; PORT = 27379 connection = remote(HOST, PORT) message = connection.recvline().rstrip().decode() match = re.findall(r"[0-9a-f]{3,6}", message) prefix = match[0]; suffix = match[1] for item in tqdm(itertools.product(string.printable, repeat=10), desc="[~] Brute-forcing MD5...", total= len(string.printable)**10): my_string = prefix + "".join(item) if(MD5.new(my_string.encode()).hexdigest()[-6:] == suffix): connection.info(f"Found string: {my_string}") break connection.sendline(my_string.encode()) connection.recvuntil(b"Public Modulus : ") n = int(connection.recvline().strip().decode()) connection.recvuntil(b"Clue : ") e = int(connection.recvline().strip().decode()) # Hàm khôi phục lại số nguyên tố random_message_list = [gmpy2.mpz(random.randrange(2, n)) for _ in range(7)] random_cipher_list = [gmpy2.powmod(r, e, n) for r in random_message_list] def find_prime(d_p): for random_message, random_cipher in zip(random_message_list, random_cipher_list): p = gmpy2.gcd((gmpy2.powmod(random_cipher, d_p, n) - random_message) % n, n) if(p == 1): continue assert isPrime(p) and n%p == 0, "Critical Error 1!" q = n//p return p, q return None for result in tqdm(Pool(16).imap_unordered(find_prime, range(2, 2**20 + 1)), desc="Finding prime...", total=2**20): if not result: continue p, q = result response = p + q break connection.sendline(f"{response}".encode()) flag = connection.recvline().strip().decode() connection.success(f"Got flag: {flag}") connection.close() # [+] Got flag: picoCTF{1_c4n'7_b3l13v3_17'5_n07_f4ul7_4774ck!!!} ``` ::: :::success **Flag: picoCTF{1_c4n'7_b3l13v3_17'5_n07_f4ul7_4774ck!!!}** ::: >[!Warning] Cảnh báo > Bạn cần phải sử dụng các hàm và phương thức toán học trong thư viện `gmpy2` và lập trình so cho tối ưu về tốc độ chạy nhiều nhất, kết hợp với đa luồng thì mới có thể brute-force thành công trong giới hạn thời gian $15$ phút được! >[!Note] Tại sao các hàm và phương thức toán học trong `gmpy2` lại chạy nhanh hơn? > Các hàm và phương thức toán học trong thư viện `gmpy2` chạy nhanh hơn các hàm và phương thức toán học có sẵn (hoặc trong thư viện `math`) của **python** vì chúng được viết bằng **C/Assembly** và chúng được viết tối ưu về mặt thuật toán, cú pháp và thời gian chạy nhất có thể. > Tốc độ của chúng (theo mình thực nghiệm) nhanh hơn từ $9$ đến $10$ lần so với tốc độ của các hàm và phương thức thông thường (hoặc trong thư viện `math`). ## Scrambled: RSA ![image](https://hackmd.io/_uploads/HJcRHQJ1-e.png) Bài này là một bài không có file thử thách, chỉ có file kết nối, và cách giải bài này cũng phải đoán mò khá nhiều mặc dù vẫn cần tư duy. Ban đầu khi ta kết nối sẽ được cho thông tin như sau: ![image](https://hackmd.io/_uploads/SJIQeHy1Wg.png) Ta thấy **flag** rất kỳ lạ và dài, trong khi $e$ thì bình thường, vì vậy mình tiếp tục gửi tiếp một vài văn bản mã. ![image](https://hackmd.io/_uploads/Hk5dZBk1Wx.png) Sau một thời gian dài phân tích, mình đã phát hiện các quy luật thú vị và có thể khai thác để tìm lại **flag**. Gọi `E(m)` là hàm mã hóa văn bản rõ `m` của server, ta có các tính chất như sau: - Với $1$ chữ cái `c` bất kỳ, ta luôn có chuỗi chữ số `E(c)` sẽ nằm trong chuỗi chữ số `E(c||m)` với `m` là chuỗi bất kỳ. - Với $2$ chữ cái `a` và `b` bất kỳ, gọi `E((a||b)\a)` là chuỗi chữ số được tạo khi loại bỏ chuỗi chữ số `E(a)` ra khỏi chuỗi chữ số `E(a||b)`. Khi đó, ta có các tính chất như sau: - Chuỗi chữ số `E((a||b)\a)` sẽ nằm trong chuỗi chữ số `E(a||b||m)` với `m` là chuỗi bất kỳ. - Mở rộng thêm, ta sẽ có chuỗi chữ số `E(((a||b||c)\a)\(E(a||b)\a))` sẽ thuộc chuỗi chữ số `E(a||b||c||m)` với `m` là chuỗi bất kỳ. Như vậy, với `m` là một chuỗi bất kỳ ta luôn có `E(m)` được cấu tạo bằng cách kết hợp các chuỗi xác định, ta gọi nó là **segment**. - Các giá trị `E(m)` mà server trả vể luôn khác nhau, bởi vì server đã hoán vị các **segment** rồi mới trả về cho chúng ta. Vì bài này phải thực nghiệm nhiều nên các bạn cần phải tự làm để thấy và hiểu rõ hơn các tính chất này. Từ các tính chất này ta có hướng giải như sau: - Brute-force từng ký tự của **flag**, sau đó tìm **segment** được tạo bởi ký tự đó, nếu ký tự đó đúng là ký tự của **flag** thì **segment** sẽ nằm trong văn bản mã của **flag** mà server trả về (`E(flag)`). - **segment** của ký tự đầu tiên `a` trong **flag** chính là `E(a)`, **segment** của hai ký tự đầu tiên `ab` trong **flag** chính là `E(a||b)\a`,... - Cứ lặp lại như vậy cho đến khi ta tìm được **flag** Vì bài này là bài cần suy luận nhiều nên các bạn cần tự kiểm nghiệm và suy luận để hiểu rõ hơn về bài này. Dưới đây là toán bộ lời giải cho thử thách này: :::spoiler solution.py ```python= from pwn import remote from tqdm import tqdm import string HOST = "mercury.picoctf.net"; PORT = 6276 connection = remote(HOST, PORT) connection.recvuntil(b"flag: ") encrypted_flag = connection.recvline().strip().decode() connection.recvuntil(b"n: ") n = int(connection.recvline().strip().decode()) connection.recvuntil(b"e: ") e = int(connection.recvline().strip().decode()) flag = "" segment_list = [] find_flag = False while(True): # Trong thực tế ta phải thay "0123456789_adpicoCTF{}b" bằng string.printable, nhưng vì picoCTF cho cái kết nối quá đần độn nên mình phải chơi bẩn như thế này để cào flag! for char in tqdm("0123456789_adpicoCTF{}b", desc="Recovering character...", total=len("0123456789_adpicoCTF{}b") - 1): possible_flag = flag + char connection.recvuntil(b"I will encrypt whatever you give me: ") connection.sendline(possible_flag.encode()) connection.recvuntil(b"Here you go: ") possible_encrypted_flag = connection.recvline().strip().decode() for segment in segment_list: possible_encrypted_flag = possible_encrypted_flag.replace(segment, "") if(possible_encrypted_flag in encrypted_flag): segment_list.append(possible_encrypted_flag) flag = possible_flag if(char == "}"): find_flag = True break if find_flag: break connection.success(f"Got flag: {flag}") connection.close() # [+] Got flag: picoCTF{bad_1d3a5_2268684} ``` ::: :::success **Flag: picoCTF{bad_1d3a5_2268684}** ::: ## SRA ![image](https://hackmd.io/_uploads/HJ1e9D1Jbx.png) :::spoiler chal.py ```python= from Crypto.Util.number import getPrime, inverse, bytes_to_long from string import ascii_letters, digits from random import choice pride = "".join(choice(ascii_letters + digits) for _ in range(16)) gluttony = getPrime(128) greed = getPrime(128) lust = gluttony * greed sloth = 65537 envy = inverse(sloth, (gluttony - 1) * (greed - 1)) anger = pow(bytes_to_long(pride.encode()), sloth, lust) print(f"{anger = }") print(f"{envy = }") print("vainglory?") vainglory = input("> ").strip() if vainglory == pride: print("Conquered!") with open("/challenge/flag.txt") as f: print(f.read()) else: print("Hubris!") ``` ::: Bài này đơn giản là một bài RSA nhưng đặt tên biến theo [**bảy đại tội**](https://en.wikipedia.org/wiki/Seven_deadly_sins), bài này sẽ chỉ cho chúng ta $c$, $d$ và $e$ và bắt chúng ta tìm lại $m$ để nộp cho server lấy **flag**. Theo công thức, ta có: $$ de \equiv 1 \pmod{\varphi(n)} \Leftrightarrow de - 1 = k_{\varphi}\varphi(n) $$ Vì $p$ và $q$ là các số nguyên tố có độ lớn $128$ bit nên ta sẽ dùng hàm `divisors` của **SageMath** để tìm ra những số ước chính là $p$ (hoặc $q$, tên gọi không quan trọng trong bài này). Cụ thể, ta sẽ brute-force từng ước $a$ của $de - 1$ và kiểm tra xem $a + 1$ có phải số nguyên tố không và ước này có độ lớn $128$ bit hay không. Sau khi có được $p$, ta sẽ tiếp tục tiến hành brute-force $k_{\varphi}$ với $k_{\varphi} = \overline{1, e-1}$ (chứng minh cũng đơn giản nên các bạn hãy tự chứng minh). Với mỗi giá trị của $k_{\varphi}$, ta sẽ kiểm tra xem $\frac{de - 1}{k_{\varphi}(p - 1)} + 1$ có phải là số nguyên tố và có độ lớn $128$ bit hay không. Nếu có thì xin chúc mừng khả năng rất cao bạn đã tìm được $q = \frac{de - 1}{k_{\varphi}(p - 1)} + 1$. Công việc còn lại thì khá đơn giản chúng ta chỉ cần tìm lại $m$ và nộp cho server để lấy **flag** mà thôi 🥳! Dưới đây là toàn bộ lời giải cho thử thách này: :::spoiler solution.py ```python= from pwn import remote from sage.all import divisors, is_prime from tqdm import trange from Crypto.Util.number import long_to_bytes import re HOST = "saturn.picoctf.net"; PORT = 56457 connection = remote(HOST, PORT) connection.recvuntil(b"anger = ") c = int(connection.recvline().strip().decode()) connection.recvuntil(b"envy = ") d = int(connection.recvline().strip().decode()) e = 65537 multiple_of_phi = d*e - 1 found_flag = False for divisor in divisors(multiple_of_phi): p = int(divisor + 1) if(is_prime(p) and p.bit_length() == 128): for k_phi in trange(1, e): if(multiple_of_phi % k_phi != 0): continue q = (multiple_of_phi//(k_phi*divisor)) + 1 if(is_prime(q) == False or q.bit_length() != 128): continue n = p*q m = pow(c, d, n) message = long_to_bytes(m) if(message.isalnum()): connection.sendlineafter(b"> ", message) connection.recvuntil(b"Conquered!") flag = connection.recvall() found_flag = True break if found_flag: break flag = re.search(rb"picoCTF\{[!-~]+\}", flag).group() connection.success(f"Got flag: {flag.decode()}") # [+] Got flag: picoCTF{7h053_51n5_4r3_n0_m0r3_b2f9b414} ``` ::: :::success **Flag: picoCTF{7h053_51n5_4r3_n0_m0r3_b2f9b414}** ::: >[!Note] >Không phải lúc nào ta cũng chạy ra lời giải, vì điều này phụ thuộc vào $de - 1$ (cụ thể là độ trơn hoặc độ bán trơn). Những lúc chạy mà thấy lâu quá, bạn chỉ cần ngắt chương trình và cho chạy lại là được. ## ChaCha Slide ![image](https://hackmd.io/_uploads/B16WlKk1bl.png) :::spoiler challenge.py ```python= import secrets import hashlib from Crypto.Cipher import ChaCha20_Poly1305 flag = open("flag.txt").read().strip() def shasum(x): return hashlib.sha256(x).digest() key = shasum(shasum(secrets.token_bytes(32) + flag.encode())) # Generate a random nonce to be extra safe nonce = secrets.token_bytes(12) messages = [ "Did you know that ChaCha20-Poly1305 is an authenticated encryption algorithm?", "That means it protects both the confidentiality and integrity of data!" ] goal = "But it's only secure if used correctly!" def encrypt(message): cipher = ChaCha20_Poly1305.new(key=key, nonce=nonce) ciphertext, tag = cipher.encrypt_and_digest(message) return ciphertext + tag + nonce def decrypt(message_enc): ciphertext = message_enc[:-28] tag = message_enc[-28:-12] nonce = message_enc[-12:] cipher = ChaCha20_Poly1305.new(key=key, nonce=nonce) plaintext = cipher.decrypt_and_verify(ciphertext, tag) return plaintext for message in messages: print("Plaintext: " + repr(message)) message = message.encode() print("Plaintext (hex): " + message.hex()) ciphertext = encrypt(message) print("Ciphertext (hex): " + ciphertext.hex()) print() print() user = bytes.fromhex(input("What is your message? ")) user_message = decrypt(user) print("User message (decrypted): " + repr(user_message)) if goal in repr(user_message): print(flag) ``` ::: Bài này sẽ là một bài mã hóa khối (và là phần mình ghét nhất, toàn **tool thủ** chơi với nhau), và mã hóa khối sử dụng trong bài này sẽ là [**ChaCha20-Poly1305**](https://en.wikipedia.org/wiki/ChaCha20-Poly1305). ![image](https://hackmd.io/_uploads/BkDDni1yWe.png) (Hình ảnh để dọa những người chơi tuy mới nhưng thích đọc lời giải bài khó 🐧) Thực ra nhìn vào cái ảnh kia chúng ta đã hình dung sơ được cách hoạt động của mã hóa **ChaCha20-Poly1305** rồi. Dưới đây sẽ là mô tả chi tiết hơn nhưng không quá sâu về cách hoạt động của chúng (có dựa vào AI). >[!Note] ChaCha20 Hoạt động như thế nào? (Phần Bảo mật) >ChaCha20 là một **mã hóa luồng**. Nó cực kỳ đơn giản: >1. Nó lấy `key` và `nonce` để tạo ra một dòng byte giả ngẫu nhiên gọi là **Keystream**. Dòng Keystream này là duy nhất cho mỗi cặp `(key, nonce)`. >2. Để mã hóa, nó chỉ cần lấy bản rõ (Plaintext) **XOR** với Keystream. >3. Để giải mã, nó làm điều tương tự: lấy bản mã (Ciphertext) **XOR** với Keystream. > >$$ >\begin{align*} >\text{Ciphertext} = \text{Plaintext} ~ \oplus ~ \text{Keystream} \\[5pt] >\text{Plaintext} = \text{Ciphertext} ~ \oplus ~ \text{Keystream} >\end{align*} >$$ >[!Note] Poly1305 Hoạt động như thế nào? (Phần Toàn vẹn) >Poly1305 là một **Mã xác thực tin nhắn (MAC)**. Nhiệm vụ của nó là tạo ra một con dấu "chống giả mạo" gọi là **tag**. Nếu kẻ tấn công thay đổi dù chỉ 1 bit của ciphertext, tag sẽ không còn hợp lệ. > >Nó hoạt động dựa trên một công thức toán học, nhưng bạn có thể hình dung nó như sau: >$$ >\text{tag} = \text{Poly1305}(\text{auth_key}, \text{ciphertext} + \text{additional_data}) >$$ > >Điểm mấu chốt ở đây là `auth_key`. Đây là một **khóa xác thực dùng một lần** được tạo ra từ `key` và `nonce` ban đầu. Cụ thể, 32 byte đầu tiên của Keystream được tạo bởi ChaCha20 chính là `auth_key` này. Một điều khá thú vị là **ChaCha20-Poly1305** là một hệ mã hóa được kết hợp từ hai thuật toán **ChaCha20** và **Poly1305**. ChaCha20 và Poly1305 là hai thuật toán hoàn toàn riêng biệt được thiết kế bởi cùng một nhà mật mã học thiên tài, Daniel J. Bernstein, và sau đó được ghép lại với nhau để tạo thành một "cặp đôi hoàn hảo". (đoạn văn lấy từ AI). Quay trở lại thử thách, thử thách này sẽ liên quan đến lỗ hổng cốt lõi là **Nonce Reuse**, khi mà nhìn vào file thử thách ta sẽ thấy `nonce` được tái sử dụng và không hề thay đổi. Đây là một tấn công kinh điển và phức tạp nên mình sẽ không trình bày nguyên nhân sâu xa và lỗi bên trong khiến hệ thống này trở nên dễ bị phá mà chỉ cần đi tìm script triển khai tấn công là được. Dưới đây là toàn bộ lời giải cho thử thách này: :::spoiler chacha_poly1305_forgery.py ```python= from sage.all import GF, ZZ, PolynomialRing from Crypto.Cipher.ChaCha20 import ChaCha20Cipher def _poly1305(msg:bytes, key:bytes, byteorder='little'): """ A pure python implementation of the Poly1305 MAC function. Not a standard implementation, abandoned Args: msg (bytes): The message to authenticate key (bytes): The 32 byte key to use Returns: bytes: The 16 byte MAC """ p = 2**130 - 5 # the prime number used in Poly1305 r = int.from_bytes(key[:16], byteorder) r = r & 0x0ffffffc0ffffffc0ffffffc0fffffff s = int.from_bytes(key[16:], byteorder) acc = 0 for i in range(0, len(msg), 16): block = msg[i:i+16] + b'\x01' block = int.from_bytes(block, byteorder) acc = (acc + block) * r % p acc = (acc + s) % p acc = int(acc % 2**128) return acc.to_bytes(16, byteorder) # the implementation of the Poly1305 in the standard lib is as follows: def poly1305(msg:bytes, key:bytes, byteorder='little'): """ A pure python implementation of the Poly1305 MAC function Reference: https://datatracker.ietf.org/doc/html/rfc7539#section-2.5.1 Args: msg (bytes): The message to authenticate key (bytes): The 32 byte key to use Returns: bytes: The 16 byte MAC """ p = 2**130 - 5 # the prime number used in Poly1305 r = int.from_bytes(key[:16], byteorder) r = r & 0x0ffffffc0ffffffc0ffffffc0fffffff s = int.from_bytes(key[16:], byteorder) acc = 0 for i in range(0, len(msg), 16): block = msg[i:i+16] + b'\x01' block = int.from_bytes(block, byteorder) acc = (acc + block) * r % p acc = (acc + s) # no more % p here !!! acc = int(acc % 2**128) return acc.to_bytes(16, byteorder) def construct_poly1305_coeffs(msg:bytes, byteorder='little'): # get coefficients of the polynomial from the message bytes coeff = [] for i in range(0, len(msg), 16): block = msg[i:i+16] + b'\x01' block = int.from_bytes(block, byteorder) coeff.append(block) return coeff def sage_poly1305(msg:bytes, key:bytes, byteorder='little'): r = int.from_bytes(key[:16], byteorder) r = r & 0x0ffffffc0ffffffc0ffffffc0fffffff s = int.from_bytes(key[16:], byteorder) p = 2**130 - 5 # the prime number used in Poly1305 PolynomialRing_GF = PolynomialRing(GF(p), 'x') x = PolynomialRing_GF.gen() poly = x * PolynomialRing_GF(construct_poly1305_coeffs(msg, byteorder)[::-1]) + s val = int(poly(r)) return int(val % 2**128).to_bytes(16, byteorder) def is_valid_r(r): # check if r is a valid Poly1305 key # from RFC specification: https://datatracker.ietf.org/doc/html/rfc7539#section-2.5 return (r & 0x0ffffffc0ffffffc0ffffffc0fffffff) == r def recover_poly1305_key_from_nonce_reuse(msg1:bytes, tag1:bytes, msg2:bytes, tag2:bytes, byteorder='little'): """Recover the Poly1305 key when nonce is reused. Args: msg1 (bytes): The first message tag1 (bytes): The first tag msg2 (bytes): The second message tag2 (bytes): The second tag Returns: list: keys r, s """ p = 2**130 - 5 # the prime number used in Poly1305 pp = 2**128 PolynomialRing_GF = PolynomialRing(GF(p), 'x') x = PolynomialRing_GF.gen() poly1 = x * PolynomialRing_GF(construct_poly1305_coeffs(msg1, byteorder)[::-1]) a1 = int.from_bytes(tag1, byteorder) poly2 = x * PolynomialRing_GF(construct_poly1305_coeffs(msg2, byteorder)[::-1]) a2 = int.from_bytes(tag2, byteorder) # find all possible keys roots = [] print(f"[+] Searching for the key with poly.degree() = {(poly1 - poly2).degree()}") # the range is larger than expected # since in poly1305 acc + s does not have to be modulo p for tag1 in range(a1, p + pp, pp): for tag2 in range(a2, p + pp, pp): f = (poly1 - poly2) - (tag1 - tag2) root = f.roots(multiplicities=False) for r in root: r = int(r) if is_valid_r(r): # print(f"[+] Found a valid key: {r}") # check if the key is valid s = int(a1 - int(poly1(r))) % pp if (int(poly1(r)) + s) % pp == a1 and (int(poly2(r)) + s) % pp == a2: roots.append((r, s)) return list(set(roots)) def derive_poly1305_key(key:bytes, nonce:bytes): assert len(key) == 32 and len(nonce) == 12, "The key should be 32 bytes and the nonce should be 12 bytes" chacha20_block = ChaCha20Cipher(key, nonce).encrypt(b'\x00'*64) return chacha20_block[:32] def chachapoly1305_merger(ad:bytes, ct:bytes): """Merge the associated data and the ciphertext Args: ad (bytes): The associated data ct (bytes): The ciphertext Returns: bytes: The merged data """ def pad(data, block_size=16): if len(data) % block_size == 0: return data return data + b'\x00' * (block_size - len(data) % block_size) la = len(ad) lc = len(ct) out = pad(ad) + pad(ct) + la.to_bytes(8, 'little') + lc.to_bytes(8, 'little') return out def chachapoly1305_nonce_reuse_attack(ad1:bytes, ct1:bytes, tag1:bytes, ad2:bytes, ct2:bytes, tag2:bytes): """Recover the Chacha-Poly1305 key when nonce is reused. The two messages are encrypted with the same nonce and the same key. Args: ad1 (bytes): The first associated data ct1 (bytes): The first ciphertext tag1 (bytes): The first tag ad2 (bytes): The second associated data ct2 (bytes): The second ciphertext tag2 (bytes): The second tag Returns: list: keys r, s """ inp1 = chachapoly1305_merger(ad1, ct1) inp2 = chachapoly1305_merger(ad2, ct2) return recover_poly1305_key_from_nonce_reuse(inp1, tag1, inp2, tag2) def forge_poly1305_tag(ad:bytes, ct:bytes, r:int, s:int): """Forge a Poly1305 tag given the message and the key Args: ad (bytes): The associated data ct (bytes): The ciphertext r (int): The r key s (int): The s key """ key = r.to_bytes(16, 'little') + s.to_bytes(16, 'little') msg = chachapoly1305_merger(ad, ct) return poly1305(msg, key) def chachapoly1305_forgery_attack(ad1:bytes, ct1:bytes, tag1:bytes, ad2:bytes, ct2:bytes, tag2:bytes, known_plaintext1:bytes, target_plaintext:bytes, target_ad:bytes): """Recover the Chacha-Poly1305 key when nonce is reused. Args: ad1 (bytes): The first associated data ct1 (bytes): The first ciphertext tag1 (bytes): The first tag ad2 (bytes): The second associated data ct2 (bytes): The second ciphertext tag2 (bytes): The second tag known_plaintext1 (bytes): The known plaintext of ct1 target_plaintext (bytes): The target plaintext to forge target_ad (bytes): The target associated data to forge return: generator: yields the forged ciphertext and tag """ keys = chachapoly1305_nonce_reuse_attack(ad1, ct1, tag1, ad2, ct2, tag2) if len(keys) == 0: assert False, "Failed to recover the key for poly1305, probably the nonce is not reused" assert len(target_plaintext) <= len(known_plaintext1), "The target plaintext should be shorter than the known plaintext" keystream = [b1 ^ b2 for b1, b2 in zip(known_plaintext1, ct1)] target_ct = bytes([b1 ^ b2 for b1, b2 in zip(keystream, target_plaintext)]) for r, s in keys: target_tag = forge_poly1305_tag(target_ad, target_ct, r, s) yield target_ct, target_tag def chachapoly1305_forgery_attack_general(ads:list[bytes], cts:list[bytes], tags:bytes, known_plaintext1:bytes, target_plaintext:bytes, target_ad:bytes): """Recover the Chacha-Poly1305 key when nonce is reused. Args: ads (list[bytes]): The associated data cts (list[bytes]): The ciphertexts tags (list[bytes]): The tags known_plaintext1 (bytes): The known plaintext of ct1 target_plaintext (bytes): The target plaintext to forge target_ad (bytes): The target associated data to forge returns: target_ct, target_tag : bytes, bytes (if not unique, we simply return the first one) """ assert len(ads) == len(cts) == len(tags) and len(cts) >= 2, "The length of the associated data, ciphertexts, and tags should be the same and at least 2" ad1, ct1, tag1 = ads[0], cts[0], tags[0] keys = [] for ad2, ct2, tag2 in zip(ads[1:], cts[1:], tags[1:]): if len(keys) == 0: keys = chachapoly1305_nonce_reuse_attack(ad1, ct1, tag1, ad2, ct2, tag2) else: _keys = chachapoly1305_nonce_reuse_attack(ad1, ct1, tag1, ad2, ct2, tag2) keys = list(set(keys) & set(_keys)) if len(keys) == 1: break if len(keys) == 0: assert False, "Failed to recover the key for poly1305, probably the nonce is not reused" elif len(keys) > 1: print(f"[!] Found multiple keys {len(keys) = }, trying to forge the message, use the first key") print("[!] You can use more nonce-reuse messages to find the unique key") # anyway, use the first key r, s = keys[0] print(f"[+] Using Key {r = }, {s = } {len(keys) = }") assert len(target_plaintext) <= len(known_plaintext1), "The target plaintext should be shorter than the known plaintext" keystream = [b1 ^ b2 for b1, b2 in zip(known_plaintext1, ct1)] target_ct = bytes([b1 ^ b2 for b1, b2 in zip(keystream, target_plaintext)]) target_tag = forge_poly1305_tag(target_ad, target_ct, r, s) return target_ct, target_tag ``` ::: :::spoiler solution.py ```python= from pwn import remote from chacha_poly1305_forgery import chachapoly1305_forgery_attack HOST = "activist-birds.picoctf.net"; PORT = 65086 connection = remote(HOST, PORT) connection.recvuntil(b"Plaintext (hex): ") plaintext1 = bytes.fromhex(connection.recvline().strip().decode()) connection.recvuntil(b"Ciphertext (hex): ") receive1 = bytes.fromhex(connection.recvline().strip().decode()) ciphertext1 = receive1[:-28]; tag1 = receive1[-28: -12]; nonce1 = receive1[-12:] associated1 = b"" connection.recvuntil(b"Plaintext (hex): ") plaintext2 = bytes.fromhex(connection.recvline().strip().decode()) connection.recvuntil(b"Ciphertext (hex): ") receive2 = bytes.fromhex(connection.recvline().strip().decode()) ciphertext2 = receive2[:-28]; tag2 = receive2[-28: -12]; nonce2 = receive2[-12:] associated2 = b"" assert nonce1 == nonce2, "Something is not right!?" plaintext3 = b"But it's only secure if used correctly!" associated3 = b"" forge3 = chachapoly1305_forgery_attack(associated1, ciphertext1, tag1, associated2, ciphertext2, tag2, plaintext1, plaintext3, associated3) for item in forge3: ciphertext3, tag3 = item my_message = (ciphertext3 + tag3 + nonce1).hex() connection.sendlineafter(b"What is your message? ", my_message.encode()) connection.recvuntil(b"b\"But it's only secure if used correctly!\"\n") flag = connection.recvline().strip().decode() connection.success(f"Got flag: {flag}") connection.close() # [+] Got flag: picoCTF{7urn_17_84ck_n0w_54e24a92} ``` ::: :::success **Flag: picoCTF{7urn_17_84ck_n0w_54e24a92}** ::: ## flag_printer ![image](https://hackmd.io/_uploads/HkJKj3kkbl.png) :::spoiler flag_printer.py ```python= import galois import numpy as np MOD = 7514777789 points = [] for line in open('encoded.txt', 'r').read().strip().split('\n'): x, y = line.split(' ') points.append((int(x), int(y))) GF = galois.GF(MOD) matrix = [] solution = [] for point in points: x, y = point solution.append(GF(y % MOD)) row = [] for i in range(len(points)): row.append(GF((x ** i) % MOD)) matrix.append(GF(row)) open('output.bmp', 'wb').write(bytearray(np.linalg.solve(GF(matrix), GF(solution)).tolist()[:-1])) ``` ::: :::spoiler encoded.txt (vì quá dài nên mình chỉ ghi một phần vào đây) ```text= 0 66 1 77611334 2 7296865972 3 6005985327 4 2768154539 5 3553732454 6 1992438015 7 538931207 8 3393488850 9 1014238907 10 5980514119 11 928281187 12 3328481670 13 2083418691 14 4530729570 15 1519132131 16 4341836794 17 6759701801 18 241051693 19 4103215828 20 2673513650 21 4311612663 22 4913401179 23 3019555041 24 3399769517 25 2003227587 26 6186773879 27 1939246820 28 7040525474 29 4339076365 30 896953233 31 5933157308 32 3724130554 33 5101531330 34 2353859308 35 413013965 36 6843441273 ... ``` ::: Niềm vui lớn nhất khi đọc **write-up** của người khác đó là học được nhiều thứ hay, và điều này đúng với bài này. Bài này chỉ đơn giản là tìm một đa thức $f(x)$ trong vành $\mathbb{F}_p[x]$ sao cho f(x) thỏa mãn tất cả các cặp giá trị $(x, f(x))$ trong file `encoded.txt`. Vì phương thức `.lagrange_polynomial` trong **SageMath** có độ phức tạp thời gian trung bình **(big-O)** là $O(n^2)$, nên nếu dùng nó để chạy khôi phục lại đa thức thì có thể sẽ mất từ hàng chục tiếng đến vài ngày. May mắn thay, số nguyên tố $p$ khá nhỏ nên ta có thể áp dụng một thuật toán mà mình mới OSINT được [**ở đây**](https://github.com/PetePriority/picoctf-2024/blob/main/cryptography/flag_printer/solve_fast.ipynb). Thuật toán này có thể coi là nhanh nhất hiện nay rồi (đối với bài này chỉ chạy mất 1 đến 2 phút là ra kết quả 🤯), về mặt lý thuyết tại sao thuật toán này chạy nhanh đến như vậy thì bạn có thể tham khảo tại: **[1] von zur Gathen, Joachim Gerhard, Jürgen - Modern Computer Algebra, third edition, 2013, Cambridge Press** >[!Note] >Ngoài thuật toán chạy cực nhanh ở trên, ta còn có một thuật toán tuy chậm hơn nhiều nhưng vẫn miễn cưỡng chạy được với số lượng cặp điểm lớn như vậy. Để đọc thêm về thuật toán đó, hãy xem bài viết của **grhkm** [tại đây](https://codeforces.com/blog/entry/94143). >Độ phức tạp thời gian trung bình của thuật toán này chỉ có $O(n\cdot \log^2n)$, hoàn toàn khả thi để chạy với số lượng cặp điểm lớn như vậy (thực tế thì chỉ chạy 6 đến 7 tiếng là ra đáp án). Dưới đây là toàn bộ lời giải cho thử thách này: :::spoiler solve.sage ```python= p = 7514777789 X = [] Y = [] for line in open('encoded.txt', 'r').read().strip().split('\n'): x, y = line.split(' ') X.append(int(x)) Y.append(int(y)) try: K = GF(p) R = PolynomialRing(K, 'x') except NameError: raise Exception("SageMath kernel is required to run this script!") class Tree(): def __init__(self, poly, X, left=None, right=None): self.left = left self.right = right self.poly = poly self.X = X def __len__(self): return len(self.X) def __call__(self, *args, **kwds): self.poly(*args, **kwds) def __mul__(self, other): return Tree(self.poly * other.poly, self.X + other.X, self, other) def __call__(self, *args, **kwds): return self.poly(*args, **kwds) def compTree(X): x = R.gen() height = ceil(log(len(X))/ log(2)) nodes = [] for xk in X: nodes.append(Tree(R(x-xk), [xk])) for i in range(1, height): new_nodes = [] for j in range(0, len(nodes)-1, 2): node = nodes[j] * nodes[j+1] new_nodes.append(node) if len(nodes) % 2 == 1: new_nodes.append(nodes[-1]) nodes = new_nodes assert(len(nodes) == 2) return nodes[0] * nodes[1] def fastEval(f, tree): if f.degree() < 2 or tree.poly.degree() < 2: if tree.poly.degree() == 1: return [f(-tree.poly(0))] if f.degree() == 0: print("degree 0") return [f] A = B = 0 if tree.left: _, r1 = f.quo_rem(tree.left.poly) A = fastEval(r1, tree.left) else: print("left empty") if tree.right: _, r2 = f.quo_rem(tree.right.poly) B = fastEval(r2, tree.right) else: print("right empty") return A + B def calcWeights(Y, tree): print("calculating deriviative") Zp = tree.poly.derivative() print("evaluating") Wj = fastEval(Zp, tree) print("putting it together") return [y/w for y, w in zip(Y, Wj)] def _fast_interpolate(weights, tree): if len(tree) == 1: return weights[0] W1 = weights[:len(tree.left)] W2 = weights[len(tree.left):] r0 = _fast_interpolate(W1, tree.left) r1 = _fast_interpolate(W2, tree.right) return r0 * tree.right.poly + r1 * tree.left.poly def fast_interpolate(X, Y): print("computing tree") tree = compTree(X) print("computing weights") weights = calcWeights(Y, tree) print("interpolating...") return _fast_interpolate(weights, tree) f = fast_interpolate(X, Y) open("output.bmp", "wb").write(bytearray(f.coefficients(sparse=False)[:-1])) ``` ::: :::success **Flag: picoCTF{i_do_hope_you_used_the_favorite_algorithm_of_every_engineering_student}** ![image](https://hackmd.io/_uploads/r1zEcCkJ-l.png) ::: ## Summary Trên đây là các bài ở trang web **picoCTF**, như các bạn đã thấy, chúng cần rất nhiều kiến thức khác nhau và write-up cho mỗi bài cũng rất dài, không thì cũng phải đi tìm tool mạnh thhif mới giải được. Chúc các bạn một ngày tốt lành và hy vọng write-up này sẽ giúp ích cho các bạn trong những cuộc thi quan trọng. Xin chào và hẹn gặp lại các bạn ở phần sau 👋.