# The Gold Bug puzzle (DEF CON 28)
The Gold Bug is an annual DEF CON puzzle hunt, focused on cryptography, run by the Crypto & Privacy Village.
* [Puzzle BBS](https://goldbug.cryptovillage.org/)
* [Hints on Twitter](https://twitter.com/search?q=%23TheGoldBug%20%40CryptoVillage&src=typed_query)
# Quarantine Confusion
> Nothing special here - just a basic crossword.
We are given a crossword puzzle.
Note that there are numbers to the right of each clue, along with a series of numbers at the bottom. The number of numbers at the bottom matches the number of clues in the puzzle so it seems each number may correspond to one clue in some way.
If we try filling in words in the crossword puzzle, we notice the intersection areas don't match up. For example, using `GEM` for 5-down and `SPORE` for 8-across leads to a conflict between `E` and `R` in the overlapping area. A different approach seems to be needed.
Note the challenge description seems to state this to be a **basic** crossword. Through experimentation, we find that interpreting the answer word as some base number and converting it to the base of the number written to the right of the clue gives us a new word in English. We can find the new English words for all clues and fill out the puzzle, finding the base number we are interpreting the initial answer word as via brute-force using [online base conversion tool](https://www.rapidtables.com/convert/number/base-converter.html) or scripts to speed up the process.
The final solved puzzle looks as below:
Here are the initial words and the new words, along with the base numbers that they were interpreted as/converted to:
3: SAP 36 => ACED 15
4: CRUS 32 => FILM 30
6: DORK 29 => BANK 31
8: SPORE 34 => PLACE 35
9: CASHES 34 => GLOUTS 32
11: KEEN 25 => BRAN 30
12: ADDED 18 => SPIT 34
14: HOGS 34 => EXES 36
15: GOGO 29 => BEAD 33
1: BRAN 30 => KEEN 25
2: POOS 29 => DILL 36
3: BEE 28 => ALB 29
5: GEM 27 => MIC 23
7: TRAP 30 => KILN 34
8: ROWS 34 => PETS 35
9: JOKE 30 => GALE 32
10: PLACE 35 => SPORE 34
11: DAB 14 => BEE 15
13: RACY 36 => TOAD 35
Now we need to work with the numbers at the bottom of the puzzle:
Given that the number of numbers at the bottom match the number of clues, it seems each number would correspond to a number in the clue.
Note that we haven't used the base numbers used to interpret the initial words that we found via trial/error anywhere:
Interpreting both as base 36, and finding the differences gives us some plaintext in leetspeak - `H4X0RI5Y0UR0Y4L8LU3`:
a = [36,32,29,34,34,25,18,34,29,30,29,28,27,30,34,30,35,14,36]
b = [19,28,32,34,7,7,13,0,29,0,2,28,29,26,13,22,14,20,33]
f = ''
s = string.digits + string.ascii_uppercase
for i in range(len(a)):
f += s[(a[i]-b[i])%36]
The answer to the challenge seems to be `royalblue` which works!
# Computer Lab
> The virology lab is having trouble keeping its samples straight. It seems the swabs have been swapped.
We are given 4 different groups of text.
Starting with the 1st group of words, we find that they seem to be words that represent well-known computer malware, so we can fill them in:
STU_N_T -> STUXNET
_RYP_OLOCKER -> CRYPTOLOCKER
__OVEYOU -> ILOVEYOU
MY__OM -> MYDOOM
ANNA _OURNIKO_A -> ANNA KOURNIKOVA
EX_LORE_IP -> EXPLOREZIP
S_L SL_MMER -> SQL SLIMMER
PS__0T -> PSYBOT
_AN_ACRY -> WANNACRY
_ER_SALEM -> JERUSALEM
T_ANATO_ -> THANATOS
TO_PI_ -> TORPIG
_LA_E -> FLAME
The second and third group of words don't seem to represent any known English words however.
We can decrypt the fourth group using [quipqiup](quipqiup.com).
FBQZ! SBZI B TSBQZIQ DMMZ BS GMPQ XMQZ BJL HEJL AEYQBOT SBZIJ EJ SFQIIT.
HARK! TAKE A STARKER LOOK AT YOUR WORK AND FIND BIGRAMS TAKEN IN THREES.
If we look into the substitution mappings, we can notice they are in pairs:
Perhaps the pair of letters filled in from each row from the first group serve as substitution mappings to decrypt the second group?
Based on the substitution mappings of the first group (missing letters from each row):
We can decrypt the second group of words!:
T_BZC_IDRB -> C_YPT_LOGY
_DDIX_W IDRLT -> _OOLE_N LOGIC
HB_YD_ -> SY_BO_
_L_XWXGX -> _I_ENERE
_G_ZCDH -> _R_PTOS
Z_R_XW -> P_G_EN
GQLI_XWT_ -> RAIL_ENC_
_D_LQT -> _O_IAC
_QKQ_D -> _AVA_O
TLZ_XGCX_C -> CIP_ERTE_T
__QOGQCLT HLXKX -> __ADRATIC SIEVE
XIILZ_L_ TJGKX -> ELLIP_I_ CURVE
ZQ_H_DGO -> PA_S_ORD
These seem to be words based on cryptography. We can fill in the missing letters:
C_YPT_LOGY -> CRYPTOLOGY
_OOLE_N LOGIC -> BOOLEAN LOGIC
SY_BO_ -> SYMBOL
_I_ENERE -> VIGENERE
_R_PTOS -> KRYPTOS
P_G_EN -> PIGPEN
RAIL_ENC_ -> RAILFENCE
_O_IAC -> ZODIAC
_AVA_O -> NAVAJO
CIP_ERTE_T -> CIPHERTEXT
__ADRATIC SIEVE -> QUADRATIC SIEVE
ELLIP_I_ CURVE -> ELLIPTIC CURVE
PA_S_ORD -> PASSWORD
Similar to how we decrypted the ciphertexts in the second group, we can use the substitution mappings from the second group to decrypt the third group of words.
Substitution mapping for the second group (missing letters from each line):
Decrypting the third group of words gives us:
WB_P_BLB -> SA_I_AMA
_Q_LBMB -> _U_MALA
W_QCXB_ICRJ -> S_UTHA_PTON
ABOLB__ROFWC -> BARMA__OREST
_P_B -> _I_A
CR__QF CFJR -> TO__UE TENO
YQJ_P_ -> KUN_I_
TR_IR_ -> CO_PO_
_BJJ_ -> _ANN_
_BT_PJPB -> _AC_INIA
QQYQJP_L_ -> UUKUNI_M_
_RO_WZBMF -> _OR_SDALE
_R_TBJB -> _O_CANA
These seem to be words based on human viruses. We can fill in the missing letters:
SA_I_AMA -> SAGIYAMA
_U_MALA -> PUUMALA
S_UTHA_PTON -> SOUTHAMPTON
BARMA__OREST -> BARMAHFOREST
_I_A -> ZIKA
TO__UE TENO -> TORQUE TENO
KUN_I_ -> KUNJIN
CO_PO_ -> COWPOX
_ANN_ -> BANNA
_AC_INIA -> VACCINIA
UUKUNI_M_ -> UUKUNIEMI
_OR_SDALE -> LORDSDALE
_O_CANA -> TOSCANA
Finally, getting the substitution mapping from the third group gives us:
This matches the substitution mappings from the fourth group! So it seems the intended solution was to use these to decrypt the fourth group ciphertext.
*Comment from puzzle setter: Exactly!*
Going back to what the fourth group ciphertext decrypted to:
HARK! TAKE A STARKER LOOK AT YOUR WORK AND FIND BIGRAMS TAKEN IN THREES.
We need to look back at our work and find bigrams taken in threes.
We use [dcode.fr's bigrams frequency analysis tool](https://www.dcode.fr/bigrams) for each of the decrypted words in the first 3 groups.
Note that we should select the `by sliding/overlapping (ABCDEF => AB,BC,CD,DE,EF)` option in the Bigrams Parameters section so all bigrams in each row can be taken into account.
Computing the frequencies for the 1st group of words (STUXNET, CRYPTOLOCKER, etc.), we find 5 bigrams that appear 3 times:
NA 3× 2.78%
TO 3× 2.78%
LO 3× 2.78%
ER 3× 2.78%
AN 3× 2.78%
For the 2nd group of words (CRYPTOLOGY, BOOLEAN LOGIC, etc.), we find 4 bigrams that appear 3 times:
OL 3× 2.63%
PT 3× 2.63%
IC 3× 2.63%
EN 3× 2.63%
For the 3rd group of words (SAGIYAMA, PUUMALA, etc.), we find 3 bigrams that appear 3 times:
MA 3× 2.97%
OR 3× 2.97%
TO 3× 2.97%
For the 4th group of words (HARK! TAKE A...THREES), we find 3 bigrams that appear 3 times:
TA 3× 5.36%
KE 3× 5.36%
RK 3× 5.36%
Observing all the bigrams that appear 3 times together:
1st group: NA TO LO ER AN
2nd group: OL PT IC EN
3rd group: MA OR TO
4th group: TA KE RK
Note that we may be able to form a word by taking one bigram from each group.
We can get `TO` from the 1st group, `OL` from the 2nd group, `MA` from the 3rd group, and `RK` from the 4th group to spell `TOOLMARK`.
Submitting this as the answer works!
# Prime Time
> Unfortunately, the order is a bit off.
We are given a large number and the prime factorization of a number:
Factoring the large number `14697688090486313344041676760459967385` using [factordb](factordb.com), we find the number nicely factors to small prime numbers:
The following key hints were released for this challenge:
* Consider your factors in order... and observe the gaps
* Once you find the gaps, pull yourself together and start over
* After my second factoring in the morning, I like to sort things; the order may seem arbitrary to other people, but i get excited At the Sequence of Corresponding Integer Items
It seems we need to find the differences ("gaps") of the factors in order. However, since they are all primes, we may get more interesting results if we consider the position of each prime number in the prime number line.
First, we need to find what number prime each prime number in the factorization is (i.e. 2 is the 1st prime, 3 is the 2nd prime, etc.): Then, we find the differences between each number. We can write a script to do this:
from Crypto.Util.number import *
p = [5,37,73,103,107,109,157,181,193,239,241,293,347,353,397,431,467]
d = dict()
d = 1
count = 2
for i in range(3,468,2):
d[i] = count
count += 1
arr = 
for num in p:
diffs = 
for i in range(len(arr)-1):
Running the script, we get:
➜ python solve.py
[3, 12, 21, 27, 28, 29, 37, 42, 44, 52, 53, 62, 69, 71, 78, 83, 91]
[9, 9, 6, 1, 1, 8, 5, 2, 8, 1, 9, 7, 2, 7, 5, 8]
Note that if we put the numbers in the differences array together (`9961185281972758`) and try factoring, the factorization doesn't give good small factors. Since the first factor was a `5` (3rd prime number), we may need to add a `3` (`3-0 = 3`) at the start of the differences array. This gives the number `39961185281972758`.
Factoring `39961185281972758`, we get the following factors:
Based on the hints, it seems we need to convert these to ASCII and possibly shift the factors around to a new order. Note that 67 => C, 79 => O, 83 => S, 65 => A, 77 => M, 80 => P, 69 => E if we convert the decimals to ASCII. 2 and 857 don't seem to be convertible but if we make 2 go to the end of 857, we get 8572 => 85,72 => U and H.
Thus, we have the following ASCII characters:
C O S A M P E U H
Unscrambling the letters to form a word, we get `CAMPHOUSE` which works as the answer!
*Comment from puzzle setter: The additional info `2x97x13781` was meant to signal to you the order to take the primes before converting to ASCII. `2x97x13781 = 2673514` which suggests taking the 2nd prime (`67`) then the 6th prime (`6577`), etc. That would have given you `CAMPEOUHS` which was still a bit off from `CAMPHOUSE`.*
# Germs Can't Catch Me
> Who am I? Where am I? How deep do I go?
> PSA: Please remember there is at least one R in quaRantine:
> Run, Read, Rotate, and Relax... in some order.
> And don't forget to wash your hands!
We are given a base64 encoded string:
Decoding it as a file, we find that we get a gzip archive:
➜ echo H4sIAMYNIV8C/21Q0RICIQj8Npkxu246Czv+/1NiCQybHkSB3RV2yMZ3rjToWlleZxtauPDGpd1obejjgWBdwKiwoWXHVaRwcDwFWktITSJJdhTbWdlAB4KjA/G0v2RPmr/3Iqq5qWTdKMTBMHlRzRP9mLsnbM+z/DOnT2r8SIsRTnO5sIXn/OZr0nPDvjt8yNF200MH6nkjnDc0wV8B1AEAAA== | base64 -d > dec
➜ file dec
dec: gzip compressed data, last modified: Wed Jul 29 05:48:54 2020, max compression, original size modulo 2^32 468
Extracting the gzip archive, we find there is a text file containing some ciphertext:
➜ 7z e dec
➜ cat dec~
Decoding the ciphertext as ROT-13, we get the plaintext:
This seems to be numbers explicitly written out. We can convert this to numeric form:
Python 3.8.5 (default, Jul 27 2020, 08:42:51)
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Given that there are non-digit characters such as `A`,`B`, etc. in the number, it seems to be a hex string.
Decoding the string as hex, we find another ciphertext:
'PHW HLGQVY YBX KSJD MF PPX TSDURHR ZWRKMFW KZTA VRINXL BSE AYFHU'
We can decrypt this ciphertext using the Vigenere cipher, finding the key through trial and error based on the resulting plaintext using [Cryptii's online decoder](https://cryptii.com/pipes/vigenere-cipher). If we assume the plaintext for `PHW` is `THE`, we get the key `was`. We can assume the next characters may be `washyourhands` based on the challenge description. Through further deduction, we can recover the entire key to be `washyourhandsoftenwithsoapandwaterforatleasttwentyseconds`, and the resulting plaintext:
THE ANSWER YOU SEEK IS THE MAPUCHE WARRIOR WITH KNIVES FOR HANDS
Googling "mapuche warrior with knives for hands", we find the answer that we want to be `Galvarino`.
# The Secret Life
> Be careful, the third rail is life
We are given 3 different graphics that look like DNA.
Notice that there seems to be 6 different colors used total for all the rectangles in each graphic:
We can open the PDF in GIMP and using the Color Picker tool, find the hex color codes for each color:
Based on the color codes, we can find the names for each color:
light-blue #7fffd4 - Aquamarine
mid-blue #40e0d0 - Turquoise
dark-blue #008080 - Teal
light-green #adff2f - GreenYellow
mid-green #7fff00 - Chartreuse
dark-green #008000 - Green
Note that each of the color names start with either A,T,G,C, the symbols for the 4 bases used in DNA codons.
We can convert the 3 graphics to represent codons based on the colors:
AGT CTT GAA TGT ACT GAG ATC
CCT GAA GAG GAC GAA CGA TTC
AAT AAC ATG AAC CTA GCA ATA
Converting the codons to alphabetic form based on [DNA codon table](https://en.wikipedia.org/wiki/DNA_codon_table), we get:
The challenge description seems to hint at usage of [railfence cipher](http://rumkin.com/tools/cipher/railfence.php). Decrypting `SLECTEIPEEDERFNNMNLAI` with the railfence cipher using 3 rails, we get plaintext!:
`SIMPLE NEEDLE CRAFT NINE` seems to imply that the answer to this challenge is a 9-letter word that means "simple needlecraft". We can use tools such as [OneLook](https://onelook.com/) to find such words and submit to BBS until we find our desired answer: `plainwork`.
*Comment from puzzle setter: Many folks were thrown by a red herring in the visual, specifically the size of the dots. This was initially intended to be a repeated pattern (if you tesselate the pattern, it repeats smnoothly), but we didn't end up using it that way. Also, this flag was particularly hard to find - looking up the flag gave a clear definition, but searching `simple needlecraft` did not yield this word.*
# Infected (Meta)
> This is a meta puzzle - it solves the overall Goldbug puzzle hunt once you have the other solutions. It will be hard to complete without the other puzzle solutions.
We are given the following ciphertext:
We can convert these characters to alphabetic form using a Python script:
s = "?*1|(;?*5;80: :|?'q8 3|;;8* )6-7 ]4608 )|0q6*3 ;48)8 .?[[08). :|?'(8 )| *5?)8|?), :|? ;4(|] ?. ;]6-8 (8]!). :|?'(8 ,?); 0?-7: 6;') 5*|;48( q6(?), *|; -|q6+-19."
alphabet = string.ascii_uppercase
idx = 0
d = dict()
for c in s:
if c == " ":
if c not in d:
d[c] = alphabet[idx]
idx += 1
ctxt = ''
for c in s:
if c == " ":
ctxt += " "
ctxt += d[c]
This gives us:
> ABCDEFABGFHIJ JDAKLH MDFFHB NOPQ RSOIH NDILOBM FSHNH TAUUIHNT JDAKEH ND BGANHDANV JDA FSEDR AT FROPH EHRWNT JDAKEH VANF IAPQJ OFKN GBDFSHE LOEANV BDF PDLOXPCYT
*Comment from puzzle setter: This is the [Gold Bug cipher](https://en.wikipedia.org/wiki/The_Gold-Bug#The_cryptogram)! You didn't need to perform cryptographic analysis to decode this.*
Now we can use [quipqiup](quipqiup.com) to decrypt the ciphertext (taking into account punctuation):
> UNFORTUNATELY YOU'VE GOTTEN SICK WHILE SOLVING THESE PUZZLES. YOU'RE SO NAUSEOUS, YOU THROW UP TWICE (EW!). YOU'RE JUST LUCKY IT'S ANOTHER VIRUS, NOT COVID-19.
It seems the answer to this puzzle is related to a human virus.
Given this is the meta puzzle, we may need to use the answers from the prior puzzle to help find the answer for this challenge:
* QUA: `royalblue`
* COM: `toolmark`
* PRI: `camphouse`
* GER: `galvarino`
* SEC: `plainwork`
If we interpret each answer as an anagram, using [online anagram solver](http://www.wordfinders.com/), we notice we can find virus-related words for `royalblue` and `galvarino`.
* `royalblue`: `rubella` (`oy` left)
* `galvarino`: `variola` (`ng` left)
To find the virus-related words hidden in the remaining answers, we use [ViralZone's human viruses table](https://viralzone.expasy.org/678) for a list of virus names:
* `plainwork`: `norwalk` (`pi` left)
* `toolmark`: `mokola` (`tr` left)
* `camphouse`: `machupo` (`se` left)
Now, taking all of the remaining letters (`oy`,`ng`,`pi`,`tr`,`se`), we try using these as an anagram and solving for a word. Using online anagram solver, we get the word `serotyping` which seems to be a virus-related word. Submitting this as the answer works!