tags:
machine learning
whysw@PLUS
Attachments are uploaded on gist and google drive.
When on website: +1 spam resistance +10 user annoyance
Gotta be fast! 500 in 10 minutes!
https://captcha.chal.uiuc.tf
Author: tow_nater
As you can see in the comments in index.html
, there is captcha.zip
file in https://captcha.chal.uiuc.tf/captchas.zip.
<!--TODO: we don't need /captchas.zip anymore now that we dynamically create captchas. We should delete this file.-->
There were 69696 PNG files, with True answer of captcha.
Additionally, these strange characters are Minecraft Enchantment Table Language
. ttf file was in https://captcha.chal.uiuc.tf/static/mc.ttf.
It is just one-to-one correspondence with the alphabet, so after doing captcha for about an hour, I became possible to distinguish and type these characters in ~5 seconds. (which is not enough to get FLAG!)
I and my teammates tried hard to find other WEB vulnerabilities, but failed.
So we thought that this challenge might be about machine learning…?(even though this chall is in web category) Then, captchas.zip
must be dataset for machine learning.
There are 5 characters at once, so I searched Github for Tensorflow code for OCR on more than 2 characters.
https://github.com/JackonYang/captcha-tensorflow
And here it is!
That original code in github is about solving captcha for 4 digits.
We are dealing with 5 (alphabet) characters, so changed like below.
Previous:
H, W, C = 100, 120, 3 N_LABELS = 10 D = 4
Changed to:
>H, W, C = 75, 250, 3
N_LABELS = 26
D = 5
At first, we used exactly same layer setting with that code, but that fails at least once in 10 trials.
input_layer = tf.keras.Input(shape=(H, W, C))
x = layers.Conv2D(32, 3, activation='relu')(input_layer)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Flatten()(x)
x = layers.Dense(1024, activation='relu')(x)
# x = layers.Dropout(0.5)(x)
x = layers.Dense(D * N_LABELS, activation='softmax')(x)
x = layers.Reshape((D, N_LABELS))(x)
Improving it, we removed one layer,
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
but it made the situation worse…
So we added one more layer from the first one!
input_layer = tf.keras.Input(shape=(H, W, C))
x = layers.Conv2D(32, 3, activation='relu')(input_layer)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D((2, 2))(x)
The output was awesome. We rarely failed! But this is not the end.
We don't have penalty even when we fails. This means we can try again just after we fails. We were able to sort answers by possibility. (because we used softmax)
im = Image.open(BytesIO(base64.b64decode(data)))
data = np.array([np.array(np.array(np.array((np.array(im) / 255.0))))])
y_pred = model.predict_on_batch(data)
res = tf.math.top_k(y_pred, k=3)
prob = np.array(res[0][0])
indices = np.array(res.indices[0])
l = []
beta = prob[0].size
beka = prob.size // beta
for i in range(beka):
k = []
for j in range(beta):
k.append([indices[i][j], prob[i][j]])
l.append(k)
wasm = list(product(*l))
def f(x):
s = 0
for i in x:
s += i[1]
return s
res = sorted(wasm, key=f, reverse=True)
This challenge uses session cookie for counting 15 minutes. It means we can open multiple windows with same cookie.
So we opened another window and used it in emergency situation.
def send(res):
for arr in res[:30]:
trial = ""
for pair in arr:
trial += toCh(pair[0])
r = s.post("https://captcha.chal.uiuc.tf/", data = {"captcha":trial})
ret = r.text.split('<h2>')[1].split('</h2>')[0]
print(ret)
if ret != "Invalid captcha":
return True
return False
while True:
im, res = solve_captcha(get_img())
if not send(res):
input("ALEEEEEEEEEEEEEEEEEEEEEERT!!!!!!!!!!!!!")
When it eventually fails after 30 tries, we manually type the answer, and press enter in python in order to continue.
AND WE GOT…
output : uiuctf{i_knew_a_guy_in_highschool_that_could_read_this}
p.s. Now I can read this too! haha - whysw