# TSG CTF 2021 Natural Flag Processing Author’s Writeup
author: [@dai](https://twitter.com/__dAi00)
## Problem Overview
Find the text which is accepted by the given RNN model.
The model contains one [RNNCell](https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html#torch.nn.RNNCell) `cell` and one [Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html?highlight=linear#torch.nn.Linear) `out`. At first the hidden vector $h \in \mathbb{R}^{520}$ is `None` and this is equivalent to the zero vector `np.zeros(520)`. After processing the input text, $h$ becomes something and the script shows "Congrats!" only when `out(h).sigmoid() > 0.5`.
## Solution
You notice these weights and biases are unnatural. In fact, I didn't run any training code to build them.
This model behaves like [Deterministic Finite Automaton](https://en.wikipedia.org/wiki/Deterministic_finite_automaton), where:
- The state $q_i\in Q\ (0\leq i \lt 520)$ corresponds to the "one-hot" hidden vector $h_i \in \{-1, 1\}^{520}$ where only its $i$-th value is $1$.
- The alphabet set $\Sigma$ ofcourse corresponds to `CHARS`, and `CHARS[i]` is encoded to the one-hot vector $e_i \in \{0,1\}^{67}$.
- The initial state is $q_1$ and it corresponds to $h_1$, which is the output of `cell(<encoded '^'>, None)`.
- The only one accept state is $q_{313}$ and it corresponds to $h_{313}$, which comes from the weight of `out`; it is the one-hot transposed vector whose 313-th value is 1. (This means `main.py` accepts the flag only when the final hidden vector's 313-th value is almost 1.0)
- There is special state, $q_{fail}$. This corresponds to the vector filled with -1.
- The transition function corresponds to the cell's parameters $W_{ih}, W_{hh}, b_{ih}, b_{hh}$ (See the [pytorch document](https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html#torch.nn.RNNCell)). When there is a transition $(q_i, a_k) \rightarrow q_j\ (a_k\in\Sigma, q_i,q_j\in Q)$ in the automaton, $W_{ih}[j, k] = 7.6$ and $W_{hh}[j,i] = 7.6$ hold. Otherwise 0 is filled. Biases are constant, $b_{ih}[:] = 0$ and $b_{hh}[:] = -12$. What will happen under this condition?
When a transition $(q_i, a_k) \rightarrow q_j$ exists, the RNNCell will process the input char $a_k$ and the hidden vector $h_i$ like this:
$$
h' = tanh(W_{ih} e_k + b_{ih} + W_{hh} h_i + b_{hh})\\
= (tanh(-12), tanh(-12), ..., tanh(7.6+7.6-12), ..., tanh(-12))^t\\
\simeq (-1, -1, ..., 1, ..., -1)^t = h_j
$$
Yes, the "one-hot" hidden vector will be mapped to another! Actually, tanh maps to "almost 1.0" only when the contributed coefficients in $W_{ih}$ and $W_{hh}$ are both $7.6$. Now we can say this RNN behaves like an automaton.
(In fact, this model is NOT exactly equivalent to DFA. When the model contains transitions $(q_i, a)\rightarrow q_x$ and $(q_j, b)\rightarrow q_x$, there must be $(q_i, b)\rightarrow q_x$ and $(q_j, a)\rightarrow q_x$.)
Finally, you can backtrack from the accept state. (You can also execute Breadth-first search with pruning looped states or $q_{fail}$. Note that the correct path of states should not contain any loops.)
```solver.py
from pathlib import Path
import string
import torch
from torch import nn
FLAG_CHARS = string.ascii_letters + string.digits + "{}-"
CHARS = "^$" + FLAG_CHARS
class Model(nn.Module):
def __init__(self, inpt, hidden):
super().__init__()
self.cell = nn.RNNCell(inpt, hidden)
self.out = nn.Linear(hidden, 1)
def forward(self, xs):
h = None
for x in xs:
h = self.cell(x, h)
return self.out(h)
model = Model(len(CHARS), 520)
model.load_state_dict(torch.load("model_final.pth"))
Wh = model.cell.weight_hh
Wi = model.cell.weight_ih
bias = model.cell.bias_ih + model.cell.bias_hh
state = model.out.weight.data[0,:].argmax()
answer = ""
while True:
prev_state = Wh[state, :].argmax()
answer += CHARS[Wi[state, :].argmax()]
if Wh[state, prev_state] == 0:
break
state = prev_state
print("FLAG:", answer[-2:0:-1])
# FLAG: TSGCTF{mRNA-st4nDs-f0r-mANuaLLy-tun3d-RecurrEn7-N3uRAl-AutoM4toN}
```
## Appendices: State map
