# TSG CTF 2021 Natural Flag Processing Author’s Writeup author: [@dai](https://twitter.com/__dAi00) ## Problem Overview Find the text which is accepted by the given RNN model. The model contains one [RNNCell](https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html#torch.nn.RNNCell) `cell` and one [Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html?highlight=linear#torch.nn.Linear) `out`. At first the hidden vector $h \in \mathbb{R}^{520}$ is `None` and this is equivalent to the zero vector `np.zeros(520)`. After processing the input text, $h$ becomes something and the script shows "Congrats!" only when `out(h).sigmoid() > 0.5`. ## Solution You notice these weights and biases are unnatural. In fact, I didn't run any training code to build them. This model behaves like [Deterministic Finite Automaton](https://en.wikipedia.org/wiki/Deterministic_finite_automaton), where: - The state $q_i\in Q\ (0\leq i \lt 520)$ corresponds to the "one-hot" hidden vector $h_i \in \{-1, 1\}^{520}$ where only its $i$-th value is $1$. - The alphabet set $\Sigma$ ofcourse corresponds to `CHARS`, and `CHARS[i]` is encoded to the one-hot vector $e_i \in \{0,1\}^{67}$. - The initial state is $q_1$ and it corresponds to $h_1$, which is the output of `cell(<encoded '^'>, None)`. - The only one accept state is $q_{313}$ and it corresponds to $h_{313}$, which comes from the weight of `out`; it is the one-hot transposed vector whose 313-th value is 1. (This means `main.py` accepts the flag only when the final hidden vector's 313-th value is almost 1.0) - There is special state, $q_{fail}$. This corresponds to the vector filled with -1. - The transition function corresponds to the cell's parameters $W_{ih}, W_{hh}, b_{ih}, b_{hh}$ (See the [pytorch document](https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html#torch.nn.RNNCell)). When there is a transition $(q_i, a_k) \rightarrow q_j\ (a_k\in\Sigma, q_i,q_j\in Q)$ in the automaton, $W_{ih}[j, k] = 7.6$ and $W_{hh}[j,i] = 7.6$ hold. Otherwise 0 is filled. Biases are constant, $b_{ih}[:] = 0$ and $b_{hh}[:] = -12$. What will happen under this condition? When a transition $(q_i, a_k) \rightarrow q_j$ exists, the RNNCell will process the input char $a_k$ and the hidden vector $h_i$ like this: $$ h' = tanh(W_{ih} e_k + b_{ih} + W_{hh} h_i + b_{hh})\\ = (tanh(-12), tanh(-12), ..., tanh(7.6+7.6-12), ..., tanh(-12))^t\\ \simeq (-1, -1, ..., 1, ..., -1)^t = h_j $$ Yes, the "one-hot" hidden vector will be mapped to another! Actually, tanh maps to "almost 1.0" only when the contributed coefficients in $W_{ih}$ and $W_{hh}$ are both $7.6$. Now we can say this RNN behaves like an automaton. (In fact, this model is NOT exactly equivalent to DFA. When the model contains transitions $(q_i, a)\rightarrow q_x$ and $(q_j, b)\rightarrow q_x$, there must be $(q_i, b)\rightarrow q_x$ and $(q_j, a)\rightarrow q_x$.) Finally, you can backtrack from the accept state. (You can also execute Breadth-first search with pruning looped states or $q_{fail}$. Note that the correct path of states should not contain any loops.) ```solver.py from pathlib import Path import string import torch from torch import nn FLAG_CHARS = string.ascii_letters + string.digits + "{}-" CHARS = "^$" + FLAG_CHARS class Model(nn.Module): def __init__(self, inpt, hidden): super().__init__() self.cell = nn.RNNCell(inpt, hidden) self.out = nn.Linear(hidden, 1) def forward(self, xs): h = None for x in xs: h = self.cell(x, h) return self.out(h) model = Model(len(CHARS), 520) model.load_state_dict(torch.load("model_final.pth")) Wh = model.cell.weight_hh Wi = model.cell.weight_ih bias = model.cell.bias_ih + model.cell.bias_hh state = model.out.weight.data[0,:].argmax() answer = "" while True: prev_state = Wh[state, :].argmax() answer += CHARS[Wi[state, :].argmax()] if Wh[state, prev_state] == 0: break state = prev_state print("FLAG:", answer[-2:0:-1]) # FLAG: TSGCTF{mRNA-st4nDs-f0r-mANuaLLy-tun3d-RecurrEn7-N3uRAl-AutoM4toN} ``` ## Appendices: State map ![](https://i.imgur.com/9DyIOds.jpg)