# s/&lt;script&gt;//gi (misc, 115 teams solved) ###### tags: `SECCON CTF 2021` ## Overview > Can you figure out why `s/<script>//gi` is insufficient for sanitizing? > This can be bypassed with `<scr<script>ipt>`. > > Remove `<script>` (case insensitive) from the input until the input contains no `<script>`. In this problem, we are required to speed up collect sanitizing since the size of flag.txt is 64 MiB. ## Bad approach We can do that with `sed` or favorite programming languages. ``` $ echo 'S3CC0N{dum<scr<script>ipt>my}' | sed -e ':loop; s/<script>//gi; tloop' S3CC0N{dummy} $ cat small.txt | sed -e ':loop; s/<script>//gi; tloop' S3CC0N{dummy_flag>_<_pt>>PT><sCRIp<scr<scr<scr!pt>ipt>ipt>} ``` However, for flag.txt, this approach takes very long time and will not finish before the end of the CTF. ``` $ cat flag.txt | sed -e ':loop; s/<script>//gi; tloop' ``` Let $n$ be the size of input. Removing `<script>` once takes $O(n)$. We have to remove `<script>` at most $O(n)$ times. So, overall execution time is $O(n^2)$, which is too long for 64 MiB. ## Good approach To reduce the time complexity, we can use a [stack](https://en.wikipedia.org/wiki/Stack_\(abstract_data_type\)). Push each character of the input and if the top of the stack is `<script>`, pop it. What is left in the stack will be the answer. Since push and pop take $O(1)$ time, overall execution time is $O(n)$. Note that `stack` must be a `list`, not a `string`. If `stack` is `string`, a single instruction `stack = stack[-8:]` takes $O(n)$ time. ```python= import sys with open(sys.argv[1]) as f: data = f.read()[:-1] stack = [] for i, c in enumerate(data): if i%0x10000==0: print(f"{i/len(data)*100:6.2f} %") stack += [c] if "".join(stack[-8:]).lower()=="<script>": for i in range(8): stack.pop() print("".join(stack)) ``` ``` $ time python3 solve.py flag.txt 0.00 % 0.10 % 0.20 % 0.29 % : 99.71 % 99.80 % 99.90 % 100.00 % SECCON{sanitizing_is_not_so_good><_escaping_is_better_iPt><SCript<ScrIpT<scRIp<scRI<Sc<scr!pt>} real 0m27.691s user 0m27.424s sys 0m0.140s ```