# You can find a list of popular websites using DataDome over at [builtwith.com](https://trends.builtwith.com/websitelist/DataDome) We recomend you checkout [CoinGecko](coingecko.com) as an example of a website using DataDome. ## Network log Let's start by launching a browser and opening the developer tools. Switch to the Network tab and navigate to [CoinGecko](coingecko.com). You can filter the network request by the keyword `datadome` if you wish. The first DataDome related request is a GET request to fetch the JavaScript file over at `https://js.datadome.co/tags.js`: ![GET request to fetch the `tags.js`](https://i.imgur.com/56TFc9q.png) As part of the obfuscation, such files are sometimes made to vary from request to request in order to confuse anybody trying to understand it. This can be achieved using on-time [Polymorphic JavaScript obfuscation](https://docs.jscrambler.com/code-integrity/documentation/randomization-seed) which randomizes the output. In our case, the script file is static and this can be checked by making multiple similar requests and comparing the responses. This file is where the subsequent network requests will be made. The second network request is a POST request to `https://api-js.datadome.co/js/` ![](https://i.imgur.com/KSXJ4tc.png) Its payload contains multiple parameters, but the two most significant of them are: - `events`: array of events captured by the script such as mouse movements, scrolls and keyboard inputs - `jsData`: most significant parameter of them all timezone, screen details, user-agent, languages, browser details (vendor, plugins, ). It is used to fingerprint the browser visiting the protected page and ... If all goes well, the response to this request will contain a cookie named `datadome`, which will be supplied in subsequent requests as proof of cleareance. ## De-obfuscating the challenge In order to make a little more sense out of the DataDome script, we can make use of an online JavaScript deobfuscator, such as [DeObfuscate.io](deobfuscate.io). It will take care of renaming variables to more human-friendly formats, abstract proxy functions and overall simplify expressions. <TBD> Transition to analysis of techniques used in Obfuscation below </TBD> ### String concealing To lower the readability of the script, variables and functions are renamed to meaningless names, and all references to strings are replaced with hexadecimal representation. After converting these hexadecimal identifiers to human-readable versions, one could start figuring out the pieces of the puzzle: - Reference to third-party CAPTCHA services: https://gist.github.com/TAnas0/e624d4d68c6cddeeccb9d78b64c68ff0#file-tags-deobfuscated-js-L4153 - Construction of the payload of the POST request: https://gist.github.com/TAnas0/e624d4d68c6cddeeccb9d78b64c68ff0#file-tags-deobfuscated-js-L3835 Even JavaScript native functions, such as `serializeToString`, `getContext`... can be obscured through a string concealing function: https://gist.github.com/TAnas0/e624d4d68c6cddeeccb9d78b64c68ff0#file-tags-deobfuscated-js-L24 Throughout the code, this function will be often renamed and called with an integer as a parameter to replace function calls. Translating its usage to the corresponding functions will be a huge step in the deobfuscation. We can set a breakpoint in the browser's developer console where said function will be available as `window.<function_name>` and wrire a simple loop to generate a mapping of parameters into strings outputs: ![](https://i.imgur.com/IqNsqNc.png) After that, we'll write a script to replace calls to the function <function_name> with it's string equivalent. This step helped us turn statements such as `zyquan[christain(598)](christain(262)) > -1) || seth))` into more expressive statements like `zyquan["indexOf"]("dd={'cid'") > -1) || seth))`. ### Control Flow obfuscation The main function in the script make heavy use of recursion and nested calls to blur the path of execution. This is noticeable with the `main` function and the `mynette` function that it defines: ```javascript !(function main(laurien, ananiah, enriquetta) { function mynette(jayziah, nerrissa) { if (!ananiah[jayziah]) { ... laurien[jayziah][0].call( ... function (carliana) { return mynette(laurien[jayziah][1][carliana] || carliana); }, ... main, ... ); } return ananiah[jayziah].exports; } for (...) mynette(enriquetta[zayion]); return mynette; })(...) ``` ### Dead code injection By adding code that will never be used or reached, a JavaScript obfuscator adds considerable overhead to anyone trying to reverse-engineer it. For example, all functions defined at the top-level of [this array](https://gist.github.com/TAnas0/e624d4d68c6cddeeccb9d78b64c68ff0#file-tags-deobfuscated-js-L686) accepts three arguments, but none of uses them all. Unifying the function signatures makes it harder to differentiate between them. We can also find unreachable code in the following format: ```javascript= var garnett = "function" == typeof require && require if (garnett && ...) { ... } ``` Since in our context the script is loaded in a browser, the expression `"function" == typeof require` will always evaluate to `false`. It can evaluate to `true` in a NodeJS environment, but that is out of the scope of our analysis, and is purely environment specific code meant to load JavaScript modules. After reversing the string concealing, the first self-calling function boils down to: ```javascript= (function (trionna, franci) { ... while (true) { try { var atlys = (parseInt("548wJMXnp") / 1) * (-parseInt("398UDVQdb") / 2) + (-parseInt("1653891NJWLhq") / 3) * (-parseInt("4nWnRzt") / 4) + (parseInt("10XFxYRR") / 5) * (parseInt("1193862saJkAB") / 6) + parseInt("2200716biSAhg") / 7 + parseInt("176768IqZPke") / 8 + parseInt("2451537dkXkOE") / 9 + -parseInt("11726860pniIRo") / 10; if (atlys === franci) break; else ...; } catch (byson) { ... } } })(elena, 276390); ``` When we evaluate the variable `atlys` above, it turns out to be equal to `276390`, which means that the `break` statement will be reached in the first iteration. Furthermore, the statement `rhodella.push(rhodella.shift())`, which appears twice, only changes removes the first element and places it in the end of the array. Lastly, the exception being caught (`byson`) is not defined anywhere. ## Main function?? After all the steps above, we can finaly dive into the core of the script, which is the self-invoking function defined at the bottom of the file. It's first parameter is a dictionary. It's values are arrays whose first element is a function, and the rest are eventual parameters to be passed to it. - Usage of the third argument - Modules passed in the dictionary are not all used. Only a single one is. - The call is hidden in a dummy loop. ## Putting it all together TBD