[ToC] # What is DOM? DOM stands for Document Object Model. **The DOM represents the web page as a hierachical tree structure**, where each element attribute, and text node is represented as an object. This tree-like structure allows developers to access, modify, and manipulate the elements and content of a web page. # What is taint flow? Taint flow, also known as information flow or data flow, refers to the tracking and analysis of data as it moves throuh a system or application. It involves identifying the origin of data (a source) adn tracking how it propagates and influences other parts of the system (sinks) The concept of taint flow is commonly used in the context of security and vulnerability analysis. ## Sources :::info **A source** is a JavaScript property that accepts data that is potentially attacker-controlled. ::: An example of a source is the `location.search` property because it reads input from the query string, which is relatively simple for an attacker to control. Ultimately, any property that can be controlled by the attacker is a potential source. This includes the referring URL (exposed by the `document.referre` string), the user's cookies (exposed by the `document.cookie` string), and web messages. ## Sinks :::info **A sink** is a potentially dangerous JavaScript function or DOM object that can cause undesirable effects if attacker-controlled data is passed to it. ::: For example, the `eval()` function is a sink because it processes the argument that is passed to ti as JavaScript. An example of an HTML sink is `document.body.innerHTML` because it potentially allows an attacker to inject malicious HTML and execute arbitrary JavaScript. # What is DOM-based XSS? :::info Fundamentally, DOM-based vulnerabilities arise **when a website passes data from a source to a sink** which then handles **the data in an unsafe way** in the context of the client's session ::: ## Exploiting DOM XSS with different sources and sinks :::warning ### LAB: DOM XSS in `document.write` sink using source `location.search` ::: ![](https://hackmd.io/_uploads/S1lj_5Iy32.png) In here, we have a search function will display user input when performing a search like this ![](https://hackmd.io/_uploads/S1Xv83g2h.png) As usual I will try to perform Reflected XSS by trying to break out of `<h1>` tag ![](https://hackmd.io/_uploads/HJO6Dhx3h.png) Looks like there's a mechanism in place to prevent an attempt to break out the `<h1>` tag. Check out source code, we can see that they have [encode data on output](https://portswigger.net/web-security/cross-site-scripting/preventing#encode-data-on-output). That's why our attempt failed. ![](https://hackmd.io/_uploads/SyQUFhg3h.png) But there is another place in HTML using our input ![](https://hackmd.io/_uploads/BJwJAne3h.png) Let try to break out `<img>` tag with ```htmlmixed= "><script>alert('hacked!!');</script> ``` ![](https://hackmd.io/_uploads/ByzdCng23.png) **Bingo!!!** 🥳 #### Let's dive into the source code ![](https://hackmd.io/_uploads/rJTcMTl2n.png) **`window.location.search`** is typically used in JavaScript to retrieve the query string parameters from the current URL For example, if the current URL is `"https://example.com/search?query=apple&page=1"`, then calling `window.location.search` will return `"?query=apple&page=1"` **`.get()`** method is a function available in the URLSearchParams API in JavaScript. It is used to retireve the value of a specific query parameter from a URLSearchParams object For example: ```javascript= // Create a URLSearchParams object var params = new URLSearchParams('?query=apple&page=1'); // Retrieve the value of a specific parameter using .get() var query = params.get('query'); var page = params.get('page'); console.log(query); // "apple" console.log(page); // "1" ``` And we can also clearly see it in our LAB ![](https://hackmd.io/_uploads/SkuQwag22.png) **`document.write`** method allows you to dynamically write content directly to the HTML document That's why in source code, there is no `<img>` tag ![](https://hackmd.io/_uploads/r1bktpxn3.png) But in the HTML structure rendered by browser, there is an `<img>` tag ![](https://hackmd.io/_uploads/rJIy2pg3h.png) And that's how data from a source to a sink can cause DOM-based XSS ![](https://hackmd.io/_uploads/SJ3c3Tl3h.png) :::success **Solved** :thumbsup: ::: ### LAB: DOM XSS in `document.write` sink using source `location.search` inside a select element This LAB don't have a search box like the LAB above ![](https://hackmd.io/_uploads/BJw7MRx32.png) But when we click to a product, it send a query param to server, and in the source code, there is a script using **`window.location.search`** and **`document.write`**. ![](https://hackmd.io/_uploads/SyLg4Cx22.png) #### Let's dive into the source code to see if we can exploit anything here After comparing the different between source code and HTML structure, we can comfirm that this script will generate select box on the site. ![](https://hackmd.io/_uploads/Ske5x1-h2.png) And here is what Chat GPT explains it for me :smiley: ```htmlmixed= <script> var stores = ["London","Paris","Milan"]; var store = (new URLSearchParams(window.location.search)).get('storeId'); document.write('<select name="storeId">'); if(store) { document.write('<option selected>'+store+'</option>'); } for(var i=0;i<stores.length;i++) { if(stores[i] === store) { continue; } document.write('<option>'+stores[i]+'</option>'); } document.write('</select>'); </script> ``` 1. The `stores` array contains three city names: "London", "Paris", and "Milan" 2. The code retrieves the value of the `storeId` parameter from the URL query string using the URLSearchParams and `get` methods. 3. It then dynamically generates the HTML for the dropdown menu using the **`document.write`** method. 4. If the value for `storeId` exists, it creates an `<option>` element with the selected value. 5. It iterates over the `stores` array and generates an `<option>` element for each city name, except for the one that matches the `storeId` value 6. Finally, it closes the `<select>` element. And because it use `window.location.search` and method `get` so we can easily change data store in var `store` by sending a request with param `storeId` like this. ![](https://hackmd.io/_uploads/HkvoDeZhh.png) We can control the data from source **`location.search`** to sink **`document.write`** => Website vulnerable to DOM-based XSS. We will prove it by trying to break out `<option>` tag and calls the `alert` function Payload: ```htmlmixed= </option> <script>alert("hacked!!")</script> ``` ![](https://hackmd.io/_uploads/Sky0sxWn2.png) **Bingo!!!** 🥳 ![](https://hackmd.io/_uploads/S1w4nxZhh.png) :::success **Solved** :thumbsup: ::: ### Lab: DOM XSS in jQuery selector sink using a hashchange event This lab uses jQuery's `$()` selector function to auto-scroll to a given post, whose title is passed via the `location.hash` property Let's see how it works :face_with_monocle: In here we can see a script at the homepage ![](https://hackmd.io/_uploads/Byd9fWfnn.png) #### Let's dive into the source code ```htmlmixed= <script> $(window).on('hashchange', function(){ var post = $('section.blog-list h2:contains(' + decodeURIComponent(window.location.hash.slice(1)) + ')'); if (post) post.get(0).scrollIntoView(); }); </script> ``` The **`$(window).on('hashchange', function(){ ... })`** code attaches an event handler to the hashchange event of the window object. This means this event will be triggered whenever the URL hash changes **`window.location.hash`** returns the hash portion of the URL as a string. For example, if the URL is `http://example.com/page#section1` => `window.locaiton.hash` will be #section1 **`slice(1)`** is a string method that extracts a portion of a string starting from a specified index. In this case, `slice(1)` is used to exclude the first charater (the # symbol) form the hash string. **`$()`** allows you to select and manipulate HTML elements in a document. Here are some examples of how the `$()` selector can be used: ```javascript= $('div') // Selects all <div> elements in the document $('.myClass') // Selects all elements with the class "myClass" $('#myId') // Selects the element with the ID "myId" $('input[type="text"]') // Selects all <input> elements with the attribute type="text" $('.myClass').addClass('highlight').text('Hello, jQuery!') // Selects elements with class "myClass", adds the class "highlight", and sets the text content to "Hello, jQuery!" ``` Once you have selected elements using $(), you can apply various jQuery methods and functions to perform actions like modifying their attributes, manipulating their content, binding event handlers, animating them, ... ```javascript= $('section.blog-list h2:contains(' + decodeURIComponent(window.location.hash.slice(1)) + ')'); ``` This line retrieves the value of the hash from the URL and constructs a jQuery selector to find the `<h2>` element within the <section class="blog-list" that contains the decoded hash value. The **`decodeURIComponent`** is used to decode any special charaters in the hash value If a matching `<h2>` element is found, the code retrieves the DOM element using **`.get(0)`** and calls the **`scrollIntoView`** method to scroll the element into view This is the part that the jQuery selector tries to find in the HTML ![](https://hackmd.io/_uploads/HJ-t3Wf3h.png) So if we specify it in the URL by using anchor part it will scroll the title we specified to the top of the page like this ![](https://hackmd.io/_uploads/BJXPabzn3.png) #### Vulnerable $() sink