# [2020q1](http://wiki.csie.ncku.edu.tw/linux/schedule) 第 12 週測驗題
###### tags: `linux2020`
:::info
目的: 檢驗學員對 Linux 記憶體管理、memfd 和 epoll 系統呼叫的認知
:::
==[作答表單](https://docs.google.com/forms/d/e/1FAIpQLSewyNR3_m8ksRmjkYmOZzZVcdy8iIz6ZbnqNVK4AqWZkNkPhw/viewform?usp=sf_link)==
---
### 測驗 `1`
在 [你所不知道的 C 語言:連結器和執行檔資訊](https://hackmd.io/@sysprog/c-linker-loader) 提過 ELF 執行檔格式,更多資訊可見 [Executable and Linkable Format](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format),以 64 位元 ELF 來說,開頭的幾個位元組的意義:
| offset | size | Purpose |
|:------:|:----:|:-------:|
| 0x00 | 4 | 0x7F followed by ELF(`45` `4c` `46`) in ASCII; these four bytes constitute the magic number. |
| 0x04 | 1 | This byte is set to either 1 or 2 to signify 32- or 64-bit format, respectively. |
| 0x05 | 1 | This byte is set to either 1 or 2 to signify little or big endianness, respectively. This affects interpretation of multi-byte fields starting with offset `0x10`. |
| x06 | 1 | Set to 1 for the original and current version of ELF. |
| ... | ... | ...待續... |
以下程式碼嘗試在既有的 ELF 檔案內嵌另一個 ELF 檔案 (可預先加密),目的是隱匿特定的程式,避免被掃毒程式或防火牆偵測出來,或將高價值的程式嵌入到文件、圖片,甚至是影音檔案中,透過特定的載入器自檔案提取出執行檔並執行,這手法在 [Digital rights management (DRM)](https://en.wikipedia.org/wiki/Digital_rights_management) 和 [Digital watermarking](https://en.wikipedia.org/wiki/Digital_watermarking) 領域不算少見。
假設即將被嵌入的程式碼名為 `payload.c`:
```cpp
#include <stdio.h>
int main() { puts("Hello world!"); return 0; }
```
編譯並移去除錯用的符號:
```shell
$ gcc -Os payload.c -o payload
$ strip -s payload
```
接著我們要開發得以載入 ELF 的程式,在這之前,先探討以下函式及系統呼叫:
* [memfd_create](http://man7.org/linux/man-pages/man2/memfd_create.2.html): 詳見 [解析 Linux 共享記憶體機制](https://hackmd.io/@sysprog/linux-shared-memory) 一文
* [memmem](http://man7.org/linux/man-pages/man3/memmem.3.html): GNU extension,在給定的記憶體範圍找到「非」C-style 字串 (仍為連續記憶體)
* [fexecve](http://man7.org/linux/man-pages/man3/fexecve.3.html): 類似 execve 系統呼叫,但由給定的 file descriptor 載入程式並執行
假定程式載入器檔名為 `loader.c`,內容如下:
```cpp
/* A program that executes a second (embedded) ELF */
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <unistd.h>
/* No glibc wrappers exist for memfd_create(2), so provide our own. */
#include <sys/syscall.h>
static inline int memfd_create(const char *name, unsigned int flags)
{
return syscall(__NR_memfd_create, name, flags);
}
/* ELF format
* https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
*/
static bool valid_elf(char *ptr)
{
return (ptr[4] == 1 || ptr[4] == 2) /* offset 0x4: 32/64-bit format */ &&
(ptr[5] == 1 || ptr[5] == 2) /* offset 0x5: endianness */ &&
(ptr[6] == 1); /* offset 0x6: current version */
}
int main(int argc, char *argv[], char **envp)
{
int pid = getpid();
int ret = 0;
char proc_path[32];
sprintf(proc_path, "/proc/%d/exe", pid);
int filedesc = open(proc_path, O_RDONLY);
if (filedesc < 0) {
printf("Invalid file descriptor for /proc: %d\n", filedesc);
return -1;
}
/* Find the size of this executable */
struct stat st;
stat(proc_path, &st);
size_t size = st.st_size;
char *entirefile = malloc(size);
if (!entirefile) {
printf("Insufficient memory.\n");
return -2;
}
read(filedesc, entirefile, size);
close(filedesc);
/* find the second ELF header, which 52 or 64 bytes long for 32-bit and
* 64-bit binaries respectively.
*/
const char elf_magic[] = {0x7F, 'E', 'L', 'F'};
char *newelf = memmem(entirefile + 52, size - 52, elf_magic, 4);
if (newelf && !valid_elf(newelf)) /* forcely find again for real ELF */
newelf = memmem(newelf + 6, size - (intptr_t) newelf - 6, elf_magic, 4);
if (!newelf || !valid_elf(newelf)) {
printf("No second ELF header found.\n");
ret = -3;
goto cleanup;
}
int newsize = AAA;
int memfd = memfd_create("hidden", 0);
if (memfd < 0) {
printf("Invalid memfd.\n");
ret = -4;
goto cleanup;
}
/* Write ELF to temporary memory file */
write(memfd, newelf, newsize);
// Deploy the payload as a different process
fork();
if (BBB) {
ret = fexecve(memfd, argv, envp); /* Execute the in-memory ELF */
/* The above will only return if there is an error. */
printf("Fail to execute payload. ret=%d (%s)\n", ret, strerror(errno));
}
cleanup:
free(entirefile);
return ret;
}
```
編譯、嵌入上述 `payload` 執行檔,然後再執行: (你沒看錯,真的用 `cat` 命令)
```shell
$ gcc -Wall loader.c -o loader
$ cat payload >> loader
$ ./loader
```
在 x86_64 GNU/Linux (核心版本: 4.15+) 預期輸出為:
```
Hello world!
```
> 注意:只有一行 "Hello world!" 字串
請補完程式碼,只要考慮 x86_64 硬體架構即可。
==作答區== (注意: 複選題,儘量選取有效的答案)
AAA = ?
* `(a)` `newelf - entirefile`
* `(b)` `size - newelf`
* `(c)` `entirefile - newelf`
* `(d)` `size - newelf - entirefile`
* `(e)` `size - newelf + entirefile`
* `(f)` `newelf - entirefile + size`
* `(g)` `entirefile - newelf - size`
`size - (newelf - entirefile)`
BBB = ?
* `(a)` `getpid()`
* `(b)` `getpid() != pid`
* `(c)` `getpid() == pid`
* `(d)` `0`
* `(e)` `1`
:::success
延伸問題:
1. 解釋上述程式碼運作原理,指出其中不足處並改進;
2. 參照 [Embedding binary data in executables](https://csl.name/post/embedding-binary-data/) 和 [incbin](https://github.com/graphitemaster/incbin),將 payload 加密並嵌入到給定的 C 程式中,允許在執行時期解密再載入 payload 並執行
3. 學習 [Digital rights management (DRM)](https://en.wikipedia.org/wiki/Digital_rights_management) 手法,實作一個電子書程式,將特定的文字檔案加密再嵌入於執行檔中,只有在特定的機器 (例如偵測 [MAC address](https://simple.wikipedia.org/wiki/MAC_address)) 才能開啟閱讀,過程中不會在檔案系統出現明文的文字檔案暫存檔。
:::
---
### 測驗 `2`
考慮以下是向 [libev](http://software.schmorp.de/pkg/libev.html) 致敬的 event loop 實作,預期執行輸出如下:
```
Test: oneshot timer
callback called
Test: periodic timer
callback timer periodic called 5
callback timer periodic called 4
callback timer periodic called 3
callback timer periodic called 2
callback timer periodic called 1
run timer cancel test ... passed
Test: raw events
```
原始程式碼 [ev.c](https://gist.github.com/jserv/0041219ef251e326c6fa18b3f170e1b8) 包含註解和單元測試程式。
> 編譯選項: `-O2 -std=gnu99`
預先學習的 API:
* [timerfd_create](http://man7.org/linux/man-pages/man2/timerfd_settime.2.html)
* [signalfd](http://man7.org/linux/man-pages/man2/signalfd.2.html)
請依據預期執行輸出,補完程式碼。
==作答區== (單選)
CCC = ?
* `(a)` 0
* `(b)` 1
DDD = ?
* `(a)` 0
* `(b)` 1
EEE = ?
* `(a)` 0
* `(b)` 1
---