從 C 語言規格看資訊安全漏洞

考慮到以下 C 程式:

#include <stdint.h>
#include <stdio.h>
unsigned int ui = 0;
unsigned short us = 0;
signed int si = -1;
int main() {
    int64_t r1 = ui + si;
    int64_t r2 = us + si;
    printf("%lld %lld\n", r1, r2);
}

在 LP64 的執行環境中，其輸出數值為何？

提示: INT02-C. Understand integer conversion rules

1. Integer Conversion

C99 Standard (§ 6.3.1.1)
Every integer type has an integer conversion rank defined as follows:

No two signed integer types shall have the same rank, even if they have the same representation.
The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision.
The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
The rank of any extended signed integer type relative to another extended signed integer type with the same precision is implementation-defined, but still subject to the other rules for determining the integer conversion rank.

依照 C99 Standard ，把 integer 的 rank 排出來是：

long long int > long int > int > short int > signed char
unsigned int == signed int, if they are both in same precision and same size
rank between extended signed integer types are implementation-defined
rank of standard ones higher than the extended ones

2. Integer Promotion

基本上 integer 都是照以上的規則，然而必須注意運算時有一個很特別的規則 integer promotion :

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

關於這個規則在 C99 的 48) 註腳提到：

The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators, as specified by their respective subclauses.

簡單來說，integer promotion 被使用在一般的算術式中 (像是使用 +, -, ~ 之類的 unary operators)，比 int 小的資料型態，譬如說 char ，會被提升成 int 的形式：


char c1, c2;  // Both of them are char
c1 = c1 + c2; // result of c1 becomes to integer

而和 int 同 rank 的資料型態，則會在 integer promotion 後變成 unsigned 的形式：



signed int si = -1;   // si & ui are at the same rank
unsigned int ui = 0;
int result = si + ui; // result is unsigned

回到第 1 個數值

檢查兩者之間的 rank ，是一樣的，但是因為 integer promotion 的關係 si 提昇成 unsigned → decimal 4294967295

第 2 個數值：

us 為 short ，rank 比 signed int 的 si 低，us 變換成 signed int 形式→ decimal -1

做了一個實驗，故意寫一個會出錯的程式碼由 compiler diagonsis 來看它到底是什麼 data type:

void test(int *a) { };
char a, b;
test(a + b);

編譯器的錯誤訊息是:

expected int* but argument is of type int

由此可知，運算的當下就會發生 integer promotion

Integer Overflow

Ref：http://projects.webappsec.org/w/page/13246946/Integer Overflows

Integer Overflow 發生在進行運算時，結果的數字超過儲存結果的變數的資料型態，例如算出來是 64-bit 的數字，但是 result 是 type of int 。

這時會發生一個叫作 wraparound 的現象，當數字超過最大值，多出來的數字會從最小值重新開始，就像時鐘一樣，早上超過 12:00 時，13:00 時指針會指向 1 重新開始。

回到程式來，以 8-bit signed integer 而言，它的範圍介在 -128 ~ 127 之間，如果程式中存了一個數字 127 ，我們對其 +1 ，直觀來想答案應該是 128 ，但是因為 128 超過這個資料型態的範圍，反而會變成 -128 。反過來如果數字太小也會有問題，叫作 Underflow ，在範圍的最小值的地方發生 wraparound 。

SSH CRC32 Weakness

考慮以下程式碼，在 x86_64 搭配 glibc 實作，printf 函示的輸出為何？

#include <stdio.h>
union some {
    int i; float f;
};

int func(union some *up, float *fp) {
    up->i = 123; *fp = -0.0;
    return up->i;
}

int main() {
    union some u;
    printf("%d\n", func(&u, &u.f));
    return 0;
}

提示: C99 規格書 6.5.2.3 提到:

If the member used to access the contents of a union object is not the same as the member last used to store > a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type.

Union 是一種將不同 data types 儲存在同一個記憶體空間的特殊自訂型別， union 一次只會儲存一個變數資料。

所有在 union 宣告的變數會共享同一個記憶體空間，而且會已宣告的變數型態 size 最大的變數空間作為記憶體空間。

修改任一成員的值皆會影響到其他成員的值，在同一時間內只能保存一個成員的值。

在題目中 int 與 float 皆為 4 bytes

func中給定 up->i = 123; 此時記憶體空間為 0x0000007b

接下來指定 *fp = -0.0; 記憶體空間變為 0x80000000

最後 printf 出return的 up->i ，此時記憶體空間為修改過後的0x80000000

在2補數表示法中即為 -2147483648