2019q3 Homework1 (review)

contributed by < uccuser159 >

題目1

考慮以下 C 程式的 align4 巨集的作用是，針對 4-byte alignment，找到給定地址的 round up alignment address。

#include <stdio.h>
#define align4(x) (((x) + K) & (-4))
int main(void) {
    int p = 0x1997;
    printf("align4(p) is %08x\n", align4(p));
    return 0;
}

預期程式輸出 align4(p) is 00001998

K = ?

思考

來看 #define align4(x) (((x) + K) & (-4))

與 (-4)=1100₂ 做AND運算，是要抓出4的倍數：0x0000 . 0x0004 . 0x0008 …
- 1~3 round up to 4
- 5~7 round up to 8
- 9~11 round up to 12
- …以此類推

此處的K即是除以4的最大餘數。將 K 加上去後可以讓 x 往上進位到該有的4的倍數同時不要讓要 x 超過4的下一個倍數，所以 K 值應該為3。

Reference

Align macro

延伸問題

Linux 核心原始程式碼 alignment 巨集

原始程式碼 ( line 33 ~ line 37 )：

#define ALIGN(x, a)		__ALIGN_KERNEL((x), (a))
#define ALIGN_DOWN(x, a)	__ALIGN_KERNEL((x) - ((a) - 1), (a))
#define __ALIGN_MASK(x, mask)	__ALIGN_KERNEL_MASK((x), (mask))
#define PTR_ALIGN(p, a)	((typeof(p))ALIGN((unsigned long)(p), (a)))
#define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)

__ALIGN_KERNEL & __ALIGN_KERNEL_MASK ( line 10 & line 11 )

#define __ALIGN_KERNEL(x, a)		__ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask)	(((x) + (mask)) & ~(mask))

應用 ( line 35 )：

為確保堆疊的指標所指向的位址是正確的




















int notrace unwind_frame(struct stackframe *frame)
{
	unsigned long high, low;
	unsigned long fp = frame->fp;

	/* only go to a higher address on the stack */
	low = frame->sp;
	high = ALIGN(low, THREAD_SIZE);

	/* check current frame pointer is within bounds */
	if (fp < low + 12 || fp > high - 4)
		return -EINVAL;

	/* restore the registers from the stack frame */
	frame->fp = *(unsigned long *)(fp - 12);
	frame->sp = *(unsigned long *)(fp - 8);
	frame->pc = *(unsigned long *)(fp - 4);

	return 0;
}

-Linux kernel source code

缺乏實際 Linux 核心原始程式碼及其「應用場景」的解說
:notes: jserv

為了提高存取效率，例如記憶體存取、分頁，需要將資料對齊到 2^N ( N 為非負整數 )。
而編譯器在分配記憶體時，會按照宣告的型態去做對齊。

#include <stdio.h>

#define ALIGN(x, a) __ALIGN_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))

int main() {
    for (int i = 0; i < 16; i++)
        printf("ALIGN(%d, 4): 0x%08x\n", i, ALIGN(i, 4));
    return 0;
}

程式執行結果:

ALIGN(0, 4): 0x00000000
ALIGN(1, 4): 0x00000004
ALIGN(2, 4): 0x00000004
ALIGN(3, 4): 0x00000004
ALIGN(4, 4): 0x00000004
ALIGN(5, 4): 0x00000008
ALIGN(6, 4): 0x00000008
ALIGN(7, 4): 0x00000008
ALIGN(8, 4): 0x00000008
ALIGN(9, 4): 0x0000000c
ALIGN(10, 4): 0x0000000c
ALIGN(11, 4): 0x0000000c
ALIGN(12, 4): 0x0000000c
ALIGN(13, 4): 0x00000010
ALIGN(14, 4): 0x00000010
ALIGN(15, 4): 0x00000010

typeof(x) 表示取 x 的類型，此例的ALIGN(x,a)中 x 是 int ，則typeof(x)為 int ，故 (typeof(x))(a)-1 ，表示把 a 轉化為 x 的類型，再減1，當作mask。以ALIGN(1,4)來說，mask=0000...00011(32bits)，& ~(mask)即是在做 round up 成4的倍數。

GCC Manual 6.1 Statements and Declarations in Expressions 和 [6.7] Referring to a Type with typeof 提到：

For example, the “maximum” function is commonly defined as a macro in standard C as follows:
#define max(a,b) ((a) > (b) ? (a) : (b))
But this definition computes either a or b twice, with bad results if the operand has side effects. In GNU C, if you know the type of the operands (here taken as int), you can avoid this problem by defining the macro as follows:
#define maxint(a,b) \
   ({int _a = (a), _b = (b); _a > _b ? _a : _b; })
Here is how the two together can be used to define a safe “maximum” macro which operates on any arithmetic type and evaluates each of its arguments exactly once:
#define max(a,b) \
({ 
   typeof (a) _a = (a); \
   typeof (b) _b = (b); \
   _a > _b ? _a : _b; 
})

GCC Manual

TODO: typeof 是 GNU extension，找出 GCC Manual 對應描述並解說為何巨集定義用得到此 extension
:notes: jserv

Reference

題目2

考慮以下 C 程式碼:

#include <stdbool.h>
bool func(unsigned int x) {
    return x && ((x & (~x + 1)) == x);       
}

可改寫為以下等價的程式碼:

bool func(unsigned int x) {
    unsigned int unknown = 0;
    while (x && unknown <= 1) {
        if ((x & Z) == 1)
            unknown++;
        x >>= 1;
   }
   return (unknown == 1);
}

請補完程式。

思考

考慮程式碼：

#include <stdbool.h>
bool func(unsigned int x) {
    return x && ((x & (~x + 1)) == x);       
}

先將func函式的參數從1(0001₂)代到8(1000₂)：

工程人員說話應該精確，英文用語怎麼寫，就用對應的中文陳述
:notes: jserv











#include <stdio.h>
#include <stdbool.h>
bool func(unsigned int x) {
     return x && ((x & (~x + 1)) == x);
}
  
int main(void) {
    for (int i = 1; i <= 8; i++){ //try x from 0 to 8
        printf("x=%d => Result:%d\n ", i, func_up(i));
    }
}

執行結果:

x=1 => Result:1
x=2 => Result:1
x=3 => Result:0
x=4 => Result:1
x=5 => Result:0
x=6 => Result:0
x=7 => Result:0
x=8 => Result:1

由結果推論是在判斷一數字 x 是否為2的次方。

return x && ((x & (~x+1))==x)

先來嘗試 x & (~x+1)

x = 1 (0001₂) :

  0001
& 1111
-------
  0001

x = 2 (0010₂) :

  0010
& 1110
-------
  0010

x = 3 (0011₂) :

  0011
& 1101
-------
  0001

x = 6 (0110₂) :

  0110
& 1010
-------
  0010

x = 7 (0111₂) :

  0111
& 1001
-------
  0001

x = 8 (1000₂) :

  1000
& 1000
-------
  1000

發現x & (~x+1)的效果為將數字最右邊 bit 的1保留，其餘 bit 為0。
所以符合x & (~x+1)==x式子的數字 x 應該要是化成二進制時只有其中一個 bit 為1，即2的次方(2ⁿ) (例如: x=0100₂ . 0010₂ …等等)。
而x && (x & (~x+1)==x)即是在 x=0 的狀況下回傳 False，將其狀況排除。

考慮等價的函式(判斷數字 x 是否為2的次冪)：

bool func(unsigned int x) {
    unsigned int unknown = 0;
    while (x && unknown <= 1) {
        if ((x & Z) == 1)
            unknown++;
        x >>= 1;
   }
   return (unknown == 1);
}

Z 值就像一個在LSB檢查"1"的偵測器，檢查 x 的 LSB 是否為1，否則將 x 做右移運算。
有此可知，Z 值應該為1。
while(x && unknown <= 1)會將 x=0 的狀況排除且unknown拿來記錄"1"的個數，當在 if 條件中檢測到"1"即將 unknown 加1，這邊舉三種例子：

例1: x=0000₂

x=0 不會進入 while 迴圈， unknown==0 ，回傳false

例2: x=0010₂

Step1. x!=0 且 unknown==0 進入 while 迴圈 → LSB 不為1 → x做算術右移為0001 
Step2. x!=0 且 unknown==0 進入 while 迴圈 → LSB 為1 → unknown==1 → x做算術右移為0000
Step3. x=0 不會進入 while 迴圈 → unknown==1，回傳 true

例3: x=0110₂

Step1. x!=0 且 unknown==0 進入 while 迴圈 → LSB 不為1 → x做算術右移為0011 
Step2. x!=0 且 unknown==0 進入 while 迴圈 → LSB 為1 → unknown==1 → x做算術右移為0001
Step3. x!=0 且 unknown==1 進入 while 迴圈 → LSB 為1 → unknown==2 → x做算術右移為0000
Step4. x=0 不會進入 while 迴圈 → unknown==2，回傳 false

此等價的函式只想在 x 中找僅有1個bit為"1"的數，所以回傳的布林判斷條件為unknown == 1

延伸問題

透過 bitwise 操作將原有運算改為 const-time 實作的程式

判斷一整數為正 or 負 or 0
























/*
if the integer is: positive print +1, negative print -1, zero print 0
*/
int32_t pos_neg_zero(int32_t a){
    int32_t sign;   // 判斷結果
    sign = (a != 0) | (a >> 31);  
    return sign;
    
int main(void){

    int x = -15;
    int y = 15;
    int z = 0;
    printf("x sign:%d\n",pos_neg_zero(x)); 
    printf("y sign:%d\n",pos_neg_zero(y));
    printf("z sign:%d\n",pos_neg_zero(z));
    return 0;

/*
x sign:-1
y sign:1
z sign:0
*/
}

判斷一整數為非負整數























/*
if the integer is: non-negative print +1, negative print 0
*/
int32_t non_neg(int32_t a){
    int32_t sign;   // 判斷結果
    sign = 1 ^ (a >> 31);  
    return sign;
}

int main(void){

    int x = -15;
    int y = 15;
    int z = 0;
    printf("x sign:%d\n",non_neg(x)); 
    printf("y sign:%d\n",non_neg(y));
    printf("z sign:%d\n",non_neg(z));

/*
x sign:-2
y sign:1
z sign:1
*/

此方法發現與結果不合，問題出現在數值小於0的狀況。
原因是當有號數在做右移運算時，會使用正負號位元填補空出的位元位置，所以小於0的數，經過右移31bits後為1111…111₂(32bits)，而非0000…001₂(32bits)
故將程式改成：
























/*
if the integer is: non-negative print +1, negative print 0
*/
int32_t non_neg(int32_t a){
    int32_t sign;   // 判斷結果
    sign = ~(1 & (a >> 31)) +2;  
    return sign;
}

int main(void){

    int x = -15;
    int y = 15;
    int z = 0;
    printf("x sign:%d\n",non_neg(x)); 
    printf("y sign:%d\n",non_neg(y));
    printf("z sign:%d\n",non_neg(z));
    return 0;

/*
x sign:0
y sign:1
z sign:1
*/

Reference

題目3

[ ]考慮以下程式碼：

#include <stdio.h>
int main() { return (*******puts)("Hello"); }

能否編譯並執行呢？若可，為什麼呢？

Why？
先了解 C99 規格：
1. [ 6.3.2.1-4 ]
A function designator is an expression that has function type. Except when it is the operand of the sizeof operator or the unary & operator, a function designator with type ‘‘function returning type’’ is converted to an expression that has type ‘‘pointer to function returning type’’.
1. [ 6.5.3.2-4 ]
The unary * operator denotes indirection. If the operand points to a function, the result is a function designator

例如：

#include <stdio.h>
int (*fptr1)(int);
int test1(int a){
    a = a + 1;
    return a;
}
int main(){
    fptr1 = test1;
    printf("test1: %x", test1);
    printf("fptr1: %x\n", fptr1);
    printf("fptr1: %x\n", *fptr1);
    printf("fptr1: %x\n", *****fptr1);
    return 0;
    
/*
test1: 400543 
fptr1: 400543
fptr1: 400543
fptr1: 400543
*/
}

A "function designator" is converted to a "pointer to function returning type"
=> test1 是一個 function designator 所以它會轉換成一個 pointer to function
If the " * " operand points to a function, the result is a function designator
=> (*fptr1) 的結果會是一個 function designator

從例子可以得知，只要 * operator 指向一函式，無論 * operater 有幾個最終出來的結果都會是一個function designator，而 function designator 會再轉換成一個 pointer to function，最終指向 test1 此函式。

回頭來看題目

#include <stdio.h>
int main() { return (*******puts)("Hello"); }

先將(*******puts) 可看作 (*(*(*(*(*(*(*puts)))))))
根據剛剛的結論：不管有幾個 * operater，最終出來的結果都會是一個function designator，而 function designator 會再轉換成一個 pointer to function，最終還是指向 puts 函式。

puts 的 function prototype 是 int puts(const char *s)
最終主程式 main 將呼叫 puts 函式 print 出 "Hello"

Reference