Try   HackMD

2023q1 Homework2 (quiz2)

contributed by <vax-r>

測驗一

uint64_t next_pow2(uint64_t x)
{
    x |= x >> 1;
    x |= x >> 1;
    x |= x >> 1;
    x |= x >> 1;
    x |= x >> 1;
    x |= x >> 1;
    x |= x >> 1;
    x |= x >> AAAA;
    x |= x >> 16;
    x |= x >> BBBB;
    return CCCC;
}

程式碼原理

上述函式需要返回大於等於x且最接近x的2的指數
以64位元16進位的方式思考此函式功能如下







next_pow2



input

x

0x0010100011110000



function

 next_pow2 



input:f1->function





output

next_pow2(x)

0x0100000000000000



function->output





也就是將x的msb的左邊一個bit變成1,並把剩下所有bit都設為0
依照題目給的程式碼,可以將上述next_pow2再細分成以下過程







next_pow2



input

x

0x0010100011110000



function

next_pow2

0x0011111111111111

0x0011111111111111 + 1



input:f1->function





output

next_pow2(x)

0x0100000000000000



function->output





由此可知總共有16個bits,已經set 7個bits了,AAAA = 8, BBBB = 32, CCCC = x + 1

用 __builtin_clzl()改寫

此函式__builtin_clzl()會回傳MSB左方有幾個0
所以若 y = __builtin_clzl(x)
則可以知道 z = 0xffffffffffff >> y 就會是next_pow2當中的第一部份
接著再回傳 z+1 即可

uint64_t next_pow2(uint64_t x)
{
    int zeroes = __builtin_clzl(x);
    x = 0xffffffffffffffff >> zeroes;
    return x + 1;
}

編譯器對應之x86指令

next_pow2:
.LFB1:
	.cfi_startproc
	endbr64
	bsrq	%rdi, %rcx
	movq	$-1, %rax
	xorq	$63, %rcx
	shrq	%cl, %rax
	addq	$1, %rax
	ret
	.cfi_endproc

測驗二

int concatenatedBinary(int n)
{
    const int M = 1e9 + 7;
    int len = 0; /* the bit length to be shifted */
    /* use long here as it potentially could overflow for int */
    long ans = 0;
    for (int i = 1; i <= n; i++) {
        /* removing the rightmost set bit
         * e.g. 100100 -> 100000
         *      000001 -> 000000
         *      000000 -> 000000
         * after removal, if it is 0, then it means it is power of 2
         * as all power of 2 only contains 1 set bit
         * if it is power of 2, we increase the bit length
         */
        if (!(DDDD))
            len++;
        ans = (i | (EEEE)) % M;
    }
    return ans;
}

程式碼原理

DDDD是用來檢查將i移除lsb之後是否為0
思考如何利用bitwise operation移除lsb
觀察規律之後可以發現, i & (i-1) 可以達到這樣的效果
所以 DDDD: i & (i-1)
要將i串接在ans最右邊, 所以ans需要進行shift, shift的長度正好是len
所以 EEEE: ans << len

改寫

查閱 https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

Built-in Function: int __builtin_ffs (int x)

​​​​Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero. 

Built-in Function: int __builtin_clz (unsigned int x)

​​​​Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. 

Built-in Function: int __builtin_ctz (unsigned int x)

​​​​Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. 

根據以上定義和程式碼需求可以將程式碼改寫成以下

int concatenatedBinary(int n)
{
    const int M = 1e9 + 7;
    int len = 0; /* the bit length to be shifted */
    /* use long here as it potentially could overflow for int */
    long ans = 0;
    for (int i = 1; i <= n; i++) {
        /* removing the rightmost set bit
         * e.g. 100100 -> 100000
         *      000001 -> 000000
         *      000000 -> 000000
         * after removal, if it is 0, then it means it is power of 2
         * as all power of 2 only contains 1 set bit
         * if it is power of 2, we increase the bit length
         */
        int ffs = __builtin_ffs(i);
        if (!(i >> ffs))
            len++;
        ans = (i | (ans << len)) % M;
    }
    return ans;
}

測驗三

判斷 2 個 32 位元寬度的整數是否都是奇數 (odd),可能會這樣撰寫:

#include <stdint.h>
bool both_odd(uint32_t x, uint32_t y) {
    return (x & 1) && (y & 1);
}

因為若num是奇數,則第0位元會是1, 所以才比較兩個數字的第0位元是否都是1

題目中提到
若輸入的字串是一個有效的 UTF-8 序列,則計算其中的後續位元組數量,並將此數字從總字串長度中減去,即可確定字元數量。

size_t swar_count_utf8(const char *buf, size_t len)
{
    const uint64_t *qword = (const uint64_t *) buf;
    const uint64_t *end = qword + len >> 3;

    size_t count = 0;
    for (; qword != end; qword++) {
        const uint64_t t0 = *qword;
        const uint64_t t1 = ~t0;
        const uint64_t t2 = t1 & 0x04040404040404040llu;
        const uint64_t t3 = t2 + t2;
        const uint64_t t4 = t0 & t3;
        count += __builtin_popcountll(t4);
    }

    count = (1 << AAAA) * (len / 8)  - count;
    count += (len & BBBB) ? count_utf8((const char *) end, len & CCCC) : DDDD;

    return count;
}

程式碼原理

可以看到在for loop的部份是在計算此字串中的continuous bytes的數量
算完後需要將此數字從總長度中扣除
在一開始宣告的qword可以知道, for loop計算的時候每次存取8個byte
有可能送入的字串長度不被8整除, 所以for loop實際上只處理到能被8整除的字串長度
剩下沒有被處理到的字元需要另外處理

    count = (1 << 3) * (len / 8)  - count;
    count += (len & 7) ? count_utf8((const char *) end, len & 7) : 0;

在上述程式碼中 (1 << 3) * (len / 8) 是將len當中除以8之後的餘數給消除
而下一行當中的 (len & 7) 是用來檢查len是否被8整除

測驗四

以下程式碼只讓特定規律的數字通過

#include <stdint.h>
#include <stdbool.h>

bool is_pattern(uint16_t x)
{
    if (!x)
        return 0;

    for (; x > 0; x <<= 1) {
        if (!(x & 0x8000))
            return false;
    }

    return true;
}

觀察可得知數字規律如下

1000000000000000
1100000000000000
1110000000000000
.
.
.
1111111111110000

程式碼運作原理

bool is_pattern(uint16_t x)
{
    const uint16_t n = EEEE;
    return (n ^ x) < FFFF;
}

思考後可知
EEEE: ~x + 1
FFFF: x
參考: https://hackmd.io/@hankTaro/quiz2#測驗四