2022q1 Homework5 (quiz8)

contributed by < tommy2234 >

題目連結 quiz8

測驗 1

Overview

當我們想尋找一個字串中第一次出現某字元的位置，最 naive 的方法便是從頭開始一個個 character 逐一檢查，但是此方法一次只能檢查一個 byte。

在本題中我們運用 SIMD 的概念，一次抓取一個 long 長度的 block ，在此 block 之中套用 bitwise 的操作偵測該字元的存在。

/* The fast code reads the source one word at a time and only performs
 * the bytewise search on word-sized segments if they contain the search
 * character, which is detected by XORing the word-sized segment with a
 * word-sized block of the search character and then detecting for the
 * presence of NULL in the result.
 */

DETECT_CHAR

DETECT_CHAR 的概念是利用目標字元去對一個 block 之中每一個 byte 做 XOR ，當字元在 block 中存在，其所在的 byte 就會變成 0，如此就能達到辨別該字元的效果。

memchr_opt() 函式

void *memchr_opt(const void *src_void, int c, size_t length)
{
    const unsigned char *src = (const unsigned char *) src_void;
    unsigned char d = c;

    while (UNALIGNED(src)) {
        if (!length--)
            return NULL;
        if (*src == d)
            return (void *) src;
        src++;
    }

    if (!TOO_SMALL(length)) {
        /* If we get this far, we know that length is large and
         * src is word-aligned.
         */

        /* The fast code reads the source one word at a time and only performs
         * the bytewise search on word-sized segments if they contain the search
         * character, which is detected by XORing the word-sized segment with a
         * word-sized block of the search character and then detecting for the
         * presence of NULL in the result.
         */

        // create a mask of length (8 * sizeof(long)) bits
        unsigned long *asrc = (unsigned long *) src;
        unsigned long mask = d << 8 | d;
        mask = mask << 16 | mask;
        for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1)
            mask = (mask << i) | mask;

        while (length >= LBLOCKSIZE) {
            /* XXXXX: Your implementation should appear here */
            if(DETECT_CHAR(*asrc, mask))
                break;
            asrc++;
            length -= LBLOCKSIZE;
        }
        
        /* If there are fewer than LBLOCKSIZE characters left, then we resort to
         * the bytewise loop.
         */
        src = (unsigned char *) asrc;
    }

    while (length--) {
        if (*src == d)
            return (void *) src;
        src++;
    }

    return NULL;
}

首先我們將指向字串位置的指標 src 轉型為（unsigned Long *) 之後 assign 給 asrc ，此時 asrc 就是我們當前抓取的 block 。
一個 block 的長度為 sizeof(long) bytes ，定義在 LBLOCKSIZE 。
```
    unsigned long *asrc = (unsigned long *) src;
```

接著我們創造一個 LBLOCKSIZE 個目標字元排列而成的 bit mask 。

    unsigned long mask = d << 8 | d;
    mask = mask << 16 | mask;
    for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1)
        mask = (mask << i) | mask;

在迴圈中不斷抓取下一個 block 並偵測目標字元的存在，直到字元被偵測到，或者剩下的字串長度少於一個 block 的大小。

    while (length >= LBLOCKSIZE) {
        /* XXXXX: Your implementation should appear here */
        if(DETECT_CHAR(*asrc, mask))
            break;
        asrc++;
        length -= LBLOCKSIZE;
    }

當字元在 block 中出現，我們跳出迴圈，然後逐個 byte 尋找其確切位置並回傳。