contributed by < tommy2234 >
題目連結 quiz8
當我們想尋找一個字串中第一次出現某字元的位置,最 naive 的方法便是從頭開始一個個 character 逐一檢查,但是此方法一次只能檢查一個 byte。
在本題中我們運用 SIMD 的概念,一次抓取一個 long 長度的 block ,在此 block 之中套用 bitwise 的操作偵測該字元的存在。
/* The fast code reads the source one word at a time and only performs
* the bytewise search on word-sized segments if they contain the search
* character, which is detected by XORing the word-sized segment with a
* word-sized block of the search character and then detecting for the
* presence of NULL in the result.
*/
DETECT_CHAR
的概念是利用目標字元去對一個 block 之中每一個 byte 做 XOR
,當字元在 block 中存在 ,其所在的 byte 就會變成 0,如此就能達到辨別該字元的效果。
void *memchr_opt(const void *src_void, int c, size_t length)
{
const unsigned char *src = (const unsigned char *) src_void;
unsigned char d = c;
while (UNALIGNED(src)) {
if (!length--)
return NULL;
if (*src == d)
return (void *) src;
src++;
}
if (!TOO_SMALL(length)) {
/* If we get this far, we know that length is large and
* src is word-aligned.
*/
/* The fast code reads the source one word at a time and only performs
* the bytewise search on word-sized segments if they contain the search
* character, which is detected by XORing the word-sized segment with a
* word-sized block of the search character and then detecting for the
* presence of NULL in the result.
*/
// create a mask of length (8 * sizeof(long)) bits
unsigned long *asrc = (unsigned long *) src;
unsigned long mask = d << 8 | d;
mask = mask << 16 | mask;
for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1)
mask = (mask << i) | mask;
while (length >= LBLOCKSIZE) {
/* XXXXX: Your implementation should appear here */
if(DETECT_CHAR(*asrc, mask))
break;
asrc++;
length -= LBLOCKSIZE;
}
/* If there are fewer than LBLOCKSIZE characters left, then we resort to
* the bytewise loop.
*/
src = (unsigned char *) asrc;
}
while (length--) {
if (*src == d)
return (void *) src;
src++;
}
return NULL;
}
首先我們將指向字串位置的指標 src 轉型為(unsigned Long *) 之後 assign 給 asrc ,此時 asrc 就是我們當前抓取的 block 。
一個 block 的長度為 sizeof(long) bytes ,定義在 LBLOCKSIZE 。
unsigned long *asrc = (unsigned long *) src;
接著我們創造一個 LBLOCKSIZE 個目標字元排列而成的 bit mask 。
unsigned long mask = d << 8 | d;
mask = mask << 16 | mask;
for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1)
mask = (mask << i) | mask;
在迴圈中不斷抓取下一個 block 並偵測目標字元的存在,直到字元被偵測到,或者剩下的字串長度少於一個 block 的大小。
while (length >= LBLOCKSIZE) {
/* XXXXX: Your implementation should appear here */
if(DETECT_CHAR(*asrc, mask))
break;
asrc++;
length -= LBLOCKSIZE;
}
當字元在 block 中出現,我們跳出迴圈,然後逐個 byte 尋找其確切位置並回傳。