2021q1 Homework (fibdrv)

contributed by < hankluo6 >

GitHub

環境

$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              1
On-line CPU(s) list: 0
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               158
Model name:          Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
Stepping:            10
CPU MHz:             2304.002
BogoMIPS:            4608.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            8192K
NUMA node0 CPU(s):   0

大數運算

目前計算 fibonacci 是使用 long long int 來實作，而

f i b (93) = 12, 200, 160, 415, 121, 876, 738 > 9, 223, 372, 036, 854, 775, 807

，後者為 lone lone int 的最大值

2^{63} - 1

，此時算出來的數值將會溢位。

使用 gcc extension 內的 __int128 來計算 fibonacci sequence，而 128 bits 便能容納

2^{128} - 1

以內的數字。

TODO: 指出使用 __int128 來做數值表達後，可計算到 Fib(n) 的有效

n

值

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

128 bits unsigned integer 可以表達的整數上限為

2^{128} - 1 \approx 3.4 \times 10^{38}

，而

f i b (186) \approx 3.3 \times 10^{38}, f i b (187) \approx 5.3 \times 10^{38}

，故 __int128 最多只能表達到

n = 186

的情況。

在 fib_read 中，因要將 128 bits 的數字回傳到 user space，使用 ssize_t 無法表達完整數值範圍，故改為使用 copy_to_user 將資料回傳到 buf 內。

/* calculate the fibonacci number at given offset */
static ssize_t fib_read(struct file *file,
                        char *buf,
                        size_t size,
                        loff_t *offset)
{
    uint128_t ret = fib_sequence(*offset);
    copy_to_user(buf, &ret, sizeof(ret));
    return 1;
}

另外需要實作能正確印出 128 bits integer 的函式，此部份參考 how to print __uint128_t number using gcc?：

static int print_u128_u(uint128_t u128)
{
    int rc;
    if (u128 > UINT64_MAX) {
        uint128_t leading = u128 / P10_UINT64;
        uint64_t trailing = u128 % P10_UINT64;
        rc = print_u128_u(leading);
        rc += printf("%." TO_STRING(E10_UINT64) PRIu64, trailing);
    } else {
        uint64_t u64 = u128;
        rc = printf("%" PRIu64, u64);
    }
    return rc;
}

最後更改 scripts/expected.txt，便能正確印出

f i b (100)

。

使用 sysfs 讀取 kernel space 時間

在 sysfs 底下註冊兩個變數 fib_kt_ns 以及 copy_kt_ns，分別用來紀錄 fib_sequence 計算時間及 copy_to_user 計算時間。

定義 kernoel object 以及 attribute：

static ssize_t kobj_copy_show(struct kobject *kobj,
                    struct kobj_attribute *attr,
                    char *buf)
{
    copy_kt_ns = ktime_to_ns(copy_kt);
    return snprintf(buf, 64, "%ld\n", copy_kt_ns);
}

static ssize_t kobj_fib_show(struct kobject *kobj,
                    struct kobj_attribute *attr,
                    char *buf)
{
    fib_kt_ns = ktime_to_ns(fib_kt);
    return snprintf(buf, 64, "%ld\n", fib_kt_ns);
}

/* store operation is skipped */
static ssize_t kobj_store(struct kobject *kobj,
                     struct kobj_attribute *attr,
                     const char *buf,
                     size_t count)
{
    return count;
}

struct kobject *kobj_ref;
static struct kobj_attribute ktime_attr = __ATTR(fib_kt_ns, 0664, kobj_fib_show, kobj_store);
static struct kobj_attribute ktime_copy_attr = __ATTR(copy_kt_ns, 0664, kobj_copy_show, kobj_store);
static struct attribute *attrs[] = {
    &ktime_attr.attr,
    &ktime_copy_attr.attr,
    NULL,
};

在 init_fib_dev 內註冊 sysfs：

static int __init init_fib_dev(void)
{
    ...
    kobj_ref = kobject_create_and_add("fib_time", kernel_kobj);

    if (!kobj_ref) {
        printk(KERN_ALERT "Failed to create kobject");
        goto failed_device_create;
    }
    if (sysfs_create_group(kobj_ref, &attr_group))
        kobject_put(kobj_ref);
    ...
}

並在卸載模組時 exit_fib_dev 內移除：

static void __exit exit_fib_dev(void)
{
    ...
    kobject_del(kobj_ref);
}

更改 fib_read 以便在 read device 時能同時紀錄時間：

/* calculate the fibonacci number at given offset */
static ssize_t fib_read(struct file *file,
                        char *buf,
                        size_t size,
                        loff_t *offset)
{
    fib_kt = ktime_get();
    uint128_t ret = fib_sequence(*offset);
    fib_kt = ktime_sub(ktime_get(), fib_kt);
    copy_kt = ktime_get();
    copy_to_user(buf, &ret, sizeof(ret));
    copy_kt = ktime_sub(ktime_get(), copy_kt);
    return 0;
}

而在 client.c 內也要將紀錄的時間讀出：

#define FIB_SYS "/sys/kernel/fib_time/"

/* fib == 1 return calculated fibonacci time in kernel
 * else return copy_to_user time in kernel
 */  
static long int get_ktime(int fib)
{
    int kfd;
    if (fib == 1)
        kfd = open(FIB_SYS "fib_kt_ns", O_RDWR);
    else
        kfd = open(FIB_SYS "copy_kt_ns", O_RDWR);
    if (kfd < 0) {
        perror("Failed to open sys kobject");
        exit(1);
    }
    char buf[64];
    read(kfd, buf, 64);
    close(kfd);
    return atol(buf);
}

效能分析

如果直接進行效能分析，會發現每次測量的結果相差很大，且同個實驗內也會出現明顯的波動：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

排除干擾效能分析的因素

根據作業提示，將以下程式加入到 Makefile 中：

plot: all
	@sudo bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"
	@sudo bash -c "echo performance > /sys/devices/system/cpu/cpu$(CPUID)/cpufreq/scaling_governor"
	@sudo bash -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
	$(MAKE) unload
	$(MAKE) load
	$(shell sudo sh performance.sh)
	python3 scripts/driver.py
	gnuplot plot.gp
	$(MAKE) unload

利用統計手法去除極端值

參考 4ce43c4 中的 python script，實作 python script，將 95% 外的 outlier 去除。

可以看到效能已經趨近穩定，且每次實驗執行時間皆固定在一定範圍內，另外可以發現在程式剛執行時時間較高，隨後才會下降。此原因在 KYG-yaya573142 的報告內有提到可能為 page fault 的影響。

Page faults

參考 Threaded RT-application with memory locking and stack handling example，將 page 鎖在 RAM 當中，並預先分配好空間，防止程式在運行時產生 page faults。

實作文章內的 configure_malloc_behavior 以及 reserve_process_memory 函式，並再次測量結果：

會發現結果不但沒有改善，甚至在 offset 為 0 時運行時間比原本更久。將 page fault 印出，會發現在一次 clock_gettime 時會產生一次 page fault，其原因可參考 bakudr18 同學的分析。但當如該同學的敘述在迴圈外額外呼叫一次 clock_gettime 時，page fault 次數的確減少為 0，但運行時間並沒有如期減少，故我推測原因並不是 page fault 造成的。

對照該同學的程式碼與我測試的程式碼，發現內容稍有不同：

int main()
{
    ...
    clock_gettime(CLOCK_MONOTONIC, &t1);
+   read(fd, &val, sizeof(val));
+   clock_gettime(CLOCK_MONOTONIC, &t2);
    for (int i = 0; i <= offset; i++) {
    ...
}

推測應為 read 的原因，將第二個 clock_gettime 移除並沒有產生不同的結果，加上 read 之後的結果：

還是有些地方需要釐清：

先呼叫 read 讓效能提昇的原因
前幾次的 overhead 還是較高

之前電子書提到的 ftrace 就可派上用場

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

jserv

Fast Doubling

使用作業解說的 fast doubling 加速，並加入 __builtin_clz 計算位元位置：

在計算量較小時，使用 sequence 速度較快，而隨著

n

越大，使用 fast doubling 的速度越好，這是因為兩個演算法的時間複雜度不同：

sequence:
$O (n)$
fast doubling:
$O (l o g_{2} (n))$

TODO: 針對較小的

n

值，預先建立查詢表格 (表格的佔用空間很關鍵!)，降低運算量

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

jserv

2021q1 Homework (fibdrv)

環境

大數運算

使用 sysfs 讀取 kernel space 時間

效能分析

排除干擾效能分析的因素

利用統計手法去除極端值

Page faults

Fast Doubling

tags: linux2021

Read more

Linked List Sort

2021q1 Homework5 (sort)

2022q1 Homework5 (quiz5)

2022q1 Homework5 (quiz6)

tags: `linux2021`