# 2021q1 Homework5 (sort) contributed by < [Julian-Chu](https://github.com/Julian-Chu) > ###### tags: `linux2021` > [GitHub](https://github.com/Julian-Chu/ksort) LKM: Linux Loadable Kernel Module --- ## copy_to_user with uint64_t and variable length `dev_read` ```cpp uint64_t value = next(); /* in xoroshiro128plus.c */ /* copy_to_user has the format ( * to, *from, size) and ret 0 on success */ int n_notcopied = copy_to_user(buffer, (char *) (&value), len_); ``` `test_xoro.cc` ```cpp /* Read the response from the LKM */ ssize_t n_bytes_read = read(fd, rx, n_bytes); uint64_t value_ = 0; // uint64_t 8 bytes <-> char * // char 1 byte // little endian for (int b_idx = 0; b_idx < n_bytes_read; b_idx++) { unsigned char b = rx[b_idx]; value_ |= ((uint64_t) b << (8 * b_idx)); } ``` 重點: uint64_t 與 char * 的互換, bit operation, little endian :::warning q1: Test reading different numbers of bytes 設計的使用場景? > 你可嘗試用 `uintptr_t` (C99 規範) > :notes: jserv ::: --- ## 透過 ktime 取得時間並紀錄排序實作佔用的時間 [Github branch](https://github.com/Julian-Chu/ksort/tree/integrate-sort) `make check`: run check and plot ### `main.c` next()回傳 uint64, 使用 cmpint 比較 uint64 會導致排序錯誤, 新增 cmpuint64 ```cpp static int __init cmpuint64(const void *a, const void *b) { uint64_t a_val = *(uint64_t *) a; uint64_t b_val = *(uint64_t *) b; if (a_val > b_val) return 1; if (a_val == b_val) return 0; return -1; ``` 在 dev_read 內部利用 next() 產生待排序資料, 利用 cmpuint64 跟 sort_impl 進行排序, 量測排序時間並檢驗排序結果 ```cpp static ktime_t kt; static ssize_t dev_read(struct file *filep, char *buffer, size_t len, loff_t *offset) { size_t len_ = (len > 8) ? 8 : len; uint64_t value; // generate data to sort uint64_t sizes[TEST_LEN]; for (int i = 0; i < TEST_LEN; i++) { sizes[i] = next(); value = sizes[i]; } // measure sorting time kt = ktime_get(); sort_impl(sizes, TEST_LEN, sizeof(*sizes), cmpuint64, NULL); kt = ktime_sub(ktime_get(), kt); printk(" %llu\n", kt); /* copy_to_user has the format ( * to, *from, size) and ret 0 on success */ int n_notcopied = copy_to_user(buffer, (char *) (&value), len_); if (0 != n_notcopied) { printk(KERN_ALERT "XORO: Failed to read %d/%ld bytes\n", n_notcopied, len_); return -EFAULT; } // check sorted data #define ESORT 999 for (int i = 0; i < TEST_LEN - 1; i++) if (sizes[i] > sizes[i + 1]) { pr_err("test has failed\n"); return -ESORT; } printk(KERN_INFO "XORO: read %ld bytes\n", len_); return len_; } ``` 對 LKM 新增 write 方法回傳量測時間 ```cpp // declaration of write static ssize_t dev_write(struct file *, const char *, size_t size, loff_t *); static struct file_operations fops = { .open = dev_open, .read = dev_read, .release = dev_release, // assign function to write .write = dev_write, }; static ssize_t dev_write(struct file *file, const char *buf, size_t size, loff_t *offset) { printk(" %llu\n", kt); return ktime_to_ns(kt); } ``` ### `test_xoro.c` 更動 `test_xoro.c`, 由於用到 write, 需要設定 fd 可寫入, 將原本的 `O_RDONLY` 改為 `O_RDWR`, 使用 write 回傳資料 記得 read/write 回傳都要 error check, 善用 `errno` 提高 debug 效率 ```cpp int main(int argc, char *argv[]) { int fd = open("/dev/xoroshiro128p", O_RDWR); if (0 > fd) { perror("Failed to open the device."); return errno; } char write_buf[] = "testing writing"; /* Test reading different numbers of bytes */ for (int n_bytes = 0; n_bytes < 10; n_bytes++) { /* Clear/zero the buffer before copying in read data */ zero_rx(); /* Read the response from the LKM */ ssize_t n_bytes_read = read(fd, rx, n_bytes); if (0 > n_bytes_read) { perror("Failed to read all bytes."); return errno; } uint64_t value_ = 0; // uint64_t 8 bytes <-> char * // char 1 byte // little endian for (int b_idx = 0; b_idx < n_bytes_read; b_idx++) { unsigned char b = rx[b_idx]; value_ |= ((uint64_t) b << (8 * b_idx)); } // get sorting time long long kt = write(fd, write_buf, strlen(write_buf)); if (kt < 0) { perror("Failed to write the message to the device."); return errno; } printf("%d %lld\n", n_bytes, kt); } return 0; } ``` 跑 10 次的結果 ![](https://i.imgur.com/jDTTpqB.png) --- ## 改寫 [swenson/sort](https://github.com/swenson/sort) 的程式碼並整合進 ksort 修改 sort.h 使之可以被使用在 Linux 核心,移除 C 標準函式庫和執行環境的依賴,改用 Linux 核心內建的 API。 ```cpp malloc(size) -> kmalloc(size, GFP_KERNEL) // what's different between flags free -> kfree printf(stderr...)-> printk(KERN_ERR ...) exit(1) -> 移除 // 以下錯誤爲使用空間超過 stack 最大值的設定, 改用 heap 即可 BUG: stack guard page was hit at 000000005b18ddff (stack is 00000000e8fcc7a9..00000000f3c40e99) ``` 整合後發現排序錯誤, 原因是使用 demo 裡面的定義給 int 的 SORT_CMP 會導致 uint 的情況錯誤, (x-y) 不會小於零, 使用預設 SORT_CMP 即可修正 ```cpp #define SORT_CMP(x, y) (x - y) /* You can redefine the comparison operator. The default is #define SORT_CMP(x, y) ((x) < (y) ? -1 : ((x) == (y) ? 0 : 1)) but the one below is often faster for integer types. */ #define SORT_CMP(x, y) (x - y) ``` 改寫 `dev_read` 讓回傳時間可以透過 `copy_to_user` 回傳 ```cpp uint64_t *times = kmalloc(sizeof(uint64_t)*8, GFP_KERNEL); ....... // shell sort memcpy(dst, src, sizeof(uint64_t) * TEST_LEN); kt = ktime_get(); ksort_shell_sort(dst, TEST_LEN); times[(int)shell_sort] = ktime_sub(ktime_get(), kt); printk(KERN_INFO " %llu\n", kt); for (int i = 0; i < TEST_LEN - 1; i++) if (dst[i] > dst[i + 1]) { pr_err("test has failed with shell_sort\n"); kfree(src); kfree(dst); return -ESORT; } ...... int n_notcopied = copy_to_user(buffer, (void *) (times), sizeof(uint64_t)*8); ``` > todo: 利用 preprocessor 與 goto 簡化程式碼 排序 100 筆資料 ![](https://i.imgur.com/iPWnoqc.png) 排序 1000 筆資料 ![](https://i.imgur.com/TRfueTW.png) 排序 1500 筆資料 ![](https://i.imgur.com/RNkTlxQ.png) 嘗試排序 2000 筆資料以上會發生 page fault 錯誤, 找問題中 ```shell [31087.396151] BUG: unable to handle page fault for address: ffffffffc064b00f [31087.396152] #PF: supervisor instruction fetch in kernel mode [31087.396153] #PF: error_code(0x0010) - not-present page ``` :::warning q: 除了 dmesg, 在 kernel debug 的技巧? > 透過 UML 搭配 GDB 來除錯! > :notes: jserv ::: sort.h 遇到的 static analysis issue, 新增下列的 cppcheck arguments 暫時忽略。 ```shell --suppress=variableScope -i sort.h --suppress=redundantAssignment -i sort.h ``` --- ## 實作 introsort 或 pdqsort,整合進 ksort