# 2021q1 Homework5 (sort)
contributed by < [Julian-Chu](https://github.com/Julian-Chu) >
###### tags: `linux2021`
> [GitHub](https://github.com/Julian-Chu/ksort)
LKM: Linux Loadable Kernel Module
---
## copy_to_user with uint64_t and variable length
`dev_read`
```cpp
uint64_t value = next(); /* in xoroshiro128plus.c */
/* copy_to_user has the format ( * to, *from, size) and ret 0 on success */
int n_notcopied = copy_to_user(buffer, (char *) (&value), len_);
```
`test_xoro.cc`
```cpp
/* Read the response from the LKM */
ssize_t n_bytes_read = read(fd, rx, n_bytes);
uint64_t value_ = 0;
// uint64_t 8 bytes <-> char *
// char 1 byte
// little endian
for (int b_idx = 0; b_idx < n_bytes_read; b_idx++) {
unsigned char b = rx[b_idx];
value_ |= ((uint64_t) b << (8 * b_idx));
}
```
重點: uint64_t 與 char * 的互換, bit operation, little endian
:::warning
q1: Test reading different numbers of bytes 設計的使用場景?
> 你可嘗試用 `uintptr_t` (C99 規範)
> :notes: jserv
:::
---
## 透過 ktime 取得時間並紀錄排序實作佔用的時間
[Github branch](https://github.com/Julian-Chu/ksort/tree/integrate-sort)
`make check`: run check and plot
### `main.c`
next()回傳 uint64, 使用 cmpint 比較 uint64 會導致排序錯誤, 新增 cmpuint64
```cpp
static int __init cmpuint64(const void *a, const void *b)
{
uint64_t a_val = *(uint64_t *) a;
uint64_t b_val = *(uint64_t *) b;
if (a_val > b_val)
return 1;
if (a_val == b_val)
return 0;
return -1;
```
在 dev_read 內部利用 next() 產生待排序資料, 利用 cmpuint64 跟 sort_impl 進行排序, 量測排序時間並檢驗排序結果
```cpp
static ktime_t kt;
static ssize_t dev_read(struct file *filep,
char *buffer,
size_t len,
loff_t *offset)
{
size_t len_ = (len > 8) ? 8 : len;
uint64_t value;
// generate data to sort
uint64_t sizes[TEST_LEN];
for (int i = 0; i < TEST_LEN; i++) {
sizes[i] = next();
value = sizes[i];
}
// measure sorting time
kt = ktime_get();
sort_impl(sizes, TEST_LEN, sizeof(*sizes), cmpuint64, NULL);
kt = ktime_sub(ktime_get(), kt);
printk(" %llu\n", kt);
/* copy_to_user has the format ( * to, *from, size) and ret 0 on success */
int n_notcopied = copy_to_user(buffer, (char *) (&value), len_);
if (0 != n_notcopied) {
printk(KERN_ALERT "XORO: Failed to read %d/%ld bytes\n", n_notcopied,
len_);
return -EFAULT;
}
// check sorted data
#define ESORT 999
for (int i = 0; i < TEST_LEN - 1; i++)
if (sizes[i] > sizes[i + 1]) {
pr_err("test has failed\n");
return -ESORT;
}
printk(KERN_INFO "XORO: read %ld bytes\n", len_);
return len_;
}
```
對 LKM 新增 write 方法回傳量測時間
```cpp
// declaration of write
static ssize_t dev_write(struct file *, const char *, size_t size, loff_t *);
static struct file_operations fops = {
.open = dev_open,
.read = dev_read,
.release = dev_release,
// assign function to write
.write = dev_write,
};
static ssize_t dev_write(struct file *file,
const char *buf,
size_t size,
loff_t *offset)
{
printk(" %llu\n", kt);
return ktime_to_ns(kt);
}
```
### `test_xoro.c`
更動 `test_xoro.c`, 由於用到 write, 需要設定 fd 可寫入, 將原本的 `O_RDONLY` 改為 `O_RDWR`, 使用 write 回傳資料
記得 read/write 回傳都要 error check, 善用 `errno` 提高 debug 效率
```cpp
int main(int argc, char *argv[])
{
int fd = open("/dev/xoroshiro128p", O_RDWR);
if (0 > fd) {
perror("Failed to open the device.");
return errno;
}
char write_buf[] = "testing writing";
/* Test reading different numbers of bytes */
for (int n_bytes = 0; n_bytes < 10; n_bytes++) {
/* Clear/zero the buffer before copying in read data */
zero_rx();
/* Read the response from the LKM */
ssize_t n_bytes_read = read(fd, rx, n_bytes);
if (0 > n_bytes_read) {
perror("Failed to read all bytes.");
return errno;
}
uint64_t value_ = 0;
// uint64_t 8 bytes <-> char *
// char 1 byte
// little endian
for (int b_idx = 0; b_idx < n_bytes_read; b_idx++) {
unsigned char b = rx[b_idx];
value_ |= ((uint64_t) b << (8 * b_idx));
}
// get sorting time
long long kt = write(fd, write_buf, strlen(write_buf));
if (kt < 0) {
perror("Failed to write the message to the device.");
return errno;
}
printf("%d %lld\n", n_bytes, kt);
}
return 0;
}
```
跑 10 次的結果

---
## 改寫 [swenson/sort](https://github.com/swenson/sort) 的程式碼並整合進 ksort
修改 sort.h 使之可以被使用在 Linux 核心,移除 C 標準函式庫和執行環境的依賴,改用 Linux 核心內建的 API。
```cpp
malloc(size) -> kmalloc(size, GFP_KERNEL) // what's different between flags
free -> kfree
printf(stderr...)-> printk(KERN_ERR ...)
exit(1) -> 移除
// 以下錯誤爲使用空間超過 stack 最大值的設定, 改用 heap 即可
BUG: stack guard page was hit at 000000005b18ddff (stack is 00000000e8fcc7a9..00000000f3c40e99)
```
整合後發現排序錯誤, 原因是使用 demo 裡面的定義給 int 的 SORT_CMP 會導致 uint 的情況錯誤, (x-y) 不會小於零, 使用預設 SORT_CMP 即可修正
```cpp
#define SORT_CMP(x, y) (x - y)
/* You can redefine the comparison operator.
The default is
#define SORT_CMP(x, y) ((x) < (y) ? -1 : ((x) == (y) ? 0 : 1))
but the one below is often faster for integer types.
*/
#define SORT_CMP(x, y) (x - y)
```
改寫 `dev_read` 讓回傳時間可以透過 `copy_to_user` 回傳
```cpp
uint64_t *times = kmalloc(sizeof(uint64_t)*8, GFP_KERNEL);
.......
// shell sort
memcpy(dst, src, sizeof(uint64_t) * TEST_LEN);
kt = ktime_get();
ksort_shell_sort(dst, TEST_LEN);
times[(int)shell_sort] = ktime_sub(ktime_get(), kt);
printk(KERN_INFO " %llu\n", kt);
for (int i = 0; i < TEST_LEN - 1; i++)
if (dst[i] > dst[i + 1]) {
pr_err("test has failed with shell_sort\n");
kfree(src);
kfree(dst);
return -ESORT;
}
......
int n_notcopied = copy_to_user(buffer, (void *) (times), sizeof(uint64_t)*8);
```
> todo: 利用 preprocessor 與 goto 簡化程式碼
排序 100 筆資料

排序 1000 筆資料

排序 1500 筆資料

嘗試排序 2000 筆資料以上會發生 page fault 錯誤, 找問題中
```shell
[31087.396151] BUG: unable to handle page fault for address: ffffffffc064b00f
[31087.396152] #PF: supervisor instruction fetch in kernel mode
[31087.396153] #PF: error_code(0x0010) - not-present page
```
:::warning
q: 除了 dmesg, 在 kernel debug 的技巧?
> 透過 UML 搭配 GDB 來除錯!
> :notes: jserv
:::
sort.h 遇到的 static analysis issue, 新增下列的 cppcheck arguments 暫時忽略。
```shell
--suppress=variableScope -i sort.h --suppress=redundantAssignment -i sort.h
```
---
## 實作 introsort 或 pdqsort,整合進 ksort