2019q1 Homework2 (kcalc)

# 2019q1 Homework2 (kcalc) contrinuted by < `lineagech` > ## Self-Checking Items * 解釋浮點運算在 Linux 核心中為何需要特別對待，以及 context switch 的過程中，涉及到 FPU/SIMD context，該注意什麼？ The ariticle mentions that floating point operation will cause trap, and it is not easy to handle in kernel mode. Also, CPU may not save whole FPU registers when context swithces. If just using floating point instruction, it may damage user process' FPU state. * 在給定的 `calc.c` 檔案中，和 [fibdrv](https://hackmd.io/s/SJ2DKLZ8E) 一樣有 character device，但註冊用的 kernel API 不同 (`register_chrdev` vs. `alloc_chrdev_region`)，能否解釋其差異和適用場合呢？ ```clike= /* * major: major device number or 0 for dynamic allocation * name: name of this range of devices * fops: file operations associated with this devices * */ int register_chrdev (unsigned int major, const char *name, const struct file_operations *fops); ``` `register_chrdev` will register a range of 256 minor numbers. From `calc.c`, it uses dynamically allocation for device major number. For `alloc_chrdev_region`: ```clike= /* * dev: output parameter for first assigned number * baseminor: first of the requested range of minor numbers * count: the number of minor numbers required * name: the name of the associated device or driver */ int alloc_chrdev_region (dev_t *dev, unsigned baseminor, unsigned count, const char *name); ``` It also uses dynamically allocation for major number. But you can assign minor base number and its number. Basically, this function can help driver to distinguish different devices sharing with the same major number. * 在 `scripts/test.sh` 檔案中，有一道命令為 `sudo chmod 0666`，這個作用為何？對於我們測試有何幫助？能否對 [fibdrv](https://hackmd.io/s/SJ2DKLZ8E) 建立的 `/dev/fibonacci` device file 也套用類似修改呢？另外，請解釋 device file 在核心及使用者層級的功能 * 在 `calc.c` 檔案中，用到 `copy_to_user` 這個 kernel API，其作用為何？本例使用該 API 做了什麼事？若是資料量增長，是否會有效能的嚴重衝擊呢？ `copy_to_user` will copy data from kernel space to user space. ```clike= static __always_inline unsigned long __must_check copy_to_user(void __user *to, const void *from, unsigned long n) { if (likely(check_copy_size(from, n, true))) n = _copy_to_user(to, from, n); return n; } ``` `_copy_to_user` will check if the pointer is in the user space range, if it is valid, then it will call `raw_copy_to_user` to move data: ```clike= static __always_inline __must_check unsigned long raw_copy_to_user(void __user *dst, const void *src, unsigned long size) { int ret = 0; if (!__builtin_constant_p(size)) return copy_user_generic((__force void *)dst, src, size); ......... ``` In the case, `dev_read` will put results from kernel space to the buffer user has provided: ```clike= snprintf(message, 64, "%d\n", result); size_of_message = strlen(message); error_count = copy_to_user(buffer, message, size_of_message); ``` * 找出至少 3 個 Linux 核心中使用定點數的案例，搭配程式碼解說 Calculating CPU load will involve fixed-point number operations (v4.18): ```clike= /* * a1 = a0 * e + a * (1 - e) */ static unsigned long calc_load(unsigned long load, unsigned long exp, unsigned long active) { unsigned long newload; newload = load * exp + active * (FIXED_1 - exp); if (active >= load) newload += FIXED_1-1; return newload / FIXED_1; } ``` ```clike= #define FSHIFT 11 /* nr of bits of precision */ #define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */ #define LOAD_FREQ (5*HZ+1) /* 5 sec intervals */ #define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */ #define EXP_5 2014 /* 1/exp(5sec/5min) */ #define EXP_15 2037 /* 1/exp(5sec/15min) */ ``` we can know linux kernel sets fractional precision as 11 bits. And it will calculate based on integer, so that it will multiply $2^{11}$ on bot sides of equations and it will get result$*2^{11}$, and finally shift left 11 bits to get real result. * 是否知曉 MathEx 如何解析給定字串，從而分離出變數和對應數學表示法呢？ After tracing code, I found MathEx will separte each token of string and do corresponding operations. It also uses `os` as operation vector and `es` as expression vector to store current operations and expressions. And when one expression is complete, like encountering `,`, it will call `expr_bind` to pop `es` twice to retrieve and combine to a new expression with one operator retrieving from `os`. * 如何對 `MathEx` 進行效能分析？能否找出相似的 math expression 來分析比較呢？ * 提示: 參照 [muparserSSE - A Math Expression Compiler](http://beltoforion.de/article.php?a=muparsersse) * 在 `MathEx` 原始程式碼的 `expression.[ch]` 裡頭 `vec` 相關的程式碼，主要做什麼事？有沒有發現類似 [list](https://hackmd.io/s/S12jCWKHN) 使用到的技巧呢？ `expression` just defines `vec` as generic type and user can just set the type they want: ```clike= #define vec(T) \ struct { \ T *buf; \ int len; \ int cap; \ } ``` And also it provides several macro of operations, like push and pop etc., without need to care the type of buffer inside `vec`. `vec_push` will call `vec_expand` to expand vector size when `len` exceeds the limit and it will grow double of original size, that's will amortize time complexity as $O(1)$. ```clike= static inline int vec_expand(char **buf, int *length, int *cap, int memsz) { if (*length + 1 > *cap) { void *ptr; int n = (*cap == 0) ? 1 : *cap << 1; ptr = realloc(*buf, n * memsz); if (ptr == NULL) { return -1; /* allocation failed */ } *buf = (char *) ptr; *cap = n; } return 0; } ``` I found `vec_foreach` is kind of similar as `list_for_each_entry`: ```clike= #define vec_foreach(v, var, iter) \ if ((v)->len > 0) \ for ((iter) = 0; (iter) < (v)->len && (((var) = (v)->buf[(iter)]), 1); \ ++(iter)) ``` ```clike= #define list_for_each_entry(entry, head, member) \ for (entry = list_entry((head)->next, __typeof__(*entry), member); \ &entry->member != (head); \ entry = list_entry(entry->member.next, __typeof__(*entry), member)) ``` The macros just hide complex code in a macro, and user can intiutively use them. * 解釋 `MathEx` 一類的 math expression 在真實世界有哪些應用？甚至，是否在 Linux 核心就存在類似的程式碼？ * 提示: 參照 [A thorough introduction to eBPF](https://lwn.net/Articles/740157/) * 如果要將使用者層級的 C 語言程式，搬到 Linux 核心作為核心模組 (LKM)，該注意哪些議題呢？請舉例說明 `__KERNEL__` is only defined in kernel space, if user program just includes kernel stuffs, `__KERNEL` can just avoid including those stuffs. Another is that it cannot include standard C library in kernel module. For some functions, you can just use linux header, like `linux/string.h`. Or just need to implement our own version rather than standard C library one. For memory allocation, just can use `kmalloc` to allocate memory smaller than page size. flags you can assign flags as `GFP_USER`, Allocate memory on behalf of user, or `GFP_KERNEL`, Allocate normal kernel ram. ```clike= void * kmalloc ( size_t size, gfp_t flags ); ``` or use `vmalloc` to allocate continuous virtual memory. Refering from [kmalloc v.s. vmalloc](https://blog.csdn.net/macrossdzh/article/details/5958368), briefly saying, vmalloc ensures it will be virtually continuous, but kmalloc ensures it will be physically continuous. Depends on application, you can decide which to use. Finally, if using floating point instructions, you will get error when compiling kernel modules.: ```sh error: SSE register return with SSE disabled ``` * [fixed point arithmetic for RT-Linux](http://netwinder.osuosl.org/pub/netwinder/docs/nw/rt_fixed/doc/html/rt_fix.html) 的運作原理為何？給定的程式碼已經存在超過 20 年，很多細節已有出入，可否嘗試移植到 Linux `v4.15+` 呢？ This implementation adopts 32-bit integer containing integer and fractional part. It also contains a variable recording integer length, which means that it can vary its integer and fractional precision. For those operations, like add, sub etc., it will calculate its final integer part length to make sure that it will not overflow, like add operation, we all know the sum's bit length will increase 1 at most: ```clike= /*rt_fix rt_fix_add(rt_fix x, rt_fix y) */ unsigned short iwl = (x._iwl > y._iwl) ? x._iwl + 1 : y._iwl + 1; ...... if(x._iwl > y._iwl) { x._value = (x._value >> 1) - (y._value >> (x._iwl - y._iwl + 1)); } else if(x._iwl < y._iwl) { x._value = (x._value >> (y._iwl - x._iwl + 1)) - (y._value >> 1); } else { x._value = (x._value >> 1) - (y._value >> 1); } ``` Or like multiply, integer part length of result will have at most the sum of two parameters' integer part length: ```clike= /*rt_fix rt_fix_mul(rt_fix x, rt_fix y) */ unsigned short iwl = x._iwl + y._iwl; ...... if(y._iwl < iwl) { y._value >>= x._iwl; } if(x._iwl < iwl) { x._value >>= y._iwl; } ``` And it will move their integer part to the same precision, which means that numbers will have same integer and fractional precision and can calculate on the same base. At `rt_fix_mul` or `rt_fix_div`, they finally use `__asm__` to perform the operation, like: ```clike= __asm__("imull %%ebx\n\t" /* Multiply a * b */ "shrdl %%cl, %%edx, %%eax\n\t" : "=a" (x._value) : "a" (x._value), "b" (y._value), "c" (RT_FIXWL - x._iwl) : "edx"); ``` It firstly uses `imull` instruction to multiply `%%eax`, namely `x`, with `%%ebx`, that is `y`. And then uses instruction `shrdl` to perform right shift `%%cl` bits, which is the number of bit of frac, and the put the right-shift part to `%%edx`. So the remaing is left as`%%eax` with integer `iwl` bits and fraction `RT_FIXWL-iwl` bits. `rt_fix_div` is similar. I refer this implementation for MathEx: [fix_p_num](https://github.com/lineagech/kcalc/blob/master/mathex/fix_p_num.c) ## Fixed-Point Number Operation I define and implement fix point number operation in file `mathex/fix_p_num.h` and `mathex/fix_p_num.c` based on the implementation of RT Linux. And modify corresponding operation, like add, sub, mul, and div etc. in file `mathex/expression.c` with `#if FIXPOP`. ```clike= typedef struct fix_p_num{ int value; unsigned short int_len; } fix_p_num; ``` Just run with `mathex/test-fix.c`, the fixed point number operations should work. but I am still struggling with transforming string number to fixed point number in `expr_parse_number`. I still cannot move to kernel module because of `error: SSE register return with SSE disabled` caused by this function.