[2019q3](http://wiki.csie.ncku.edu.tw/sysprog/schedule) 第 8 週測驗題

# [2019q3](http://wiki.csie.ncku.edu.tw/sysprog/schedule) 第 8 週測驗題 :::info 目的: 檢驗學員對 CS:APP 第 5 章的認知 ::: --- ### 測驗 `1` [bc](https://linux.die.net/man/1/bc) 工具很好用，可讓我們快速計算特定數值表示，如: ```shell $ echo "2^64" | bc -l ``` 預期輸出為: ```shell 18446744073709551616 ``` 這個十進位的輸出需要幾個字元來展現呢？ ```shell $ echo "2^64" | bc -l | wc -c ``` 其中 [wc](https://linux.die.net/man/1/wc) 可計算字元數量，於是我們知道 2^64^ 用十進位表示，需要 21 個位數。倘若我們要計算 48 個位數的十進位運算，顯然不能用 long 或 long long (在 [LP64 data model](https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models) 中，這兩者都是 64 位元的寬度) 型態來處理，這時候我們需要 [大數運算](https://openhome.cc/Gossip/AlgorithmGossip/BigNumber.htm) 的實作。考慮到 64 位元的有效數值上限是 `0xFFFFFFFFFFFFFFF`，而十進位 $99999999×99999999$ 的結果在這有效範圍以內，被乘數 (前者) 和乘數 (後者) 都是 8 個位數 (有利於 x86-64 架構)。於是下方程式碼嘗試依照每 8 個十進位數字，進行分割運算: (`big-mul`): ```cpp #include <stdio.h> #include <stdlib.h> #include <string.h> static inline void add_with_carry(char *input, char num, int idx) { char input_num = input[idx] + num; if (input_num <= 9) { input[idx] = input_num; return; } char carry = input_num / 10; input[idx] = input_num % 10; add_with_carry(input, carry, idx - 1); } #define max(a, b) \ ({ \ __typeof__(a) _a = (a); \ __typeof__(b) _b = (b); \ _a > _b ? _a : _b; \ }) static int add(const char *a1, const char *a2, char *ret, int *ppos) { int len1 = strlen(a1), len2 = strlen(a2); int len = max(len1, len2); int i; for (i = len - 1; i >= 0; i--) { unsigned char tmp; int len_diff = len1 - len2; if (len_diff >= 0) { tmp = ARR1[i] - '0'; if (i > len_diff - 1) tmp += ARR2[i - len_diff] - '0'; } else { tmp = ARR3[i] - '0'; if (i > -len_diff - 1) tmp += ARR4[i + len_diff] - '0'; } add_with_carry(ret, tmp, i + 1); } *ppos = 1; if (ret[BBB]) { len++; *ppos = 0; } for (i = 0; i < len + 1; i++) ret[i] += '0'; return len; } char *mul(const char *m1, const char *m2) { int len = strlen(m1); int ret_len = len + strlen(m2); int n = 0, pos = 0, nblocks = len / 8; char str1[10], str2[10], resstr[20]; char *tmp_ret = calloc(ret_len, 1); char *pow1 = calloc(len * 2, 1), *pow2 = calloc(len * 2, 1); char *ret = calloc(ret_len, 1); /* split into several blocks, 8 digits per block */ for (int i = 0; i < nblocks; i++) { memcpy(str1, m1 + len - i * 8 - 8, 8); unsigned long im1 = atoi(str1); for (int j = 0; j < nblocks; j++) { memcpy(str2, m2 + len - j * 8 - 8, 8); unsigned long im2 = atoi(str2); unsigned long tmp_res = im1 * im2; /* multiplying pow(10,n) equals to preserving trailing '0' */ int pow1size = i * 8, pow2size = j * 8; int reslen = sprintf(resstr, "%lu", tmp_res); int totlen = reslen + pow2size; memset(pow2, '0', totlen); memcpy(pow2, resstr, reslen); reslen = totlen; totlen += pow1size; memset(pow1, '0', totlen); pow1[totlen] = '\0'; memcpy(pow1, pow2, reslen); memset(ret, 0, n + pos); n = add(pow1, tmp_ret, ret, &pos); memcpy(tmp_ret, ret + pos, n); } } memset(ret, 0, n + pos); memcpy(ret, tmp_ret, n); free(tmp_ret); free(pow1); free(pow2); return ret; } int main(int argc, char **argv) { if (argc < 3) return -1; char *m1 = argv[1], *m2 = argv[2]; char *ret = mul(m1, m2); printf("%s\n", ret); free(ret); return 0; } ``` 參考執行方式和輸出: ```shell ./mul 119334567890334449388883313579158334567098134455 \ 667908995633221198765432134678040000123411113456 ``` 預期可得: ``` 79704631383957730438879843848804741889926116047138197998269353980447530720116354515912427726480 ``` 其中被乘數和乘數都用 48 個字元 (再加 1 個 null terminator) 的字串來表示 (可執行 `$ echo -n "119334567890334449388883313579158334567098134455" | wc -c`)，最終乘法結果用 95 個字元 (再加 1 個 null terminator) 的字串來表示。請補完程式碼。 ==作答區== ARR1 = ? * `(a)` a1 * `(b)` a2 * `(c)` ret ARR2 = ? * `(a)` a1 * `(b)` a2 * `(c)` ret ARR3 = ? * `(a)` a1 * `(b)` a2 * `(c)` ret ARR4 = ? * `(a)` a1 * `(b)` a2 * `(c)` ret BBB = ? * `(a)` len * `(b)` `len - 1` * `(c)` i * `(d)` 1 * `(e)` 0 :::success 延伸問題: 1. 解釋上述程式碼運作原理，並指出影響運算正確的缺陷; 2. 修正上述程式碼的缺陷，嘗試依據 CS:APP 第 5 章的手法予以分析執行時期的成本，從而改善效能; 3. 研究 [Karatsuba algorithm](https://en.wikipedia.org/wiki/Karatsuba_algorithm) 快速相乘演算法，嘗試和本實作比較，被乘數和乘數至少該是十進位 48 個位數 - 解說影片: [Fast Multiplication: From Grade-School Multiplication To Karatsuba's Algorithm](https://youtu.be/-dfsxsiGoC8) - 善用 [GMP: GNU Multiple Precision Arithmetic Library](https://gmplib.org/) 4. 研讀 [隱藏的傅立葉分析](https://hackmd.io/@sysprog/S1jR0mHfN) 提到透過傅立葉轉換來進行乘法運算，嘗試引入到大數運算並比較正確性及效能表現; - 參照 [FFT based multiplication of large numbers](http://numbers.computation.free.fr/Constants/Algorithms/fft.html) 及 [Multiplying Huge Integers using Fourier Transforms](http://www.cs.rug.nl/~ando/pdfs/Ando_Emerencia_multiplying_huge_integers_using_fourier_transforms_paper.pdf) ![](https://i.imgur.com/qgWxPjF.png) > A plot showing the average time spent per multiplication operation as a function of the sum of the lengths of the integers to be multiplied. Both axes are dawn on a logarithmic scale (base 10) :::