contributed by < IepIweidieng
>
我切 E。
切出 32 GiB。
404 - Not Found
載點 dead 了,換個。
我選元智。
404 - Not Found
每個載點點點,發現全都 dead。
到 ftp 頁面裡面看了看,發現沒有 18.04.3 這個版本,不過有 18.04.4。
http://ftp.ubuntu-tw.org/mirror/ubuntu-releases/
Name Last Modified Size Type
[...]
18.04/ 2020-Feb-12 22:32:24 - Directory
18.04.4/ 2020-Feb-12 22:32:24 - Directory
觀察 Last Modified
這個欄位,可以發現是 2 月初才剛更新。
可能是因為這個原因,剛才的下載點列表還沒更新,所以裡面的連結才會失效。
http://ftp.ubuntu-tw.org/mirror/ubuntu-releases/18.04/
Name Last Modified Size Type
Parent Directory/ - Directory
[...]
SHA1SUMS 2020-Feb-12 22:32:05 0.1K application/octet-stream
SHA1SUMS.gpg 2020-Feb-12 22:32:41 0.9K application/octet-stream
SHA256SUMS 2020-Feb-12 22:32:24 0.2K application/octet-stream
SHA256SUMS.gpg 2020-Feb-12 22:32:44 0.9K application/octet-stream
ubuntu-18.04.4-desktop-amd64.iso 2020-Feb-04 02:40:24 1.9G application/x-iso9660-image
ubuntu-18.04.4-desktop-amd64.iso.torrent 2020-Feb-12 21:41:56 79.5K application/x-bittorrent
ubuntu-18.04.4-desktop-amd64.iso.zsync 2020-Feb-12 21:41:52 3.9M application/octet-stream
ubuntu-18.04.4-desktop-amd64.list 2020-Feb-04 02:40:25 7.8K application/octet-stream
[後略...]
因為離下載完成還要很久,先用 WSL (WSL 1) 開始本次作業。
安裝 WSL 的說明:https://docs.microsoft.com/zh-tw/windows/wsl/install-win10
我先前就已準備好 WSL 環境,所以將其直接開啟。
看看它的版本。
$ uname -a
Linux IID-PC-acer-N17Q3 4.4.0-18362-Microsoft #476-Microsoft Fri Nov 01 16:53:00 PST 2019 x86_64 x86_64 x86_64 GNU/Linux
查看我電腦的規格。
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
Vendor ID: AuthenticAMD
CPU family: 21
Model: 112
Model name: AMD A9-9420 RADEON R5, 5 COMPUTE CORES 2C+3G
Stepping: 0
CPU MHz: 3000.000
CPU max MHz: 3000.0000
BogoMIPS: 6000.00
Virtualization: AMD-V
Hypervisor vendor: Windows Subsystem for Linux
Virtualization type: container
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave osxsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm perfctr_core perfctr_nb bpext ptsc mwaitx fsgsbase bmi1 avx2 smep bmi2
$ sudo aptitude install build-essential git-core valgrind cppcheck clang-format aspell lordiff valgrind
$ cppcheck --version
Cppcheck 1.82
太舊,不要了。
$ sudo aptitude remove cppcheck
$ sudo snap install cppcheck
Interacting with snapd is not yet supported on Windows Subsystem for Linux.
This command has been left available for documentation purposes only.
看來需要自行編譯 cppcheck
。
cppcheck
- 2020-02-27 18:03$ cd ~
$ git clone git://github.com/danmar/cppcheck.git
$ cd cppcheck/
$ ls
AUTHORS cmake gui requirements.txt
CMakeLists.txt console_common.pri htmlreport rules
COPYING cppcheck-errors.rng lib runastyle
Cppcheck.xcodeproj cppcheck.cbp man runastyle.bat
Makefile cppcheck.cppcheck naming.json samples
addons cppcheck.sln oss-fuzz snap
appveyor.yml createrelease philosophy.md test
benchmarks.txt cve-test-suite platforms tools
build-pcre.txt democlient pylintrc_travis webreport.sh
build.bat doxyfile readme.md win_installer
cfg externals readme.txt
cli generate_coverage_report readmeja.md
$ vim readme.md
看了說明後,我決定使用 cmake
編譯。
$ mkdir build/
$ cd build/
$ cmake .. -DHAVE_RULES=ON -DUSE_MATCHCOMPILER=ON
$ cmake --build .
編好了可以安裝了。
$ sudo checkinstall
**********************************************************************
Done. The new package has been installed and saved to
/home/iid/cppcheck/build/build_20200227-1_amd64.deb
You can remove it from your system anytime using:
dpkg -r build
**********************************************************************
終於裝好了,那我們可以開始了。
queue
開發過程O(1)
的 q_insert_tail()
& q_size()
如果我們只知道 list 的開頭,那麼 q_insert_tail()
需要找出 list 的 tail 才能完成;q_size()
則需要從頭 travel 到 tail 才行。
但如果把 traveling 完的結果與位置記起來,只要這個結果本身不改變,這個位置不移動,就不必再 traveling 到這裡。
我們可以試著把 traveling 的結果 (對應 tail) 與位置 (對應 size) 記起來。
我們可以在一開始 travel 整個 list,記下結果。
不過,通常整個 list 在一開始都會是空的,所以沒有 tail,而 size 為 0
。這時如果去 travel,一開始就結束了。
然而,我們想要處理的是會改變的 list。
為了讓記起來的結果保持正確,需要在結果改變時,更新我們之前記住的東西。
可能需要更新 tail 的場合:
q_free()
q_remove_head()
,且 head 為 tail
q_new()
(視 list root 為 tail)
q_insert_tail()
q_insert_head()
,且 list 為空
q_reverse()
q_sort()
可能需要更新 size 的場合:
q_insert_tail()
q_insert_head()
q_free()
q_remove_head()
當 list 被 free,它的存在便不成立,它的頭尾、它的長度,都失去意義,不再需要我們更新從前的記憶。
Tail 消失了
需要進行 traveling 來更新 tail,通常這會造成 O(n)
的時間複雜度。
但是只有 q_remove_head()
在 head 為 tail 時需要 traveling 來更新 tail,而 head 為 tail 時,list 長度為 1
;需要 travel 的 node 數量固定了,因此時間複雜度降為 O(1)
。
而 size 不需要位置資訊,只要在 insert 和 remove 時跟著更新就行,size 取得操作的時間複雜度可為 O(1)
首先先把 q_new()
及 q_free()
寫好。
q_new()
Declaration:
queue_t *q_new();
這是不正確的。這是 C 不是 C++。
不像 C++,C 函數宣告的 parameter list 留空的話,會允許 pass 任意 type 與數量的 arguments 進去。
所以在 C 下,我這樣呼叫也是可以的:
q_new("" L" 4" " 2" L" ", &06[L"L"],
&(struct Struct {
union Union {
int long unsigned long (((i)));
char (((byte))[sizeof(const long unsigned int long volatile)]);
} *noinu;
} const){&(union Union){.i = (9ULL),},});
這種函數宣告在 C 中叫做 K&R-style function declaration,是在 C 的 function prototype 出現前的函數宣告語法。
C++ 沒有 K&R-style function declaration,而是將空的 parameter list 視為沒有參數,改用 int func(...)
表示接受任意 type 與數量的 arguments 的函數。
比較正確的作法是宣告成:
queue_t *q_new(void);
這是 C 與 C++ 都接受的語法。
繼續研究後發現,C2x 不再接受 K&R-style function declaration。所以函數定義中的 int func()
與 int func(void)
會是相等的,都不接受任何參數,變得與 C++ 相似。
但在函數宣告中,int func()
仍然不限制參數的 type 與數量,所以在 q_new()
的定義之前,或在沒有定義 q_new()
的檔案中,還是可以使用剛才的方法呼叫 q_new()
。
There are other issues within the field of function syntax, namely — declarations (that are not definitions) may omit parameter information completely, and […]
The first, the property to provide declarations without prototypes, has also been marked obsolescent in the C standard, but still is relatively widely used. […] Therefore, we will not attempt to remove this specific feature here, this would need much more preparation and
discussion.
– Remove support for function definitions with identifier lists proposal for C2x
相關資料:
1.1. Integrated changes:
[…]
N2432. Remove support for function definitions with identifier lists
Definition:
/*
* Create empty queue.
* Return NULL if could not allocate space.
*/
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
if (q)
q->head = NULL;
return q;
}
q_free()
Declaration:
void q_free(queue_t *q);
Definition:
/* Free all storage used by queue */
void q_free(queue_t *q)
{
if (!q)
return;
/* Free queue elements */
for (list_ele_t *k = q->head; k;) {
list_ele_t *const next = k->next;
free(k->value);
free(k);
k = next;
}
/* Free queue structure */
free(q);
}
不應該用 q_remove_head()
,因為會不必要地設定 q->head
以及檢查要不要 copy k->value
。
接下來要考慮 insert_tail()
與 size()
的 O(1)
時間實作。
延續之前的討論,可以嘗試把 tail 和 size 記在 queue_t
中。
diff --git a/queue.h b/queue.h
index 3b48875..c67dd87 100644
--- a/queue.h
+++ b/queue.h
@@ -25,10 +25,8 @@ typedef struct ELE {
/* Queue structure */
typedef struct {
list_ele_t *head; /* Linked list of elements */
- /* TODO: You will need to add more fields to this structure
- * to efficiently implement q_size and q_insert_tail.
- */
- /* TODO: Remove the above comment when you are about to implement. */
+ list_ele_t *tail; /* The tail element of the list */
+ size_t size; /* The size of the list */
} queue_t;
/* Operations on queue */
q_new()
- revised不過,通常整個 list 在一開始都會是空的,所以沒有 tail,而 size 為
0
。這時如果去 travel,一開始就結束了。
將 list root 設為 tail
要這麼做的話,tail
的 type 得是 list_ele_t **
才能指向 list root (queue_t::head
)。
但我將 tail
的 type 設為了 list_ele_t *
,所以不可行,因此將其直接設為 NULL
。
diff --git a/queue.c b/queue.c
index 48fef4e..f18cf9c 100644
--- a/queue.c
+++ b/queue.c
@@ -12,8 +12,11 @@
queue_t *q_new()
{
queue_t *q = malloc(sizeof(queue_t));
- if (q)
+ if (q) {
q->head = NULL;
+ q->tail = NULL;
+ q->size = 0;
+ }
return q;
}
q_size()
Declaration:
int q_size(queue_t *q);
Definition:
/*
* Return number of elements in queue.
* Return 0 if q is NULL or empty
*/
int q_size(queue_t *q)
{
return (q) ? q->size : 0;
}
我習慣把 ?:
的條件用 ()
括起來。
如果 q
是空的話,q->size
會是 0
,回傳 q->size
即可,不須額外處理。
q_insert_head()
& q_insert_tail()
Declarations:
bool q_insert_head(queue_t *q, char *s);
bool q_insert_tail(queue_t *q, char *s);
Parameter s
被宣告為 char *
,這是不好的。
因為這兩個函數都不會修改 s
指向的字串的內容(註解有提到要用複製的),所以應該宣告成 const char *
。
const char *
與 char *
在 C 與 C++ 中都可以 implicit convert 為 const char *
型別,所以如果不需要修改 pass 進來的字串內容的話,完全可以宣告為 const char *
。這可以提示其他看到這個函數宣告的寫程式者說:「你可以相信我不會改你給我的字串!」
另外,string literal 在 C 中的型別是 char [N]
,但在 C++ 中卻是 const char [N]
,所以如果有人使用:
q_insert_head(q, "aaq");
呼叫這個函數的話,這人用 C++ 編譯器編譯時,如果編譯器沒有相關的 extension,就會編譯失敗。
所以應該改成:
bool q_insert_head(queue_t *q, const char *s);
bool q_insert_tail(queue_t *q, const char *s);
Definitions:
我自己定義了函數 ele_alloc()
來新增 node。
注意我用 const char *s
。
/*
* Return `NULL` if could not allocate space.
* Return non-`NULL` if successful.
* Argument `s` points to the string to be stored and should not be `NULL`.
* Note: `newh->next` will not be initialized.
*/
static list_ele_t *ele_alloc(const char *s)
{
/* Don't forget to allocate space for the string and copy it */
list_ele_t *const newh = malloc(sizeof(list_ele_t));
if (newh) {
const size_t len = strlen(s) + 1;
newh->value = malloc(len);
if (newh->value) {
memcpy(newh->value, s, len);
} else {
/* What if either call to malloc returns NULL? */
free(newh);
return NULL;
}
}
return newh;
}
用了 strlen()
後就不需要 strcpy()
或 strncpy()
了,應該使用 memcpy()
。因為 strcpy()
以及 strncpy()
會去算字串長度,可是已經用 strlen()
算過了,會浪費時間。
由於 C Programming Lab 有提到:
For functions that provide strings as arguments, you must create and store a copy of the string by calling
malloc
to allocate space (remember to include space for the terminating character) and then copying from
the source to the newly allocated space.
所以我使用 malloc()
+ memcpy()
而非 strdup()
來複製字串。
另外,strdup()
並非 C 與 C++ 的標準函數,而是由 POSIX 標準規範的函數。
strdup()
與 strndup()
是 C2x 的標準函數了。
相關資料:
1.1. Integrated changes:
[…]
N2353. the strdup and strndup functions
q_insert_head()
要注意 tail 可能會 appear。
/*
* Attempt to insert element at head of queue.
* Return true if successful.
* Return false if q is NULL or could not allocate space.
* Argument s points to the string to be stored.
* The function must explicitly allocate space and copy the string into it.
*/
bool q_insert_head(queue_t *q, char *s)
{
list_ele_t *newh;
if (!q)
return false;
newh = ele_alloc(s);
if (newh) {
newh->next = q->head;
q->head = newh;
if (!q->tail) // The tail will appear
q->tail = newh;
++q->size;
return true;
}
return false;
}
q_insert_tail()
則要注意 tail 會被 appended。
另外 q
為空時,head 會 appears。
注意沒有 head 時也不會有 tail,所以這裡用 else if (q->tail)
而不是單純的 if (q->tail)
以免去不必要的判斷。
/*
* Attempt to insert element at tail of queue.
* Return true if successful.
* Return false if q is NULL or could not allocate space.
* Argument s points to the string to be stored.
* The function must explicitly allocate space and copy the string into it.
*/
bool q_insert_tail(queue_t *q, char *s)
{
list_ele_t *newh;
if (!q)
return false;
newh = ele_alloc(s);
if (newh) {
newh->next = NULL;
if (!q->head) // The head will appear
q->head = newh;
else if (q->tail) // The original tail will be appended
q->tail->next = newh;
q->tail = newh;
++q->size;
return true;
}
return false;
}
q_remove_head()
Declaration:
bool q_remove_head(queue_t *q, char *sp, size_t bufsize);
要注意如果 head 剛好是 tail 時,tail 會消失。
Definition:
/*
* Attempt to remove element from head of queue.
* Return true if successful.
* Return false if queue is NULL or empty.
* If sp is non-NULL and an element is removed, copy the removed string to *sp
* (up to a maximum of bufsize-1 characters, plus a null terminator.)
* The space used by the list element and the string should be freed.
*/
bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
{
list_ele_t *node;
if (!q || !q->head)
return false;
node = q->head;
if (sp) {
const size_t len = strnlen(node->value, bufsize);
memcpy(sp, node->value, len);
if (len == bufsize && len > 0)
sp[len - 1] = '\0';
}
free(node->value);
q->head = q->head->next;
if (node == q->tail) // The tail will disappear
q->tail = NULL;
free(node);
--q->size;
return true;
}
strnlen()
是有步數限制的 strlen()
。
呼叫 strnlen(str, len)
時,如果找到 str[len - 1]
時都沒發現 '\0'
結尾,strnlen()
就會回傳 len
。這可以防止 buffer overrun。
為了確保 sp
有 '\0'
結尾,我在 strnlen()
傳回最大值時,就將 sp
強制加上 '\0'
結尾。
查看了 man page 發現,strncpy()
在複製後會用 '\0'
填滿剩下的空間,所以理論上可以直接 sp[bufsize - 1] = '\0'
而不改變結果。
而 C11 的 strncpy_s()
在複製後只會包含 1 個 '\0'
,而不會用 '\0'
填滿剩下的空間,相較下比較有效率。
q_reverse()
Declaration:
void q_reverse(queue_t *q);
Definition:
/*
* Reverse elements in queue
* No effect if q is NULL or empty
* This function should not allocate or free any list elements
* (e.g., by calling q_insert_head, q_insert_tail, or q_remove_head).
* It should rearrange the existing ones.
*/
void q_reverse(queue_t *q)
{
list_ele_t *prev;
if (!q || !q->head)
return;
/* Re-assign `k->next` */
prev = NULL;
for (list_ele_t *k = q->head; k;) {
list_ele_t *const next = k->next;
k->next = prev;
prev = k;
k = next;
}
/* `prev` is now the original tail */
/* Swap `head` and `tail` */
q->tail = q->head;
q->head = prev;
}
q_sort()
算法構想我使用了 recursion 版本的 merge sort 來實作 q_sort()
。
構想是:與 q1 測驗題中的 insertion sort 相似,但是讓 right
先走到 list 的中間,再去左右分別遞迴下去,這樣就是 merge sort。
在撰寫 q_sort()
的途中,發現了幾個可能值得探討之處:
left
和 right
被拿去遞迴時應當都不會是 NULL
。
right
走到一半前會變 NULL
。left
遞迴下去就足夠了,之後也不必合併 left
與 right
的 lists,所以可以直接回傳 left
遞迴下去的結果。left
與 right
的列表時,它們都不會是 NULL
,所以可以在迴圈前就設定好 merge list 的開頭,不用在迴圈裡一直檢查 !merge
。Declaration:
void q_sort(queue_t *q);
Definitions:
我寫了另外的函數用以遞迴。
注意不要相信傳進來的 len
,要讓函數在 len
給錯時還能正確完成操作。
/*
* Sort `len` elements start with `head` in ascending order using merge sort
* No effect if `head` is `NULL` or its member `next` is `NULL`.
* The function reduces to insertion sort when `len` < 4
*/
static list_ele_t *ele_sort(list_ele_t *head, size_t len)
{
list_ele_t *left;
list_ele_t *right;
list_ele_t *merge;
if (!head || !head->next)
return head;
left = right = head;
for (size_t k = 1, n = len / 2; k < n && right; ++k)
right = right->next;
if (right) {
list_ele_t *const next = right->next;
right->next = NULL;
right = next;
}
left = ele_sort(left, len / 2);
if (!right) // `right` is `NULL` => `left` is the head of the sorted list
return left;
right = ele_sort(right, (len + 1) / 2);
/* Now `left` and `right` are both the head of a non-`NULL` sorted list */
{
/* Set up the beginning of the merged list */
list_ele_t **const phead =
(strcmp(left->value, right->value) < 0) ? &left : &right;
head = *phead;
merge = *phead = (*phead)->next;
}
while (left || right) {
if (!right || (left && strcmp(left->value, right->value) < 0)) {
/* `left` should be linked from `merge` */
merge->next = left;
} else {
/* `right` should be linked from `merge` */
merge->next = right;
}
merge = merge->next;
}
return head;
}
/*
* Sort elements of queue in ascending order
* No effect if q is NULL or empty. In addition, if q has only one
* element, do nothing.
*/
void q_sort(queue_t *q)
{
if (!q || !q->head)
return;
ele_sort(q->head, q->size);
}
$ make test
scripts/driver.py -c
--- Trace Points
+++ TESTING trace trace-01-ops:
# Test of insert_head and remove_head
ERROR: Removed value dolphinXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value dolphin
ERROR: Removed value bearXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value bear
ERROR: Removed value gerbilXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value gerbil
--- trace-01-ops 0/6
+++ TESTING trace trace-02-ops:
# Test of insert_head, insert_tail, and remove_head
ERROR: Removed value dolphinXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value dolphin
ERROR: Removed value bearXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value bear
ERROR: Removed value gerbilXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value gerbil
ERROR: Removed value meerkatXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value meerkat
ERROR: Removed value bearXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value bear
Error limit exceeded. Stopping command execution
--- trace-02-ops 0/6
+++ TESTING trace trace-03-ops:
# Test of insert_head, insert_tail, reverse, and remove_head
ERROR: Removed value squirrelXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value squirrel
ERROR: Removed value gerbilXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value gerbil
ERROR: Removed value bearXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value bear
ERROR: Removed value meerkatXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value meerkat
ERROR: Removed value gerbilXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value gerbil
Error limit exceeded. Stopping command execution
--- trace-03-ops 0/6
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-04-ops 0/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
ERROR: Removed value dolphinXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX[後略...] != expected value dolphin
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-05-ops 0/5
+++ TESTING trace trace-06-string:
# Test of truncated strings
ERROR: Removed value aardvark_bear_dolphin_gerbil_jaguarXXX[後略...] != expected value aardvark_bear_dolphin_gerbil_jaguar
ERROR: Removed value meerkat_panda_squirrel_vulture_wolfXXX[後略...] != expected value meerkat_panda_squirrel_vulture_wolf
ERROR: Removed value aardvark_bear_dolphin_gerbil_jaguarXXX[後略...] != expected value aardvark_bear_dolphin_gerbil_jaguar
ERROR: Removed value meerkat_panda_squirrel_vulture_wolfXXX[後略...] != expected value meerkat_panda_squirrel_vulture_wolf
--- trace-06-string 0/6
+++ TESTING trace trace-07-robust:
# Test operations on NULL queue
--- trace-07-robust 6/6
+++ TESTING trace trace-08-robust:
# Test operations on empty queue
--- trace-08-robust 6/6
+++ TESTING trace trace-09-robust:
# Test remove_head with NULL argument
--- trace-09-robust 6/6
+++ TESTING trace trace-10-malloc:
# Test of malloc failure on new
--- trace-10-malloc 6/6
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on insert_head
--- trace-11-malloc 6/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on insert_tail
--- trace-12-malloc 6/6
+++ TESTING trace trace-13-perf:
# Test performance of insert_tail
--- trace-13-perf 6/6
+++ TESTING trace trace-14-perf:
# Test performance of size
--- trace-14-perf 6/6
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-15-perf 0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
--- trace-16-perf 0/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
--- TOTAL 53/100
q_remove()
和 q_sort()
不正確。
q_remove()
複製字串時少複製了 '\0'
。
diff --git a/queue.c b/queue.c
index 2226509..35a38c1 100644
--- a/queue.c
+++ b/queue.c
@@ -128,11 +128,11 @@ bool q_remove_head(queue_t *q, char *sp, size_t bufsize)
return false;
node = q->head;
- if (sp) {
- const size_t len = strnlen(node->value, bufsize);
- memcpy(sp, node->value, len);
- if (len == bufsize && len > 0)
- sp[len - 1] = '\0';
+ if (sp && bufsize) {
+ const size_t len = strnlen(node->value, bufsize - 1);
+ memcpy(sp, node->value, len + 1);
+ if (len + 1 == bufsize)
+ sp[len] = '\0';
}
free(node->value);
注意如果 bufsize
為 0
時沒有東西可 copy。
q_sort()
則有幾個問題:
right
遞迴下去時,手動給的 len
可能會 overflow 變成 0
(size_t
是 unsigned type)。next
上,而這個 head
才剛設定好,它的 next
還是合併前的 next
,如果其 list 長度是 1 的話就是 NULL
。next
指向自己,造成無限迴圈。@@ -209,24 +209,26 @@ static list_ele_t *ele_sort(list_ele_t *head, size_t len)
left = ele_sort(left, len / 2);
if (!right) // `right` is `NULL` => `left` is the head of the sorted list
return left;
- right = ele_sort(right, (len + 1) / 2);
+ right = ele_sort(right, len / 2 + len % 2);
/* Now `left` and `right` are both the head of a non-`NULL` sorted list */
{
/* Set up the beginning of the merged list */
list_ele_t **const phead =
(strcmp(left->value, right->value) < 0) ? &left : &right;
- head = *phead;
- merge = *phead = (*phead)->next;
+ merge = head = *phead;
+ *phead = (*phead)->next;
}
while (left || right) {
if (!right || (left && strcmp(left->value, right->value) < 0)) {
/* `left` should be linked from `merge` */
merge->next = left;
+ left = left->next;
} else {
/* `right` should be linked from `merge` */
merge->next = right;
+ right = right->next;
}
merge = merge->next;
}
但解決了這兩個問題後,q_sort()
還是有錯誤,原因不確定,用 valgrind
找找看。
$ sudo aptitude install libc6-dbg
libc6-dbg is already installed at the requested version (2.27-3ubuntu1)
$ valgrind -q --leak-check=full ./qtest
==181== error calling PR_SET_PTRACER, vgdb might block
cmd> new
cmd> new
q = []
cmd> it RAND 16
cmd> it RAND 16
q = [tblipgjq hhutscu yblakru xjsjz wlwagfii ofwirvrl umjrxhrfm kbbwx zwhwos lhkpxzb ykamabi hrhitzevx gdkhb ahpblakjy hvpbvtyul zkfgmhpbi]
cmd> sort
cmd> sort
==181== Invalid read of size 8
==181== at 0x109CD2: do_sort (qtest.c:561)
==181== by 0x10B9D2: interpret_cmda (console.c:220)
==181== by 0x10BF46: interpret_cmd (console.c:243)
==181== by 0x10C6B5: cmd_select (console.c:607)
==181== by 0x10C75C: run_console (console.c:628)
==181== by 0x10AE81: main (qtest.c:772)
==181== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==181==
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
==181== 5 bytes in 1 blocks are still reachable in loss record 1 of 30
[後略...]
(qtest
的 echo
option 沒關,所以會將使用者輸入再印一次)
$ vim qtest.c
:561
if (strcasecmp(e->value, e->next->value) > 0) {
e->next
很可疑。
bool ok = true;
if (q) {
for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
/* Ensure each element in ascending order */
/* FIXME: add an option to specify sorting order */
if (strcasecmp(e->value, e->next->value) > 0) {
report(1, "ERROR: Not sorted in ascending order");
ok = false;
break;
}
}
}
找到問題了,因為 e->next
在 e
到達 list tail 時已經是 NULL
了,再 ->value
下去就會爆。
只要在 e->next
為 NULL
時停止迴圈就可以了。
diff --git a/qtest.c b/qtest.c
index 86d68fb..6ec8d68 100644
--- a/qtest.c
+++ b/qtest.c
@@ -554,8 +554,8 @@ bool do_sort(int argc, char *argv[])
set_noallocate_mode(false);
bool ok = true;
- if (q) {
- for (list_ele_t *e = q->head; e && --cnt; e = e->next) {
+ if (q && q->head) {
+ for (list_ele_t *e = q->head; e->next && --cnt; e = e->next) {
/* Ensure each element in ascending order */
/* FIXME: add an option to specify sorting order */
if (strcasecmp(e->value, e->next->value) > 0) {
注意只要比較 size - 1
次就好;在 size 為 1
時則沒東西可比較。
重看了這裡的 code 發現,for
迴圈中的 --cnt
會讓迴圈最多執行 size - 1
步,這就足以防止 e
為 tail 時 e->next
為 NULL
的相關問題了。
所以這裡會 crash 是因為 q
本身出了問題。(見下)
再試試看。
$ make test
scripts/driver.py -c
--- Trace Points
[...]
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
ERROR: Removed value dolphin != expected value bear
--- trace-04-ops 0/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
ERROR: Removed value gerbil != expected value bear
ERROR: Removed value meerkat != expected value bear
ERROR: Removal from queue failed (1 failures total)
ERROR: Removal from queue failed (2 failures total)
ERROR: Removal from queue failed (3 failures total)
Error limit exceeded. Stopping command execution
--- trace-05-ops 0/5
[...]
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
ERROR: Freed queue, but 3999998 blocks are still allocated
--- trace-15-perf 0/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
ERROR: Freed queue, but 19998 blocks are still allocated
ERROR: Freed queue, but 119996 blocks are still allocated
ERROR: Freed queue, but 319994 blocks are still allocated
--- trace-16-perf 0/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
--- TOTAL 77/100
看來這次是我的問題了。
因為想不到原因,所以從頭看了這份報告,然後看到這段:
可能需要更新 tail 的場合:
[…]
- Tail 往前移動了
q_reverse()
q_sort()
我想起來了,我還沒有設定 sort 完的 list 的 head 與 tail。
diff --git a/queue.c b/queue.c
index cbe1628..b25556a 100644
--- a/queue.c
+++ b/queue.c
@@ -183,19 +183,24 @@ void q_reverse(queue_t *q)
q->head = prev;
}
+typedef struct {
+ list_ele_t *head;
+ list_ele_t *tail;
+} list_span_t;
+
/*
* Sort `len` elements start with `head` in ascending order using merge sort
* No effect if `head` is `NULL` or its member `next` is `NULL`.
* The function reduces to insertion sort when `len` < 4
*/
-static list_ele_t *ele_sort(list_ele_t *head, size_t len)
+static list_span_t ele_sort(list_ele_t *head, size_t len)
{
list_ele_t *left;
list_ele_t *right;
list_ele_t *merge;
if (!head || !head->next)
- return head;
+ return (list_span_t){head, head};
left = right = head;
for (size_t k = 1, n = len / 2; k < n && right; ++k)
@@ -206,10 +211,13 @@ static list_ele_t *ele_sort(list_ele_t *head, size_t len)
right = next;
}
- left = ele_sort(left, len / 2);
- if (!right) // `right` is `NULL` => `left` is the head of the sorted list
- return left;
- right = ele_sort(right, len / 2 + len % 2);
+ {
+ const list_span_t left_res = ele_sort(left, len / 2);
+ left = left_res.head;
+ if (!right) // `left` is the head of the sorted list
+ return left_res;
+ }
+ right = ele_sort(right, len / 2 + len % 2).head;
/* Now `left` and `right` are both the head of a non-`NULL` sorted list */
{
@@ -233,7 +241,7 @@ static list_ele_t *ele_sort(list_ele_t *head, size_t len)
merge = merge->next;
}
- return head;
+ return (list_span_t){head, merge};
}
/*
@@ -243,8 +251,11 @@ static list_ele_t *ele_sort(list_ele_t *head, size_t len)
*/
void q_sort(queue_t *q)
{
- if (!q || !q->head)
+ list_span_t span;
+ if (!q || !q->head || !q->head->next)
return;
- ele_sort(q->head, q->size);
+ span = ele_sort(q->head, q->size);
+ q->head = span.head;
+ q->tail = span.tail;
}
我讓 ele_sort()
一次回傳兩個值,分別是 list sort 完後的 head 與 tail。
/*
* Sort elements of queue in ascending order
* No effect if q is NULL or empty. In addition, if q has only one
* element, do nothing.
*/
注意在 q
的 size 為 1
時,do nothing
,所以這時我就不呼叫 ele_sort()
也不設定 head 與 tail 了。
$ make test
scripts/driver.py -c
--- Trace Points
[...]
+++ TESTING trace trace-04-ops:
# Test of insert_head, insert_tail, size, and sort
--- trace-04-ops 6/6
+++ TESTING trace trace-05-ops:
# Test of insert_head, insert_tail, remove_head, reverse, size, and sort
--- trace-05-ops 5/5
[...]
+++ TESTING trace trace-15-perf:
# Test performance of insert_tail, size, reverse, and sort
--- trace-15-perf 6/6
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
--- trace-16-perf 6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
--- TOTAL 100/100
我選擇參考 Ruby 版本的 natural comparison 實作,重寫 C 的版本。
https://github.com/sourcefrog/natsort/blob/master/natcmp.rb
先看看 Ruby 版本是怎麼做的?
class String
# 'Natural order' comparison of two strings
def String.natcmp(str1, str2, caseInsensitive=false)
作者將這個 method 直接 implement 進 Ruby 的內建 class String
裡。
提供了不區分大小寫的選項;不給時則分大小寫。
注意這不是 instance method (要寫成 def self.natcmp(str2, caseInsensitive=false)
),所以只能用 String.natcmp "sth" "else"
的形式呼叫,不能用 "sth".natcmp "else"
的形式來呼叫。
str1, str2 = str1.dup, str2.dup
複製字串,因為後面會去除空白字元,以及可能要轉小寫。(Ruby 可以省略 method 呼叫的 ()
)
這種 assignment 的語法叫 destructing assignment。
C++17 有個類似的語法叫 structured binding declaration:
auto [buf1, buf2] = std::make_tuple(strdup(argv[1]), strdup(argv[2]));
接下來,
compareExpression = /^(\D*)(\d*)(.*)$/
這是用以分開字串的 regular expression。
Natural comparison 要分成「非數字」、「數字」、以及「餘下部分」作比較。
這個 regular expression 中的三個 groups 就分別對應這三個部分。
if caseInsensitive
str1.downcase!
str2.downcase!
end
如果不分大小寫,轉小寫。(!
結尾的 Ruby method 通常表示「破壞性的」,會直接變更原本的物件,所以前面用 .dup
複製,避免改到原本的物件)
# Remove all whitespace
str1.gsub!(/\s*/, '')
str2.gsub!(/\s*/, '')
去除空白字元。
while (str1.length > 0) or (str2.length > 0) do
注意要不斷地比較餘下部分(一開始是整個字串)。
# Extract non-digits, digits and rest of string
str1 =~ compareExpression
chars1, num1, str1 = $1.dup, $2.dup, $3.dup
str2 =~ compareExpression
chars2, num2, str2 = $1.dup, $2.dup, $3.dup
將 str1
/str2
分別用 regular expression match,matched 的 3 個 groups 分別 assign 給代表開頭非數字部分的 chars1
/chars2
、代表中間連續數字部分的num1
/num2
、以及代表餘下字串的 str1
/str2
。
這裡使用 .dup
的原因,可能是因為作者認為進行 matching (=~
) 時,Ruby 會直接改變 matching group $1
/$2
/$3
/… 所 reference 的字串物件的內容,而不是將這些 variables reference 到新 allocate 出來的字串。
用類似 C 的 pseudo code 表示作者認為的樣子:
void *$3 = str_new(""); // Initial value of `$3`
str_cpy($3, str_new("/* sth */")); // `str1 =~ compareExpression`
void *str1 = str_dup($3); // `str1 = $3.dup`
str_cpy($3, str_new("/* sth else */")); // `str2 =~ compareExpression`
void *str2 = str_dup($3); // `str2 = $3.dup`
不過用 Ruby 2.6.5 測試,Ruby 的做法是將這些 variables $1
/$2
/$3
/… reference 到新 allocate 出來的字串,而不會改變原先字串的內容,所以這裡並不需要使用 .dup
。
用類似 C 的 pseudo code 表示實際執行的樣子:
void *$3 = NULL; // Initial value of `$3`
$3 = str_new("/* sth */"); // `str1 =~ compareExpression`
void *str1 = str_dup($3); // `str1 = $3.dup`
$3 = str_new("/* sth else */"); // `str2 =~ compareExpression`
void *str2 = str_dup($3); // `str2 = $3.dup`
可以看到兩個 .dup
(這裡用 str_dup()
表示對應的 C 函數) 都是不必要的。
節錄 Ruby 對 Pre-defined variables
的說明:
$1, $2…
Contains the subpattern from the corresponding set of parentheses in the last successful pattern matched, […], ornil
if the last pattern match failed. (Mnemonic: like \digit.) These variables are all read-only.
– Pre-defined variables
$1
/$2
/$3
/… 可以 nil
,而不是一直都是字串,所以應該是改變 reference 的字串,而不是改變原字串內容。
These variables are all read-only
這又是怎麼回事?
實際測試一下:
irb(main):001:0> "asdf" =~ /(\w{2})/
=> 0
irb(main):002:0> $1
=> "as"
irb(main):003:0> $1[0] = "i"; $1
=> "as"
直接從 $1
改其所 reference 的字串內容會失敗。
irb(main):004:0> a = $1
=> "as"
irb(main):005:0> a.equal? $1
=> false
irb(main):006:0> b = $1
=> "as"
irb(main):007:0> b.equal? a
=> false
irb(main):008:0> b[0] = "i"; b
=> "is"
irb(main):009:0> a
=> "as"
irb(main):010:0> $1
=> "as"
發現嘗試 reference 到 $1
所指的字串時,會 reference 到複製出來的字串。所以無法去更改 $1
所 reference 的字串內容。
繼續剛才的 code。
# Compare the non-digits
case (chars1 <=> chars2)
Ruby 的 <=>
是比較大小的意思,大於時會給 1
,相等時給 0
,小於時給 -1
。
有點像是 strcmp()
,但 C/C++ 標準沒有規定 strcmp()
的值必須是 -1
、0
、或 1
之一。
C++20 也有 <=>
,不過和 strcmp()
一樣沒有限制值域。
when 0 # Non-digits are the same, compare the digits...
# If either number begins with a zero, then compare alphabetically,
# otherwise compare numerically
if (num1[0] != 48) and (num2[0] != 48)
num1, num2 = num1.to_i, num2.to_i
end
case (num1 <=> num2)
when -1 then return -1
when 1 then return 1
end
48
是字元 '0'
的 ASCII 值;在 Ruby 中沒有「字元」,只有「字串」,不能寫 '0'
,這在 Ruby 中是字串。比較好的寫法是 "0".ord
。
先比較非數字部分,一樣的話 ((chars1 <=> chars2) == 0
) 再比較數字部分,如果數字部分也一樣的話,就給下一圈迴圈比較。
如果兩個字串的數字部分都不以 '0'
開頭,就轉成數字比較;但如果有一個的數字部分以 '0'
開頭,就使用字串比較。
when -1 then return -1
when 1 then return 1
end # case
一旦遇到非數字部分不一樣時就 return
。
end # while
# Strings are naturally equal
return 0
end
迴圈裡都沒 return
就沒不同。
end # class String
前面的 qtest.c
的 do_sort()
處有一段。
/* FIXME: add an option to specify sorting order */
所以要使用 option 來指定 comparison 的方法。
然後是作業要求:
修改排序所用的比較函式,變更為 natural sort,在 “simulation” 也該做對應的修改,得以反映出 natural sort 的使用。
用 rg
(ripgrep
) 稍微找了一下,還暫時看不出 在 “simulation” 也該做對應的修改
的意思,可能是在 simulation mode 下要給預設的 comparison 方法。
使用 add_param()
增加 comparison method 的參數 (int
),然後在 do_sort()
時根據這個參數取得對應的 comparison function 進行 sort。
不應該從 queue.o
存取 qtest.o
裡的全域變數,因為這樣是不太好的設計,會造成 queue.o
的 link dependency。
所以 q_sort()
以及 ele_sort()
應該要增加指定 comparison function 用的 parameter,避免依賴 qtest.o
中的全域變數。
首先增加 compare.h
與 compare.c
。
compare.h
:
#ifndef LAB0_COMPARE_H
#define LAB0_COMPARE_H
#include <string.h> /* strcmp */
#include <strings.h> /* strcasecmp */
typedef int (*cmp_func_t)(const char *s1, const char *s2);
cmp_func_t cmp_get_func(int index);
const char *cmp_get_func_str(int index);
/* Name of the comparison functions */
#define CMP_FUNC_NAME_0 strcasecmp
#define CMP_FUNC_NAME_1 strcmp
#define CMP_FUNC_NAME_2 negstrcasecmp
#define CMP_FUNC_NAME_3 negstrcmp
/* Function name of the comparison functions */
#define CMP_FUNC_NAME(index) CMP_FUNC_NAME_##index
#define CMP_FUNC_STR(index) CMP_STR(CMP_FUNC_NAME(index))
/* Enumeration name of the comparison functions */
#define CMP_ENUM_NAME(index) CMP_CAT(k_, CMP_FUNC_NAME_##index)
/* Concatenate with expansion */
#define CMP_CAT(x, ...) CMP_CAT_NOEXPAND(x, __VA_ARGS__)
#define CMP_CAT_NOEXPAND(x, ...) x##__VA_ARGS__
/* Stringify with expansion */
#define CMP_STR(...) CMP_STR_NOEXPAND(__VA_ARGS__)
#define CMP_STR_NOEXPAND(...) #__VA_ARGS__
/* Define macro `CMP_EXPAND_FMT(n)` first and then invoke `CMP_EXPAND()` */
#define CMP_EXPAND() \
CMP_EXPAND_FMT(0) CMP_EXPAND_FMT(1) CMP_EXPAND_FMT(2) CMP_EXPAND_FMT(3)
#define CMP_EXPAND_FMT(n) CMP_ENUM_NAME(n) = n,
typedef enum { CMP_EXPAND() } cmp_idx_t;
#undef CMP_EXPAND_FMT
int negstrcasecmp(const char *s1, const char *s2);
int negstrcmp(const char *s1, const char *s2);
#endif /* LAB0_COMPARE_H */
CMP_FUNC_NAME(index)
以及 CMP_FUNC_STR(index)
是用來從指定的 index
取得函數名稱的 macros。
其中 CMP_FUNC_NAME()
取得的是 function designator,可用於呼叫;CMP_FUNC_STR()
取得的是字串,可用於顯示。
cmp_func_t cmp_get_func(int index)
以及 const char *cmp_get_func_str(int index)
則分別是 macros CMP_FUNC_NAME(index)
與 CMP_FUNC_STR(index)
所對應的 C 函數版本。
CMP_ENUM_NAME(index)
則是用來取得 index
所對應的 enum cmp_idx_t
的元素的名稱。
CMP_EXPAND()
要配合 #define CMP_EXPAND_FMT(n)
使用,使用上就像是 foreach
的感覺。
如果想要新增比較函數 ninecmp()
,對應的 index 為 9
,只要 #define CMP_FUNC_NAME_9 ninecmp
,然後在 #define CMP_EXPAND()
後面加上 CMP_EXPAND_FMT(9)
就行了。我認為很方便。
compare.c
:
#include "compare.h"
#define CMP_EXPAND_FMT(n) \
case CMP_ENUM_NAME(n): \
return CMP_FUNC_NAME(n);
cmp_func_t cmp_get_func(int index)
{
switch (index) {
default:
CMP_EXPAND()
}
}
#undef CMP_EXPAND_FMT
#define CMP_EXPAND_FMT(n) \
case CMP_ENUM_NAME(n): \
return CMP_FUNC_STR(n);
const char *cmp_get_func_str(int index)
{
switch (index) {
default:
CMP_EXPAND()
}
}
#undef CMP_EXPAND_FMT
int negstrcasecmp(const char *s1, const char *s2)
{
return -strcasecmp(s1, s2);
}
int negstrcmp(const char *s1, const char *s2)
{
return -strcmp(s1, s2);
}
negstrcasecmp()
以及 negstrcmp()
是我增加的字串比較函數,作用是在 sort 的時候可以由大排到小。
接著把 compare.o
加進 Makefile
。
diff --git a/Makefile b/Makefile
index 172a41f..34ed55e 100644
--- a/Makefile
+++ b/Makefile
@@ -25,7 +25,7 @@ $(GIT_HOOKS):
@scripts/install-git-hooks
@echo
-OBJS := qtest.o report.o console.o harness.o queue.o \
+OBJS := qtest.o report.o console.o harness.o queue.o compare.o \
random.o dudect/constant.o dudect/fixture.o dudect/ttest.o
deps := $(OBJS:%.o=.%.o.d)
然後把 q_sort()
加上用來指定比較函數的 parameter。
diff --git a/queue.c b/queue.c
index b25556a..26e5737 100644
--- a/queue.c
+++ b/queue.c
@@ -2,6 +2,7 @@
#include <stdlib.h>
#include <string.h>
+#include "compare.h"
#include "harness.h"
#include "queue.h"
@@ -193,7 +194,7 @@ typedef struct {
* No effect if `head` is `NULL` or its member `next` is `NULL`.
* The function reduces to insertion sort when `len` < 4
*/
-static list_span_t ele_sort(list_ele_t *head, size_t len)
+static list_span_t ele_sort(list_ele_t *head, size_t len, cmp_func_t cmp)
{
list_ele_t *left;
list_ele_t *right;
@@ -212,24 +213,24 @@ static list_span_t ele_sort(list_ele_t *head, size_t len)
}
{
- const list_span_t left_res = ele_sort(left, len / 2);
+ const list_span_t left_res = ele_sort(left, len / 2, cmp);
left = left_res.head;
if (!right) // `left` is the head of the sorted list
return left_res;
}
- right = ele_sort(right, len / 2 + len % 2).head;
+ right = ele_sort(right, len / 2 + len % 2, cmp).head;
/* Now `left` and `right` are both the head of a non-`NULL` sorted list */
{
/* Set up the beginning of the merged list */
list_ele_t **const phead =
- (strcmp(left->value, right->value) < 0) ? &left : &right;
+ (cmp(left->value, right->value) < 0) ? &left : &right;
merge = head = *phead;
*phead = (*phead)->next;
}
while (left || right) {
- if (!right || (left && strcmp(left->value, right->value) < 0)) {
+ if (!right || (left && cmp(left->value, right->value) < 0)) {
/* `left` should be linked from `merge` */
merge->next = left;
left = left->next;
@@ -246,16 +247,17 @@ static list_span_t ele_sort(list_ele_t *head, size_t len)
/*
* Sort elements of queue in ascending order
- * No effect if q is NULL or empty. In addition, if q has only one
+ * No effect if `q` is `NULL` or empty. In addition, if `q` has only one
* element, do nothing.
+ * Argument `cmp` should not be `NULL`.
*/
-void q_sort(queue_t *q)
+void q_sort(queue_t *q, cmp_func_t cmp)
{
list_span_t span;
if (!q || !q->head || !q->head->next)
return;
- span = ele_sort(q->head, q->size);
+ span = ele_sort(q->head, q->size, cmp);
q->head = span.head;
q->tail = span.tail;
}
diff --git a/queue.h b/queue.h
index c67dd87..c0d3bcb 100644
--- a/queue.h
+++ b/queue.h
@@ -11,6 +11,8 @@
#include <stdbool.h>
#include <stddef.h>
+#include "compare.h"
+
/* Data structure declarations */
/* Linked list element (You shouldn't need to change this) */
@@ -91,6 +93,6 @@ void q_reverse(queue_t *q);
* No effect if q is NULL or empty. In addition, if q has only one
* element, do nothing.
*/
-void q_sort(queue_t *q);
+void q_sort(queue_t *q, cmp_func_t cmp);
#endif /* LAB0_QUEUE_H */
在 qtest.c
中增加全域變數 static int cmp_func_idx = 0;
,然後用 add_param()
增加 compare
選項,並寫個 setter compare_setter()
來告訴使用者,比較函數換成了哪一個。
接著在 do_sort()
裡取得比較函數,用以傳給 q_sort()
以及驗證 sort 的結果。
diff --git a/qtest.c b/qtest.c
index 6ec8d68..c4dc7f8 100644
--- a/qtest.c
+++ b/qtest.c
@@ -6,7 +6,6 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
-#include <strings.h> /* strcasecmp */
#include <sys/stat.h>
#include <sys/wait.h>
#include <time.h>
@@ -31,6 +30,8 @@
*/
#include "queue.h"
+#include "compare.h" /* comparison functions */
+
#include "console.h"
#include "report.h"
@@ -62,6 +63,9 @@ static int string_length = MAXSTRING;
#define MAX_RANDSTR_LEN 10
static const char charset[] = "abcdefghijklmnopqrstuvwxyz";
+/* Comparison method */
+static int cmp_func_idx = 0;
+
/* Forward declarations */
static bool show_queue(int vlevel);
static bool do_new(int argc, char *argv[]);
@@ -77,6 +81,11 @@ static bool do_show(int argc, char *argv[]);
static void queue_init();
+static void compare_setter(int oldval)
+{
+ printf("Will use %s for sorting.\n", cmp_get_func_str(cmp_func_idx));
+}
+
static void console_init()
{
add_cmd("new", do_new, " | Create new queue");
@@ -104,6 +113,11 @@ static void console_init()
NULL);
add_param("fail", &fail_limit,
"Number of times allow queue operations to return false", NULL);
+#define CMP_EXPAND_FMT(n) " " #n ": " CMP_FUNC_STR(n)
+ add_param("compare", &cmp_func_idx,
+ "Comparison function to be used (default: 0)" CMP_EXPAND(),
+ compare_setter);
+#undef CMP_EXPAND_FMT
}
static bool do_new(int argc, char *argv[])
@@ -547,9 +561,10 @@ bool do_sort(int argc, char *argv[])
report(3, "Warning: Calling sort on single node");
error_check();
+ const cmp_func_t cmp = cmp_get_func(cmp_func_idx);
set_noallocate_mode(true);
if (exception_setup(true))
- q_sort(q);
+ q_sort(q, cmp);
exception_cancel();
set_noallocate_mode(false);
@@ -557,8 +572,7 @@ bool do_sort(int argc, char *argv[])
if (q && q->head) {
for (list_ele_t *e = q->head; e->next && --cnt; e = e->next) {
/* Ensure each element in ascending order */
- /* FIXME: add an option to specify sorting order */
- if (strcasecmp(e->value, e->next->value) > 0) {
+ if (cmp(e->value, e->next->value) > 0) {
report(1, "ERROR: Not sorted in ascending order");
ok = false;
break;
$ ./qtest
cmd> help
cmd> help
Commands:
# ... | Display comment
free | Delete queue
help | Show documentation
ih str [n] | Insert string str at head of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
it str [n] | Insert string str at tail of queue n times. Generate random string(s) if str equals RAND. (default: n == 1)
log file | Copy output to file
new | Create new queue
option [name val] | Display or set options
quit | Exit program
reverse | Reverse queue
rh [str] | Remove from head of queue. Optionally compare to expected value str
rhq | Remove from head of queue without reporting value.
show | Show queue contents
size [n] | Compute queue size n times (default: n == 1)
sort | Sort queue in ascending order
source file | Read commands from source file
time cmd arg ... | Time command execution
Options:
compare 0 Comparison function to be used (default: 0) 0: strcasecmp 1: strcmp 2: negstrcasecmp 3: negstrcmp
echo 1 Do/don't echo commands
error 5 Number of errors until exit
fail 30 Number of times allow queue operations to return false
length 1024 Maximum length of displayed string
malloc 0 Malloc failure probability percent
simulation 0 Start/Stop simulation mode
verbose 4 Verbosity level
cmd> new
cmd> new
q = []
cmd> it RAND 16
cmd> it RAND 16
q = [cwxvawh qenuidu fqcrxv yxwugtmlc jxqosdvq oprjot iywxl hdkyr vcdqwrhqc jakuahtmi tnpks veftxnj jwvwsvf tywgsm mdnbop xeaerjtt]
cmd> sort
cmd> sort
q = [cwxvawh fqcrxv hdkyr iywxl jakuahtmi jwvwsvf jxqosdvq mdnbop oprjot qenuidu tnpks tywgsm vcdqwrhqc veftxnj xeaerjtt yxwugtmlc]
cmd> option compare 2
cmd> option compare 2
Will use negstrcasecmp for sorting.
cmd> sort
cmd> sort
q = [yxwugtmlc xeaerjtt veftxnj vcdqwrhqc tywgsm tnpks qenuidu oprjot mdnbop jxqosdvq jwvwsvf jakuahtmi iywxl hdkyr fqcrxv cwxvawh]
cmd>
$ git commit -a
compare.c:27:0: style: The function 'negstrcasecmp' is never used. [unusedFunction]
^
compare.c:32:0: style: The function 'negstrcmp' is never used. [unusedFunction]
^
Fail to pass static analysis.
真奇怪,我剛剛使用的函數是沒有被使用的函數?
如果函數沒有被使用到,那我剛才使用的是什麼?
如果我剛才使用的是這個函數,為何會未被使用?
cppcheck
可能不接受這種只用在 switch
內的用法,所以以為沒有使用這些函數。
看來不能用 switch
,得真的建表了。
compare.c
:
#define CMP_EXPAND_FMT(n) [CMP_ENUM_NAME(n)] = CMP_FUNC_NAME(n),
static const cmp_func_t funcs[] = {CMP_EXPAND()};
#undef CMP_EXPAND_FMT
#define CMP_EXPAND_FMT(n) [CMP_ENUM_NAME(n)] = CMP_FUNC_STR(n),
static const char *const func_strs[] = {CMP_EXPAND()};
#undef CMP_EXPAND_FMT
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
cmp_func_t cmp_get_func(int index)
{
if (index >= 0 && index < ARRAY_SIZE(funcs))
return funcs[index];
return funcs[CMP_ENUM_NAME(0)];
}
const char *cmp_get_func_str(int index)
{
if (index >= 0 && index < ARRAY_SIZE(func_strs))
return func_strs[index];
return func_strs[CMP_ENUM_NAME(0)];
}
$ git commit -a
[master f6b25e9] Add an option to specify sorting order
6 files changed, 111 insertions(+), 14 deletions(-)
create mode 100644 compare.c
create mode 100644 compare.h
$ git commit
Rename elements of enum `cmp_idx_t` to `k_<func>` [line 1]
- Possible misspelled word(s): enum
How to Write a Git Commit Message: https://chris.beams.io/posts/git-commit/
Proceed with commit? [e/n/?]
C 的 keyword enum
居然被認為是錯字。雖然它本來就是。
用全稱取代:
Rename elements of enumeration type `cmp_idx_t` to `k_<func>` [line 1]
- Possible misspelled word(s): cmp
How to Write a Git Commit Message: https://chris.beams.io/posts/git-commit/
Proceed with commit? [e/n/?]
連常用的簡寫 cmp
=> compare
也不行。
可能是我寫得太 specific 了。
寫得更 generic 一點。
[master d71ea32] Refine the name of comparison method enumerations
1 file changed, 7 insertions(+), 7 deletions(-)
compare.c
:
#include <ctype.h>
#include <stdbool.h>
/* Natural comparison */
/* For `strspn()` */
#define WHITESPACES " \f\n\r\t\v"
#define DECDIGITS "0123456789"
static const char digit_space[] = DECDIGITS WHITESPACES;
static const char space[] = WHITESPACES;
用來呼叫 strspn()
用的字串。
這裡用 macros 是為了避免重複兩次同樣的字串,以免之後需要修改時需要修改兩個地方。
先看主函數實作:
/*
* The basic implement of `natstrcmp()` and `natstrcasecmp()`.
*/
static int natstrcmp_base(const char *s1, const char *s2, bool ignore_case)
{
const char *p1 = s1;
const char *p2 = s2;
while (*p1 || *p2) {
int diff;
/* Skip spaces */
p1 += strspn(p1, space);
p2 += strspn(p2, space);
if (isdigit(*p1) || isdigit(*p2)) {
diff = natstrcmp_digit(&p1, &p2); /* Compare digits */
} else {
const char c1 = (ignore_case) ? tolower(*p1) : *p1;
const char c2 = (ignore_case) ? tolower(*p2) : *p2;
diff = c1 - c2;
}
if (diff || !*p1 || !*p2)
return diff;
++p1;
++p2;
}
return 0;
}
從頭開始對 2 個字串 s1
與 s2
的字元進行比對。
先跳過連續空白,檢查有沒有遇到數字部分,有的話交給數字部分處理,沒有則直接比較。
如果有不同,或是遇到字串結尾,就回傳結果,否則比對下個字元。
數字處理的部分:
/*
* The digit comparison part of natural comparison.
* Both `!isspace(**ps1) && !isspace(**ps2)`
* and `isdigit(**ps1) || isdigit(**ps2)` should hold.
*/
static int natstrcmp_digit(const char **ps1, const char **ps2)
{
const char *p1;
const char *p2;
int diff; // The difference of the first mismatch characters
bool numcmp; // Perform number comparison or string comparison
if (!isdigit(**ps1)) {
*ps2 += strspn(*ps2, digit_space);
return -1;
}
if (!isdigit(**ps2)) {
*ps1 += strspn(*ps1, digit_space);
return 1;
}
p1 = *ps1;
p2 = *ps2;
*ps1 += strspn(*ps1, digit_space);
*ps2 += strspn(*ps2, digit_space);
diff = 0;
numcmp = !(**ps1 == '0' || **ps2 == '0');
for (;;) {
for (;;) {
p1 += strspn(p1, space);
p2 += strspn(p2, space);
if (isdigit(p1) && isdigit(p2)) {
if (!diff) {
diff = *p1 - *p2;
if (diff)
break;
}
} else {
/* strcmp/numcmp: Returns on end of either numbers */
if (!diff)
diff = isdigit(p1) - isdigit(p2);
return diff;
}
++p1;
++p2;
}
/* strcmp: Returns on first mismatch */
if (!numcmp)
return diff;
}
}
先檢查是不是 2 個字串 *ps1
與 *ps2
都以數字開頭。不是的話,代表有一個是數字,另一個不是,可以直接回傳結果。
我在回傳結果前把指向 指向數字字元的指標 的指標 ps1
/ps2
指向的指標 *ps1
/*ps2
(也就是 natstrcmp_base()
中的 p1
/p2
) 移到數字部分的結束。這個動作其實不是必要的,因為回傳結果非 0
,natstrcmp_base()
會跟著 return
,並不會用到這個 由參數 ps1
/ps2
指向的 指標 *ps1
/*ps2
的值。
確定 2 個字串都以數字字元開頭後,將 *ps1
與 *ps2
移到數字部分的結束。與前面的不同,這一步是需要的,因為可能因數字部分相等而回傳 0
。
如果 2 個字串中有 1 個以 0
開頭,就使用字串比較,否則當作整數比較。
整數字串比較特別,而處理單純字串比較簡單,所以先討論整數比較,再討論字串比較好。
首先,不可以轉成數字,因為會 overflow。
在做整數比較時,我們會看哪個數字長,一樣長時則從開頭看哪個數字大。
但是不應該看完長度再回去一一比較,因為中途可能有些空白,所以數字的長度,必須走過一遍才能明白。
因此我採用的策略是,在比較的過程中,記住第一個不一樣的數字,然後讓我們繼續看下去。
如果先看到其中一個的尾巴,就表示它比較短;否則一樣長,需要看數字大小。
而因為已經記住了第一個不一樣的數字,而十進制數字愈靠近開頭的數位,權重愈大,所以第一個不一樣的數字決定了哪個數大。
而做字串比較時,只要看到第一個不一樣的數字,就可以回傳結果。
因此,整數比較與字串比較都有「尋找第一個不同點」的共同點。
所以我將比較迴圈寫成兩層的無限迴圈,找到第一個不同點時就退出內層迴圈,在外層迴圈中根據是字串比較還是整數比較,判斷要不要繼續比較。
而當遇到非數字字元時,數字字元就比較結束了。這時整數比較可以判斷哪個整數比較長,字串比較可以判斷哪個字串比較短。
因此,整數比較與字串比較還有「尋找字元長度」的共同點。
不同的是,整數比較還要考慮是不是找到了第一個不同點,而字串比較則不用。
因為整數比較與字串比較的這些共同點,所以我把它們寫成同一個迴圈。
最後是實際提供的函數。
額外加上 neg*()
版本。
int natstrcasecmp(const char *s1, const char *s2)
{
return natstrcmp_base(s1, s2, true);
}
int natstrcmp(const char *s1, const char *s2)
{
return natstrcmp_base(s1, s2, false);
}
int negnatstrcasecmp(const char *s1, const char *s2)
{
return -natstrcasecmp(s1, s2);
}
int negnatstrcmp(const char *s1, const char *s2)
{
return -natstrcmp(s1, s2);
}
接著將這些函數宣告入 compare.h
。
diff --git a/compare.h b/compare.h
index fa09be9..663bee6 100644
--- a/compare.h
+++ b/compare.h
@@ -14,6 +14,10 @@ const char *cmp_get_func_str(int index);
#define CMP_FUNC_NAME_1 strcmp
#define CMP_FUNC_NAME_2 negstrcasecmp
#define CMP_FUNC_NAME_3 negstrcmp
+#define CMP_FUNC_NAME_4 natstrcasecmp
+#define CMP_FUNC_NAME_5 natstrcmp
+#define CMP_FUNC_NAME_6 negnatstrcasecmp
+#define CMP_FUNC_NAME_7 negnatstrcmp
/* Function name of the comparison functions */
#define CMP_FUNC_NAME(index) CMP_FUNC_NAME_##index
@@ -31,8 +35,12 @@ const char *cmp_get_func_str(int index);
/* Define macro `CMP_EXPAND_FMT(n)` first and then invoke `CMP_EXPAND()` */
-#define CMP_EXPAND() \
- CMP_EXPAND_FMT(0) CMP_EXPAND_FMT(1) CMP_EXPAND_FMT(2) CMP_EXPAND_FMT(3)
+#define CMP_EXPAND() \
+ CMP_EXPAND_FMT(0) \
+ CMP_EXPAND_FMT(1) \
+ CMP_EXPAND_FMT(2) \
+ CMP_EXPAND_FMT(3) \
+ CMP_EXPAND_FMT(4) CMP_EXPAND_FMT(5) CMP_EXPAND_FMT(6) CMP_EXPAND_FMT(7)
#define CMP_EXPAND_FMT(n) CMP_ENUM_NAME(n) = n,
typedef enum { CMP_EXPAND() } cmp_idx_t;
@@ -40,5 +48,9 @@ typedef enum { CMP_EXPAND() } cmp_idx_t;
int negstrcasecmp(const char *s1, const char *s2);
int negstrcmp(const char *s1, const char *s2);
+int natstrcasecmp(const char *s1, const char *s2);
+int natstrcmp(const char *s1, const char *s2);
+int negnatstrcasecmp(const char *s1, const char *s2);
+int negnatstrcmp(const char *s1, const char *s2);
#endif /* LAB0_COMPARE_H */
然後直接 make
。
在 qtest
中就可以使用這些函數了。
而且完全不需要碰 qtest.c
的 code。
$ ./qtest
cmd> help
cmd> help
Commands:
[...]
time cmd arg ... | Time command execution
Options:
compare 0 Comparison function to be used (default: 0) 0: strcasecmp 1: strcmp 2: negstrcasecmp 3: negstrcmp 4: natstrcasecmp 5: natstrcmp 6: negnatstrcasecmp 7: negnatstrcmp
echo 1 Do/don't echo commands
[...]
cmd>
用 sourcefrog/natsort 的 README.md
中的範例測試:
x2-g8 < x2-y7 < x2-y08 < x8-y8
cmd> new
cmd> new
q = []
cmd> ih x2-g8
cmd> ih x2-g8
q = [x2-g8]
cmd> ih x2-y7
cmd> ih x2-y7
q = [x2-y7 x2-g8]
cmd> ih x2-y08
cmd> ih x2-y08
q = [x2-y08 x2-y7 x2-g8]
cmd> ih ih x8-y8
cmd> ih x8-y8
q = [x8-y8 x2-y08 x2-y7 x2-g8]
cmd> sort
cmd> sort
q = [x2-g8 x2-y08 x2-y7 x8-y8]
cmd> option compare 4
cmd> option compare 4
Will use natstrcasecmp for sorting.
cmd> sort
cmd> sort
Segmentation fault occurred. You dereferenced a NULL or invalid pointer
Aborted (core dumped)
發生什麼事?
用 valgrind
檢查發現不小心直接把指標當作 isdigit()
的參數。
diff --git a/compare.c b/compare.c
index 8a541e3..6fe39ab 100644
--- a/compare.c
+++ b/compare.c
@@ -80,7 +80,7 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
p1 += strspn(p1, space);
p2 += strspn(p2, space);
- if (isdigit(p1) && isdigit(p2)) {
+ if (isdigit(*p1) && isdigit(*p2)) {
if (!diff) {
diff = *p1 - *p2;
if (diff)
@@ -89,7 +89,7 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
} else {
/* strcmp/numcmp: Returns on end of either numbers */
if (!diff)
- diff = isdigit(p1) - isdigit(p2);
+ diff = isdigit(*p1) - isdigit(*p2);
return diff;
}
連 cppcheck
都沒提出警告!
如果是 C++ 編譯器就會在編譯時直接擋下來:
$ g++ compare.c
compare.c: In function ‘int natstrcmp_digit(const char**, const char**)’:
compare.c:83:27: error: invalid conversion from ‘const char*’ to ‘int’ [-fpermissive]
if (isdigit(p1) && isdigit(p2)) {
^
In file included from compare.c:3:0:
/usr/include/ctype.h:111:1: note: initializing argument 1 of ‘int isdigit(int)’
__exctype (isdigit);
^
compare.c:83:42: error: invalid conversion from ‘const char*’ to ‘int’ [-fpermissive]
if (isdigit(p1) && isdigit(p2)) {
^
In file included from compare.c:3:0:
/usr/include/ctype.h:111:1: note: initializing argument 1 of ‘int isdigit(int)’
__exctype (isdigit);
^
compare.c:92:38: error: invalid conversion from ‘const char*’ to ‘int’ [-fpermissive]
diff = isdigit(p1) - isdigit(p2);
^
In file included from compare.c:3:0:
/usr/include/ctype.h:111:1: note: initializing argument 1 of ‘int isdigit(int)’
__exctype (isdigit);
^
compare.c:92:52: error: invalid conversion from ‘const char*’ to ‘int’ [-fpermissive]
diff = isdigit(p1) - isdigit(p2);
^
In file included from compare.c:3:0:
/usr/include/ctype.h:111:1: note: initializing argument 1 of ‘int isdigit(int)’
__exctype (isdigit);
找找看原因。
編譯器版本?
$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
寫個 test.c
:
#include <ctype.h>
isdigit("42")
gcc test.c -E
:
[前略...]
# 327 "/usr/include/ctype.h" 3 4
# 2 "test.c" 2
((*__ctype_b_loc ())[(int) ((
# 2 "test.c"
"42"
# 2 "test.c" 3 4
))] & (unsigned short int) _ISdigit)
原來 isdigit()
在 gcc
上是 macro,有強制轉型,所以不會提出警告。
g++ test.c -E
:
[前略...]
# 327 "/usr/include/ctype.h" 3 4
}
# 2 "test.c" 2
# 2 "test.c"
isdigit("42")
在 g++
上則是 function。
因為有 prototype 的 function 會對 arguments 做型別檢查,所以可以發現參數型別不合的問題。
如果把 gcc
preprocessor 的輸出拿給 g++
當作 C++ 程式編譯呢?
test.c
:
#include <ctype.h>
int main(void)
{
isdigit("42");
}
$ gcc test.c -E | g++ -xc++ -
test.c: In function ‘int main()’:
test.c:5:5: error: cast from ‘const char*’ to ‘int’ loses precision [-fpermissive]
isdigit("42");
還是會被 C++ 編譯器擋。
在 C++ 中進行 pointer 與整數型別的互相強制轉型時,會使用 reinterpret_cast<>()
的規則,而 reinterpret_cast<>()
在將 pointer 轉換成整數型別時,要求這個整數型別的值域要足夠完整表示 pointer 的值。
而在 LP64 (可看成 long
& pointer 64-bit) data model 上,int
是 32-bit,而 pointer 是 64-bit。
在這個例子中,int
的值域不足以完整表示 const char *
的值,會被 C++ 編譯器視為轉型失敗,終而報錯。
另外,如果用:
test.c
:
#include <ctype.h>
int main(void)
{
isdigit('4');
}
$ gcc test.c -E | g++ -xc++ -
則會編譯成功。
在 C++ 下之所以不使用 macro,應該是因為 C++ 在 header <locale>
中有定義相同名稱的 function template:
template <class charT>
bool isspace (charT c, const locale& loc);
而這需要透過編譯器執行 overload resolution 才能確定實際的操作。
除非使用利用了 C99/C++11 的 variadic macro 功能的 narg C macro 技巧與 C11 的 _Generic
,不然沒辦法實作對應的 macro。
而且它在 namespace std
下,可以用 std::isspace
表示,這不是 C macro 所能正確處理的語法。
修復問題後再度測試:
x2-g8 < x2-y7 < x2-y08 < x8-y8
cmd> new
cmd> new
q = []
cmd> option compare 4
cmd> option compare 4
Will use natstrcasecmp for sorting.
cmd> ih x2-g8
cmd> ih x2-g8
q = [x2-g8]
cmd> ih x2-y7
cmd> ih x2-y7
q = [x2-y7 x2-g8]
cmd> ih x2-y08
cmd> ih x2-y08
q = [x2-y08 x2-y7 x2-g8]
cmd> ih x8-y8
cmd> ih x8-y8
q = [x8-y8 x2-y08 x2-y7 x2-g8]
cmd> sort
cmd> sort
q = [x2-g8 x2-y08 x2-y7 x8-y8]
cmd> option compare 1
cmd> option compare 1
Will use strcmp for sorting.
cmd> sort
cmd> sort
q = [x2-g8 x2-y08 x2-y7 x8-y8]
怎麼不正確?
剛才的 Ruby 實作是正確的嗎?
擷取 README.md
中的另一個範例:
1.001 < 1.002 < 1.010 < 1.02 < 1.1 < 1.3
irb(main):047:0> String.natcmp("1.010", "1.02")
=> 1
結果前面 Ruby 的實作也是不正確的。
原因在於對 Ruby string index 會得到 Ruby string 而非 ASCII 碼數字:"010"[0] == "0"
,所以前面的 Ruby 實作會一律轉成數字再比較。
Sorting
Strings are sorted as usual, except that decimal integer substrings are compared on their numeric value. For example,
a < a0 < a1 < a1a < a1b < a2 < a10 < a20
Strings can contain several number parts:
x2-g8 < x2-y7 < x2-y08 < x8-y8
in which case numeric fields are separated by nonnumeric characters.
到目前為止的敘述大致符合之前的實作。
Leading spaces are ignored. This works very well for IP addresses from log files, for example.
開始不符合了。只有 leading spaces 需要忽略,但在剛才的實作中,將全部的 spaces 都忽略了。
Leading zeros are not ignored, which tends to give more reasonable results on decimal fractions.
1.001 < 1.002 < 1.010 < 1.02 < 1.1 < 1.3
這是符合的。
正確的實作是當數字部分以 '0'
開頭時,將其看作小數點後的數字來比較,而這恰巧等同於將其當作字串比較。
而 Ruby 版本的實作則是因為存在邏輯問題,而導致結果不正確。
Some applications may wish to change this by modifying the test that calls
isspace
.Performance is linear: each character of the string is scanned at most once, and only as many characters as necessary to decide are considered.
問題來了。
原範例:
1.001 < 1.002 < 1.010 < 1.02 < 1.1 < 1.3
x2-g8 < x2-y7 < x2-y08 < x8-y8
擷取其中部分:
1.02 < 1.1
x2-y7 < x2-y08
去除相同的非數字部分:
02 < 1
7 < 08
不影響結果的數字代換:
02 < 1
1 < 02
移項:
02 < 1
02 > 1
在只有規格所述的規則下,這兩個範例根本是互相衝突的!
而根據實際 sort 結果:
https://github.com/sourcefrog/natsort/blob/master/sorted-words
x2-y08 < x2-y7
,這等同於 02 < 1
,才是正確的。
所以這個 README.md
內的範例是有誤的。
既然知道只有數字前的空白字元需要忽略,那就可以來修改並優化實作了。
diff --git a/compare.c b/compare.c
index 6fe39ab..2a2c0bd 100644
--- a/compare.c
+++ b/compare.c
@@ -41,16 +41,12 @@ int negstrcmp(const char *s1, const char *s2)
/* For `strspn()` */
-#define WHITESPACES " \f\n\r\t\v"
-#define DECDIGITS "0123456789"
-
-static const char digit_space[] = DECDIGITS WHITESPACES;
-static const char space[] = WHITESPACES;
+static const char digit[] = "0123456789";
+static const char space[] = " \f\n\r\t\v";
/*
* The digit comparison part of natural comparison.
- * Both `!isspace(**ps1) && !isspace(**ps2)`
- * and `isdigit(**ps1) || isdigit(**ps2)` should hold.
+ * `isdigit(**ps1) || isdigit(**ps2)` should hold.
*/
static int natstrcmp_digit(const char **ps1, const char **ps2)
{
@@ -59,27 +55,27 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
int diff; // The difference of the first mismatch characters
bool numcmp; // Perform number comparison or string comparison
+ *ps1 += strspn(*ps1, space);
+ *ps2 += strspn(*ps2, space);
+
if (!isdigit(**ps1)) {
- *ps2 += strspn(*ps2, digit_space);
+ *ps2 += strspn(*ps2, digit);
return -1;
}
if (!isdigit(**ps2)) {
- *ps1 += strspn(*ps1, digit_space);
+ *ps1 += strspn(*ps1, digit);
return 1;
}
p1 = *ps1;
p2 = *ps2;
- *ps1 += strspn(*ps1, digit_space);
- *ps2 += strspn(*ps2, digit_space);
+ *ps1 += strspn(*ps1, digit);
+ *ps2 += strspn(*ps2, digit);
diff = 0;
numcmp = !(**ps1 == '0' || **ps2 == '0');
for (;;) {
for (;;) {
- p1 += strspn(p1, space);
- p2 += strspn(p2, space);
-
if (isdigit(*p1) && isdigit(*p2)) {
if (!diff) {
diff = *p1 - *p2;
@@ -113,10 +109,6 @@ static int natstrcmp_base(const char *s1, const char *s2, bool ignore_case)
while (*p1 || *p2) {
int diff;
- /* Skip spaces */
- p1 += strspn(p1, space);
- p2 += strspn(p2, space);
-
if (isdigit(*p1) || isdigit(*p2)) {
diff = natstrcmp_digit(&p1, &p2); /* Compare digits */
} else {
因為手動輸入測試資料太麻煩了,所以我想研究 scripts/driver.py
看看自動測試是怎麼實作的。
$ vim scripts/driver.py
一打開就看到可疑的 dictionary。
traceDict = {
1: "trace-01-ops",
2: "trace-02-ops",
3: "trace-03-ops",
4: "trace-04-ops",
5: "trace-05-ops",
6: "trace-06-string",
7: "trace-07-robust",
8: "trace-08-robust",
9: "trace-09-robust",
10: "trace-10-malloc",
11: "trace-11-malloc",
12: "trace-12-malloc",
13: "trace-13-perf",
14: "trace-14-perf",
15: "trace-15-perf",
16: "trace-16-perf",
17: "trace-17-complexity"
}
這些 values 的名稱與 traces/
底下的檔案的主檔名一樣。
*
(vim
向下搜尋某 word 的 hotkey)
發現 runTrace()
與 run()
2 個 methods,其中 run()
會呼叫 runTrace()
。
看看有沒有可疑的關鍵字。
分數是怎麼給的?查查可疑的 maxScores
。
只有這裡有。
maxval = self.maxScores[t]
真相只有一個。
for t in tidList:
tname = self.traceDict[t]
if self.verbLevel > 0:
print("+++ TESTING trace %s:" % tname)
ok = self.runTrace(t)
maxval = self.maxScores[t]
tval = maxval if ok else 0
if tval < maxval:
self.printInColor("---\t%s\t%d/%d" % (tname, tval, maxval), self.RED)
else:
self.printInColor("---\t%s\t%d/%d" % (tname, tval, maxval), self.GREEN)
score += tval
maxscore += maxval
scoreDict[t] = tval
可以看到與 maxval
直接相關的 variables 有 tval
與 maxscore
。
而 score
是不同 tval
的累加,maxscore
則是不同 maxval
的累加。
而 tval
是看 ok
的值,非 maxval
即 0
。這說明了為什麼每個項目都會拿到滿分或 0 分,而不是對多少給多少。
ok
則是從 self.runTrace(t)
回傳的。
看看這個 class
的 runTrace()
method。
def runTrace(self, tid):
if not tid in self.traceDict:
self.printInColor("ERROR: No trace with id %d" % tid, self.RED)
return False
fname = "%s/%s.cmd" % (self.traceDirectory, self.traceDict[tid])
vname = "%d" % self.verbLevel
clist = self.command + ["-v", vname, "-f", fname]
try:
retcode = subprocess.call(clist)
except Exception as e:
self.printInColor("Call of '%s' failed: %s" % (" ".join(clist), e), self.RED)
return False
return retcode == 0
可以看到這個 method 會生成類似 execv()
的陣列參數的指令,讓 clist
reference 這個陣列物件,然後使用 subprocess.call(clist)
執行這個指令。
其中 self.command
是 "./qtest"
。
參數用法 qtest
有附。
$ ./qtest -h
Usage: ./qtest [-h] [-f IFILE][-v VLEVEL][-l LFILE]
-h Print this information
-f IFILE Read commands from IFILE
-v VLEVEL Set verbosity level
-l LFILE Echo results to LFILE
這段 code 使用 exception handling 進行呼叫失敗時的處理——告知使用者,並且回傳 False
代表本題不通過。
正常情況下不會 raise
exception,所以 retcode = subprocess.call(clist)
會成功執行,回傳 qtest
的回傳值。
-f
參數會讓 qtest
將指定檔案,也就是 "%s/%s.cmd" % (self.traceDirectory, self.traceDict[tid])
,當作輸入。
而當 qtest
發生錯誤退出時,會回傳非 0
值。因此用 return retcode == 0
判斷 qtest
有沒有成功執行。
所以判斷操作成不成功是由 qtest
決定的。
什麼時候 qtest
會回傳非 0
值?
qtest.c
的 main()
的最後幾行:
bool ok = true;
ok = ok && run_console(infile_name);
ok = ok && finish_cmd();
return ok ? 0 : 1;
}
console.c
:
bool run_console(char *infile_name)
{
if (!push_file(infile_name)) {
report(1, "ERROR: Could not open source file '%s'", infile_name);
return false;
}
while (!cmd_done())
cmd_select(0, NULL, NULL, NULL, NULL);
return err_cnt == 0;
}
err_cnt
是個全域變數,會被 record_error()
增加,record_error()
會在 interpret_cmda()
中 next_cmd->operation(argc, argv)
回傳 false
時被呼叫。
而 next_cmd->operation
是由 add_cmd()
加進來的 callback function。
所以可以尋找有哪些與 list 操作相關的 callback functions。如果這些函數回傳 false
,那麼會導致 qtest
回傳非 0
值。
其中我們有興趣的是可以驗證 queue 的內容的 command。而 rh
可以讓我們加上預期要 remove 的 node 所儲存的字串,不符時回傳 false
,可用於驗證 queue 的內容。
可以來寫測試的 script 了。
以下測試資料來自 https://github.com/sourcefrog/natsort。
由於 qtest
的 parsing 方法的關係,參數不能有空白,所以我修改了 parse_args
增加雙引號 ""
語法,以接受參數中的空白。
diff --git a/console.c b/console.c
index 3a754ab..25b66ac 100644
--- a/console.c
+++ b/console.c
@@ -163,11 +163,12 @@ static char **parse_args(char *line, int *argcp)
char *src = line;
char *dst = buf;
bool skipping = true;
+ bool quote = false;
int c;
int argc = 0;
while ((c = *src++) != '\0') {
- if (isspace(c)) {
+ if (!quote && isspace(c)) {
if (!skipping) {
/* Hit end of word */
*dst++ = '\0';
@@ -179,9 +180,14 @@ static char **parse_args(char *line, int *argcp)
argc++;
skipping = false;
}
- *dst++ = c;
+ if (c == '\"')
+ quote = !quote;
+ else
+ *dst++ = c;
}
}
+ if (quote)
+ report(1, "Warning: Unbalanced quote '%s'", line);
/* Now assemble into array of strings */
char **argv = calloc_or_fail(argc, sizeof(char *), "parse_args");
quote
成立表示在引號中。在引號中時把空白字元當成一般字元處理。
遇到引號時,切換 quote
,然後忽略引號,不要寫到參數中。
測試:
$ ./qtest
cmd> option echo 0
cmd> option echo 0
cmd> "new"
q = []
cmd> "it" "tes"t" "space
q = [test space]
cmd> rh"" "test space"
Removed test space from queue
q = []
cmd> """"""""quit""""""""
Freeing queue
$ ./qtest
cmd> "new
cmd> "new
Warning: Unbalanced quote '"new
'
Unknown command 'new
'
cmd>
traces/trace-18-natsort.cmd
:
取自 test-dates 以及 sorted-dates
# Test of sort with natural sort - Date
option fail 0
option malloc 0
option compare 4
new
ih "2000-1-10"
ih "2000-1-2"
ih "1999-12-25"
ih "2000-3-23"
ih "1999-3-3"
sort
rh "1999-3-3"
rh "1999-12-25"
rh "2000-1-2"
rh "2000-1-10"
rh "2000-3-23"
traces/trace-19-natsort.cmd
:
取自 test-fractions 以及 sorted-fractions
# Test of sort with natural sort - Fractions
option fail 0
option malloc 0
option compare 4
new
ih "Fractional release numbers"
ih "1.011.02"
ih "1.010.12"
ih "1.009.02"
ih "1.009.20"
ih "1.009.10"
ih "1.002.08"
ih "1.002.03"
ih "1.002.01"
sort
rh "1.002.01"
rh "1.002.03"
rh "1.002.08"
rh "1.009.02"
rh "1.009.10"
rh "1.009.20"
rh "1.010.12"
rh "1.011.02"
rh "Fractional release numbers"
traces/trace-20-natsort.cmd
:
取自 test-words 以及 sorted-words
# Test of sort with natural sort - Words
option fail 0
option malloc 0
option compare 4
new
ih "fred"
ih "pic2"
ih "pic100a"
ih "pic120"
ih "pic121"
ih "jane"
ih "tom"
ih "pic02a"
ih "pic3"
ih "pic4"
ih "1-20"
ih "pic100"
ih "pic02000"
ih "10-20"
ih "1-02"
ih "1-2"
ih "x2-y7"
ih "x8-y8"
ih "x2-y08"
ih "x2-g8"
ih "pic01"
ih "pic02"
ih "pic 6"
ih "pic 7"
ih "pic 5"
ih "pic05"
ih "pic 5 "
ih "pic 5 something"
ih "pic 4 else"
sort
rh "1-02"
rh "1-2"
rh "1-20"
rh "10-20"
rh "fred"
rh "jane"
rh "pic01"
rh "pic02"
rh "pic02a"
rh "pic02000"
rh "pic05"
rh "pic2"
rh "pic3"
rh "pic4"
rh "pic 4 else"
rh "pic 5"
rh "pic 5 "
rh "pic 5 something"
rh "pic 6"
rh "pic 7"
rh "pic100"
rh "pic100a"
rh "pic120"
rh "pic121"
rh "tom"
rh "x2-g8"
rh "x2-y08"
rh "x2-y7"
rh "x8-y8"
$ ./qtest -f traces/trace-18-natsort.cmd
cmd> # Test of sort with natural sort - Date
cmd> option fail 0
cmd> option malloc 0
cmd> option compare 4
Will use natstrcasecmp for sorting.
cmd> new
q = []
cmd> ih "2000-1-10"
q = [2000-1-10]
cmd> ih "2000-1-2"
q = [2000-1-2 2000-1-10]
cmd> ih "1999-12-25"
q = [1999-12-25 2000-1-2 2000-1-10]
cmd> ih "2000-3-23"
q = [2000-3-23 1999-12-25 2000-1-2 2000-1-10]
cmd> ih "1999-3-3"
q = [1999-3-3 2000-3-23 1999-12-25 2000-1-2 2000-1-10]
cmd> sort
q = [1999-12-25 1999-3-3 2000-1-10 2000-1-2 2000-3-23]
cmd> rh "1999-3-3"
Removed 1999-12-25 from queue
ERROR: Removed value 1999-12-25 != expected value 1999-3-3
q = [1999-3-3 2000-1-10 2000-1-2 2000-3-23]
cmd> rh "1999-12-25"
Removed 1999-3-3 from queue
ERROR: Removed value 1999-3-3 != expected value 1999-12-25
q = [2000-1-10 2000-1-2 2000-3-23]
cmd> rh "2000-1-2"
Removed 2000-1-10 from queue
ERROR: Removed value 2000-1-10 != expected value 2000-1-2
q = [2000-1-2 2000-3-23]
cmd> rh "2000-1-10"
Removed 2000-1-2 from queue
ERROR: Removed value 2000-1-2 != expected value 2000-1-10
q = [2000-3-23]
cmd> rh "2000-3-23"
Removed 2000-3-23 from queue
q = []
發現怎麼和 strcmp()
的結果一樣?
寫個簡單的測試程式,test.c
:
#include <stdio.h>
#include "compare.h"
int main(int argc, char *argv[])
{
int res = natstrcmp(argv[1], argv[2]);
if (res < -1)
res = -1;
else if (res > 1)
res = 1;
printf("\"%s\" %c \"%s\"\n", argv[1], "<=>"[res + 1], argv[2]);
}
和 compare.o
一起編譯。
gcc -o test test.c compare.o -g -Wall -Wextra -pedantic
$ ./test "22" "3"
"22" < "3"
只好出動 gdb
。
$ gdb -q ./test
Reading symbols from ./test...done.
(gdb) b natstrcmp_base
Breakpoint 1 at 0xa31: file compare.c, line 105.
(gdb) r 22 3
Starting program: /home/iid/Linux-Kernel-Design-2020/lab0-c/test 22 3
Breakpoint 1, natstrcmp_base (s1=0x7ffffffee23e "22", s2=0x7ffffffee241 "3", ignore_case=false) at compare.c:105
105 {
(gdb) n
109 while (*p1 || *p2) {
(gdb)
112 if (isdigit(*p1) || isdigit(*p2)) {
(gdb)
113 diff = natstrcmp_digit(&p1, &p2); /* Compare digits */
(gdb) s
natstrcmp_digit (ps2=<synthetic pointer>, ps1=<synthetic pointer>) at compare.c:58
58 *ps1 += strspn(*ps1, space);
(gdb) p *ps1
$1 = 0x7ffffffee23e "22"
(gdb) n
59 *ps2 += strspn(*ps2, space);
(gdb) p *ps1
$2 = 0x7ffffffee23e "22"
(gdb) n
61 if (!isdigit(**ps1)) {
(gdb)
65 if (!isdigit(**ps2)) {
(gdb)
72 *ps1 += strspn(*ps1, digit);
(gdb)
73 *ps2 += strspn(*ps2, digit);
(gdb)
76 numcmp = !(**ps1 == '0' || **ps2 == '0');
(gdb)
找到了 1 個問題。
仔細檢查後,發現有 3 個問題:
ps1
/ps2
當作數字部分的開頭。break;
後沒有移到下一個字元,造成同一個字元比兩次。diff --git a/compare.c b/compare.c
index 2a2c0bd..84a5c0e 100644
--- a/compare.c
+++ b/compare.c
@@ -73,7 +73,7 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
*ps2 += strspn(*ps2, digit);
diff = 0;
- numcmp = !(**ps1 == '0' || **ps2 == '0');
+ numcmp = !(*p1 == '0' || *p2 == '0');
for (;;) {
for (;;) {
if (isdigit(*p1) && isdigit(*p2)) {
@@ -84,8 +84,9 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
}
} else {
/* strcmp/numcmp: Returns on end of either numbers */
- if (!diff)
- diff = isdigit(*p1) - isdigit(*p2);
+ const int diff_len = isdigit(*p1) - isdigit(*p2);
+ if (diff_len || !numcmp)
+ diff = diff_len;
return diff;
}
@@ -95,6 +96,8 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
/* strcmp: Returns on first mismatch */
if (!numcmp)
return diff;
+ ++p1;
+ ++p2;
}
}
$ make compare.o
CC compare.o
$ gcc -o test test.c compare.o -Wall -Wextra -pedantic -g
test.c: In function ‘main’:
test.c:5:14: warning: unused parameter ‘argc’ [-Wunused-parameter]
int main(int argc, char *argv[])
^~~~
$ ./test "22" "3"
"22" > "3"
$ make
LD qtest
$ ./qtest -f traces/trace-18-natsort.cmd
cmd> # Test of sort with natural sort - Date
[...]
q = [1999-3-3 2000-3-23 1999-12-25 2000-1-2 2000-1-10]
cmd> sort
q = [1999-3-3 1999-12-25 2000-1-2 2000-1-10 2000-3-23]
cmd> rh "1999-3-3"
Removed 1999-3-3 from queue
q = [1999-12-25 2000-1-2 2000-1-10 2000-3-23]
cmd> rh "1999-12-25"
Removed 1999-12-25 from queue
q = [2000-1-2 2000-1-10 2000-3-23]
cmd> rh "2000-1-2"
Removed 2000-1-2 from queue
q = [2000-1-10 2000-3-23]
cmd> rh "2000-1-10"
Removed 2000-1-10 from queue
q = [2000-3-23]
cmd> rh "2000-3-23"
Removed 2000-3-23 from queue
q = []
Freeing queue
看起來 OK 了。
$ ./qtest -f traces/trace-19-natsort.cmd
cmd> # Test of sort with natural sort - Fractions
[...]
q = [1.002.01 1.002.03 1.002.08 1.009.10 1.009.20 1.009.02 1.010.12 1.011.02 Fractional release numbers]
cmd> sort
q = [Fractional release numbers 1.002.01 1.002.03 1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02]
cmd> rh "1.002.01"
Removed Fractional release numbers from queue
ERROR: Removed value Fractional release numbers != expected value 1.002.01
q = [1.002.01 1.002.03 1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02]
cmd> rh "1.002.03"
Removed 1.002.01 from queue
ERROR: Removed value 1.002.01 != expected value 1.002.03
q = [1.002.03 1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02]
cmd> rh "1.002.08"
Removed 1.002.03 from queue
ERROR: Removed value 1.002.03 != expected value 1.002.08
q = [1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02]
cmd> rh "1.009.02"
Removed 1.002.08 from queue
ERROR: Removed value 1.002.08 != expected value 1.009.02
q = [1.009.02 1.009.10 1.009.20 1.010.12 1.011.02]
cmd> rh "1.009.10"
Removed 1.009.02 from queue
ERROR: Removed value 1.009.02 != expected value 1.009.10
q = [1.009.10 1.009.20 1.010.12 1.011.02]
Error limit exceeded. Stopping command execution
cmd> rh "1.009.20"
cmd> rh "1.010.12"
cmd> rh "1.011.02"
cmd> rh "Fractional release numbers"
diff --git a/compare.c b/compare.c
index 0aa7d28..0279caa 100644
--- a/compare.c
+++ b/compare.c
@@ -60,11 +60,11 @@ static int natstrcmp_digit(const char **ps1, const char **ps2)
if (!isdigit(**ps1)) {
*ps2 += strspn(*ps2, digit);
- return -1;
+ return 1;
}
if (!isdigit(**ps2)) {
*ps1 += strspn(*ps1, digit);
- return 1;
+ return -1;
}
p1 = *ps1;
寫反了。這是非數字部分的結束,所以開頭非數字者大。
$ ./qtest -f traces/trace-19-natsort.cmd
[...]
q = [1.002.01 1.002.03 1.002.08 1.009.10 1.009.20 1.009.02 1.010.12 1.011.02 Fractional release numbers]
cmd> sort
q = [1.002.01 1.002.03 1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.002.01"
Removed 1.002.01 from queue
q = [1.002.03 1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.002.03"
Removed 1.002.03 from queue
q = [1.002.08 1.009.02 1.009.10 1.009.20 1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.002.08"
Removed 1.002.08 from queue
q = [1.009.02 1.009.10 1.009.20 1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.009.02"
Removed 1.009.02 from queue
q = [1.009.10 1.009.20 1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.009.10"
Removed 1.009.10 from queue
q = [1.009.20 1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.009.20"
Removed 1.009.20 from queue
q = [1.010.12 1.011.02 Fractional release numbers]
cmd> rh "1.010.12"
Removed 1.010.12 from queue
q = [1.011.02 Fractional release numbers]
cmd> rh "1.011.02"
Removed 1.011.02 from queue
q = [Fractional release numbers]
cmd> rh "Fractional release numbers"
Removed Fractional release numbers from queue
q = []
Freeing queue
OK。
$ ./qtest -f traces/trace-20-natsort.cmd
cmd> # Test of sort with natural sort - Words
[...]
q = [pic 4 else pic 5 something pic 5 pic05 pic 5 pic 7 pic 6 pic02 pic01 x2-g8 x2-y08 x8-y8 x2-y7 1-2 1-02 10-20 pic02000 pic100 1-20 pic4 pic3 pic02a tom jane pic121 pic120 pic100a pic2 fred]
cmd> sort
q = [1-02 1-2 1-20 10-20 fred jane pic01 pic02a pic02 pic02000 pic05 pic2 pic3 pic4 pic 4 else pic 5 pic 5 pic 5 something pic 6 pic 7 pic100a pic100 pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
[...]
q = [pic02a pic02 pic02000 pic05 pic2 pic3 pic4 pic 4 else pic 5 pic 5 pic 5 something pic 6 pic 7 pic100a pic100 pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
cmd> rh "pic02"
Removed pic02a from queue
ERROR: Removed value pic02a != expected value pic02
q = [pic02 pic02000 pic05 pic2 pic3 pic4 pic 4 else pic 5 pic 5 pic 5 something pic 6 pic 7 pic100a pic100 pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
cmd> rh "pic02a"
Removed pic02 from queue
ERROR: Removed value pic02 != expected value pic02a
q = [pic02000 pic05 pic2 pic3 pic4 pic 4 else pic 5 pic 5 pic 5 something pic 6 pic 7 pic100a pic100 pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
[...]
q = [pic100a pic100 pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
cmd> rh "pic100"
Removed pic100a from queue
ERROR: Removed value pic100a != expected value pic100
q = [pic100 pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
cmd> rh "pic100a"
Removed pic100 from queue
ERROR: Removed value pic100 != expected value pic100a
q = [pic120 pic121 tom x2-g8 x2-y08 x2-y7 x8-y8]
發現在 natstrcmp_digit()
數字相同回傳 0
時,會漏掉一個字元的比較。
diff --git a/compare.c b/compare.c
index 0279caa..a4b7b6b 100644
--- a/compare.c
+++ b/compare.c
@@ -110,11 +110,12 @@ static int natstrcmp_base(const char *s1, const char *s2, bool ignore_case)
const char *p2 = s2;
while (*p1 || *p2) {
- int diff;
+ int diff = 0;
if (isdigit(*p1) || isdigit(*p2)) {
diff = natstrcmp_digit(&p1, &p2); /* Compare digits */
- } else {
+ }
+ if (!diff) {
const char c1 = (ignore_case) ? tolower(*p1) : *p1;
const char c2 = (ignore_case) ? tolower(*p2) : *p2;
diff = c1 - c2;
$ ./qtest -f traces/trace-20-natsort.cmd >/dev/null; echo $?
0
終於通過 natural comparison 的測試了。
試著放到 scripts/driver.py
中。
diff --git a/scripts/driver.py b/scripts/driver.py
index f0abc2c..bc33b60 100755
--- a/scripts/driver.py
+++ b/scripts/driver.py
@@ -35,7 +35,10 @@ class Tracer:
14: "trace-14-perf",
15: "trace-15-perf",
16: "trace-16-perf",
- 17: "trace-17-complexity"
+ 17: "trace-17-complexity",
+ 18: "trace-18-natsort",
+ 19: "trace-19-natsort",
+ 20: "trace-20-natsort",
}
traceProbs = {
@@ -55,10 +58,13 @@ class Tracer:
14: "Trace-14",
15: "Trace-15",
16: "Trace-16",
- 17: "Trace-17"
+ 17: "Trace-17",
+ 18: "Trace-18",
+ 19: "Trace-19",
+ 20: "Trace-20",
}
- maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5]
+ maxScores = [0, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 4, 4, 5]
RED = '\033[91m'
GREEN = '\033[92m'
$ make test
scripts/driver.py -c
--- Trace Points
[...]
+++ TESTING trace trace-16-perf:
# Test performance of sort with random and descending orders
# 10000: all correct sorting algorithms are expected pass
# 50000: sorting algorithms with O(n^2) time complexity are expected failed
# 100000: sorting algorithms with O(nlogn) time complexity are expected pass
--- trace-16-perf 6/6
+++ TESTING trace trace-17-complexity:
# Test if q_insert_tail and q_size is constant time complexity
Probably constant time
Probably constant time
--- trace-17-complexity 5/5
+++ TESTING trace trace-18-natsort:
# Test of sort with natural sort - Date
Will use natstrcasecmp for sorting.
--- trace-18-natsort 4/4
+++ TESTING trace trace-19-natsort:
# Test of sort with natural sort - Fractions
Will use natstrcasecmp for sorting.
--- trace-19-natsort 4/4
+++ TESTING trace trace-20-natsort:
# Test of sort with natural sort - Words
Will use natstrcasecmp for sorting.
--- trace-20-natsort 5/5
--- TOTAL 113/113
可以 commit
了。
$ git commit -a
[master e1517c2] Make `make test` test natural sort
4 files changed, 113 insertions(+), 3 deletions(-)
create mode 100644 traces/trace-18-natsort.cmd
create mode 100644 traces/trace-19-natsort.cmd
create mode 100644 traces/trace-20-natsort.cmd
dudect
實作探討dudect
?
“simulation”
初探 - 2020-02-29 17:14作業要求中有個:
修改排序所用的比較函式,變更為 natural sort,在 “simulation” 也該做對應的修改,得以反映出 natural sort 的使用。
“simulation”
是什麼意思?
用 rg
(ripgrep
) 找找看:
$ rg simulation
console.c
20:bool simulation = false;
102: add_param("simulation", (int *) &simulation, "Start/Stop simulation mode",
console.h
9:extern bool simulation;
qtest.c
254: if (simulation) {
256: report(1, "%s does not need arguments in simulation mode", argv[0]);
473: if (simulation) {
475: report(1, "%s does not need arguments in simulation mode", argv[0]);
dudect/constant.c
27:/* Implement the necessary queue interface to simulation */
traces/trace-17-complexity.cmd
2:option simulation 1
5:option simulation 0
發現在 qtest.c
下的 do_insert_tail()
與 do_size()
在 simulation
成立時,會去呼叫 is_*_const()
函數。
$ rg 'is_\w+_const'
qtest.c
259: bool ok = is_insert_tail_const();
478: bool ok = is_size_const();
dudect/fixture.c
156:bool is_insert_tail_const(void)
177:bool is_size_const(void)
dudect/fixture.h
8:bool is_insert_tail_const(void);
9:bool is_size_const(void);
進入 dudect/fixture.c
,發現 is_*_const()
函數都會去重複呼叫 doit()
,而 doit()
呼叫的 measure()
名字看來很可疑。
$ rg -w measure
dudect/constant.h
37:void measure(int64_t *before_ticks,
dudect/constant.c
55:void measure(int64_t *before_ticks,
dudect/fixture.c
136: measure(before_ticks, after_ticks, input_data, mode);
進入 dudect/constant.c
,發現裡面有一堆 dut_*()
函數。
$ rg 'dut_'
dudect/constant.c
64: dut_new();
65: dut_insert_head(
69: dut_insert_tail(s, 1);
71: dut_free();
75: dut_new();
76: dut_insert_head(
80: dut_size(1);
82: dut_free();
dudect/constant.h
5:#define dut_new() \
10:#define dut_size(n) \
16:#define dut_insert_head(s, n) \
23:#define dut_insert_tail(s, n) \
30:#define dut_free() \
發現是 #define
開頭的,原來是 function-like macros。
進入 dudect/constant.h
:
#define dut_new() \
{ \
q = q_new(); \
}
#define dut_size(n) \
do { \
for (int __iter = 0; __iter < n; ++__iter) \
q_size(q); \
} while (0);
#define dut_insert_head(s, n) \
do { \
int j = n; \
while (j--) \
q_insert_head(q, s); \
} while (0);
#define dut_insert_tail(s, n) \
do { \
int j = n; \
while (j--) \
q_insert_tail(q, s); \
} while (0);
#define dut_free() \
{ \
q_free(q); \
}
既然已經用 do {} while (0)
包起來了,就不應該有後面的 ;
了。
因為當寫程式者以為它是函數時,會用如 dut_size(42);
的方法使用這個 macro,而這樣就會造成多出的 ;
。
請送 pull request 過來。
:notes: jserv
註 - 2020-03-12 17:40
已送:https://github.com/sysprog21/lab0-c/pull/39
如果用如
if (cond)
dut_size(42);
else
dut_free();
的方法使用的話,會展開成:
if (cond)
do { \
for (int __iter = 0; __iter < 42; ++__iter) \
q_size(q); \
} while (0);;
else
{ \
q_free(q); \
};
(行尾的 \
是為排版而加,實際上展開後應已併為單行而不存在)
寫程式者自己加上的 ;
已經在 if
block 外面,後面的 else
不會對應前面的 if (cond)
,而是更外層的 if
;如果這已經是最外層的 if
的話,就會產生語法錯誤。
而底下的 else
後的寫程式者自己加上的 ;
,也不屬於 else
block。
而 macros dut_new()
與 dut_free()
完全可以寫成 expression 的形式,可以完美避免這種 statement 才會遇到的問題:
#define dut_new() \
( \
(void)(q = q_new()) \
)
#define dut_free() \
( \
(void)(q_free(q)) \
)
加上 (void)
,是避免有人使用例如 q_size dut_new()
(會展開成 q_size (q = q_new())
) 這種看似不合 C/C++ 語法而令人困惑的方法呼叫函數。
而將轉成 void
的 expression 用 ()
括起,是避免有人使用例如 dut_new()->head++
(會展開成 (void)(q = q_new())->head++
) 的方式繞過轉型成 void
後不能操作的限制,以免這人改用 ++dut_new()->head
(會展開成 ++(void)(q = q_new())->head
) 時發生編譯錯誤而抱怨。
Dude, is my code constant time?
內容整理