2024q1 Homework2 (quiz1+2)

contributed by <Howard Chen>

第一周測驗

測驗 1:

This function was used to calcaute the length of the list by traversing from head to the end whcih is NULL.

@@ -61,7 +61,7 @@ int list_length(node_t **left)
     int n = 0;
     while (*left) {
         ++n;
-        left = &(BBBB);
+        left = &((*left)->next);
     }
     return n;
 }

This function was implemented to return the last node of the list. Since this is a singly linked list, the last node was linked with NULL. We have to traverse until the end to get the last node.

@@ -52,7 +52,7 @@
 node_t *list_tail(node_t **left)
 {
     while ((*left) && (*left)->next)
-        left = &(AAAA);
+        left = &((*left)->next);
     return *left;
 }

This part of the code is responsible for partitioning the sublist into two smaller sublists (left and right). It iterates through each node in the sublist p and adds it to either the left or right sublist based on its value compared to the pivot value. This partitioning step divide the list into smaller parts for sorting.

@@ -107,16 +112,16 @@ void quick_sort(node_t **list)

            while (p) {
                node_t *n = p;
-                p = CCCC;
+                p = p->next;
                list_add(n->value > value ? &right : &left, n);
            }

In below code segment, after partitioning the sublist into left, pivot, and right sublists, the pointers begin and end are updated to reflect the boundaries of these sublists. Additionally, the pointers left and right are reset to NULL, and the loop counter i is incremented by 2 to move to the next level of sublists.

begin[i] = left and end[i] = list_tail(&left): Set the beginning and ending nodes of the left sublist.
begin[i + 1] = pivot and end[i + 1] = pivot: Set the beginning and ending nodes of the pivot sublist (which is essentially just the pivot node).
begin[i + 2] = right and end[i + 2] = list_tail(&right): Set the beginning and ending nodes of the right sublist.
left = right = NULL: Reset the left and right pointers to NULL to prepare for the next iteration.

             begin[i] = left;
-            end[i] = DDDD;
+            end[i] = list_tail(&left);;
             begin[i + 1] = pivot;
             end[i + 1] = pivot;
             begin[i + 2] = right;
-            end[i + 2] = EEEE;
+            end[i + 2] = list_tail(&right);;

測驗 2:

Timsort 程式運作原理：

The main spriti of timsort is that it divides the data into smaller chunks called "runs" and then uses a combination of insertion sort and merge sort to sort these runs efficiently.

About weekly test

During the weekly test, Google sheet deteremined my answer for BBBB and CCCC were incorrect. However, the real answers of &a->next and &b->next meaning the same expression as what I filled as below. They're all aim to assign the address of a->next and b->next to tail since tail is a pointer that pointed to the address of head.

         if (cmp(priv, a, b) <= 0) {
             *tail = a;
-            tail = BBBB;
+            tail = &(a->next);
             a = a->next;
             if (!a) {
                 *tail = b;

@@ -130,7 +130,7 @@ static struct list_head *merge(void *priv,
             }
         } else {
             *tail = b;
-            tail = CCCC;
+            tail = &(b->next);
             b = b->next;
             if (!b) {
                 *tail = a;

Merge timsort to lab0-c

In the original timsort code, the cmp funtion was comparing integer number. However, we need to modified to compare char* in order to used timsort in lab0-c
Commit: Add support to return queue size
Commit: Complete basic functions for make test pass

-    int res = list_entry(b, element_t, list)->value -
-              list_entry(a, element_t, list)->value;
+    int res = strcmp(list_entry(a, element_t, list)->value,
+                     list_entry(b, element_t, list)->value);

Comparing timsort and q_sort performance

We use perf command to help analysis the performance and cache-miss of q_sort and timsort. In order to use the same data, we copy trace-15-perf.cmd and modified the sort command to timsort

sudo perf stat -r 100 -e cache-misses,cache-references,instructions,cycles ./qtest -f ./traces/trace-tim15.cmd

We could find that by using timsort, we could have less cache miss and better performance. The outcome confirms that Timsort exhibits a characteristic advantageous for memory locality.

Because Timsort divides the input data into small blocks called "runs." These runs represent contiguous subsequences of the input data that are already partially sorted. By sorting small chunks of data at a time, Timsort improves memory locality because it increases the likelihood that the elements being accessed during sorting are stored close together in memory.

`q_sort`

 Performance counter stats for './qtest -f ./traces/trace-15-perf.cmd' (100 runs):

           1870609      cache-misses              #    4.760 % of all cache refs      ( +-  1.42% )
          39672133      cache-references                                              ( +-  0.23% )
         772755378      instructions              #    0.97  insn per cycle           ( +-  0.00% )
         795976486      cycles                                                        ( +-  0.14% )

          0.170038 +- 0.000270 seconds time elapsed  ( +-  0.16% )

`timsort`

 Performance counter stats for './qtest -f ./traces/trace-tim15.cmd' (100 runs):

           1563182      cache-misses              #    4.017 % of all cache refs      ( +-  1.78% )
          39160639      cache-references                                              ( +-  0.24% )
         772729482      instructions              #    0.99  insn per cycle           ( +-  0.00% )
         772852364      cycles                                                        ( +-  0.16% )

          0.167184 +- 0.000283 seconds time elapsed  ( +-  0.17% )

第二周測驗

測驗 1

透過前序與中序建構樹程式運作原理

hlist_node: This structure represents a node in a hash list.
- next: Points to the next node in the hash list.
- pprev: Points to the pointer that points to this node. In other words, it points to the previous node's next, which allows efficient removal of nodes from the list without traversing the list from the beginning.
hlist_head: This structure represents the head of a hash list.
- first: Points to the first node in the hash table.
order_node: This structure represents a node containing ordered information.
- node: An instance of the hlist_node structure, allowing it to be part of a hash list.
- val: Holds the value of the node.
- idx: Holds the index of the node.

struct hlist_node {
    struct hlist_node *next, **pprev;
};

struct hlist_head {
    struct hlist_node *first;
};

struct order_node {
    struct hlist_node node;
    int val;
    int idx;
};

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

The hlist_add_head function is used to add a node to the head of a hash list.

static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
{
    if (h->first)
        h->first->pprev = &n->next;
    n->next = h->first;
    n->pprev = &h->first;
    h->first = n;
}

+ n->next = h->first;
- n->next = AAAA;

The find funtion is used to find the index of root from indorer list

static int find(int num, int size, const struct hlist_head *heads)
{
    struct hlist_node *p;
    int hash = (num < 0 ? -num : num) % size;
    hlist_for_each (p, &head[hash]) {
        struct order_node *on = container_of(p, struct order_node, node);
        if (num == on->val)
            return on->idx;
    }
    return -1;
}





+ struct order_node *on = container_of(p, struct order_node, node);
- struct order_node *on = CCCC(p, struct order_node, node);

+ hlist_for_each (p, &head[hash]) {
- hlist_for_each (p, BBBB) {

`dfs`

Checks if the current subtree bounds are valid.
Allocates memory for a new TreeNode and sets its value to the first element in the preorder traversal.
Finds the index of the root node in the inorder list
Recursively constructs the left subtree:
- Updates the subtree bounds and recursively calls dfs for the left subtree.
Recursively constructs the right subtree:
- Updates the subtree bounds and recursively calls dfs for the right subtree.
Returns the root node of the constructed binary tree.

static struct TreeNode *dfs(int *preorder,
                            int pre_low,
                            int pre_high,
                            int *inorder,
                            int in_low,
                            int in_high,
                            struct hlist_head *in_heads,
                            int size)
{
    if (in_low > in_high || pre_low > pre_high)
        return NULL;

    struct TreeNode *tn = malloc(sizeof(*tn));
    tn->val = preorder[pre_low];
    int idx = find(preorder[pre_low], size, in_heads);
    tn->left = dfs(preorder, pre_low + 1, pre_low + (idx - in_low), inorder,
                   in_low, idx - 1, in_heads, size);
    tn->right = dfs(preorder, pre_high - (in_high - idx - 1), pre_high, inorder,
                    idx + 1, in_high, in_heads, size);
    return tn;
}

The meaning of`pre_low + (idx - in_low)`:

The expression is used to determine the range of elements in the preorder list that correspond to the left subtree of the current node being constructed.

Breakdown of the Expression:

pre_low: Index of the root node in the preorder list.
idx: Index of the root node in the inorder list.
in_low: Starting index of the current range in the inorder list.

Explanation:

In the inorder list, elements to the left of the root node (from index in_low to idx - 1) belong to the left subtree of the current node.
To find the corresponding range of elements in the preorder list for the left subtree:
- The left subtree in the preorder list immediately follows the root node.
- The number of elements in the left subtree of the current node is determined by the difference between the index of the root node in the inorder list (idx) and the starting index of the current range in the inorder list (in_low).
- Therefore, idx - in_low gives us the count of elements in the left subtree. Adding this count to the index of the root node in the preorder list (pre_low) gives us the index of the first element of the left subtree in the preorder list.

The meaning of `pre_high - (in_high - idx - 1)`

Same as previous, this expression is aim for the right subtree.

The right subtree in the preorder list is located after the left subtree and the root node.
We need to consider the number of elements in the right subtree of the current node, which is determined by the difference between the index of the last element in the inorder list (in_high) and the index of the root node (idx), plus 1 to include the root node.
Subtracting this count from the index of the last element in the preorder list (pre_high) gives us the index of the first element of the right subtree in the preorder list.

`node_add`

This function add the node to the hash list and count the hash key using it's value and the list size.

static inline void node_add(int val,
                            int idx,
                            int size,
                            struct hlist_head *heads)
{
    struct order_node *on = malloc(sizeof(*on));
    on->val = val;
    on->idx = idx;
    int hash = (val < 0 ? -val : val) % size;
    hlist_add_head(&on->node, &heads[hash]);
}

+ hlist_add_head(&on->node, &heads[hash]);
- hlist_add_head(&on->node, DDDD);

`buildTree`

First of all, build a hash table for inorder list. Then dfs is used to build the tree

static struct TreeNode *buildTree(int *preorder,
                                  int preorderSize,
                                  int *inorder,
                                  int inorderSize)
{
    struct hlist_head *in_heads = malloc(inorderSize * sizeof(*in_heads));

    for (int i = 0; i < inorderSize; i++)
        INIT_HLIST_HEAD(&in_heads[i]);

    for (int i = 0; i < inorderSize; i++)
        node_add(inorder[i], i, inorderSize, in_heads);

    return dfs(preorder, 0, preorderSize - 1, inorder, 0, inorderSize - 1,
               in_heads, inorderSize);
}

測驗 2

LRU 資料結構:

The LRUCache and LRUNode structures are used together to implement a Least Recently Used (LRU) cache.

typedef struct {
    int capacity;
    int count;
    struct list_head dhead;
    struct hlist_head hhead[];
} LRUCache;

typedef struct {
    int key;
    int value;
    struct hlist_node node;
    struct list_head link;
} LRUNode;

`LRUCache`

LRUCache represents the main cache structure.
It contains the following members:
- capacity: The maximum capacity of the cache.
- count: The current number of elements in the cache.
- dhead: The head of a doubly linked list used to maintain the order of elements based on their usage.
- hhead[]: An array of hash lists used for quick lookup of elements by their key.

`LRUNode`

LRUNode represents a node in the cache.
Each node contains the following members:
- key: The key associated with the node.
- value: The value associated with the key.
- node: An instance of hlist_node used to link nodes within a hash list.
- link: An instance of list_head used to link nodes within the doubly linked list maintained by the LRUCache.
Each LRUNode is stored in both a hash list and the doubly linked list.
The hash list provides fast access to nodes based on their key, while the doubly linked list maintains the order of nodes based on their usage.

Reference from marvin0102 figure to illustrate the relationship of LRUCache and LRUNode as above.

The lRUCacheFree function is responsible for freeing the memory allocated for the LRUCache object and its associated LRUNodes. It iterates through the doubly linked list of LRUNodes, deletes each node from the list, frees the memory associated with it, and finally frees the memory occupied by the LRUCache object itself.

void lRUCacheFree(LRUCache *obj)
{
    struct list_head *pos, *n;
    list_for_each_safe (pos, n, &obj->dhead) {
-       LRUNode *cache = list_entry(pos, LRUNode, FFFF);
+       LRUNode *cache = list_entry(pos, LRUNode, link);
-       list_del(GGGG);
+       list_del(&(cache->link));
        free(cache);
    }
    free(obj);
}

The lRUCacheGet function is used to retrieve the value associated with a given key from the LRUCache object.

It calculates the hash value for the key using the modulo operation % with the capacity of the LRUCache.
It then iterates through the hash list associated with the computed hash value.
Within the loop, it obtains each LRUNode corresponding to the current hash value using the list_entry.
It checks if the key of the current LRUNode matches the key being searched for.
- If a match is found, it moves the LRUNode to the front of the doubly linked list using the list_move function to indicate that it was recently accessed and returns the value associated with the key.
If no matching key is found, it returns -1 indicating that the key was not found in the LRUCache.

@@ -140,9 +140,9 @@ int lRUCacheGet(LRUCache *obj, int key)
     int hash = key % obj->capacity;
     struct hlist_node *pos;
     hlist_for_each (pos, &obj->hhead[hash]) {
-        LRUNode *cache = list_entry(pos, LRUNode, HHHH);
+        LRUNode *cache = list_entry(pos, LRUNode, node);
         if (cache->key == key) {
-            list_move(IIII, &obj->dhead);
+            list_move(&(cache->link), &obj->dhead);
             return cache->value;
         }
     }

The lRUCachePut function is used to insert a key-value pair into the LRUCache.

It calculates the hash value for the key using the modulo operation % with the capacity of the LRUCache and iterates through the hash list associated with the computed hash value to check if the key already exists in the cache.
If the key exist:
- If the key already exists, it moves the LRUNode to the front of the doubly linked list to indicate that it was recently accessed.
- It updates the cache pointer with the found LRUNode.
If the key does not exist:
- If the cache is full, it removes the LRUNode associated with the least recently used key from the cache by moving it to the front of the doubly linked list and updating the hash list.
- It creates a new LRUNode for the key-value pair, adds it to the cache, and updates the count.
Finally, it sets the value of the LRUNode to the specified value.

@@ -155,9 +155,9 @@ void lRUCachePut(LRUCache *obj, int key, int value)
     int hash = key % obj->capacity;
     struct hlist_node *pos;
     hlist_for_each (pos, &obj->hhead[hash]) {
-        LRUNode *c = list_entry(pos, LRUNode, JJJJ);
+        LRUNode *c = list_entry(pos, LRUNode, node);
         if (c->key == key) {
-            list_move(KKKK, &obj->dhead);
+            list_move(&c->link, &obj->dhead);
             cache = c;
         }
     }

測驗 3

sz % BITS_PER_LONG calculates the remainder when sz is divided by BITS_PER_LONG. It determines if there are any remaining bits after dividing sz into groups of BITS_PER_LONG.
The result of this expression is used as a condition to check if there are any remaining bits in the last word. If there are remaining bits, the macro performs additional operations to handle the last word separately.

@@ -49,7 +49,7 @@ static inline unsigned long hweight_long(unsigned long w)
             nr -= w;                                            \
         }                                                       \
                                                                 \
-        if (sz CCCC BITS_PER_LONG)                              \
+        if (sz % BITS_PER_LONG)                              \
             tmp = (FETCH) & BITMAP_LAST_WORD_MASK(sz);          \
     found:                                                      \
         sz = min(idx * BITS_PER_LONG + fns(tmp, nr), sz);       \

The __ffs function efficiently finds the position of the first set (non-zero) bit in a 64-bit word. The function first checks if the lower 32 bits of the word are all zeros. This is done by bitwise AND operation with 0xffffffff, which isolates the lower 32 bits.

@@ -67,7 +67,7 @@ static inline unsigned long __ffs(unsigned long word)
     int num = 0;

 #if BITS_PER_LONG == 64
-    if ((word & AAAA) == 0) {
+    if ((word & 0xffffffff) == 0) {
         num += 32;
         word >>= 32;
     }

__clear_bit function calculate the mask for the specific bit to be cleared using BIT_MASK(nr). This creates a bitmask with all bits set to 1 except for the bit at position nr, which is set to 0. For the address, it calculates the address of the word (a group of bits) containing the bit to be cleared. This is done by adding the word index (determined by BIT_WORD(nr)) to the base address addr. In the end, it perform a bitwise AND operation between the value pointed to by p and the complement of the mask (~mask). This clears the bit at position nr in the word.

@@ -98,7 +98,7 @@ static inline void __clear_bit(unsigned long nr, volatile unsigned long *addr)
     unsigned long mask = BIT_MASK(nr);
     unsigned long *p = ((unsigned long *) addr) + BIT_WORD(nr);

-    *p &= BBBB;
+    *p &= ~mask;
 }

2024q1 Homework2 (quiz1+2)

第一周測驗

測驗 1:

測驗 2:

Timsort 程式運作原理：

About weekly test

Merge timsort to lab0-c

Comparing timsort and q_sort performance

q_sort

timsort

第二周測驗

測驗 1

透過前序與中序建構樹程式運作原理

dfs

The meaning ofpre_low + (idx - in_low):

Breakdown of the Expression:

Explanation:

The meaning of pre_high - (in_high - idx - 1)

node_add

buildTree

測驗 2

LRU 資料結構:

LRUCache

LRUNode

測驗 3

Read more

2022q1 Homework1 (lab0)

linux2024-homework5

2024q1 Homework6 (integration)

2024q1 Homework4 (quiz3+4)