2022q1 Homework3 (quiz3)

contributed by <yaohwang99 >

quiz3

Problem 1

In Linux kernel,GENMASK gernerates bitmask for the given range.

GENMASK(6, 4) produces 01110000₂
GENMASK(39, 21) produces 0x000000FFFFE00000

For the above 2 examples to be true, we can deduct that LEFT is 63 - h and RIGHT is l.

To clear the more significant bits, LEFT must be 63 - h.
To clear the less significant bits, RIGHT is l.

#define GENMASK(h, l) \
    (((~0UL) >> (LEFT)) & ((~0UL) >> (l) << (RIGHT)))

However, I believe that the code is not correct because:

GENMASK(63, 8) produces 0xFFFFFFFFFFFFFFFF & 0x00FFFFFFFFFFFF00 == 0x00FFFFFFFFFFFF00 which is not the correct result.

I believe the following code is correct.

#define GENMASK(h, l) \
    (((~0UL) >> (63-h) & ((~0UL) << l)))

Problem 2

Memory alignment

struct foo;
  
struct fd {
    struct foo *foo;
    unsigned int flags;
};

enum {
    FOO_DEFAULT = 0,
    FOO_ACTION,                           
    FOO_UNLOCK,
} FOO_FLAGS;

static inline struct fd to_fd(unsigned long v)
{
    return (struct fd){EXP1, v & 3};
}

The function to_fd use the parameter v , align down to 4, and use it as the address of *foo.
Align down to 4 means that we clear the last 2 bits of v, therefore, EXP1 is v & ~3.

Problem 3








#include <stdint.h>
uint8_t rev8(uint8_t x)
{
    x = (x >> 4) | (x << 4);               
    x = ((x & 0xCC) >> 2) | (EXP2);
    x = ((x & 0xAA) >> 1) | (EXP3);
    return x;
}

0xCC = 11001100₂
0xAA = 10101010₂
line 4 swaps the most significant 4 bits with the less significant 4 bits
line 5 swaps every 2 bits
line 6 swaps every bit.
EXP2: x & 0x33) << 2
EXP3: x & 0x55) << 2

Problem 4

    int i;
    foreach_int(i, 0, -1, 100, 0, -99) {
        printf("i is %i\n", i);
    }
    const char *p;
    foreach_ptr(p, "Hello", "world") {
        printf("p is %s\n", p);
    }

#include <assert.h>
#define _foreach_no_nullval(i, p, arr) \
    assert((i) >= sizeof(arr) / sizeof(arr[0]) || (p))

#define foreach_int(i, ...)                                            \
    for (unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0); \
         _foreach_i < sizeof((int[]){__VA_ARGS__}) / sizeof(int);      \
         (i) = ((int[]){__VA_ARGS__, 0})[EXP4])

#define foreach_ptr(i, ...)                                                 \
    for (unsigned _foreach_i =                                              \
             (((i) = (void *) ((typeof(i)[]){__VA_ARGS__})[0]), 0); \
         (i); (i) = (void *) ((typeof(i)[]){__VA_ARGS__,            \
                                                    NULL})[EXP5],   \
                  _foreach_no_nullval(_foreach_i, i,                        \
                                      ((const void *[]){__VA_ARGS__})))

From C99 Standard

6.5.17 Comma operator
Syntax

expression:
+ assignment-expression
+ expression , assignment-expression
Semantics
The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its evaluation and that of the right operand. Then the right operand is evaluated; the result has its type and value.
For example, the function f(a, (t=3, t+2), c) has three arguments, the second of which has the value 5.

unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0) assigns ((int[]){__VA_ARGS__})[0]) to i, then assign the second 0 to _foreach_i

Just like a common for(int i = 0; i < n; ++i) , we can deduct that EXP4 and EXP5 is ++_foreach_1.

Problem 5

#include <limits.h>
int divide(int dividend, int divisor)
{
    int signal = 1;
    unsigned int dvd = dividend;
    if (dividend < 0) {
        signal *= -1;
        dvd = ~dvd + 1;
    }

    unsigned int dvs = divisor;
    if (divisor < 0) {
        signal *= -1;
        dvs = ~dvs + 1;
    }

    int shift = 0;
    while (dvd > (EXP6))
        shift++;

    unsigned int res = 0;
    while (dvd >= dvs) {
        while (dvd < (dvs << shift))
            shift--;                         
        res |= (unsigned int) 1 << shift;
        EXP7;
    }

    if (signal == 1 && res >= INT_MAX)
        return INT_MAX;
    return res * signal;
}

Problem 6






























    for (i = 0; i < pointsSize; i++) {
        for (j = i + 1; j < pointsSize; j++) {
            if (points[i].x == points[j].x)
                hori_cnts[i]++, hori_cnts[j]++;

            if (points[i].y == points[j].y)
                vert_cnts[i]++, vert_cnts[j]++;

            if (points[i].x == points[j].x && points[i].y == points[j].y)
                dup_cnts[i]++, dup_cnts[j]++;

            if (points[i].x != points[j].x && points[i].y != points[j].y) {
                int dx = points[j].x - points[i].x;
                int dy = points[j].y - points[i].y;
                int tmp = gcd(dx, dy);
                dx /= tmp;
                dy /= tmp;
                int hash = dx * dy - 1333 * (dx + dy);
                if (hash < 0)
                    hash = -hash;
                hash %= slope_size;
                if (can_insert(&heads[hash], i, j)) {
                    struct point_node *pn = malloc(sizeof(*pn));
                    pn->p1 = i;
                    pn->p2 = j;
                    EXP9;
                }
            }
        }
    }

Line 3 to 7 counts the number of points in a vertical line and horizontal line for a given point. For example, if hori_cnts[1] is 5, that means points[1] is in a horizontal line consist of other 5 points and itself.
Line 9 and 10 counts the number of duplicated points for a given point. For example, if dup_cnts[2] is 2, that means there are 2 other identical points.
Line 12 to 28 uses different slope as hash and put the points into the hash. Therefore, EXP9 is list_add(pn, &heads[hash])

In line 22, we first check if we can insert the pair.

static bool can_insert(struct list_head *head, int p1, int p2)
{
    struct point_node *pn;
    list_for_each_entry (pn, head, link)
        return EXP8;
    return true;
}

To deduct EXP8, consider the following example.

The traverse order is <points[1], points[2]>

\to

<points[1], points[3]>

\to

<points[1], points[4]>

\to

,<points[2], points[3]>

\to

<points[2], points[4]>

\to

<points[3], points[4]>. To count the correct number of points in the line, we need each pair consist of the point with the smallest index(because of the traverse order). Therefore, we check if we can insert by EXP8: p1 == pn->p1.

Problem 7

If v > 0, then ret must be greater or equal to 1.
Check if any of the 32 more significant bits are set.
If so, then ret must be greater then 32.
Set v as the remaining 32 bits.

int ilog32(uint32_t v)
{
    int ret = v > 0;
    int m = (v > 0xFFFFU) << 4;
    v >>= m;
    ret |= m;

Check if any of the 16 more significant bits are set.
If so, then ret must be greater then 16.
Set v as the remaining 16 bits.
Repeat the same process for 8,4,2 more significant bits.
Therefore, EXP10 is (v > 0x3) << 1

    m = (v > 0xFFU) << 3;
    v >>= m;
    ret |= m;
    m = (v > 0xFU) << 2;
    v >>= m;
    ret |= m;
    m = EXP10;
    v >>= m;
    ret |= m;

v then becomes 01₂, 10₂,or 11₂
We will need one more bit if v > 1, therefore, EXP11 is ret += v > 1


    EXP11;
    return ret;
}

Problem 8

Use the technique of indirect pointer.

void remove_data(tree &t, int d)
{
    tnode **p = &t;
    while (*p != 0 && (*p)->data != d) {
        if (d < (*p)->data)
            EXP12;
        else
            EXP13;
    }

Traverse to left or right child according to current value.
EXP12: p = &(*p)->left
EXP13: p = &(*p)->right

For a binary tree, after removing the target, move the smallest child element in the right subtree to target, or move the largest element in the left subtree to target.
In the following code, we move the smallest child element in the right subtree to target.

    tnode *q = *p;
    if (!q)
        return;

    if (!q->left)
        *p = q->right;
    else if (!q->right)
        *p = q->left;
    else {
        tnode **r = &q->right;
        while ((*r)->left)
            r = EXP14;
        q->data = (*r)->data;
        q = *r;
        *r = q->right;
    }
    delete q;
}

Therefore, EXP14 is r = &(*r)->left

Problem 9

/* maximum alignment needed for any type on this platform, rounded up to a
   power of two */
#define MAX_ALIGNMENT 16

/* Given a size, round up to the next multiple of sizeof(void *) */
#define ROUND_UP_TO_ALIGNMENT_SIZE(x) \
    (((x) + MAX_ALIGNMENT - MMM) & ~(NNN))

Using the fact that

⌈ \frac{a}{b} ⌉ = ⌊ \frac{a}{b} + \frac{b - 1}{b} ⌋

x round up to 16

= ⌈ \frac{x}{16} ⌉ * 16 = ⌊ \frac{x + 15}{16} ⌋ * 16

Therefore, MMM is 1, NNN is MAX_ALIGNMENT-1

Problem 10

#define DIVIDE_ROUND_CLOSEST(x, divisor)                       \
    ({                                                         \
        typeof(x) __x = x;                                     \
        typeof(divisor) __d = divisor;                         \
        (((typeof(x)) -1) > 0 || ((typeof(divisor)) -1) > 0 || \
         (((__x) > 0) == ((__d) > 0)))                         \
            ? ((RRR) / (__d))                  \
            : ((SSS) / (__d));                 \
    })

r o u n d (\frac{a}{b}) = ⌊ \frac{a + \frac{b}{2}}{b} ⌋

x or divisor is unsigned
x and divisor is both positive or negetive
then return (__x)+(__d)>>1
otherwise, return -(__x)+(__d)>>1

Therefore, RRR is (__x)+(__d)>>1, SSS is -(__x)+(__d)>>1.

Problem 11

fls() returns the position of the most significant bit.

Set the current value to 63, the greatest possible number.
If all of the 32 more significant bit is 0, the reult is 32 less than the current number.
Repeat step 2. for 16, 8, 4, 2, 1, return the final number.

static inline unsigned long fls(unsigned long word)
{
    int num = 64 - 1;
        
    if (!(word & (~0ul << 32))) {
        num -= 32;
        word <<= 32;
    }
    if (!(word & (~0ul << (64 - 16)))) {
        num -= 16;
        word <<= 16;
    }
    if (!(word & (~0ul << (64 - 8)))) {
        num -= 8;
        word <<= 8;
    }
    if (!(word & (~0ul << (64 - 4)))) {
        num -= 4;
        word <<= 4;
    }   
    if (!(word & (~0ul << (64 - 2)))) {
        num -= 2;
        word <<= 2;
    }   
    if (!(word & (~0ul << (64 - 1))))
        num -= 1;
    return num;
}

The following code is used fore calculating

⌊ \sqrt{x} ⌋

unsigned long i_sqrt(unsigned long x)
{
    unsigned long b, m, y = 0;

    if (x <= 1)
        return x;

    m = 1UL << (fls(x) & ~1UL);
    while (m) {
        b = y + m;
        XXX;

        if (x >= b) {
            YYY;
            y += m;
        }
        ZZZ;
    }

    return y;
}

To understand how the code works, we need to know how to calculate the square root of a binary number.
source

We calculate the answer every two digits.
y corresponds the leftmost column.
m = 1UL << (fls(x) & ~1UL); the return value of fls() is rounded down to even number.

For the above example,

x = 169_{1} 0 = 10101001_{2}

f l s (x) = 7

m = 1000000_{2}

y of the next iteration is equal to (y+m)>>1 if x > y + m.
Therefore,XXX is y >>= 1, YYY is x -= b, ZZZ is m >>= 2.