Yu Jen Lin
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # 2020q3 Homework3 (quiz3) contributed by < `sammer1107` > ###### tags: `進階電腦系統理論與實作` > [測驗題](https://hackmd.io/@sysprog/2020-quiz3) ## 測驗 1 ```cpp int asr_i(signed int m, unsigned int n) { const int logical = (((int) -1) OP1) > 0; unsigned int fixu = -(logical & (OP2)); int fix = *(int *) &fixu; return (m >> n) | (fix ^ (fix >> n)); } ``` + **`logical`**: 首先 `-1` 對應到 `0xFFFFFFFF` 也就是二進位的全 1。如果我們將他右移,可作為判斷右移實作的根據。 + 要是編譯器的實作會複製 sign bit,結果則為 `0xFFFFFFFF`=`-1` + 要是編譯器的實作只會補 0 ,則結果為 `0x7FFFFFFF`,為正的最大值 所以透過判斷唯一之後是否大於 0,就知道編譯器的實作種類。根據上面推論,**OP1** 為 `>> 1`,且 logical 為 `1` (需要把高位改成1)或 `0`(不需要修正)。 + **`fixu`**: 除了判斷編譯器實作,我們也需要判斷 `m` 是否為負,因為只有負數會因為 shift 的實作會有不同結果。故合理的答案為 **`OP2`** = `m < 0`,觀察這兩種情況: + `m >= 0`: 故 `-(logical & m<0)` 得到 0 + `m < 0`: 故 `-(logical & m<0)` 得到 + `logical == 1`: 得到二進位的全 1 。 + `logical == 0`: 得到 0 。 + 故只有當 `m` 為負,且實作需要修正我們會得到全 1 ,其他都是 0 代表不需修正。 + 這裡我們把 `-1` 轉成 unsigned,根據 c99 - **6.3.1.3 Signed and unsigned integers** 是會得到全 1 沒錯。 > Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. + **`fix`**: 只是把 fixu 再轉成有號數,值為 -1,不過**為什麼還要 fixu 不直接使用 fix 就好了?** 這我想不到 + 假設現在需要修正,所以 `fix` 為 `0xFFFFFFFF`則 + `fix` 長這樣(簡化長度為 8) ```graphviz graph { node [shape=none]; a [label=< <TABLE BORDER="0" CELLBORDER="1" cellspacing="0"> <TR> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> </TR> </TABLE>>] } ``` + `fix >> n` 長這樣(假設 n = 2 ) ```graphviz graph { node [shape=none]; a [label=< <TABLE BORDER="0" CELLBORDER="1" cellspacing="0"> <TR> <TD>0</TD> <TD>0</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> <TD>1</TD> </TR> </TABLE>>] } ``` + 故 `fix ^ (fix >> n)` 為 ```graphviz graph { node [shape=none]; a [label=< <TABLE BORDER="0" CELLBORDER="1" cellspacing="0"> <TR> <TD>1</TD> <TD>1</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> <TD>0</TD> </TR> </TABLE>>] } ``` + 我們的 `m` 假如也右移 2 ,則會空出兩個 0 在左邊,所以 `m >> 2 | fix ^ (fix >> n)` 就是正確結果了。 ## 測驗 2 ```cpp bool isPowerOfFour(int num) { return num > 0 && (num & (num - 1))==0 && !(__builtin_ctz(num) OPQ); } ``` + 我們首先要確定 `num > 0`,因為要是 `num = 0`,後面都會成立,但 0 不是 power-of-four + **`(num & (num - 1))==0`**: 這個可用來確定是否是 2 的冪次方。 + 原因大概證明:假設 n 不是 2 的冪次方,則 n 的二進位表示式至少有兩個 1 。當我們對 n 減 1,會發現低位的 0 都被翻轉,直到遇到最低位的 1 。而且最低位 1 也會被翻轉,但往後的 bit 全都不動。因此,`(num & (num - 1))` 一定不等於 0。 故 `(num & (num - 1))==0` 時一定為二的次方。 + 同理可得到當 n 為二的次方, `(num & (num - 1))==0` + 今天已經知道 `num` 為二的次方,故 `num` 的二元表示,只有一個 1 。觀察四的次方,可以發現低位的 0 數量一定是偶數: + | num | Binary | | -------- | -------- | | $4$ | 00000100 | | $4^2$ | 00010000 | | $4^3$ | 01000000 | + 故 `__builtin_ctz(num) & 0x1` 可以判斷低位 0 的個數是否為奇數(奇數得到 1),在加上 NOT 就是判斷是否為四的次方了。 ## 延伸問題 - 其他實作 ```cpp bool isPowerOfFour(int num){ return (__builtin_popcount(num) == 1) && !(__builtin_ctz(num) & 0x1); } ``` + 我想到前兩個判斷事實上可以整合成**判斷二進位中 1 的個數**,所以用 `(__builtin_popcount(num) == 1)`來取代前兩個判斷。 :::info TODO: - [ ] 練習 LeetCode 1009. Complement of Base 10 Integer 和 41. First Missing Positive,應善用 clz; - [ ] 研讀 2017 年修課學生報告,理解 clz 的實作方式,並舉出 Exponential Golomb coding 的案例說明,需要有對應的 C 原始程式碼和測試報告; > x-compressor 是個以 Golomb-Rice coding 為基礎的資料壓縮器,實作中也用到 clz/ctz 指令,可參見 Selecting the Golomb Parameter in Rice Coding。 ::: ## 測驗 3 ```cpp int numberOfSteps (int num) { return num ? AAA + 31 - BBB : 0; } ``` 根據題目,我們照著規則走即可找出通式,從最低位開始: + 遇到 0: 做一個 shift,花費一個 Step + 遇到 1: 減 1 後還要再 shift,花費兩個 Step。 + 直到最後一個 1: 減 1 後結束,花費一個 Step。 令 1 的個數為 $p$,最高位的 1 以上的 0 個數為 $l$ 而總共有 32 bit。則總 Step 數為 $$\begin{align} &(2p-1)+(32-l-p)\cdot 1 \\ =& p + 31 - l \end{align} $$ 對應到答案 `__builtin_popcount(num) + 31 - __builtin_clz(num)` :::info - [ ] 改寫上述程式碼,提供等價功能的不同實作並解說。 避免用到編譯器的擴充功能,只用 C99 特徵及 bitwise 運算改寫 LeetCode 1342. Number of Steps to Reduce a Number to Zero,實作出 branchless 解法,儘量縮減程式碼行數和分支數量; ::: ## 測驗 4 ```cpp= #include <stdint.h> uint64_t gcd64(uint64_t u, uint64_t v) { if (!u || !v) return u | v; int shift; for (shift = 0; !((u | v) & 1); shift++) { u /= 2, v /= 2; } while (!(u & 1)) u /= 2; do { while (!(v & 1)) v /= 2; if (u < v) { v -= u; } else { uint64_t t = u - v; u = v; v = t; } } while (XXX); return YYY; } ``` + **第3行**: 首先檢查有沒有數字為0,如果有,就回傳不是0的那一個(兩個都0回傳0) + **4~7行**: + 這裡 `!((u | v) & 1)` 的作用在於檢查是不是兩個都是偶數 + 如果是,我們就把兩個都除以二(右移),並把右移次數紀錄在 `shift` 變數中。 + 整段的作用其實就是**提出所有公因數 2** ,並在後來求完公因數後,再把這些 2 重新乘回去(左移)。 + **8~9行**: 這裡判斷如果 u 是偶數,就把他所有的質因數 2 去掉。 + 原因是若 u 是偶數,則 v 一定是奇數,(前面迴圈的停止條件是至少其中一個是奇數) + 若一個是偶數一個是奇數,則 2 一定不會是公因數,把 2 除掉也不會影響最大公因數 + **10~21行**: 這裡透過不斷的做減法與除以2,來將 u 與 v 變得愈來愈小,直到最後一個為 0 gcd 的結果就顯而易見了。 + 這裡用到了兩個**性質** 1. 當 $u$ 為偶數,$v$ 為奇數則 $\text{gcd}(u,v) = \text{gcd}(\frac{u}{2},v)$ 2. 當 $u>v$, $\gcd(u,v) = \gcd(u-v,v)$ + **11~12 行**: 首先注意到首次進迴圈時, $u$ 一定是奇數,因為我們在外面處理過 $u$ 了,所以如果 $v$ 是偶數,我們可以利用 **性質1** 把 2 除掉。我們之後會確保 $u$ 會一直是奇數,所以在每次迴圈開頭,我們都可以利用一次**性質1**。 + **13~19 行**: 這裡我們利用**性質2**將大的減去小的。 + 這裡值得注意的是,$u,v$ 都是奇數,所以大減小一定會產生偶數。我們固定把偶數(大減小的結果)放在 $v$,奇數放在 $u$,好在下次迴圈開頭利用**性質1**。 + **20~21 行**: 這裡我們要判斷是否要中止迴圈。當 $v=0$,$\gcd(u,v)=\gcd(u,0)=u$。所以`XXX`=`v`,最後`YYY`必須回傳 `u << shift` :::warning :warning: 不能用 `XXX`=`u-v`: 雖然當 $u-v=0$ 時,$\gcd(u,v)=\gcd(u,u)=u$,會得到正確結果。但要是 $v=2u$,迴圈不會結束。但下次迴圈 `v` 會被除為 `u`,所以迴圈結束時 `v=0`,`u` 不變,但這時 `u-v != 0` ,所以會陷入無窮迴圈。 ::: ### 使用 ctz 改寫 程式碼 :::spoiler ```cpp #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <time.h> uint64_t gcd64_ctz(uint64_t u, uint64_t v) { if (!u || !v) return u | v; int shift = __builtin_ctz(u | v); u >>= shift; v >>= shift; u >>= __builtin_ctz(u); do { v >>= __builtin_ctz(v); if (u < v) { v -= u; } else { uint64_t t = u - v; u = v; v = t; } } while (v); return u << shift; } uint64_t gcd64(uint64_t u, uint64_t v) { if (!u || !v) return u | v; int shift; for (shift = 0; !((u | v) & 1); shift++) { u /= 2, v /= 2; } while (!(u & 1)) u /= 2; do { while (!(v & 1)) v /= 2; if (u < v) { v -= u; } else { uint64_t t = u - v; u = v; v = t; } } while (v); return u << shift; } int main(void) { printf("i, no ctz, ctz\n"); for (size_t i=10000; i<50000; i=i+100){ clock_t start, end; uint32_t time=0, time_ctz=0; uint64_t gcd, gcd_ctz; for (size_t j=1; j<i; ++j) { start = clock(); gcd_ctz = gcd64_ctz(i,j); end = clock(); time_ctz += (double)(end-start); start = clock(); gcd = gcd64(i,j); end = clock(); time += (double)(end-start); if (gcd != gcd_ctz) { printf("error: gcd(%lu,%lu)\n", i, j); break; } } printf("%lu, %f, %f\n", i, (double)time / CLOCKS_PER_SEC, (double)time_ctz / CLOCKS_PER_SEC); } return 0; } ``` ::: 環境 ```shell $ uname -a Linux Aspire-V5-591G 4.15.0-118-generic #119~16.04.1-Ubuntu SMP Tue Sep 8 14:54:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ cat /proc/cpuinfo | grep name model name : Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz model name : Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz model name : Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz model name : Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz ``` ### 測試結果 ![](https://i.imgur.com/l5ESRwq.png) + 可以看出來用 ctz 改寫的版本的確比較快。隨著數字增大,運算時間上升的斜率較小。 + 推測原因主要是少了 while 迴圈的分支結構,改成以硬體指令一次完成的關係。 ## 測驗 5 ```cpp #include <stddef.h> size_t improved(uint64_t *bitmap, size_t bitmapsize, uint32_t *out) { size_t pos = 0; uint64_t bitset; for (size_t k = 0; k < bitmapsize; ++k) { bitset = bitmap[k]; while (bitset != 0) { uint64_t t = KKK; int r = __builtin_ctzll(bitset); out[pos++] = k * 64 + r; bitset ^= t; } } return pos; } ``` ### 程式作用 + 這段程式的作用在於把 bitmap 中的 1 的位置,全部紀錄下來,而紀錄的方式就是把這些 1 的 index 紀錄在 `out` 這個 array 中。 + 我們總共有 `bitmapsize` 個 bitmap ,每一個 bitmap 有 64 bit 。因此總共會有 `bitmapsize*64` bit 。 + index 計算方式為從 `bitmap[0]` 開始,`bitmap[0]` 的 LSB 為 0, MSB 為 63。 `bitmap[1]` 的 LSB 為 64, MSB 為 127。依此類推 ### 程式原理 + 外層迴圈很好理解,就是把 bitmap 逐一拿出來,一次處理 64 bit。 + 拿到每一串 bitmap 後,我們要把裡面的 1 都找出來。 ```cpp=10 int r = __builtin_ctzll(bitset); out[pos++] = k * 64 + r; ``` + 我們可以用 ctz 的方式直接找到最低 set bit 的位置(`r`),而不用逐 bit 查找。找到之後,就在 `out` 中紀錄下當前的 bit 的位置。 + 找到了最低位的 1 後,我們要把這個 1 去掉,這樣下次最低位的 1 就是下一個 1 了。所以假如當這次最低位的 1 在 index **2** 的時候,`KKK` 應該要是 `0b100` ,這樣才可以透過 XOR 把當前的 bit 消掉。 + 我們可以觀察 `KKK` = `bitset & -bitset` 的作用。二補數中的負號相當於 NOT 後加 1 。 1. 所以假如 bitset 為 `10101000` 2. NOT 後為 `01010111` 3. 加 1 後所有低位的 bit 都將再次翻轉,直到遇到第一個 0 , 0 也被翻轉成 1 。 結果為 `01011000` 4. `bitset & -bitset` 得到 `00001000` 5. 所以 `bitset & -bitset` 會保留 `bitset` 最低位的 1 以下的部份(包含 1),以上則變成 0 ,因為以上的半段只翻轉一次, AND 之後一定為 0。所以最後得到的就是只有一個 set bit 的 bitmask, XOR 之後就可清除最低位的 1 。

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully