sysprog
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.

      Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Explore these features while you wait
      Complete general settings
      Bookmark and like published notes
      Write a few more notes
      Complete general settings
      Write a few more notes
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Help
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # 2026-03-31 問答簡記 ## [大發:造假 30 年,害慘豐田.豐田章男 90 度鞠躬的背後](https://youtu.be/ydFupdQiCkc) * 第三方調查指出,大發累計存在 174 件程序不正行為,涉及 64 款車型與 3 具引擎,最早可追溯至 1989 年,時間跨度長達三十餘年。影響範圍不僅限於自有品牌,亦包含供應給 Toyota、Mazda、Subaru 的 OEM 車型。事件曝光後,大發一度全面停止當時仍在生產的相關車型出貨,顯示問題已非局部瑕疵,而是制度性偏差的長期累積 * 日本國土交通省隨後介入,並於 2024 年 1 月正式取消三款車型的型式指定,理由是未符合後方碰撞時防止燃油洩漏的安全基準。同年 6 月完成進一步檢證後,確認多數車型仍符合標準,但上述三款車型確實不符規範。這一處置顯示監管機關開始對既有認證制度進行實質性校正,而非僅停留在文件審查層面 * Toyota、Mazda、Honda、Suzuki,和 Yamaha 等 5 家車廠在型式指定申請過程中亦存在不正行為。Toyota 涉及七款車型,問題涵蓋碰撞試驗方法不符規定、測試資料處理不當以及測試條件偏離規範;Mazda 有 5 款車型涉案,其中包含在出力試驗中改寫引擎控制軟體的情形;Honda 則在多達 22 款停產車型的噪音測試文件中出現不實記載;Suzuki 與 Yamaha 亦分別在煞車測試與噪音測試相關文件與條件上存在違規。這些案例共同指向一個事實,即問題並非個別企業的偶發失誤,而是整體認證文化逐步偏離工程實務的結果 * Hitachi Astemo 被揭露在長達數十年的期間內,於煞車與懸吊零組件的測試與出貨過程中存在未執行測試即提交數據、以及未達標準仍出貨等不當行為,影響數百萬件產品與多家車廠。另一家豐田體系企業 Hino 則被查出自 2003 年起偽造部分引擎排放數據,時間跨度超過 20 年 * 每次微小的偏離都可能被合理化,最終累積為長期且系統性的失真。這不僅是品質控管的失效,更是整個產業在工程實踐與制度設計之間逐步脫鉤的結果 $\to$ [連 Lexus 也壞了?15 萬輛召回,日系豪華車的品質保證還在嗎?](https://youtu.be/tfXCdK-xgG0) > [Toyota and Lexus Recall Thousands of Vehicles Over Label Error](https://autos.yahoo.com/safety-and-recalls/articles/toyota-lexus-recall-thousands-vehicles-130000892.html?guccounter=1) :::warning 品質是尊嚴和價值的起點 ::: --- > 回顧[作業二](https://hackmd.io/@sysprog/linux2026-homework2) ## deantee ```c #define round_up(x, align) \ ((x) + (align) - 1 & ~((align) - 1)) ``` ```c #define round_half_up(x, align) ({ \ int k = __builtin_ctz(align); \ int high = (x) >> k; \ int low = (x) >> k - 2 & 3; \ high + (low > 2 || low == 2 && (high & 1)) << k; \ }) ``` ```c #define round_half_up_2(x, align) ({ \ int k = __builtin_ctz(align); \ int high = (x) >> k; \ int low = (x) & (1UL << k) - 1; \ int half = 1UL << k - 1; \ high + (low > half || low == half && (high & 1)) << k; \ }) ``` ## PinkNekoFist 考慮以下程式碼 ```c int main(void) { signed int x = -1; unsigned int y = 0; printf("%s", (x > y) ? "yes" : "no"); return 0; } ``` 輸出結果為 `yes`,因為根據 C99 6.3.1.8 > if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type. 說明當有號整數在跟無號整數做比較時,會將有號整數轉換成無號整數,這個過程稱為 interger promotion。 在 2002 年,FreeBSD 的開發者發現其 getpeername 函式的實作中存在嚴重的安全風險。該漏洞源於對資料長度進行邊界檢查時,未正確處理有號整數與無號整數之間的類型轉換。 以下是模擬該漏洞的簡化程式碼: ```c /* * Illustration of code vulnerability similar to that found in * FreeBSD's implementation of getpeername() */ /* Declaration of library function memcpy */ void *memcpy(void *dest, void *src, size_t n); /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; } ``` `maxlen` 被宣告為有號整數,因此可以傳入負數,進而導致 `len` 為負值,同時,`memcpy` 中的 `n` 被宣告為無號整數,當將負值傳入 `n` 時,根據 C99,`n` 值會是一個極大正數,導致非預期的記憶體位置遭到存取。 可以透過以下方法修正 ```c /* use size_t for maxlen */ int copy_from_kernel(void *user_dest, size_t maxlen) { size_t len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return (int)len; } /* check negtive value */ int copy_from_kernel(void *user_dest, int maxlen) { if (maxlen < 0) return -1; int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; } ``` ## c8763yee - `out.i &= ~(out.i >> 31);` Well defined / undefined behavior? 查閱 ISO C99 6.5.7 Bitwise shift operators > The result of E1 >> E2 is E1 right-shifted E2 bit positions. > If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of $E1 / 2^{E2}$. > If E1 has a signed type and a negative value, the resulting value is implementation-defined 得知 signed integer(對應到 `out.i` )在非負數時進行右移為 Well defined,當右移等同於對 `out.i` 除以 $2^k$。在 `out.i` 為負數時則為 implementation-defined behavior,因此須查閱各編譯器具體的行為。 ISO C99 3.4.1 implementation-defined behavior > unspecified behavior where each implementation documents how the choice is made #### GCC 查閱 GCC 規格書 4.5 Integers implementation (Using the GNU Compiler Collection (GCC)) > - *The results of some bitwise operations on signed integers* (C90 6.3, C99 and C11 6.5). > > Bitwise operators act on the representation of the value including both the sign and value bits, where the **sign bit is considered immediately above the highest-value value bit**. Signed ‘>>’ acts on negative numbers by sign extension. 得知 GCC 對於右移動作會將 sign bit 複製到最高位 value bit 來保持位移運算後的正負性,而這等同於進行算數位移。 使用 GCC 15.2 分別在 [x86-64](https://godbolt.org/z/c9f75adc4)、[ARM64](https://godbolt.org/z/crrxc74xe) 與 [RISC-V (64-bit)](https://godbolt.org/z/bq1TqMWvq) 觀察編譯出來的組合語言驗證 ```asm ReLU: ;... mov eax, DWORD PTR [rbp-4] mov edx, DWORD PTR [rbp-4] sar edx, 31 ; in x86-64 ; asr w0, w0, 31;; in arm64 ; sraiw a5,a5,31 ;; in RISC-V 64bit not edx and eax, edx ;... ``` GCC 在上述架構中皆使用算數位移,與手冊敘述和預期行為符合。 #### Clang 根據 [StackOverflow 上的文章](https://stackoverflow.com/questions/36335071/where-does-clang-document-implementation-defined-behavior) 與對應的 [Github Issue](https://github.com/llvm/llvm-project/issues/11644),Clang 手冊其實沒有說明 Implementation-defined Behavior 在該編譯器下的具體行為,即使 ISO C 強制要求各「Implementation」需要明確說明其在所有 Implementation-defined behavior 的具體行為。 > - 4 Conformance p8 > An implementation **shall** be accompanied by a document that defines all implementationdefined and locale-specific characteristics and all extensions. 因此目前只能使用編譯出來的組合語言觀察其在 [x86-64](https://godbolt.org/z/5634PahxK)、[ARM64(armv8-a)](https://godbolt.org/z/65bP3nqb1) 與 [RISC-V 64bit(rv64gc)](https://godbolt.org/z/noY961G8v) 上的行為 ```asm ReLU: ; ... mov eax, dword ptr [rbp - 8] sar eax, 31 ; x86-64 ; bic w8, w8, w9, asr #31 ;; in armv8-a ; srli a1, a1, 31 ;; in RISC-V 64bit xor eax, -1 and eax, dword ptr [rbp - 8] ; ... ``` 得知 Clang 在 x86-64 與 ARM64 也是使用算數位移,然而在 RISC-V rv64gc 反而使用邏輯位移。 查閱 [RISC-V 64 規格書](https://five-embeddev.github.io/riscv-docs-html//riscv-user-isa-manual/Priv-v1.12/rv64.html) 後發現 `lw` 會透過 Sign extension 將 sign bit 擴充到原本 32 bit 數值作為高位元後存入暫存器 `a0(x10)`。 > The LW instruction loads a 32-bit value from memory and sign-extends this to 64 bits before storing it in register rd for RV64I > > The SD, SW, SH, and SB instructions store 64-bit, 32-bit, 16-bit, and 8-bit values from the low bits of register rs2 to memory respectively. 在 `out.i` 為負數時,在暫存器中是以 `0xFFFFFFFFxxxxxxxx` 儲存,反之則是 `0x00000000xxxxxxxx`。 因此得知與上述其他的架構透過算術位移不同,RISC-V 64 bit 透過 `lw` 的 sign extension 特性生成 bitmask。 觀察編譯出來的組合語言 ```asm ReLU: ; ... lw a0, -24(s0) ; a0 = out.i ; out.i >= 0: 0x00000000xxxxxxxx ; out.i < 0 : 0xFFFFFFFFxxxxxxxx not a1, a0 ; a1 = ~a0 srli a1, a1, 31 ; a1 >>= 31 ; a1 < 0(out.i >= 0): 0x1FFFFFFFF ; a1 >= 0(out.i < 0): 0 and a0, a0, a1 ; a0 &= a1;; become a0 or 0 sw a0, -24(s0) ; out.i = a0 ;... ``` 與預期行為符合,都是透過位移生成 bitmask 後對原本的數進行 and 運算。不過上面組合語言的行為對應到的 C 語言是 `out.i &= ~out.i >> 31`,實際修改後發現生成的組合語言沒有變化。 ## keep90ing ```c out.i &= ~(out.i >> 31); ``` #### 查詢 C99 standard 是 well-defined 還是 undefined behavior ? 首先根據 〈[你所不知道的 C 語言: bitwise 操作](https://hackmd.io/@sysprog/c-bitwise)〉提到其中一種位移運算的未定義 >右移一個負數時,可能是邏輯位移或是算術位移,C 語言標準未定義。 >因此右移如果是一個負數時,會變成正或者是負值,要注意編譯器如何實作。編譯器甚至可以有編譯選項可改變此語意,gcc 的實作上是使用 arithmetic shift (signed extension)。 接著根據 C99 Strandard [6.5.7-5] ***Bitwise shift operators*** >The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 divided by the quantity, 2 raised to the power E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined. 在 C99 Standard [3.4.1] ***implementation-defined behavior*** 也可以發現有以此為例 >- unspecified behavior where each implementation documents how the choice is made >- EXAMPLE >An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right. 因為 `out.i` 的型別是 `int32_t`,所以根據 C99 [6.5.7-5]: - 若 `out.i` $\ge0$,其結果是 well-defined - 若 `out.i` $<0$,代表要對負數進行位移運算,其結果是 implementation-defined,也就是 `>>` 是採用算數位移還是邏輯位移,完全靠編譯器的實作來決定。 接著查看不同指令集架構中的位移指令與實作: - x86_64:參考 [AMD64 Architecture Programmer's Manual Volume 3: General Purpose and System Programming Instructions](https://docs.amd.com/v/u/en-US/24594_3.37) - 算術位移指令:SAR > The SAR instruction does not change the sign bit of the target operand. For each bit shift, it copies the sign bit to the next bit, preserving the sign of the result. - 邏輯位移指令:SHR > For each bit shift, the instruction clears the most-significant bit to 0. > The effect of this instruction is unsigned division by powers of two. - ARM64:參考 [Arm Architecture Reference Manual for A-profile architecture-C3.5.10 Shift (immediate) ](https://developer.arm.com/documentation/ddi0487/maa/-Part-C-The-AArch64-Instruction-Set/-Chapter-C3-A64-Instruction-Set-Overview/-C3-5-Data-processing---immediate/-C3-5-10-Shift--immediate-?lang=en) - 算術位移指令:[ASR](https://developer.arm.com/documentation/ddi0487/maa/-Part-C-The-AArch64-Instruction-Set/-Chapter-C6-A64-Base-Instruction-Descriptions/-C6-2-Alphabetical-list-of-A64-base-instructions/-C6-2-20-ASR--immediate-?lang=en#isa_asr_sbfm) >Arithmetic shift right (immediate) >This instruction shifts a register value right by an immediate number of bits, shifting in copies of the sign bit in the upper bits and zeros in the lower bits, and writes the result to the destination register. - 邏輯位移指令: [LSR](https://developer.arm.com/documentation/ddi0487/maa/-Part-C-The-AArch64-Instruction-Set/-Chapter-C6-A64-Base-Instruction-Descriptions/-C6-2-Alphabetical-list-of-A64-base-instructions/-C6-2-272-LSR--immediate-?lang=en#isa_lsr_ubfm) >Logical shift right (immediate) >This instruction shifts a register value right by an immediate number of bits, shifting in zeros, and writes the result to the - RISC-V (64bits):參考 [The RISC-V Instruction Set Manual Volume I: Unprivileged Architecture-4.1.2.1. Integer Register-Immediate Instructions](https://docs.riscv.org/reference/isa/unpriv/rv64.html#4-1-2-1-integer-register-immediate-instructions) - 算數位移指令:SRAI >SRAI is an arithmetic right shift (the original sign bit is copied into the vacated upper bits). - 邏輯位移指令:SRLI >SRLI is a logical right shift (zeros are shifted into the upper bits). 檢視下方兩種 compiler 的實作: - GCC 根據下方 [4.5 Integers](https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html),在 GCC 下,若 `out.i` 為負值,其 IEEE 754 表示的 sign bit 為 1,故執行 `out.i >> 31` 時會藉由 sign extension 使結果為 `0xFFFFFFFF`,`~(0xFFFFFFFF)` 等於 0,最後 `out.i &= 0` 把結果清除為 0 >Bitwise operators act on the representation of the value including both the sign and value bits, where the sign bit is considered immediately above the highest-value value bit. Signed ‘>>’ acts on negative numbers by sign extension. 以下皆使用 gcc 15.2 來編譯,並對應至 `out.i &= ~(out.i >> 31);`,假設 `out.i` 在記憶體中 32 bits 的值為 `0xXXXXXXXX`,其中 X 為: - x86_64 ```asm mov eax, DWORD PTR [rbp-4] ; rax = 0xXXXXXXXX (zero-extension) mov edx, DWORD PTR [rbp-4] ; rdx = 0xXXXXXXXX (zero-extension) sar edx, 31 ; shift arithmetic right ; sign extension: ; if out.i >= 0, out.i = 0x00000000 ; if out.i < 0, out.i = 0xFFFFFFFF not edx ; rdx = 0x00000000 (zero-extension) and eax, edx ; rax &= edx mov DWORD PTR [rbp-4], eax ``` 根據 [AMD64 Architecture Programmer's Manual Volume 1: Application Programming-3.4.5 High 32 Bits](https://docs.amd.com/v/u/en-US/24592_3.24) 可知對於 32 bits 的輸出結果,硬體會藉由 zero-extension 將 bits[63:32] 都清成 0 >In 64-bit mode, the following rules apply to extension of results into the high 32 bits when results smaller than 64 bits are written: >- Zero-Extension of 32-Bit Results: 32-bit results are zero-extended into the high 32 bits of 64-bit GPR destination registers. - ARM (64-bits) ```asm ldr w1, [sp, 24] ldr w0, [sp, 24] asr w0, w0, 31 // arithmetic shift right mvn w0, w0 and w0, w1, w0 str w0, [sp, 24] ``` - RISC-V (64-bits) ```asm lw a4,-24(s0) lw a5,-24(s0) sraiw a5,a5,31 # arithmetic right shift sext.w a5,a5 not a5,a5 sext.w a5,a5 and a5,a4,a5 sext.w a5,a5 sw a5,-24(s0) ``` - Clang llvm-project 的 [GitHub issue #11644 ](https://github.com/llvm/llvm-project/issues/11644)指出 Clang 缺乏對 C/C++ 標準中 implementation-defined behavior 的正式文件,因此對於C99 [6.5.7-5] 的 implementation-defined 沒有像 GCC 一樣提供明確的文件說明,因此這部份需要透過組合語言來驗證 以下皆使用 clang 22.1.0 來編譯,並對應至 `out.i &= ~(out.i >> 31);`: - x86_64 ```asm mov eax, dword ptr [rbp - 8] ; eax = out.i sar eax, 31 ; shift arithmetic right ; sign extension: ; if out.i >= 0, eax = 0x00000000 ; if out.i < 0, eax = 0xFFFFFFFF xor eax, -1 ; eax = eax ^ 0xFFFFFFFF ; if out.i >=0, eax = 0xFFFFFFFF ; if out.i < 0, eax = 0x00000000 and eax, dword ptr [rbp - 8] ; eax = out.i & eax ; if out.i >=0, eax = out.i & 0xFFFFFFFF = out.i ; if out.i < 0, eax = out.i & 0x00000000 = 0 mov dword ptr [rbp - 8], eax ``` - armv8-a ```asm ldr s0, [sp, #8] fmov w9, s0 ldr s0, [sp, #8] fmov w8, s0 ; w8 = out.i bic w8, w8, w9, asr #31 ; bit clear:w8 = w8 AND NOT(w9 ASR 31) ; w9 asr #31 performs arithmetic shift right 31 bits: ; if out.i >= 0, w9 asr 31 = 0x00000000 ; if out.i < 0, w9 asr 31 = 0xFFFFFFFF ; then NOT and AND in one instruction: ; if out.i >= 0, w8 = out.i & ~0x00000000 = out.i ; if out.i < 0, w8 = out.i & ~0xFFFFFFFF = 0 str w8, [sp, #8] ``` - RISC-V rv64gc ```asm lw a0, -24(s0) ; a0 = out.i (sign-extension) ; if out.i >= 0, a0 = 0x00000000_XXXXXXXX ; if out.i < 0, a0 = 0xFFFFFFFF_XXXXXXXX not a1, a0 ; a1 = ~a0 ; if out.i >= 0, a1 = 0xFFFFFFFF_XXXXXXXX ; if out.i < 0, a1 = 0x00000000_XXXXXXXX srli a1, a1, 31 ; logical right shift 31 bits: ; if out.i >= 0, a1 = 0x00000001_FFFFFFFF ; if out.i < 0, a1 = 0x00000000_00000000 and a0, a0, a1 ; a0 = out.i & a1 ; if out.i >= 0, low 32 bits: out.i & 0xFFFFFFFF = out.i ; if out.i < 0, low 32 bits: out.i & 0x00000000 = 0 sw a0, -24(s0) ``` 根據 [The RISC-V Instruction Set Manual Volume I: Unprivileged Architecture-4.1.3. Load and Store Instructions](https://docs.riscv.org/reference/isa/unpriv/rv64.html#4-1-3-load-and-store-instructions) >The LW instruction loads a 32-bit value from memory and sign-extends this to 64 bits before storing it in register rd for RV64I. rv64gc 上的 lw 先做 32 -> 64 bits 的 sign-extension,使得 `not` + `srli 31` 在低 32 位元剛好與 x86_64 的 `sar 31` + `not` 等價的 mask,而 `sw` 只寫回低 32 位元,因此上半部份的值不影響最終結果,且可以發現 Clang 使用`srli` 代替 `srai` 來達到相同語意。 #### 可以如何修改 branchless 的寫法 ? 修改如下: ```diff --- original_relu.c +++ modified_relu.c @@ -3,10 +3,10 @@ float ReLU(float x) { union { float f; - int32_t i; + uint32_t i; } out = {.f = x}; - out.i &= ~(out.i >> 31); + out.i &= ((out.i >> 31) - 1); return out.f; } ``` 根據 C99 [6.5.7-5] ***Bitwise shift operators*** >The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 divided by the quantity, 2 raised to the power E2. 因為 `uint32_t` 屬於 unsigned type,因此 `out.i >> 31` 的結果會是 `out.i / 2^31`,且只可能是 0 或 1,即 IEEE754 中的 sign bit。 再根據 C99 [6.5.7] ***Bitwise shift operators*** >If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined. 並且位移量(31)滿足 $0\le31<32$,其中 32 為 `uint32_t` 的寬度,故不會觸發 C99[6.2.5-9] 的 undefined behavior,由此可知,這部份是 well-defined,結果為 0 或 1 。 接著根據 C99 [6.2.5-9] ***Types*** >A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type. 因此 `(out.i >> 31) - 1` 的結果可以分為以下兩種情形: - 若 `(out.i >> 31) == 1`,代表是負浮點數(包含 $-0.0$、$-\infty$、$-NAN$),結果為 `0x00000000` - 若 `(out.i >> 31) == 0`,代表是正浮點數(包含 $+0.0$、$+\infty$、$+NAN$),結果為 `(0 - 1) mod 2^32 = 0xFFFFFFFF` 由此可知,這部份也是 well-defined。 最後 `out.i &= ((out.i >> 31) - 1);` 的部份,其結果分為以下兩種情形: - 若輸入為正浮點數或 +0.0:`out.i & 0xFFFFFFFF = out.i`,保留原值 - 若輸入為負浮點數或 -0.0:`out.i & 0x00000000 = 0x00000000`,即 IEEE754 的 +0.0 由此可知,這部份也是 well-defined。 ## sakinu > 以下程式如何減少比較次數? 這個問題不會探討 $rand$ 函數的隨機性,且假設它真隨機。 ```c int x; do { x = rand(); } while (x >= (RAND_MAX - RAND_MAX % n)); x %= n; ``` 當下有想到一個不知道夠不夠好的優化方式,編譯器也可以做類似優化。 `(RAND_MAX - RAND_MAX % n)` 其實是一個常數,所以可以預先算好後多次使用避免成本高昂的取模操作,另外重點其實是要把 `RAND_MAX` 中對 n 取模後會多出現的部分去除,所以可以少算一次減法,變成判斷 $x$ 是否小於 `RAND_MAX % n`。 ```c int x; int cond = (RAND_MAX % n); do { x = rand(); } while (x < cond); x %= n; ``` 原本的程式邏輯是:若 $x \geq cond$ 時就重新隨機出一個 $x$ 直到 $x < cond$ 為止,來達到將該部分平均分佈回 $x < cond$ 的部分。 將每次隨機出 $x$ 後,$x \geq cond$ 的機率設為 $a$ 代表要重算的機率,並將 $b$ 設為 $1-a$ 代表不用重算的機率,會發現因為 `RAND_MAX - RAND_MAX%n + 1` 顯然大於 `n` ,所以 $a > b$ 必定成立且 $a - b \geq 1$。 $2^{31}-1$ = $2^{30}$ 或許可以建表紀錄若 n 位於某個區間時,哪個 k 足夠好,使得我們可以忍受做 k 次 rand 再組合結果的成本,換取 while 重複執行的機率更低。 > 如何計算? 這個方法一定要有足夠大的優化,因為原本的期望值是 2,而重複執行的只是一個 cost 不特別大的 rand() ## ericlin1231 C99 6.5.7 說明有號數右移只有在數值表示為非負時為 well-defined,當為負數時則為 implementation-defined,雖然 gcc 手冊 4.5 定義有號數右移為算術位移,因此在 gcc 編譯的環境中原版的 `ReLU` 可以如預期計算正確數值,但其並不符合 ISO C 規範,而是屬於 GNU C 對 ISO C 的擴充,因此不具備可攜性。 以下實作將 `union` 中的整數改為無號整數,因此以無號整數的運算規則進行,將浮點數的 sign bit 提取,若 sign bit 為 1 則 `out.i = out.i & 0` 反之則為 `out.i = out.i & 0xFFFFFFFF`。 ```c float ReLU(float x) { union { float f; uint32_t i; } out = {.f = x}; out.i &= ~(out.i & (1u << 31)); return out.f; } ``` --- ## 身為電機資訊系的學生,你要殘酷面對「作大事 or Nothing」 主演《惡靈古堡》系列電影的好萊塢演員 [Milla Jovovich (蜜拉喬娃維琪) 參與開發名為 MemPalace 的 AI 記憶系統,並在所謂的 LongMemEval 記憶基準測試中取得滿分](https://x.com/bensig/status/2041236952998171118)。檢視其[公開 benchmark 說明](https://github.com/milla-jovovich/mempalace/blob/main/benchmarks/BENCHMARKS.md),可發現部分數據帶有特定條件下的調整。例如 LongMemEval 所宣稱的 100% 成績,是在針對原本失敗的三題進行修補,並搭配大型語言模型重新排序後取得;若以保留測試集來看,分數為 98.4%。至於 LoCoMo 的 100% 成績,則是在將檢索範圍擴大至 top-k 等於 50,且超過單一 session 的合理範圍,同時再進行 reranking 的條件下得到;若限制在較為合理的 top-10 且不使用 reranking,分數則為 88.9%。 > [蜜拉喬娃維琪的 AI 記憶宮殿系統](https://www.facebook.com/hinet/posts/pfbid0tEhWp4PTpp1NXkDA6AdAK89qxtac1NFMHoAhPxKjayTAv5DzcimGrajr8bSW5XwFl) Milla Jovovich 長期以模特兒與演員身分活躍,代表作品包括《第五元素》與《惡靈古堡》系列,沒有公開的工程或人工智慧研究背景。AI 記憶系統本身確實是快速發展的研究方向,涵蓋向量檢索、長期記憶管理,以及 agent 在多輪互動中的記憶與決策機制。這些問題目前仍有許多挑戰,例如如何在有限上下文中維持語意一致性、如何避免錯誤記憶累積,以及如何在效率與準確性之間取得平衡。這類案例一方面反映生成式 AI 正在降低系統開發的門檻,使不同背景的人更容易參與實作與實驗;另一方面也提醒我們,在面對看似亮眼的數據與敘事時,應更加重視評測方法的透明性與結果的可重現性。對於電腦科學相關領域的學習者而言,工程的價值不僅在於快速做出系統,更在於能否建立可信、可驗證且經得起檢驗的方法論。 ## [kbox](https://github.com/sysprog21/kbox) 近期更新 kbox 不是在既有 Linux 核心進行隔離,而是把[整個 Linux 核心搬進 user space,直接當成 library 來用](https://hackmd.io/@sysprog/linux-lkl),並攔截 guest program 的系統呼叫,轉送到這個 in-process kernel 或 host kernel 處理 。 這讓它在架構上與傳統 sandbox(container、VM、ptrace-based 工具)形成一個完全不同的設計點。guest、kernel、dispatcher 甚至可以在同一個 address space 中共存,而 syscall interception 則透過 seccomp、SIGSYS trap 或 binary rewriting 三種機制動態選擇 。 傳統 sandbox 最大問題是 syscall semantics 的不完整,例如 proot 或 gVisor 都需要「模擬 Linux」,而這種模擬 inevitably 會在 edge case 出現偏差。kbox 直接使用真正的 Linux kernel 處理 syscall,等於把 compatibility 問題轉成「沒有問題」,這在執行複雜工具鏈、multi-thread 程式或奇怪 syscall usage 時特別關鍵 。另一個特性是 rootless 且無需 container/VM。它不依賴 namespace、daemon 或 hypervisor,甚至不需要 ptrace,這讓它在受限環境(例如 shared host、Termux、CI sandbox)中仍然可用 。這點其實讓它更接近「library OS」而不是「container runtime」。 因為 kernel 與 guest 在同一 process,kbox 可以直接讀取 guest 的 /proc、提供 per-syscall event stream,甚至內建 web observatory 與 GDB hooks 。這種「內建可觀測 kernel」在傳統 sandbox 幾乎不存在。 這個轉向對 AI agent sandboxing 特別有意思,因為 AI agent 的需求和傳統 sandbox 有根本差異。 傳統 sandbox 關注 isolation,而 AI agent 更關心的是三件事:語意正確性、可觀測性,以及可控制性。kbox 在這三點上剛好全部命中。 * AI agent 常常需要執行真實工具鏈,例如 gcc、git、pip、甚至整個 build system。任何 syscall 不一致都可能讓 agent 行為偏離預期。kbox 用 real kernel 處理 syscall,等於讓 agent 執行環境與 production Linux 幾乎一致,這對於「讓 agent 學會真實世界工具」非常重要 * observability。kbox 的 per-syscall trace 與 kernel-level introspection,實際上提供了一個 fine-grained execution trace。這對 AI agent 有兩個潛在用途,一是做 tool-use 的 explainability,讓每步 syscall 都可被記錄與分析;二是作為 training signal,例如學習哪些 syscall pattern 對應成功或失敗的操作 * controllability。kbox 的 dispatch layer 本質上是一個 programmable syscall router。這意味著可以在 syscall 層做 policy,例如限制 file access、rewrite network 行為、甚至模擬某些資源。對 AI agent sandboxing 來說,這比傳統 coarse-grained isolation(例如 container)更細緻,也更可組合。 kbox 不採取高度隔離的 sandbox,它更像是 rootless execution environment。當使用 trap 或 rewrite 模式時,guest 與 host 在同一 address space,這對惡意 code 的 containment 是有風險的。對 AI agent 來說,如果要執行不可信 code,可能仍需要 outer sandbox(例如 VM 或 seccomp 嚴格模式)額外處理。其次是 attack surface。把 kernel 拉進 user space 等於把 kernel attack surface 直接暴露在 application context,雖然 LKL 本身減少複雜度和風險,但仍然需要考慮核心自身的問題。

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully