---
title: ByteWise - Breakdown of `BUILD_BUG_ON_ZERO`
tags: bytewise, fml, integer
---
# Breakdown of [`BUILD_BUG_ON_ZERO`](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/tools/include/linux/build_bug.h#L16)
If you are as ignorant as me and think Linux kernel does not have software testing mechanism as what we commonly seen, then you are making a big mistake. In contrast, Linux kernel does it more delicately than what you can imagine. Without further ado, let us get into one of the **compile-time** assertion composed by making use of the rules of the **bit-field**.
The macro seems not straight-forward at the very first sight but after breaking it down, we will see and feel the beauty of it.
:::info
:brain: That is either genius, or a seriously diseased mind.
:::
As it appears, I think we need to know a little about the **bit-field** before going further.
```c
sizeof(struct { int : -!!(e); })
```
## Bit-field
Let us quote from the [Wikipedia](https://en.wikipedia.org/wiki/Bit_field)
> A bit field is a common idiom used in computer programming to compactly store multiple logical values as a short series of bits where each of the single bits can be addressed separately.
[...]
>```c
>struct BoxProps
>{
> unsigned int opaque : 1;
> unsigned int fill_color : 3;
> unsigned int : 4; // fill to 8 bits
> unsigned int show_border : 1;
> unsigned int border_color : 3;
> unsigned int border_style : 2;
> unsigned char : 0; // fill to nearest byte (16 bits)
> unsigned char width : 4, // Split a byte into 2 fields of 4 bits
> height : 4;
>};
>```
Also, as always, let us extract some first-hand information from the C standard.
> #### § 6.7.2.1
> **4.)** The expression that specifies the width of a **bit-field** shall be an integer constant expression with a nonnegative value that does not exceed the width of an object of the type [...]
>
>
> **4.)** [...] If the value is zero, the declaration shall have no delarator.
>
> **9.)** A member of a structure or union may have any complete object type other than a variably modified type. In addition, a member may be declared to consist of a specified number of bits (including a sign bit, if any). Such a member is called a **bit-field**; its width is preceded by a colon.
>
> **12.)** A **bit-field** declaration with no declarator, but only a colon and a width, indicates an unnamed **bit-field**. As a special case, a **bit-field** structure member with a width of 0 indicates that no further **bit-field** is to be packed into the unit in which the previous **bit-field**, if any, was placed.
>
> **[126]** An unnamed **bit-field** structure member is useful for padding to conform to externally imposed layouts.
## Breakdown
```c
sizeof(struct { int : -!!(e); })
```
`e` is an expression, essentially, we categorize the result of the `e` into zero and non-zero.
1. if `e` is equal to `0`, the `-!!(e)` is `0`
2. if `e` is non-zero, the `-!!(e)` is `-1`
Now it comes to the **bit-field** part
* Before jumping into the logical breakdown, let us see if it is a legal definition. According to **(§ 6.7.2.1 [126])**, the behavior of an unnamed **bit-field** structure member is defined, which as mentioned, it is useful for padding or data alignment.
* Now let us break down its logical behavior
1. According to **(§ 6.7.2.1 12.)**, a **bit-field** structure member with a width of `0` indicates that no further **bit-field** is to be packed into the unit in which the previous **bit-field**.
3. According to **(§ 6.7.2.1 4.)**, the width of a **bit-field** shall be an integer constant expression with a **nonnegative** value, so if it is `-1`, the compiler will complain:
```sh
error: negative width in bit-field ‘<anonymous>’
3 | #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int : -!!(e); }))
```
## Use case
```c
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
```
`ARRAY_SIZE` is a straight-forward and convinient macro to calculate the number of element in an array, **BUT** instead of seeing it as a entry-level common sense, its pitfall should be emphasized more frequently.
:::success
:door: Portals
1. [ARR01-C. Do not apply the sizeof operator to a pointer when taking the size of an array](https://wiki.sei.cmu.edu/confluence/display/c/ARR01-C.+Do+not+apply+the+sizeof+operator+to+a+pointer+when+taking+the+size+of+an+array)
2. [Difference Between Arrays and Pointers](https://www.oreilly.com/library/view/understanding-and-using/9781449344535/ch04.html#DifferencesBetweenArraysAndPointersSection)
:::
Let us see the following code and guess what the result is
```c
#include <stdio.h>
int main()
{
int test_arr[10];
int *test_ptr = test_arr;
printf("%ld\n", ARRAY_SIZE(test_arr));
printf("%ld\n", ARRAY_SIZE(test_ptr));
return 0;
}
```
:::spoiler **Answer**:
* **64-bit** platform
```bash
10
2
```
* **32-bit** platform
```bash
10
1
```
:::
<br>
The reason of this is the `sizeof` operator works different for an **array** and a **pointer**. The `sizeof` an array returns the number of bytes allocated to the array while the `sizeof` a pointer return the number of bytes taken for the pointer, *4 bytes for a 32-bit platform* and *8 bytes for a 64-bit platform*.
:::info
:pencil2: **Why `sizeof` operator works differently for array vs pointer?**
Remember. `sizeof()` is a **compile time operator**, it means that any code written within the parentheses of `sizeof()` operator is **not evaluated/executed** [^1]. So a reasonable reasoning is `sizeof()` gets to inference the number of bytes taken by `int arr[10]` but definitely not by `int *arr_ptr`.
:::
In Linux, here is one of the case where the `BUILD_BUG_ON_ZERO` comes to rescue.
```c
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0])) + __must_be_array(arr)
```
`ARRAY_SIZE` adds a second term, `__must_be_array(arr)` to check if the input argument is an array.
```c
#define __same_type(_a, _b) \
__builtin_types_compatible_p(__typeof__(_a), __typeof__(_b))
#define __must_be_array(a) \
BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
```
1. `__must_be_array(arr)` use `BUILD_BUG_ON_ZERO` to check the `__same_type(...)` expression where leads us to the `__same_type(_a, _b)` macro.
2. `__same_type(_a, _b)` uses the [`__builtin_types_compatible_p`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) to check if the `__type_of__` `_a` and `_b` are compatible. It returns `1` if they are compatible; `0` otherwise.
3. The reasoning is
* if `_a` is an array and `_b` is a pointer, the `_same_type((a), &(a)[0])` is 0, making it a legal expression and return `0` to the `ARRAY_SIZE` macro
```c
sizeof(struct { int : 0 ;}))
```
* if `_a` and `_b` are both pointer, the `_same_type((a), &(a)[0])` is 1, hence the compiler complains about it.
```c
sizeof(struct { int : -1 ;}))
```
[^1]: Scaler Topics, *"sizeof() in C"*, https://www.scaler.com/topics/c/size-of-in-c/