contributed by < zoanana990
>
4
5
延伸問題:
解題思路:
__u.__c
可知其變數為 __c
res
中,而 res
則為 __c
。default
中顯示不是這幾個空間的直接用複製的到 __c
中。union
的結構可知,__c
所使用的空間需要 __val
的空間,此時可以利用 union
的特性,將複製到 __c
的地址剛好複製 __val
中__c
佔用空間不能比 __val
大,因為 __c
不是主角,因此可推得 __c
的型態是佔最小空間的 char
。此外,不能讓 __c
為指標形式,需要給予固定的空間,因此最終答案為 char __c[1]
。READ_ONCE()就是 f
In short, ACCESS_ONCE() forces the variable to be treated as being a volatile type, even though it (like almost all variables in the kernel) is not declared that way. The problem reported by Christian is that GCC 4.6 and 4.7 will drop the volatile modifier if the variable passed into it is not of a scalar type. It works fine if x is an int, for example, but not if x has a more complicated type. For example, ACCESS_ONCE() is often used with page table entries, which are defined as having the pte_t type
在這篇文章中,主要說明 ACCESS_ONCE
在進行純量運算時可以正常運行,例如: int
, float
…
若是使用結構或其他資料型態,則編譯器的 bug (GCC 4.6, GCC 4.7) 會自動去除 volatile
,則使用 ACCESS_ONCE
就沒有意義,此時不僅是編譯器有問題,也代表這個巨集不夠強健
Christian started by looking for ways to work around the problem, only to be informed that normal kernel practice is to avoid working around compiler bugs whenever possible; instead, the buggy versions should simply be blacklisted in the kernel build system. But 4.6 and 4.7 are installed on a lot of systems; blacklisting them would inconvenience many users.
當然文章也有說明要不要進用 GCC 4.6 和 GCC 4.7 但是這兩個編譯器仍然有很多人使用,因此最後並沒有這樣做
One way of being less fragile would be to change the affected ACCESS_ONCE() calls to point to the scalar parts of the relevant non-scalar types. So, if code does something like:
這邊可以將變數寫成這樣:
當然也可以寫成這樣:
This type of change requires auditing all ACCESS_ONCE() calls, though, to find the ones using non-scalar types; that would be a lengthy and error-prone process that would not prevent the addition of new bugs in the future. This type of change requires auditing all ACCESS_ONCE() calls, though, to find the ones using non-scalar types; that would be a lengthy and error-prone process that would not prevent the addition of new bugs in the future.
若此時將所有的 pte_t
全部替換回 unsigned int
,是一件非常費工且不切實際的行為。這樣的確可以成功規避調錯誤,但是對於先前使用的巨集,需要進行檢驗,這是一個漫長且容易出錯的過程。
Another approach to the problem explored by Christian was to remove a number of problematic ACCESS_ONCE() calls and just put in a compiler barrier with barrier() instead. In many cases, a barrier is sufficient, but in others it is not. Once again, a detailed audit is required, and there is nothing preventing new code from adding buggy ACCESS_ONCE() calls.
因此文章提到,將有問題的 ACCESS_ONCE
使用 barrier()
替換,但是這不能解決所有問題,這也造成了 READ_ONCE
的提出。
至於什麼是 barrier()
,根據 Wikipedia。一般而言,編譯器進行程式邊譯時,會最佳化執行結果,而 barrier()
就像一個柵欄,可以讓 barrier()
前的程式碼和 barrier()
後的程式碼進行邊譯時順序不會混在一起。
TODO
Like volatile, the kernel primitives which make concurrent access to data safe (spinlocks, mutexes, memory barriers, etc.) are designed to prevent unwanted optimization. If they are being used properly, there will be no need to use volatile as well. If volatile is still necessary, there is almost certainly a bug in the code somewhere. In properly-written kernel code, volatile can only serve to slow things down.
原文中說道,在 linux kernel
中也有許多類似 volatile
這種防止編譯器優化的機制,善用這些機制,則不需要使用 volatile
,若仍須使用則幾乎可以斷定程式裡有 bug
考慮以下程式碼:
If shared_data were declared volatile, the locking would still be necessary. But the compiler would also be prevented from optimizing access to shared_data within the critical section, when we know that nobody else can be working with it. While the lock is held, shared_data is not volatile. When dealing with shared data, proper locking makes volatile unnecessary - and potentially harmful.
就算沒有使用 volitile
,在 spin_lock
裡面仍然可以防止編譯器優化,這使得 volitile
多餘且可能有潛在行的危害
另一種例子是處理器忙於等待變量的值,考慮以下程式碼:
The cpu_relax() call can lower CPU power consumption or yield to a hyperthreaded twin processor; it also happens to serve as a compiler barrier, so, once again, volatile is unnecessary.
這邊可以看到 cpu_relax()
不僅可以降低 CPU 的功耗也是一個編譯器的屏障,因此我們仍舊不需要 volatile
但是 Linux 核心中仍然有需要使用到 volatile
的時候,原文舉四個例子:
volatile
The global variable jiffies holds the number of ticks that have occurred since the system booted. On boot, the kernel initializes the variable to zero, and it is incremented by one during each timer interrupt. Thus, because there are HZ timer interrupts in a second, there are HZ jiffies in a second. The system uptime is therefore jiffies/HZ seconds.
For most code, none of the above justifications for volatile apply. As a result, the use of volatile is likely to be seen as a bug and will bring additional scrutiny to the code. Developers who are tempted to use volatile should take a step back and think about what they are truly trying to accomplish.
大部分在 Linux Kernel 中的程式碼都不符合上面四個可以使用 volatile
的原則,因此貿然使用 volatile
可能是有害的。在文章的最後也有提到,對於移除 volatile
的補釘(Patch) 是很受歡迎的,當然要提出相當的理由。
在這篇文章中主要有兩大部份與這題比較有關聯,分別是 Memory Ordering 和 Barrier
與 處理器架構和其 Memory Order
這裡分別探討應用場景
git log
查看 READ_ONCE
/ WRITE_ONCE
的巨集演化使用指令:
這邊看到總共有 9 條演化,其中改進程式碼的有 4 條,在這裡搜尋的結果並沒有與題目程式碼類似的,因此有額外進行搜尋。
直接觀察原始碼:
發現出現的不是小考的程式碼,因此搜尋 __unqual_scalar_typeof(x)
這一串巨集的目的,這個巨集被定義在 include/linux/compiler_types.h
中,如下所示:
這邊可以發現,還有 _Generic
還沒有了解,因此進行 _Generic
的搜尋,發現 C11 規格書 第 6.5.1.1 節 Generic selection
中對於 _Generic
進行定義
Constraint:
- A generic selection shall have no more than one default generic association. The type name in a generic association shall specify a complete object type other than a variably modified type. No two generic associations in the same generic selection shall specify compatible types. The controlling expression of a generic selection shall have type compatible with at most one of the types named in its generic association list. If a generic selection has no default generic association, its controlling expression shall have type compatible with exactly one of the types named in its generic association list.
Semantics:
- The controlling expression of a generic selection is not evaluated. If a generic selection has a generic association with a type name that is compatible with the type of the controlling expression, then the result expression of the generic selection is the expression in that generic association. Otherwise, the result expression of the generic selection is the expression in the default generic association. None of the expressions from any other generic association of the generic selection is evaluated.
- The type and value of a generic selection are identical to those of its result expression. It is an lvalue, a function designator, or a void expression if its result expression is, respectively, an lvalue, a function designator, or a void expression.