contributed by < Julian-Chu >
linux2021
與 quiz3 的 xs 設計上概念有不少相似之處, 同樣依據空間大小決定放在 stack 跟 heap, 而 vector 更進一步的可以泛用於不同數據類型( xs 針對 string), 同時還會有 push 跟 pop 的重複操作, 怎麼重複利用現有空間 (buf) 跟彈性延伸策略是設計上的重點
quiz3
的 xs union先看 __attribute__
, 用以操作變數的各種屬性
__attribute__
The keyword attribute allows you to specify special properties of variables, function parameters, or structure, union, and, in C++, class members. This attribute keyword is followed by an attribute specification enclosed in double parentheses. Some attributes are currently defined generically for variables. Other attributes are defined for variables on particular target systems
這邊使用的是 common variable attributes 中的 cleanup, 當變數離開宣告的 scope 會自動執行 cleanup function, 可以利用這個 property 做 auto free/smart poiner
cleanup
The cleanup attribute runs a function when the variable goes out of scope. This attribute can only be applied to auto function scope variables; it may not be applied to parameters or variables with static storage duration. The function must take one parameter, a pointer to a type compatible with the variable. The return value of the function (if any) is ignored.
If -fexceptions is enabled, then cleanup_function is run during the stack unwinding that happens during the processing of the exception. Note that the cleanup attribute does not allow the exception to be caught, only to perform an action. It is undefined what happens if cleanup_function does not return normally.
__Static_assert
The constant expression is evaluated at compile time and compared to zero. If it compares equal to zero, a compile-time error occurs and the compiler must display message (if provided) as part of the error message (except that characters not in basic source character set aren't required to be displayed).
Otherwise, if expression does not equal zero, nothing happens; no code is emitted.
q1: 測試過只用 _Static_assert 也可以運作, 使用 struct, dummy 跟 void 的原因是?
q2: 下列程式碼在計算 array size 的時候, 為何需要多插入一個零最後在減一, 第一眼是以為 empty array 會有問題, 但嘗試移除 0 依然可以作用?
嘗試用 pointer type 作為輸入再來檢驗
jservImage Not Showing Possible ReasonsLearn More →
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
bit field 技巧
計算 round up to next power of 2, 這邊利用兩個技巧分別是:
pop 函數相當單純, 只是將 size 減 1, 不對現存記憶體做任何操作, push 時會才根據 size 來做後續的記憶體或是內容覆蓋操作
先看一下 reserve 跟 push 私有函數使用到的 NON_NULL
類似 cleanup, 這邊用的是 common function attributes nonull
The nonnull attribute may be applied to a function that takes at least one argument of a pointer type. It indicates that the referenced arguments must be non-null pointers.
用以編譯期檢查 function 傳入的參數不得爲 NULL
reserve
函數會在 capacity 小於傳入參數 n 的情況下, 進行分配新的記憶體空間, 並將原有的資料複製過去新的記憶體空間, 如果原本的記憶體空間是在 stack , 則會強制轉移到 heap 上。
針對 ilog_factor
執行成本高的問題, 由於這邊沒有很高的精準度要求, 可以嘗試以下做法
log2f(FACTOR)
為固定值, 可以直接給定結果在 #define 中
x/log2f(FACTOR)
, 會用到浮點數的除法, 成本比乘法高, 可以替換成 乘以 log2f(FACTOR) 的倒數 x * 1.70951129
log2f(n)
的部分, 如果像folly::fbvector
有設定 capacity範圍的話, 那麼可以針對範圍內的 n 預先計算後利用查表進行後續的重複運算。 沒有限定 capacity 範圍, 可以考慮近似公式解, 下面兩個連結是參考資料(待研讀)
https://tech.ebayinc.com/engineering/fast-approximate-logarithms-part-iii-the-formulas/
https://github.com/etheory/fastapprox/blob/master/fastapprox/src/fastlog.h
https://gcc.gnu.org/onlinedocs/cpp/Standard-Predefined-Macros.html
__LINE__
This macro expands to the current input line number, in the form of a decimal integer constant. While we call it a predefined macro, it’s a pretty strange macro, since its “definition” changes with each new line of source code.
##name##line
concatenation
參考
https://gcc.gnu.org/onlinedocs/cpp/Concatenation.html
你所不知道的 C 語言:前置處理器應用篇
&&
Labels as Values
https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
你所不知道的C語言: goto 和流程控制篇
cr_label
針對 croutine 展開產生各個 loop 的function scope 內的 label 區塊, 利用 __LINE__
產生不重複的 label, 搭配 cr_begin
內部的 goto *(o)->label
, 跳到對應的程式碼區塊
gcc 展開後的 stdin_loop
main function 主要可以分為兩個部分:
檢查 file descriptor 的狀態以及設置 O_NONBLOCK flag,
fcntl
傳入二個固定參數跟一個可變參數, 分別爲 file descriptor ID, 要操作的 command (funciton pointer),可變參數爲要傳入 command 的參數
設置 O_NONBLOCK, select 針對 socket file descriptor的時候可能會誤判阻塞, 設置 O_NONBLOCK
可以強制 socket FD 不會被阻塞
man select
Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.
FCNTL(2)
int fcntl(int fd, int cmd, … /* arg */ );
DESCRIPTION
fcntl() performs one of the operations described below on the open file descriptor fd. The operation is determined by cmd.
fcntl(fd, F_SETFL, flags | O_NONBLOCK)
fcntl() can take an optional third argument. Whether or not this argument is required is determined by cmd. The required argument type is indicated in parentheses after
each cmd name (in most cases, the required type is int, and we identify the argument using the name arg), or void is specified if the argument is not required.
F_SETFL and F_GETFL
File status flags
Each open file description has certain associated status flags, initialized by open(2) and possibly modified by fcntl(). Duplicated file descriptors (made with dup(2),
fcntl(F_DUPFD), fork(2), etc.) refer to the same open file description, and thus share the same file status flags.The file status flags and their semantics are described in open(2).
F_GETFL (void)
Return (as the function result) the file access mode and the file status flags; arg is ignored.
F_SETFL (int)
Set the file status flags to the value specified by arg. File access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation flags (i.e., O_CREAT, O_EXCL, O_NOCTTY,
O_TRUNC) in arg are ignored. On Linux, this command can change only the O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK flags. It is not possible to change
the O_DSYNC and O_SYNC flags; see BUGS, below.
man socket
: 留意不同的 domain 跟 typeint socket(int domain, int type, int protocol);
socket() creates an endpoint for communication and returns a file descriptor that refers to that endpoint. The file descriptor returned by a successful call will be the
lowest-numbered file descriptor not currently open for the process.The domain argument specifies a communication domain; this selects the protocol family which will be used for communication. These families are defined in <sys/socket.h>.
The formats currently understood by the Linux kernel include:
Name Purpose Man page
AF_UNIX Local communication unix(7)
AF_LOCAL Synonym for AF_UNIX
AF_INET IPv4 Internet protocols ip(7)
...
The socket has the indicated type, which specifies the communication semantics. Currently defined types are:
SOCK_STREAM Provides sequenced, reliable, two-way, connection-based byte streams. An out-of-band data transmission mechanism may be supported.
SOCK_DGRAM Supports datagrams (connectionless, unreliable messages of a fixed maximum length).
...
int connect(int sockfd, const struct sockaddr *addr,
socklen_t addrlen);
man connect
: 注意 socket type 不同, 可能導致 connect 的使用方式不同int connect(int sockfd, const struct sockaddr *addr,
socklen_t addrlen);DESCRIPTION
The connect() system call connects the socket referred to by the file descriptor sockfd to the address specified by addr. The addrlen argument specifies the size of addr.
The format of the address in addr is determined by the address space of the socket sockfd; see socket(2) for further details.If the socket sockfd is of type SOCK_DGRAM, then addr is the address to which datagrams are sent by default, and the only address from which datagrams are received. If
the socket is of type SOCK_STREAM or SOCK_SEQPACKET, this call attempts to make a connection to the socket that is bound to the address specified by addr.Generally, connection-based protocol sockets may successfully connect() only once; connectionless protocol sockets may use connect() multiple times to change their associ‐
ation. Connectionless sockets may dissolve the association by connecting to an address with the sa_family member of sockaddr set to AF_UNSPEC (supported on Linux since
kernel 2.2).
select 用以監聽多個 fd 的事件, 當沒有事件/資料時會阻塞操作, 直到事件觸發或是資料就緒才會繼續運作, 避免重複檢查佔用 CPU, 由於監聽多個 fd 的特性, select 會應用於 IO multiplexing, 避免因為單一 IO 阻塞其他 IO 的操作
man select
int select(int nfds, fd_set *restrict readfds,
fd_set *restrict writefds, fd_set *restrict exceptfds,
struct timeval *restrict timeout);select() allows a program to monitor multiple file descriptors,
waiting until one or more of the file descriptors become "ready"
for some class of I/O operation (e.g., input possible). A file
descriptor is considered ready if it is possible to perform a
corresponding I/O operation (e.g., read(2), or a sufficiently
small write(2)) without blocking.select() can monitor only file descriptors numbers that are less
than FD_SETSIZE; poll(2) and epoll(7) do not have this
limitation. See BUGS.
select(fd + 1, &fds, NULL, NULL, NULL);
使用fd + 1 的原因: select#BUGS
According to POSIX, select() should check all specified file
descriptors in the three file descriptor sets, up to the limit
nfds-1. However, the current implementation ignores any file
descriptor in these sets that is greater than the maximum file
descriptor number that the process currently has open. According
to POSIX, any such file descriptor that is specified in one of
the sets should result in the error EBADF.
file descriptor set 是紀錄對應 fd 的 bit mask 陣列
利用 fd / __NFDBITS
爲 array index 跟 fd % __NFDBITS
為對應的 bit 位置可取得 fd 在表格中的位置,
同樣的手法可以參考 quiz3 xs_trim
以下的 macro 皆是 bit operation, 與 quiz3 xs_trim
的
set_bit
/ check_bit
原理雷同
FD_ZERO()
This macro clears (removes all file descriptors from) set.
It should be employed as the first step in initializing a
file descriptor set.FD_SET()
This macro adds the file descriptor fd to set. Adding a
file descriptor that is already present in the set is a
no-op, and does not produce an error.FD_CLR()
This macro removes the file descriptor fd from set.
Removing a file descriptor that is not present in the set
is a no-op, and does not produce an error.FD_ISSET()
select() modifies the contents of the sets according to
the rules described below. After calling select(), the
FD_ISSET() macro can be used to test if a file descriptor
is still present in a set. FD_ISSET() returns nonzero if
the file descriptor fd is present in set, and zero if it
is not.
參考 Linux 核心設計: 檔案系統概念及實作手法 - I/O 事件模型
poll, epoll
用 gcc 展開後的 stdin_loop
來看 coroutine 的運作流程, 這邊的 do … while(0) 主要是用來避免 dangling else, 將其移除方便閱讀
當進入 stdin_loop
的時候, cr_begin
會根據 cr
攜帶的資訊做判斷, 如果狀態是 CR_FINISHED
直接返回, 不做處理。 如果沒有 label
資訊, 則按照正常流程走下去, 不斷的更新 status
跟 label
, 直到 return, 此時 cr
帶有最新的 status
跟下次要開始執行的 label
區塊, 等到下次執行 stdin_loop
, 在並非 CR_FINISHED
的狀態, cr_begin
會跳轉到上次執行的 label
區塊繼續下去, 重複流程直到 stdin 讀取完畢或是 CR_FINISHED
read
returns 0: end of file
recv
returns 0: stream socket peer has performed an orderly shutdown
socket_read_loop
跟 socket_write_loop
皆是相同的邏輯, 只是操作的 file descriptor 是 socket