task daemon
=========== ==========
send_req close_devfd
close_anonfd
daemon_read
(for OPEN) copen (write on devfd)
(for READ) CACHEFILES_IOC_READ_COMPLETE ioctl (on anonfd)
访问 erofs 文件的进程在触发 ondemand 的时候,通过 cachefiles_ondemand_send_req() 将请求添加到 xarray 中,即 enqueue request
cachefiles_ondemand_init_object
cachefiles_ondemand_clean_object
cachefiles_ondemand_read
cachefiles_ondemand_send_req
daemon 在 close devfd 的时候需要将 xarray 中的请求 flush 掉,即 flush request
cachefiles_daemon_release
cachefiles_flush_reqs
enqueue request 与 flush request 这两步操作通过 cache->flags 的 CACHEFILES_DEAD bit 进行同步,即
# flush request
cachefiles_daemon_release
set CACHEFILES_DEAD bit
cachefiles_flush_reqs
# flush request
# enqueue request
cachefiles_ondemand_send_req
# if CACHEFILES_DEAD bit not set:
# enqueue request
但是这里有两个时序需要注意
/*
* Stop enqueuing the request when daemon is dying. The
* following two operations need to be atomic as a whole.
* 1) check cache state, and
* 2) enqueue request if cache is alive.
* Otherwise the request may be enqueued after xarray has been
* flushed, leaving the orphan request never being completed.
*
* CPU 1 CPU 2
* ===== =====
* test CACHEFILES_DEAD bit
* set CACHEFILES_DEAD bit
* flush requests in the xarray
* enqueue the request
*/
因而使用了 spinlock (xarray->xa_lock) 锁来确保上述 atomic 的要求
# enqueue request
cachefiles_ondemand_send_req
xa_lock
# if CACHEFILES_DEAD bit not set:
# enqueue request
xa_unlock
# flush request
cachefiles_daemon_release
set CACHEFILES_DEAD bit
cachefiles_flush_reqs
xa_lock
# flush request
xa_unlock
/*
* Make sure the following two operations won't be reordered.
* 1) set CACHEFILES_DEAD bit
* 2) flush requests in the xarray
* Otherwise the request may be enqueued after xarray has been
* flushed, leaving the orphan request never being completed.
*
* CPU 1 CPU 2
* ===== =====
* flush requests in the xarray
* test CACHEFILES_DEAD bit
* enqueue the request
* set CACHEFILES_DEAD bit
*/
所以为了让 flush request 路径的 a) set CACHEFILES_DEAD bit, b) flush request 两个操作,不要发生乱序,在这两个操作的中间加了一个 memory barrier
# flush request
cachefiles_daemon_release
set CACHEFILES_DEAD bit
cachefiles_flush_reqs
smp_mb();
xa_lock
# flush request
xa_unlock
这里 flush request 路径中的 xa_lock 能不能充当 memory barrier 呢?
spinlock 的 lock 操作隐含的是 read acquire 语义,而 read acquire 语义则是,确保 read acquire 之后的内存访问指令都在 read acquire 之后执行,相当于具有抑制 LoadLoad/LoadStore reordering 的作用read acquire --------------------- all memory operations stay below the line
但是我们这里是需要抑制 Store[Load|Store],所以 xa_lock 隐含的 read acquire 语义并不能解决这个问题
类似地,当 daemon 在 close anonfd 的时候需要将 xarray 中,与该 anonfd 相关的请求 flush 掉,主要是 READ/CLOSE 请求,即 flush request
在引入 failover 特性之后,close anonfd 的时候不再需要 flush READ 请求,但是仍然需要 flush CLOSE 请求;flush CLOSE 请求的原因请参考 flush CLOSE requests when anon fd is closed
这里 enqueue request 与 flush request 这两步操作通过 object->ondemand_id 进行同步,即
# flush request
cachefiles_ondemand_fd_release
object->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED
# flush request
# enqueue request
cachefiles_ondemand_send_req
# if object->ondemand_id valid (ondemand_id > 0):
# enqueue request
类似地,enqueue request 路径中 a) test ondemand_id, b) enqueue request 这两个操作作为一个整体必须是 atomic 的,因而这里也是使用了 spinlock (xarray->xa_lock) 锁来确保上述 atomic 的要求
# flush request
cachefiles_ondemand_fd_release
xa_lock
object->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED
# flush request
xa_unlock
# enqueue request
cachefiles_ondemand_send_req
xa_lock
# if object->ondemand_id valid (ondemand_id > 0):
# enqueue request
xa_unlock
后面在引入 failover 特性,支持 object state 之后,也就变成了
# flush request
cachefiles_ondemand_fd_release
xa_lock
set_object_close
# flush request
xa_unlock
# enqueue request
cachefiles_ondemand_send_req
xa_lock
# if object is not in close state:
# enqueue request
xa_unlock
cachefiles_ondemand_daemon_read() 中,存在 a) search xarray, b) erase xarray 两个操作
# for other request types
xa_lock
search the xarray to find a valid request
clear CACHEFILES_REQ_NEW mark
xa_unlock
id = xas.xa_index;
copy this request to user buffer
on error path:
xa_erase(..., id)
对于 CLOSE 请求,在 read 得到一个 CLOSE 请求之后,就会执行 erase xarray 操作
# for CLOSE requests
xa_lock
search the xarray to find a valid request
clear CACHEFILES_REQ_NEW mark
xa_unlock
id = xas.xa_index;
copy this request to user buffer
xa_erase(..., id)
可以看到上述 daemon_read 路径中存在 a) search xarray, b) erase xarray 两个操作,而这两个操作作为一个整体并不是 atomic 的
同时我们之前介绍过,daemon 在 close anonfd 的时候,会 flush request,此时就有可能导致以下时序
P1 P2
------------ -----------
xa_lock
search the xarray to find a valid request
clear CACHEFILES_REQ_NEW mark
xa_unlock
id = xas.xa_index;
copy this request to user buffer
close anon fd
xa_lock
flush related requets
xa_unlock
another request may be enqueued into the xarray,
reusing the previous id
xa_erase(..., id) # oops
如果要用一个 spinlock 锁把上述 daemon_read 路径中的 a) search xarray, b) erase xarray 两个操作包起来,一个是实现起来比较麻烦,在 daemon: close_anonfd 路径中也要把相关的代码段用这个 spinlock 包起来;另外一个,daemon_read 路径中 "copy this request to user buffer" 这一步还可能有其他操作,例如对于 OPEN 请求会调用 cachefiles_ondemand_get_fd(),这些操作可能会陷入阻塞,不能在持有 spinlock 的语境下调用
因而现在的修复方法是,daemon: close_anonfd 路径中只对 CACHEFILES_REQ_NEW 标记的请求做 flush 操作
P1 P2
------------ -----------
xa_lock
search the xarray to find a valid request
clear CACHEFILES_REQ_NEW mark
xa_unlock
id = xas.xa_index;
copy this request to user buffer
close anon fd
xa_lock
flush related requets with CACHEFILES_REQ_NEW marked
xa_unlock
the request processed by P1 is not flushed
xa_erase(..., id)
请参考 race between reading/flush requests
类似地,daemon 在 close devfd 的时候同样会 flush 所有请求,那么上述介绍的 race 有没有可能在 daemon: close_devfd 的时候触发呢?
答案是不会,因为 daemon: daemon_read 与 daemon: close_devfd 这两个操作根本不会并行发生,daemon: daemon_read 是对 devfd 进行 read 操作的时候触发的,那么既然 devfd 还在执行 read 操作,那么 devfd 根本就还不会被 close 掉