owned this note
owned this note
Published
Linked with GitHub
# Time Sequence Constraint
## flush CLOSE requests when anon fd is closed
When anonymous fd gets closed, all CLOSE requests associated with this anonymous fd will be flushed from the xarray, to avoid the following race.
```
P1 P2
------------ -----------
umount
enqueue CLOSE request with a object_id
close anon fd
free the object_id
this object_id is reused for another blob
read one CLOSE request with outdated object_id
close anon fd for other blob # oops
```
However this mechanism can not cover the race described by the following sequence:
```
P1 P2
------------ -----------
(daemon) read one CLOSE request,
and come back to the user space
(daemon) close anon fd
flush CLOSE requests still inside the xarray
find no CLOSE requests in the xarray
this object_id is reused for another blob
go on to process this CLOSE request
close anon fd for other blob # oops
```
In this case, the user daemon is responsible for avoiding the above sequence.
## race between reading/flush requests
The user daemon will read "/dev/cachefiles" to fetch one request to handle, in which case it will search through the xarray to find a valid request to handle. On error path, the request will be removed from the xarray directly since the request has already been marked as non-CACHEFILES_REQ_NEW previously. Besides, for CLOSE requests, they will be removed from the xarray immediately once they are read by the user daemon, since CLOSE requests don't have reply. The procedure can be described as
```
# for CLOSE requests
xa_lock
search the xarray to find a valid request
xa_unlock
id = xas.xa_index;
copy this request to user buffer
xa_erase(..., id)
```
```
# for other request types
xa_lock
search the xarray to find a valid request
xa_unlock
id = xas.xa_index;
copy this request to user buffer
on error path:
xa_erase(..., id)
```
The above operations to the xarray (search the xarray, and xa_erase()) are not atomic as a whole, which will race with flushing CLOSE requests in cachefiles_ondemand_fd_release(). Considering the following sequence:
```
P1 P2
------------ -----------
xa_lock
search the xarray to find a valid request
xa_unlock
id = xas.xa_index;
copy this request to user buffer
close anon fd
xa_lock
flush CLOSE requets
xa_unlock
another request may be enqueued into the xarray,
reusing the previous id
xa_erase(..., id) # oops
```
This can be fixed by only flushing CLOSE requests marked with CACHEFILES_REQ_NEW in cachefiles_ondemand_fd_release().
While for other request types, though the operations to the xarray (search the xarray, and xa_erase() on error path) are also not atomic as a whole, there's no race with cachefiles_ondemand_fd_release(), since cachefiles_ondemand_fd_release() will only flush CLOSE requests.
## constraint for failover
When anon fd is closed prematurely (the cachefiles_object will switch to *close* state), i.e. there's still inflight READ request, the failover mechanism will automatically resend an OPEN request to reallocate an anon fd (in which case the cachefiles_object will switch to *opening* state). Once the OPEN requst is completed (with a successful copen replied), the cachefiles_object will switch to *open* state.
Current implementation doesn't cover the potential race described by the following sequence:
```
P1
------------
when reopen is triggered,
object switches to opening state
(daemon) read OPEN request
close anon fd
object switches to close state
reply (a successful) copen
object switches to open state # oops object is in open state with invalid object_id (CACHEFILES_ONDEMAND_ID_CLOSED)
```
This can not be fixed by the following attempt, which make the object switch to open state only when the object is in opening state.
```
cachefiles_ondemand_copen
cmpxchg(&req->object->state, CACHEFILES_OBJECT_STATE_opening, CACHEFILES_OBJECT_STATE_open)
```
Because the object may be in opening state in error path, which can be described by the following sequence:
```
P1 P2
------------ -----------
when reopen is triggered,
object switches to opening state
(daemon) read OPEN request
close anon fd
object switches to close state
since object is in close state now,
reopen again, and enqueue a new OPEN request,
and object switches to opening state
reply (a successful) copen
object switches to open state # oops object is in open state with invalid object_id (CACHEFILES_ONDEMAND_ID_CLOSED)
```
Besides, there's other possible sequence interfering the object state machine.
```
P1 P2
------------ -----------
when reopen is triggered,
object switches to opening state
(daemon) read OPEN request
close anon fd
object switches to close state
since object is in close state now,
reopen again, and enqueue a new OPEN request,
and object switches to opening state
(daemon) read OPEN request
reply (a successful) copen
object switches to open state
reply (a fail) copen
object switches to close state # oops
```
A potential fix is that, also flushing OPEN requests when anon fd is closed.
1. If cachefiles_ondemand_fd_release() runs before cachefiles_ondemand_copen(), i.e. the user daemon closes anon fd before replying copen, then inside cachefiles_ondemand_fd_release(), object will switch to close state, but the OPEN request itself won't be removed from the xarray (since all request types with reply can't be flushed considering the following sequence).
```
P1 P2
------------------------------- -----------------------------
cachefiles_ondemand_daemon_read cachefiles_ondemand_fd_release
xa_lock
read OPEN request, and its (msg) id
xa_unlock
xa_lock
flush READ requests
xa_unlock
cachefiles_ondemand_send_req
enqueue another request, and
reuse the former (msg) id
error encountered, and
xa_erase(..., id)
```
While for cachefiles_ondemand_copen(), it won't make any change to the object state when cachefiles_ondemand_fd_release() has been called before.
```
cachefiles_ondemand_copen()
if (req->error) # i.e. cachefiles_ondemand_fd_release() has been called before
return
else
make the object switch to open/close state according to the copen
```
2. If the user daemon replies copen before closing anon fd, then cachefiles_ondemand_copen() will make the object switch to open/close state according to the copen
- if a successful copen is replied, then the object will switch to open state in cachefiles_ondemand_copen()
- or the object will keep in opening state until the anon fd gets closed and the object will switch to close state in cachefiles_ondemand_fd_release(). If the object switches to close state inside cachefiles_ondemand_copen() when a bad copen is received, then it will race with the setting close state inside cachefiles_ondemand_fd_release().
However the above fix seems quite complicated. Thus the current implementation doesn't cover the above described race, while the user daemon is responsible for avoiding the above time sequence where anon fd gets closed before replying copen.