# Eventfd shim idea ## Introduction ```rust pub unsafe extern "C" fn eventfd(init: c_uint, flags: c_int) -> c_int //source: https://docs.rs/libc/latest/libc/fn.eventfd.html ``` ``Eventfd`` store a counter which can be increased by calling ``write`` and retrieved and reset to 0 by calling ``read`` . This syscall is Linux specific. - Function parameters: - init: Initial value of the counter - flags: - ``EFD_CLOEXEC``: miri don't support cloexec, will be silently discarded - ``EFD_NONBLOCK``: non-blocking flag - ``EFD_SEMAPHORE (won't be supported)``: create semaphore mode, but this is not used by tokio, so will be left as ``FIXME``. According to strace, we only need to support ``eventfd2(0,EFD_CLOEXEC|EFD_NONBLOCK)`` for ``tokio``. ## Implementation ```rust struct Event { counter: u64, is_nonblock: bool, clock: VClock, } ``` ### Eventfd Fail with ``ErrorKind::InvalidInput`` if flags other than ``EFD_CLOEXEC``, ``EFD_NONBLOCK``, and ``EFD_SEMAPHORE`` are set in the flags argument. ### Read Read the counter in the buffer and return the counter if succeed. There are several case that can happen: 1. If the size of supplied buffer is less than 8 bytes - Fail with ``ErrorKind::InvalidInput`` 2. If counter in buffer == 0 - if non_blocking: fail with ``ErrorKind::WouldBlock`` - block (currently will throw_unsup_format because blocking is not implemented) 3. Happy case: counter != 0 - store counter in the buffer provided by ``write``, reset counter to 0, then return number of bytes read. ### Write Write takes in a buffer of 8 byte, and convert the value in the buffer into an integer, then add that integer into the counter. The maximum value of the counter is ``max_u64 - 1``, which is ``0xfffffffffffffffe`` There are several cases that can happen: 1. If the buffer supplied < 8 bytes or the value in the buffer is ``0xffffffffffffffff`` - Fail with ``ErrorKind::InvalidInput`` 2. If buffer size > 8 bytes, only the first 8 bytes will be used 3. If the adddition cause the counter value exceed the maximum: - for non-blocking: fail with ``ErrorKind::WouldBlock`` - for blocking: block until a ``read`` is performed (currently will throw_unsup_format because blocking is not implemented) 4. Happy case: If the addition does not exceed the maximum, just add to the counter and return number of bytes written. ### Synchronisation Similar to ``socketpair``, ``eventfd`` also can be used for synchronisation as demonstrate in the test below, so we might also need to do the same ``this.release_clock()`` for ``write`` and ``this.acquire_clock(&clock)`` for ``read``. ```rust //@compile-flags: -Zmiri-preemption-rate=0 use std::thread; fn test_race() { static mut VAL: u8 = 0; let flags = libc::EFD_NONBLOCK | libc::EFD_CLOEXEC; let fd = unsafe { libc::eventfd(0, flags)}; let thread1 = thread::spawn(move || { let mut buf: [u8; 8] = [0; 8]; let res: i32 = unsafe { libc::read(fd, buf.as_mut_ptr().cast(), buf.len() as libc::size_t) .try_into() .unwrap() }; // read returns number of bytes has been read, which is always 8 assert_eq!(res, 8); let counter:u64; if cfg!(target_endian = "big") { // Read will store the bytes based on the endianess of the host system. counter = u64::from_be_bytes(buf); } else { counter = u64::from_le_bytes(buf); } assert_eq!(counter, 1); // Read from the static mutable variable unsafe { assert_eq!(VAL, 1) }; }); // Write to the static mutable variable unsafe { VAL = 1 }; let data:[u8; 8]; if cfg!(target_endian = "big") { // Adjust the data based on the endianess of host system. data = [0, 0, 0, 0, 0, 0, 0, 1]; } else { data = [1, 0, 0, 0, 0, 0, 0, 0]; } let res:i64 = unsafe { libc::write(fd, data.as_ptr() as *const libc::c_void, 8).try_into().unwrap() }; // write return number of bytes written, which is always 8 assert_eq!(res, 8); thread::yield_now(); thread1.join().unwrap(); } ``` ### Test - All the test below are for ``eventfd(0, EFD_CLOEXEC|EFD_NONBLOCK)`` because ``tokio`` use this. 1. Both read and write happy case (ie write that does not cause the addition to exceed maximum value of u64, and read counter value that != 0) 1. Read when counter == 0 1. Write with buffer size > 8 bytes 1. Read with a supplied buffer < 8 bytes 1. Write with supplied buffer < 8 bytes 1. Read with a supplied buffer size > 8 bytes 1. Addition that exceed the maximum value - Blocking test for blocking eventfd (will be in fail-dep): 1. Blocking read: read when counter == 0 2. Blocking write: addition that exceed maximum value - Race test in the synchronisation section