# TSanv3 Runtime
The goal of this document is to lay out the various components in TSan and be an easy reference to key information about each component, for implementing new algorithms and identifying areas of improvement as new research problems.
The diagram below gives an intuition of how they all interact with one another.

## Instrumentation
The entry points into the runtime are the [instrumentation added via LLVM](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L545-L584) (**memory accesses and function calls**), as well as intercepting all library functions (**synchronization**).
Specifically, the instrumentation [creates](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L569-L581) [function](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L718-L741) [calls](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L637-L653) to the relevant handlers.
:::warning
The instrumentation *introduces the largest overhead* to the target program, because CPU optimizations such as speculative execution or ILP are impeded by these function calls. Even when the handlers are modified to do nothing, the program still slows down by several factors.
TODO: do experiments and show evidence
:::
### Memory Accesses
The instrumentation pass in LLVM performs [capture tracking](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L452-L454) to ignore variables that will not be accessed by other threads.
:::warning
It [seems possible](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L542-L543) that more variable accesses could be ignored.
:::
### Interception
TSan intercepts every system library function that performs some form of synchronization, so that it can compute new HB edges before executing the actual code in the corresponding library function.
For example, [pthread_mutex_lock](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L1340) and [pthread_mutex_unlock](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L1374).
The interception mechanism is not so much within our area of interest at the moment. Here are some links to the source code just for reference.
* [How interception works](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/interception/interception.h#L71)
* [Generic interceptor](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/interception/interception.h#L303-L306) & [TSan interceptor](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interceptors.h#L85)
* [InitializeInterceptors](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L2857)
:::warning
It is noteworthy that the TSan devs has made an immense effort in intercepting many many library functions that introduce synchronization. **It is not practical to reinvent the wheel for this mechanism (there are thousands of lines of code!).**
:::
## Handlers
As mentioned above, the instrumentation just provides an entry point into the runtime, which does the actual work of race detection.
The handlers read/write [`VectorClock`](/GsKyuDeET1GCehne31irEQ), [`Trace`](/tf3W3jVoTSGdUIiRNXgj9Q), [`ThreadState`](/k6zBUzW8SMSHGDXbXT6_lQ) and [`Slot`](/SE-3gyaBSked_W2FvR8_eQ) contents. This section briefly describes the interactions between them. The other pages listed in the side navbar contain more details.
### Memory Access
Memory accesses are all handled the same in principle, just slightly differently based on the access's size, alignment and atomicity. The handlers are located in [tsan_interface.inc](https://github.com/llvm/llvm-project/blob/llvmorg-19-init/compiler-rt/lib/tsan/rtl/tsan_interface.inc) and [tsan_interface_atomic.cpp](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L265). They are all wrappers that call [a](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L420) [variant](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L454) [of](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L490) [`MemoryAccess`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L547), which [updates](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L440) [the](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L18) [thread's](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L467) [`Trace`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L57), [checks](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L442) [for races](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L194-L232), [updates](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L230) the shadow memory region with a read/write epoch, and [reports a race](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L220) if detected.
Race detection is similar to FastTrack, [comparing](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L218) `VectorClock` entries with the local epoch ([loaded from `ThreadState`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L432-L433)).
The memory access events are recorded in a `Trace`, as traversing the `Trace` is also [part of](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L166) [the race](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L771) [detection routine](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L428-L432), due to the usage of slots. See [Race Detection](/GTABsgldQyWrFy8XdqJOvA) for more details.
There is also some [special handling of deallocations](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L589-L597) but I have not spent time in understanding it as it has not affected our experiments so far.
:::warning
We observed in VTune that memory access handlers take up a significant amount of time.
This overhead could be due to cache misses when loading from shadow memory that is shared across threads (other threads may have written to it or the cache may have been evicted due to infrequent access to this shadow).
The overhead could also come from appending the current event to the `Trace` which has to be locked because another thread could access this `Trace` for race detection.
TODO: Do experiments to confirm
:::
:::info
The implementation of `R_x` and `W_x` is much more complicated (tens of lines and even using SSE) than FastTrack in which a simple `W_x = (t, C_t)` suffices. This is due to the ability to perform variable-size accesses in C++, where an access could be to only a part of a variable or overlapping multiple variables. TSan addresses these possibilities by recording multiple epochs for the same memory cell in shadow memory.
However, is this really necessary?
:::
See [Memory Access](/gc8jLz1SRr-AUNzGybC14g), [Shadow Memory](/abwNpC1GRrqguBAvVTGa2A), [Race Detection](/GTABsgldQyWrFy8XdqJOvA) for more details.
### Function Calls
TSan instruments function calls [that have instrumented accesses](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp#L569) within them, so that they [can](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface.inc#L168-L170) [record these](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L779-L805) [function calls](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L746-L758) in a `Trace`.
This information is [used](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L515-L529) when reporting a race.
:::warning
When doing experiments that perform sampling, we observed in VTune that the function entry/exit handlers take up a significant amount of time. I have not confirmed why such simple operations can be so costly.
TODO: do experiments to confirm the cause of overhead
:::
### Synchronization
TSan handles many different means of synchronization, such as locks and atomic operations with release/acquire memory order.
For locks, the handler computes metadata for HB race detection ([vector clock operations](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L242), [local epoch increments](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L254)) and [deadlock detection](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L250). The metadata is stored in a [SyncVar](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L226).
For atomics, the handler [performs](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L237-L244) [the](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L275-L280) [relevant](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L290-L300) [vector clock operation](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L426-L435) and [increments](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L281) [the](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L301-L302) [local epoch](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp#L443-L444) when necessary.
See [Race Detection](/GTABsgldQyWrFy8XdqJOvA), [Synchronization](/XJWHpVbxTo6pN25T5Bu1IQ), [SyncVar](/BCgm3ahoSkm9nKXKgNsqtQ) for more details.
{"title":"TSanv3 Runtime","description":"The components in TSanv3 are as follows:","contributors":"[{\"id\":\"b1fe8eb9-27c2-424d-89ec-e4721042b8a2\",\"add\":14729,\"del\":2403}]"}