# TSanv3 Trace
According to [their paper](https://dl.acm.org/doi/abs/10.1145/1791194.1791203), TSan was designed with informative race reporting as an important feature. The `Trace` component in TSan serves this purpose and is closely integrated into the race detection process.
Each thread maintains its own `Trace` to keep a history of [all events](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L22-L29) (e.g. memory access, synchronization, function calls), which will be consulted when a race is detected so that an error report can be produced from traversing the trace.
## Structure
[A `Trace`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L189-L211) consists of [multiple](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L191) `TracePart`s. [A `TracePart`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L172-L186) simply consists of an array of [32763 `Event`s](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L175-L183). There are different [`Event` structs](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L31-L41), although having different sizes, they all [must](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L61) [be](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L69) [a](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L93) [multiple](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L111) [of](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L133) [8](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L147) [bytes](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L162).
A `Trace` starts off with 1 `TracePart`, and when it is full, allocates a new `TracePart`, and so on. When a `TracePart` is full, it may be recycled (see [Switching Trace Parts](#Switching-Trace-Parts)). The [`kMinParts` constant](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L203-L207) ensures that a `Trace` keeps (i.e. cannot recycle) a [certain minimum number](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L964-L966) of its `TracePart`s. This is worth knowing to not misunderstand that each `Trace` immediately consumes 3 `TracePart`s worth of memory.
Each trace has a single writer (the thread that owns it) and multiple readers (other threads that traverse it during race reporting). Thus, a trace is protected by [a `Mutex`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L190) [that spins](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/sanitizer_common/sanitizer_mutex.h#L170-L188). However, this mutex only needs to be locked very rarely, so it is not a concern for performance. The mutex is only locked during [race reporting](#Race-Reporting), [trace part switching](#Switching-Trace-Parts), and slot reset (refer to [Slot](/SE-3gyaBSked_W2FvR8_eQ)).
Each thread also remembers a [*trace position*](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L167-L168) which points to the last `Event` of the last `TracePart`, i.e. the location to append the next `Event`. The writer thread stores an `Event` to the trace position then advances it; a reader thread begins traversing a trace from the current trace position.
There are several other fields for optimization and memory management purposes which will be explored in more detail later.
## Usage
The `Trace` structure records events so that they can be replayed to produce a race report.
### Appending Events
[`TraceEvent`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L735) is a generic helper function that appends an `Event` to a `Trace`.
The append begins with attempting to [*acquire*](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L737) [the trace position](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L704). It is possible for the acquire to [fail](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L722-L723) if the current `TracePart` has reached its end (i.e. the trace position points to the end). In such case, [`TraceSwitchPart` is called](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L738-L739) to switch to a new `TracePart`, and the [acquire is attempted again](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L739) (it will definitely succeed this time because it is on a fresh part).
After acquiring an `Event*`, TSan [stores the new `Event`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L742) to it and [*releases*](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L743) this pointer by [advancing the trace position](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L731). Note that although different `Event`s may have different sizes, the pointer will be advanced to the correct next position with a generic `evp + 1`.
#### Time Event
The [time event](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L134-L146) is the simplest. It is appended [after a local epoch increment](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L511), to mark the advance of the thread's timestamp.
This information is useful [during](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L771) [race](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L486) [reporting](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L373), for replaying the trace to identify the conflicting memory access event given its epoch.
#### Synchronization Events
When a mutex is [locked](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L175) or [unlocked](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L221), a corresponding `Event` is [appended](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L105-L132) to the trace position.
This information is useful [during race reporting](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_report.cpp#L282), when printing a [racy memory access](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_report.cpp#L332-L333) ([`Mop` probably means memory op](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L177-L183)), to[ print the mutex set](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_report.cpp#L158) that the thread is currently in.
#### Memory Access Events
Upon a memory access, a corresponding `Event` is [appended](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L18-L54) to the trace position, with the same code pattern of [acquire](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L25-L26)-[store](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L31-L35)-[release](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp#L38) described above. Note that acquiring/releasing an `Event*` from the trace position does not do any locking despite the name sounding like it does.
To report a race, the trace is [replayed to find](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L428-L432) the conflicting memory access.
### Switching Trace Parts
`TraceSwitchPartImpl` (`TraceSwitchPart` is just a wrapper) is rather complex. The goal is to switch a "finished" `TracePart` with a "fresh" one. It is worth knowing what it does, in case one needs to make changes that might introduce side effects to it.
First, [`TraceSkipGap` is called](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L941-L942) to check whether the `TracePart` has not actually reach its end. This is necessary due to `TraceAcquire`'s ["cheat" way](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L718-L723) to quickly check the `TracePart`'s end. `TraceSkipGap` will [properly check](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L925) for the end, and if it was not actually so, [fill the "gap" with `NopEvent`s](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L928-L929).
After the check, [`TracePartAlloc` is called](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L960) to obtain a `TracePart` [from the *recycle queue*](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L87). The recycle queue contains traces that are [no longer in use](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L89-L93). The new `TracePart` is [added](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L975) to the `Trace`, and the trace position is [set to](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L976-L977) the start of the new `TracePart`.
[If necessary](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L971-L974), the first active ([not in the recycle queue](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L192-L193)) `TracePart` is [inserted into the recycle queue](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L1025-L1026). This will later be returned by a future call to `TracePartAlloc`.
Next, the new `TracePart` is populated with the stack frames that the program is currently in. This is necessary for race reporting. The frames can be [retrieved from the *shadow stack*](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L983-L989). These frames were [appended during function entry](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L791-L792) and [popped during function exit](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.h#L804). Note that `TraceSkipGap` is called here again after `TraceFunc`, for a similar purpose as done at the start of `TraceSwitchPart`.
A similar thing is [done for mutex sets](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L996-L1001), [updated when recording](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp#L77-L83) a mutex lock/unlock event.
Lastly, the corresponding *slot* is [pushed to the end](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L1021-L1024) of the *slot queue*. The purpose is for "refreshing" slots that are being actively used. More details are discussed in [Slot](/SE-3gyaBSked_W2FvR8_eQ).
### Critical Sections
Each `Trace` has a [mutex](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_trace.h#L190) which is locked very sparingly.
It is locked during [race reporting](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L466), [trace part switching](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L968), and [slot](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L328) [reset](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L150). In the latter two cases, a thread may [replace](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L975-L977) or [free](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L175-L176) ([move to the recycle queue](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L120)) its active `TracePart`. For race reporting, when a thread T2 detects a race with thread T1, T2 will access T1's active `TracePart` via the [trace position](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp#L473-L475). It is thus important for T2 to lock the mutex so that it does not access any recycled `TracePart`.
There is no need to lock any mutex when appending an `Event` to a `TracePart`. This action is performed by the owning thread so the active `TracePart` will not be replaced/recycled concurrently. (Not exactly so, see [Advanced Information](#Advanced-Information) for more details but bottom line it is safe.)
Furthermore, it is fine for the owning thread to append an `Event` concurrently while another thread traverses the active `TracePart` for race reporting. The only shared variable is the trace position. It is insignificant whether the reporting thread reads the latest trace position, because the raced `Event` must be located much earlier in the `TracePart`.
### Race Reporting
Sorry, not going to spend much effort in elaborating on this. This process doesn't seem too important to us at the moment. Considering that races are rare, there is negligible overhead from `Trace` in this process.
## Advanced Information
The details below are not important for just a brief understanding of `Trace`. However these might be important when wanting to make some changes that affects `Trace`, to make sure nothing breaks.
You might only care about these if you need more understanding.
### Cleanup
`TracePart`s are recycled in `SlotReset` and `DoResetImpl`.
The key thing to note is, when `DoReset` happens, the slots/vector clocks start afresh, so the traces are obsolete and they [should all be recycled](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L175-L176). A small catch: the thread that executes `DoReset` will clear the `Trace` [*of all threads*](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L142-L144), i.e. it may race with the owning thread if it tries to append an `Event`. However, this operation is safe because the attached (active) `TracePart` [will not be recycled just yet](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L161-L174).
`SlotDetachImpl` also [recycles a `TracePart`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L339-L342) [in an unlikely event](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L330-L337) after a slot is [preempted by `DoReset`](https://github.com/llvm/llvm-project/blob/987087df90026605fc8d03ebda5a1cd31b71e609/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L324).