software prefetching targets short array streams, irregular memory address patterns, and L1 cache miss reduction, there is an overall positive impact with code examples
software prefetching can interfere with the training of the hardware prefetcher, resulting in strong negative effects, especially when using software prefetch instructions for a part of streams.
不同的 data structures 的 access patterns 影響 prefectch 的方式
Recursive Data Structures (RDS)
x86 SSE SIMD extensions 的 instrinsic 會被轉換成 2道指令(direct addr)或4道指令(indirect addr)
Prefetch Classification:跟時間有關 Timely Late Early,重複 Redundant_dc Redundant_mshr,錯誤 Incorrect
Software Prefetch Distance D:prefetch distance l:prefetch latency s:length of the shortest path through the loop body D 必須大於 memory latency,但太大會造成 cache 內的資料被提早逐出,導致 cache miss 提高
Harmful Software Prefetching : stress on cache, memory bandwidth
EXPERIMENTAL METHODOLOGY
where K is a constant factor, L is an average memory latency, IPCbench is the profiledaverage IPC of each benchmark, and W is the average instruction count in one loop iteration