Zero871015
Design of a High Performance Branch Predictor Based on Global History Considering Hardware Cost
Try
HackMD
Zero871015
·
Follow
Last edited by
Zero871015
on
Jan 13, 2022
Linked with GitHub
Contributed by
0
Comments
Feedback
Log in to edit or delete your comments and be notified of replies.
Sign up
Already have an account? Log in
There is no comment
Select some text and then click Comment, or simply add a comment to this page from below to start a discussion.
Discard
Send
Design of a High Performance Branch Predictor Based on Global History Considering Hardware Cost
Abstract
分支預測(Branch prediction)是提高效能的關鍵技術
CPU 執行時,分支佔所有指令的 20~30%
基於 global history 的動態分支預測結構簡單,占用的內存少,但有 index aliasing 的問題
本文對分支預測改進,將預測生成的 index 和指令 PC 位址最後 8-bits 做 XOR,緩解 index aliasing 的問題
使用 simplescalar 3.0 模擬
實驗表明,GHR 為 12 時,改進的分支預測器效果最好,平均比原本的高出 1.93%
Introduction
以一般 MIPS 架構的 five-level pipeline 為例,如果沒有分支預測,在遇到分支指令時可能導致 pipeline 中所有指令都被清除
分支預測分為靜態和動態
靜態的通常預測率較低
基於全局歷史寄存器(GHR)的動態分支預測器使用移位的分支歷史寄存器記錄第一級最近 n 條指令的執行情況,並使用具有四個狀態值的模式歷史表(PHT)來預測分支的跳轉方向二級指令
因為 GHR 寬度有限,不同指令可能有相同的 GHR,導致預測率下降
本文目標是改進 Gselect 類型的分支預測器
Shortages of Several Classical Dynamic Branch Predictors
Branch Predictor Based on 1-bit Saturation Counter
用 1-bit 來紀錄上次有沒有跳轉
如果一直預測成功就永遠不會改變狀態
如果預測結果頻繁改變效果就會很差,實務上沒有過這種作法
Branch Predictor Based on 2-bit Saturation Counter
用 2-bits 紀錄前兩次有沒有跳轉
變成四種狀態的狀態機,概念和上面的基本一樣
One-Level Branch Predictor
用一部分的 PC 值當作 index 去找 PHT
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
第一個問題是不同 PC 會有同樣的 index
第二個問題是某些狀況之下預測率會很低
Two-level branch predictor
有三個 PHT
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
用 Choice-PHT 決定要用哪張 PHT
只改變 Choice-PHT
硬體成本高
看不太懂= =
Hybird and Neural Network Predictor
結合多個預測器
延遲高,需要時間訓練
硬體成本也高
New Scheme Proposal
Gselect 分支預測器使用 PC 高位和 GHR 低位當作 PHT 的 index,而沒使用 PC 低位
因為沒辦法準確區分沒使用到的 PC,導致 Gselect 預測率下降
為了充分使用 PC,本文將 Gselect 分支預測器生成的 index 和 PC 低位 8-bits XOR
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
假設 PC 和 GHR 為 16-bits,Gselect 將 PC 的高 8 位和 GHR 的低 8 位合併得到 index;然後我們再將 PC 的低 8 位和結果做 XOR
只影響結果的低 8 位
一個 Gselect 會重複 index 但 XOR 不會的例子
Index Name
Value
Gselect
XOR-Gselect
PC1
1010001111001111
1010001110101010
1010001101100101
GHR1
0011110010101010
PC2
1010001100000001
1010001110101010
1010001110101011
GHR2
1010111110101010
無法證明 XOR 有更好的表現,因此使用 simplescalar 執行一堆指令來判斷效能
Experimental Environment and Results
Experimental Environment
使用 SimpleScalar 3.0 sim-bpred 模擬,並使用 Gselect, Bimode 比較
Experimental Results
GHR 的長度會影響預測的成功率,因此這會是我們需要研究的因素之一
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
長度為 12 的時候飽和
PC 低位使用的數量也會影響結果(for XOR)
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
根據上面的結果,GHR 長度使用 12,並和 Bimode, Gselect 比較的結果
Image Not Showing
Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →
平均比 Gselect 好 1.93%,但比 Bimode 差
Bimode 的硬體成本高,因此輸了很正常
Conclusion
使用 XOR-Gselect 改善 index aliasing 的問題
實驗表明 GHR 長度為 12 時效果最好,並且 XOR 能改善 Gselect 的預測率
tags:
paper
Design of a High Performance Branch Predictor Based on Global History Considering Hardware Cost
Abstract
Introduction
Shortages of Several Classical Dynamic Branch Predictors
Branch Predictor Based on 1-bit Saturation Counter
Branch Predictor Based on 2-bit Saturation Counter
One-Level Branch Predictor
Two-level branch predictor
Hybird and Neural Network Predictor
New Scheme Proposal
Experimental Environment and Results
Experimental Environment
Experimental Results
Conclusion
Expand all
Back to top
Go to bottom
Design of a High Performance Branch Predictor Based on Global History Considering Hardware Cost
Abstract
Introduction
Shortages of Several Classical Dynamic Branch Predictors
Branch Predictor Based on 1-bit Saturation Counter
Branch Predictor Based on 2-bit Saturation Counter
One-Level Branch Predictor
Two-level branch predictor
Hybird and Neural Network Predictor
New Scheme Proposal
Experimental Environment and Results
Experimental Environment
Experimental Results
Conclusion
Expand all
Back to top
Go to bottom
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up
Comment