https://www.gem5.org/documentation/general_docs/cpu_models/execution_basics
https://www.gem5.org/documentation/general_docs/cpu_models/O3CPU
gem5/src/arch/riscv/isa/templates/vector_mem.isa VlIndexMicroInitiateAcc
fault = initiateMemRead(xc, EA, mem_size, memAccessFlags, byte_enable);
(for o3 model)
->gem5/src/cpu/o3/dyn_inst.cc DynInst::initiateMemRead
cpu->pushRequest(...)
->gem5/src/cpu/o3/cpu.hh pushRequest
iew.ldstQueue.pushRequest(...)
->gem5/src/cpu/o3/lsq.cc LSQ::pushRequest
request->initiateTranslation();
->gem5/src/cpu/o3/lsq.cc LSQ::????::initiateTranslation()
sendFragmentToTranslation(...)
->gem5/src/cpu/o3/lsq.cc LSQ::LSQRequest::sendFragmentToTranslation
_port.getMMUPtr()->translateTiming(...)
(TLB translation happens here)
->gem5/src/arch/riscv/tlb.cc
TLB::translateTiming
This might be the real path for our implementation. We may need to create our own LSQRequest or use UnsquashableDirectRequest to bypass TLB translation.
https://ieeexplore.ieee.org/document/10027181
Psuedo Code ?
pn = []
vs = []
mask = []
rs
base_i = 0
for(n in N steps) {
pn[base] = translate(rs + vs[base])
for(i in vs) {
if (element == vs[base]){
pn[i] = pn[base]
mask[i] = 0
}
}
base = first index that is 1 in mask
}
for(i in vs) {
if (mask[i] == 1){
pn[i] = translate(rs + vs[i])
}
}
for index intructions
decode
gem5/src/arch/riscv/isa/decoder.isa
...
0x3: VlIndexOp::vloxei32_v(
{{ Vd_vu[vdElemIdx] = Mem_vc.as<vu>()[0]; }},
{{ EA = Rs1 + Vs2_uw[vs2ElemIdx]; }},
inst_flags=SimdIndexedLoadOp
);
...
instruction constructor
gem5/src/arch/riscv/isa/templates/vector_mem.isa
def template VlIndexConstructor
...
microop = new %(class_name)sMicro<ElemType>(machInst, vdRegIdx, vdElemIdx, vs2RegIdx, vs2ElemIdx, elen, vlen);
...
microop initiate memory requests
gem5/src/arch/riscv/isa/templates/vector_mem.isa
def template VlIndexMicroInitiateAcc
// EA = Rs1 + Vs2_uw[vs2ElemIdx];
fault = initiateMemRead(xc, EA, mem_size, memAccessFlags, byte_enable);
microop complete memory requests
gem5/src/arch/riscv/isa/templates/vector_mem.isa
def template VlIndexMicroCompleteAcc
...
memcpy(Mem.as<uint8_t>(), pkt->getPtr<uint8_t>(), pkt->getSize());
//Vd_vu[vdElemIdx] = Mem_vc.as<vu>()[0];
%(memacc_code)s;
...
vmseq, vmsne: compare vector with elemnt
vfirst: find index of first active bit in vector
or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing