Advanced_SoCStudy

# **"Advanced SoC Design" Course Study Journal** This study journal is created by Yeh Cheng-Hong, 112501538, CoSR department, NTHU. Course: NTHU 11220EE525200 高階系统晶片設計(Advanced SOC Design) For the study journal of SoC-design course last semester, please click [here](https://hackmd.io/@whywhytellmewhy/S1joJxm1p).  ## Lab 1: Integrate FIR into FSIC - 題目的Github位置：https://github.com/bol-edu/fsic_fpga/tree/main - 更新後的Github位置（公布在eeclass上的）：https://github.com/bol-edu/caravel-soc_fpga-lab/tree/main/fsic-sim - 記錄一下有改到的部分： 1. Add FIR-related files from `/Final_project` (in [SoC design course](https://github.com/whywhytellmewhy/SOC-design)) into `/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl` 2. `/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl/user_prj1.v` 3. `/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl/rtl.f` 4. `/lab_1/fsic_fpga/rtl/user/testbench/tc/filelist` 5. `/lab_1/fsic_fpga/rtl/user/testbench/tb_fsic.v` - issue 1. tb_fsic中要「Program [‘h3000_5000] = **32’h01**」（**而非32’h02**）才可將user_project_1 enable？ - 因為發現當沒有特別去program它時，會為default值0，此時user_project_0可運作 - 當特別將它program為1時，user_project_1可運作，user_project_0不運作 - 可由`/lab_1/fsic_fpga/rtl/user/config_ctrl/rtl/config_ctrl.v`中的下圖： ![image](https://hackmd.io/_uploads/rkQaeh3qT.png) 以及`/lab_1/fsic_fpga/rtl/user/user_subsys/axil_slav/rtl/axil_slav.v`中的下圖得知。 ![image](https://hackmd.io/_uploads/B1TLg3hqp.png) - 但此與 [workbook這一頁](https://docs.google.com/presentation/d/1V06u84HeIAGzz7sjz3WoeaC0yEsZNmcE/edit#slide=id.p16) 說法不一致 - ➜ 最後決定先使用「Program [‘h3000_5000] = **32’h01**」 2. 若將FIR放在user_project_1，會發現read SoC configuration時有下列情況： ![image](https://hackmd.io/_uploads/r1IyQnn9p.png) ~~即使FIR已經handshake但WB仍沒有傳送回ACK~~ :::info **(更新版)** 最後先在user_project_0中完成後再將FIR放入user_project_1，此時有傳送回ACK，但值會讀到user_project_0的AA55AA55 ![image](https://hackmd.io/_uploads/Hkq9B57op.png) ::: ➜但若將FIR放到user_project_0（原本放在user_project_0的先換到user_project_1）： ![image](https://hackmd.io/_uploads/S12SQ23cp.png) 並**先不設定**「Program [‘h3000_5000] = 32’h01」，就可以正常運作： ![image](https://hackmd.io/_uploads/rk2XNn256.png) ➜ 最後決定**先將其放在user_project_0** - **Solution**： 1. 在`/lab_1/fsic_fpga/rtl/user/rtl/fsic.v`中搜尋「wbs_rdata」，得知其來自`/lab_1/fsic_fpga/rtl/user/config_ctrl/rtl/config_ctrl.v` 2. 在`config_ctrl.v`中搜尋「wbs_rdata」，發現其來自「wb_axi_rdata」 ![image](https://hackmd.io/_uploads/BJCjgtMsT.png) ![image](https://hackmd.io/_uploads/HknagYGo6.png) ➜ 再追溯得知來自「m_axi_rdata」 ➜ 「m_axi_rdata」來自「axi_rdata2」（與user project有關的部分） ![image](https://hackmd.io/_uploads/SytrzKzia.png) ![image](https://hackmd.io/_uploads/B15nzYfia.png) 3. 再追溯得知來自 `/lab_1/fsic_fpga/rtl/user/user_subsys/axil_slav/rtl/axil_slav.v`的「axi_rdata」 ➜ 在`axil_slav.v`中 ![螢幕擷取畫面 2024-02-09 001635](https://hackmd.io/_uploads/rktpR_Gsp.png) ➜ ~~可知output為 user_project_0 ~ user_project_3 的相對應output直接做OR得到，而非透過user_prj_sel選出~~ **發現在上圖第117行、第122行、第126行、...的 `...? XX_0 : 0 ;`，應改為`XX_1`**。  ## Lab 2-1: Catapult FIR - 題目的Github位置：https://github.com/bol-edu/caravel-soc_fpga-lab/tree/main/catapult_hls - 記錄一下要完成的部分： 1. catapult_setup(NTHU).pptx - [x] 修改`.tcshrc` - [x] 複製`/home/course/ee5252/catapult`資料夾到自己的working space - [x] 執行`source sun_catapult` - [x] 跑一遍lab1_fir ➜在113.4.5~ 03:03 a.m.在屏東完成 - [x] create “bin” folder to save execution file - 記錄一下實作的過程： 1. **01_walkthrough_loops** 1. `cd /catapult/catapult-for-soc-course/lab1_fir/01_walkthrough_loops` 2. `mkdir bin`，不然`make`後會error 3. `make` 4. `./bin/SCVerify_fir.exe` 5. 剩下的部分按照"Step_by_step_lab1_FIR.pdf"的 p.11 ~ p.31 的步驟實作 2. **02_mem_ifc** 1. 按照"Step_by_step_lab1_FIR.pdf"的 p.32 ~ p.40 的步驟實作 3. **03_multi_blks** 1. 按照"Step_by_step_lab1_FIR.pdf"的 p.41 ~ p.59 的步驟實作 2. Improve the throughput from 10 to 1： 1. 將block0、block1的「SHIFT」及「MAC」做 "Unroll" 2. 將block0、block1、block2的「main」做 "Pipeline (II=1)" 3. 按下「RTL」步驟後，執行結果為 ![image](https://hackmd.io/_uploads/r1pjFTa10.png) 其中top.v1為初始設定；top.v2為尚未將block2的「main」做 "Pipeline (II=1)"時的結果；top.v3則為最終結果。 3. 按照"Step_by_step_lab1_FIR.pdf"的 p.61 的步驟實作 - ### ==**C/C++轉換成HLS的過程中需要注意的事項**== 1. From "Step_by_step_lab1_FIR.pdf" - header要記得加上 ```C #include <ac_int.h> #include <ac_channel.h> #include <mc_scverify.h> ``` 有時候會需要加上fix point的小數（有如floating point，但硬體上必須fix其bit數） ```C #include <ac_fixed.h> ``` :::warning 若在C程式碼中有`#include <math.h>`，則要修改為Catapult提供的`#include <ac_math.h>` ::: - 每個class一定要有initialization/constructor，例如： ```C class fir { private: ... public: fir() { ... } } ``` - 在class中的top function前要加上`#pragma hls_design interface`，且top function要改寫如下： 1. 原本的C程式碼為 ```C void run(int input, int coeffs[8],int &output) { ... } ``` 2. 改寫成HLS後變為 ```C #pragma hls_design interface void CCS_BLOCK(run)(ac_channel<int8> &input, int8 coeffs[8], ac_channel<int8> &output) { ... } ``` - 讀取input值要從`A=input;`變為`A=input.read();` - 輸出output值要從`output=temp;`變為`output.write(temp);` - 除了for loop中宣告的`int i`以外，`int temp;`要改寫為`int數字 temp;` - 每個for loop前加上label，且for loop中的`i`盡量寫為相同順序，例如： 1. 原本的C程式碼為 ```C for (int i=7; i>=0; i--) { ... } for (int i=0; i<8; i++) { ... } ``` 2. 改寫成HLS後變為 ```C SHIFT:for (int i=7; i>=0; i--) { ... } MAC:for (int i=7; i>=0; i--) { ... } ``` - ### Memory Interface 要改為memory interface，則需要修改： - 原本為 ```C #pragma hls_design interface void CCS_BLOCK(run)(ac_channel<int8> &input, int8 coeffs[8], ac_channel<int8> &output) { ... } ``` 變為 ```C #pragma hls_design interface void CCS_BLOCK(run)(ac_channel<int8> &input, ac_int<8> coeffs[32][8], ac_channel<ac_int<5,false>> &coeff_addr, ac_channel<int8> &output) { ... } ``` - 除了for loop中宣告的`int i`以外，`int數字 temp;`要改寫為`ac_int<數字> temp;` - 原本的`coeffs[8]`為input，要改為memory interface ```C ac_int<5,false> addr = coeff_addr.read(); ``` 並以`coeffs[addr][i]`來讀取其值 - ### Multiple Blocks 要實作multiple blocks，則需要注意： - top design要使用`#pragma hls_design interface top`；其他(底層的)block用`#pragma hls_design interface` - 各block之間使用ac_channel相連接，在top function中使用： ```C class top { ac_channel<ac_int<8>> connect0; fir block0; fir block1; public: top () {} #pragma hls_design interface top void CCS_BLOCK(run)(ac_channel<ac_int<8>> &din, ..., ac_channel<ac_int<8>> &dout) { block0.run(din, ..., connect0); block1.run(connect0, ..., dout); } } ```  ## Lab 2-2: Catapult Edge Detect - 題目的Github位置：https://github.com/bol-edu/caravel-soc_fpga-lab/tree/main/catapult_hls (與 lab2-1 同一個Github) - 記錄一下實作的過程： 1. **01_edgedetect** 1. `cd /catapult/catapult-for-soc-course/lab2_edgedetec_fsic/01_edgedetect` 2. `make` 3. `cd bin` 4. `source run.sh`（不知道為什麼直接用`./run.sh`會Permission denied） 5. `cd ../catapult_work` 6. `catapult &`，再按照"Step_by_step_lab2_EdgeDetect.pdf"的 p.34 ~ p.???????????? 的步驟實作 2. **02_edgedetect_fsic** 1. `cd /catapult/catapult-for-soc-course/lab2_edgedetec_fsic/02_edgedetect_fsic/hls_c/inc` 2. `cp ../../../01_edgedetect/hls_c/inc/* .`並修改複製過來的這些檔案 - 過程中得知的重點： 1. From "Step_by_step_lab2_EdgeDetect.pdf" - "DirectInput" is used for the stable ports avoiding the unnecessary pipeline registers inserted - data input channel is set as "ccs_in_wait_coupled" avoiding extra FIFO buffer 2. From "EdgeDetect_VerDer.h" - 若for loop的上限為變數時，可以透過「移除上限的限制式，而`i`使用 bit-accurate 的 data type以增加更多可能的情況，最後再設定break條件」的方式使其可以合成： ```C for (uint8 i = 0; ; i++) { ... if (i == upper_bound) { break; } } ```  ## Lab 3: Synopsys IC flow - 記錄一下實作過程中遇到的問題及解法： 1. 在`/lab2_pnr`中遇到error： ![S__47546384](https://hackmd.io/_uploads/rkSvvamxR.jpg) :::success **Solution**:換伺服器（ws44➜ws27）後就沒有error了 ::: 2. 在`/lab_pt`中遇到error： ![image](https://hackmd.io/_uploads/Sy2v5TXxA.png) :::success **Solution**:將`/lab_pt/work/Makefile`中的`/bin/tclsh ./../script/gen_pt_cmd.tcl`換成 ```console tclsh ./../script/gen_pt_cmd.tcl ``` ::: 3. 在`/lab_pt`中遇到error： ![image](https://hackmd.io/_uploads/SJnhspme0.png) 上網搜尋`primetime Error: Library Compiler executable path is not set. (PT-063)`得知是未設定`SYNOPSYS_LC_ROOT`這個environment variable（[參考資料1](https://www.fasteda.cn/post/240.html)、[參考資料2](https://bbs.eetop.cn/thread-906741-1-1.html)、[參考資料3](https://www.eecs.umich.edu/dco/docs/ecad/synopsys.html)、[參考資料4](https://blog.csdn.net/m0_61544122/article/details/129873760)、[參考資料5](https://zhuanlan.zhihu.com/p/401603501)） :::info 💡 在terminal中使用`printenv SYNOPSYS_LC_ROOT`未回傳任何資訊，可知的確未設定`SYNOPSYS_LC_ROOT` ::: 最後在Google搜尋時發現path設定所需指向的資料夾： ![image](https://hackmd.io/_uploads/SkoYAaQxR.png) 並在電機系工作站中透過`ls /usr/cad/synopsys/lc/cur`確認此資料夾存在。 :::success **Solution**：因此到`.tcshrc`中加入此行： ```console setenv SYNOPSYS_LC_ROOT /usr/cad/synopsys/lc/cur ``` 即可正常執行！ ::: 4. 在`/lab4_finishing`中遇到error： ![image](https://hackmd.io/_uploads/r1Z4GVNxC.png) 這個error解了超久QAQ。線索來自`/lab4_finishing/scripts/step7_finishing.tcl`的第23行 "Use correct ICV version"： ![image](https://hackmd.io/_uploads/rJNSODuxC.png) 因此我懷疑是因為 IC compiler II (icc2_shell) 與 IC Validator (icv) 的版本不同（icc2_shell不支援icv的版本導致吃不到binary file）所導致的。先上網搜尋相關指令，最主要參考了[這個網站](https://www.cnblogs.com/ASIC-Horizon/p/17071009.html)得知與 "ICV_HOME_DIR" 、 "ICV_INCLUDES" 和 "PATH" 這些環境變數有關。另外由[ASoC課程教材](https://drive.google.com/drive/folders/1cGNs2LfW5ZdoaEpi_Nh6iANl7V4f6rtv?usp=sharing)中的「Reference > icc2ug.pdf」的第543頁（如下圖，此為透過搜尋「ICV」關鍵字而找到的）可知的確需要搭配正確的version才可正常執行。 ![image](https://hackmd.io/_uploads/r196aDOxR.png) 因此在`icc2_shell`中使用`report_versions`得知`source /usr/cadtool/user_setup/08-icc2.csh`所對應的icc2_shell版本為2020年，並不支援2022年版的icv（`source /usr/cadtool/user_setup/08-icv.csh`），上述版本的部分也可從[電機系的 CAD Tool List網頁](https://web.ee.nthu.edu.tw/p/405-1175-169285,c4918.php?Lang=zh-tw)得知。雖然icc2_shell也有2021.06的版本（`source /usr/cadtool/cad/synopsys/CIC/icc2_2021.cshrc`），但仍無法支援2022年版的icv，後來誤打誤撞發現Synopsys的tool會被分在兩大資料夾中：`/usr/cadtool/cad/synopsys`與`/usr/cadtool/user_setup`，感覺後者是為了方便版本控制而新建立的區域。在`/usr/cadtool/cad/synopsys/icvalidator`中有 2021.06 的版本，跟icc2_shell的2021.06的版本是可以互相支援的！但並沒有相關的.csh檔可以方便source，因此我參考前面所述的幾個 .csh / .cshrc 的寫法，以及參考上圖中所介紹需要設定"ICV_HOME_DIR" 和 "path"，而寫出了`/home/course/m112501538/ASoC_lab3.csh`，並在`.tcshrc`中source它。 :::success **Solution**：建立`/home/course/m112501538/ASoC_lab3.csh`檔案，其內容為 ```console setenv ICV_HOME_DIR "/usr/cadtool/cad/synopsys/icvalidator/2021.06" set path = (/usr/cadtool/cad/synopsys/icvalidator/2021.06/bin/LINUX.64 $path) ``` 接著到`.tcshrc`中執行下列3個步驟： 1. 將`source /usr/cadtool/user_setup/08-icc2.csh`改為`source /usr/cadtool/cad/synopsys/CIC/icc2_2021.cshrc` 2. 註解`source /usr/cadtool/user_setup/08-icv.csh` 3. 加入此行：`source /home/course/m112501538/ASoC_lab3.csh` ::: 5. 在`/lab_formal`中遇到error： ![image](https://hackmd.io/_uploads/HyOalBulR.png) :::success **Solution**:將`/lab_formal/work/Makefile`中的`/bin/tclsh ./../script/gen_formality_cmd.tcl`換成 ```console tclsh ./../script/gen_formality_cmd.tcl ``` ::: 6. 在`/lab_formal`中遇到error： ![image](https://hackmd.io/_uploads/ry_sGrde0.png) 發現是因為在電機系工作站沒辦法用`fm_shell`指令開啟formality： ![image](https://hackmd.io/_uploads/Sktt2UulC.png) 使用`vim /usr/cadtool/user_setup/08-formality.csh` ![image](https://hackmd.io/_uploads/HyRbTLdlC.png) 得知tool安裝在`/usr/cad/synopsys/formality/`中。一層層往資料夾內搜尋執行檔所在位置： ![image](https://hackmd.io/_uploads/By5NlPuxA.png) 因此可使用 ```console /usr/cad/synopsys/formality/2021.06/bin/fm_shell ``` 指令來開啟formality軟體： ![image](https://hackmd.io/_uploads/HknulvOg0.png) :::success **Solution**:將`/lab_formal/work/Makefile`中的`FM_EXEC = fm_shell`換成 ```console FM_EXEC = /usr/cad/synopsys/formality/2021.06/bin/fm_shell ``` 即可正常執行！ :::  ## Lab 4: Simulation & Validation of FSIC with FPGA - 關於「將 lab 1 所更新的內容（/fsic_fpga/rtl/user/user_subsys/axil_slav/rtl/axil_slav.v）更新到[bol-edu/fsic_fpga](https://github.com/bol-edu/fsic_fpga/tree/main)」：十分感謝Michael Kao 所提供的[參考網站](https://gitbook.tw/chapters/github/pull-request) 1. 將 [bol-edu/fsic_fpga](https://github.com/bol-edu/fsic_fpga/tree/main) 這個repository **fork** 到自己Github帳號下 2. 修改fork過來的（自己帳號下的）檔案 3. `git add`、`git commit`、`git push`到自己的帳號下 4. 到 [bol-edu/fsic_fpga](https://github.com/bol-edu/fsic_fpga/tree/main) 新增 Pull request - 記錄一下實作過程中的發現： - (/fsic_fpga/vivado/fsic_tb.sv) `fork ... join_none`的用途可參考[這個網站](https://blog.csdn.net/a52228254/article/details/106184602) - /fsic_fpga/vivado/fsic_tb.sv 中寫到： ```csharp fork fw_mb_st_t(); join_none @(fw_mb_st_event); fork fw_mb_wd_t(); join_none @(fw_mb_wd_event); ``` 其中`fw_mb_st_t()`及`fw_mb_wd_t()`這兩個task的定義為： ```csharp task fw_mb_st_t; begin wait(DUT.design_1_i.caravel_0_mprj_o[37] == 1'b0); $display($time, "=> FW starts MB writing, caravel_0_mprj_o[37] = %0b", DUT.design_1_i.caravel_0_mprj_o[37]); ->> fw_mb_st_event; end endtask task fw_mb_wd_t; begin wait(DUT.design_1_i.caravel_0_mprj_o[37] == 1'b1); $display($time, "=> FW finishs MB writing, caravel_0_mprj_o[37] = %0b", DUT.design_1_i.caravel_0_mprj_o[37]); ->> fw_mb_wd_event; end endtask ``` 由此可知 /fsic_fpga/testbench/fsic/fsic.c 中的 ```csharp case 5: reg_mprj_datah = 0x00; //set mprj_io[37] to 1'b0 to indicate FW going to waiting fpga MB test reg_fsic_aa_mb0 = 0x5a5a5a5a; reg_fsic_cc = 0x00000004; reg_mprj_datah = 0x20; //set mprj_io[37] to 1'b0 to indicate FW going to waiting fpga MB test break; ``` 應改為 ```csharp reg_mprj_datah = 0x00; //set mprj_io[37] to 1'b0 to indicate FW going to "write" fpga MB test reg_mprj_datah = 0x20; //set mprj_io[37] to "1'b1" to indicate FW "has finished writing" fpga MB ``` - (/fsic_fpga/vivado/fsic_tb.sv 與 /fsic_fpga/testbench/fsic/fsic.c) Mailbox的讀寫分為兩個方向： - Task SocLocal_MbWrite() 為SoC side寫給FPGA side，流程如下： 1. 先將「PL_AA（16'h2100） offset=0」位址的值寫作1，不確定是不是指要將FPGA side的aa_irq_en開啟的用途 2. SoC side寫值至 mailbox（reg_fsic_aa_mb0），並透過 mprj_io[37] 來告知寫完 3. FPGA side 至 PL_AA_MB（16'h2000）位址接收mailbox傳來的值 4. FPGA side：此時的 aa_mb_irq status（aa_mb_irq值）應為1，可透過read PL_AA（16'h2100） offset=4 來得知 5. FPGA side將「PL_AA（16'h2100） offset=4」位址的值寫入**1**，它就會自動變為**0**（表示將aa_mb_irq值回復成0），非常神奇！ - Task FpgaLocal_MbWrite() 為FPGA side寫給SoC side，流程如下： 1. SoC side將 reg_fsic_aa_irq_en 的值寫為1，並透過 reg_fsic_cc 來告知寫完 2. FPGA side 寫值至 mailbox（PL_AA_MB（16'h2000） offset=0） 3. SoC side 至 reg_fsic_aa_mb0 接收mailbox傳來的值 4. SoC side：此時的 aa_irq status（reg_fsic_aa_irq_sts值）應為1 5. SoC side透過`reg_fsic_aa_irq_sts = 1;`將 reg_fsic_aa_irq_sts 值寫入**1**，它就會自動變為**0**（表示將reg_fsic_aa_irq_sts值回復成0），非常神奇！ - 記錄一下有改到的部分： - Integrate FIR into FSIC，參考lab1的「記錄一下有改到的部分」： 1. Add FIR-related files from `/lab1-1/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl` (in [Github of Advanced SoC design course](https://github.com/whywhytellmewhy/Advanced-SoC-design)) into `/ASoC_lab4_FSIC_FPGA/rtl/user/user_subsys/user_prj/user_prj1/rtl` 2. Copy the content from `/lab1-1/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl/user_prj1.v` to `/ASoC_lab4_FSIC_FPGA/rtl/user/user_subsys/user_prj/user_prj1/rtl/user_prj1.v` 3. Copy the content from `/lab1-1/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl/rtl.f` to `/ASoC_lab4_FSIC_FPGA/rtl/user/user_subsys/user_prj/user_prj1/rtl/rtl.f` 4. `/lab1-1/lab_1/fsic_fpga/rtl/user/testbench/tc/filelist` 5. Copy the content from `/lab1-1/lab_1/fsic_fpga/rtl/user/testbench/tb_fsic.v` to `/ASoC_lab4_FSIC_FPGA/rtl/user/testbench/tb_fsic.v` - Integrate FIR into Vivado project 的 source files： 1. Add FIR-related files from `/lab1-1/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl` into `/ASoC_lab4_FSIC_FPGA/vivado/vvd_srcs/caravel_soc/rtl/user/user_subsys/user_prj/user_prj1/rtl` 2. Copy the content from `/lab1-1/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl/user_prj1.v` to `/ASoC_lab4_FSIC_FPGA/vivado/vvd_srcs/caravel_soc/rtl/user/user_subsys/user_prj/user_prj1/rtl/user_prj1.v` 3. Copy the content from `/lab1-1/lab_1/fsic_fpga/rtl/user/user_subsys/user_prj/user_prj1/rtl/rtl.f` to `/ASoC_lab4_FSIC_FPGA/vivado/vvd_srcs/caravel_soc/rtl/user/user_subsys/user_prj/user_prj1/rtl/rtl.f` - Modify userDMA 1. Modify "BUF_LEN" in `/ASoC_lab4_FSIC_FPGA/vivado/vitis_prj/hls_userdma/userdma.h` 2. Comment the lines related to "in_Img_width" in `/ASoC_lab4_FSIC_FPGA/vivado/vitis_prj/hls_userdma/userdma.cpp` 3. Develop a new Vitis project as `/ASoC_lab4_FSIC_FPGA/vivado/vitis_prj/userdma_fir` directory - Simulation testbench 1. Modify the original `fsic_tb.sv` into `/ASoC_lab4_FSIC_FPGA/vivado/fsic_tb2.sv` 2. Modify `/ASoC_lab4_FSIC_FPGA/vivado/vvd_caravel_fpga_fsic_sim.tcl` 3. After running `./run_vivado_fsic_sim` in `/ASoC_lab4_FSIC_FPGA/vivado`, 可在`/ASoC_lab4_FSIC_FPGA/vivado/vvd_caravel_fpga_sim/vvd_caravel_fpga_sim.sim/sim_1/behav/xsim`資料夾中找到 .vcd 檔，並在`/Advanced_SoC/lab_4/fsic_fpga/vivado`資料夾中找到 "updma_input.log"、"updma_output.log" - Validation 1. Modify `/ASoC_lab4_FSIC_FPGA/vivado/vvd_caravel_fpga_fsic.tcl` 2. After running `./run_vivado_fsic` in `/ASoC_lab4_FSIC_FPGA/vivado`, open the vivado project located in `/ASoC_lab4_FSIC_FPGA/vivado/vvd_caravel_fpga`, then add our userDMA IP into block design, and 用[這個討論區](https://github.com/bol-edu/HLS-SOC-Discussions/discussions/221)所述的方式接線：「兩條instream跟outstream需要自己手動接到userdmaso跟userdmasir」 3. 將產生的.bit及.hwh檔放置於`/ASoC_lab4_FSIC_FPGA/vivado/jupyter_notebook_fir`資料夾 4. 上傳相關檔案至 onlineFPGA，如[Github協作repository](https://github.com/ZheChen-Bill/ASoC_lab4_FSIC_FPGA/tree/main/vivado/jupyter_notebook_result)中的README.md的敘述 5. 將 onlineFPGA 執行完畢的檔案下載下來並放置於`/ASoC_lab4_FSIC_FPGA/vivado/jupyter_notebook_result`資料夾中 - 過程中遇到的兩大difficulties： - 在 **run simulation** 時，明明有將 "BUF_LEN" 改為64，並重新開一個Vitis project，再輸出IP，還是會產生 deadlock，從 2024.5.17~ 一直到 2024.5.18~ 才解決。最後是在詢問學弟後得知應該只要改 "BUF_LEN" 的值再輸出project應該就可以，於是我們將其他拿來debug用的code改回原樣，突然就可以了，也不知道到底背後的原因出在哪裡 - 在 **run onlineFPGA** 時，在read SoC內部的state（例如 read SOC_CC 位址的值）時，會讓整個板子壞掉，state變成 "unknown"，然後就不能用了。這次debug過程從 2024.5.18~ 大約4:00 a.m.開始，弄壞了好幾塊板子後，已無板子可用，因此必須等到5點~6點的系統重置後才有板子可用。隔天(2024.5.19)下午也弄壞幾塊板子後，終於發現是在configure LA_DMA 或 userDMA 的那段 code 會使「read SoC 內部的state」時板子會壞掉（不知道「write SoC 內部的state」會不會也是不能成功，但因為read不出來，因此無法得知write是否成功），但仍可成功「read LA_DMA/userDMA 內部的state」。因此我們決定「先configure SoC內部的訊號，再configure userDMA的訊號，並且將 LA_DMA 的功能關閉」，最後終於成功了！>w<  ## Final project: Optical Flow - Final project的部分有另外開了一個共用的 [Github repository](https://github.com/whywhytellmewhy/ASoC-Final_project-optical_flow) - 關於 Catapult HLS 的實作部分： 1. 前面的function過程中有遇到超多issue導致HLS的模擬結果與algorithm的結果不同，但當時只記錄在Github的commit中，沒有註記在這裡 2. 在實作`OpticalFlow_flow_calc.h`後，發現testbench的結果除了在邊界處會與algorithm差很多外，在內部也幾乎都error超過1。 **<Debug過程>** 1. 從testbench所print在螢幕上的資訊選定一個pixel (x, y)=(451,62)來觀察 2. 在`OpticalFlow_flow_calc.h` (HLS) 中加入debug用的程式碼： ![image](https://hackmd.io/_uploads/r1Yqly8SR.png) 3. 在`OpticalFlow_Algorithm.h` (algorithm C) 中加入debug用的程式碼： ![image](https://hackmd.io/_uploads/H17l-JIS0.png) 4. 觀察print在螢幕上的結果可發現其實input值都沒有差很多，但output值光是denominator就差10倍： ![image](https://hackmd.io/_uploads/SJFVZJ8SA.png) 5. 想到可能是精細度不夠，在`OpticalFlow_outer_product.h`中： ![image](https://hackmd.io/_uploads/BkUAb1LSR.png) 是兩個`pixel_t` type相乘（`gradient_t` type的component為`pixel_t` type），得到`outer_pixel_t` type（`outer_t` type的component為`outer_pixel_t` type）。但在`OpticalFlow_defs.h`中發現整數部分的確有擴充為 13\*2+1=27 bits，但小數部分卻只剩下 5 bits： ![image](https://hackmd.io/_uploads/H1_oZkUH0.png) 這個quantization error導致後續計算上差異非常大（將前面的值計算出來，即可發現兩者差異的確很大） **<解法>** 將 ```csharp const int OUTER_PIXEL_T_BIT_WIDTH = 32; ``` 改為 ```csharp const int OUTER_PIXEL_T_BIT_WIDTH = 64; ``` print在螢幕上的結果： ![image](https://hackmd.io/_uploads/H1XqdkUrC.png) 計算後可發現變得很接近了！ **<仍有issue>** 雖然前面的那個pixel已經解決，但執行testbench後發現仍有許多pixel的error還是很大，例如 (x, y)=(362,399)： ![image](https://hackmd.io/_uploads/ryFEmxLS0.png) 計算上圖結果發現HLS的unput算出來應該要大約是 $143.485*307.77-189.532*189.532=8237.999426$，也很接近，但卻輸出46.8251189...，看起來是因為overflow的關係，也就是bit數不夠。 **<解法>** 在`OpticalFlow_flow_calc.h` 中的`denominator_value`原本是pixel_t type，但透過其運算`denominator_value = tensor_value.val[0]*tensor_value.val[1] - tensor_value.val[3]*tensor_value.val[3];`得知它在worst case應該要有 $27*2+1=55$ bits的整數部分，但pixel_t type的整數部分卻只有13 bits，難怪會overflow（8237.99426的整數部分至少需15 bits才不會overflow），因此將`denominator_value`改為`vel_pixel_t` type： ![image](https://hackmd.io/_uploads/BJb_wgLBC.png) 同一個pixel的模擬結果就正確了許多： ![image](https://hackmd.io/_uploads/Bk_FNgUBR.png) 另外，也因為`total_output_value`是尚未除過 denominator 的值，因此應也要用64 bits才可避免overflow，因此將`OpticalFlow_defs.h`中的 ```csharp const int VEL_PIXEL_T_BIT_WIDTH = 32; const int VEL_PIXEL_T_INTEGER_PART = 13; ``` 改為 ```csharp const int VEL_PIXEL_T_BIT_WIDTH = 64; const int VEL_PIXEL_T_INTEGER_PART = 56; ``` 即可解決此pixel的問題！ **<仍有issue>** 雖然前面的那些pixel已經解決，但執行testbench後發現仍有許多pixel出現inf的結果，例如 (x, y)=(317,189)： ![image](https://hackmd.io/_uploads/HkNAsl8SC.png) 表示分母為0，可能是因為小數位數不夠，導致無法記錄下很小的值： ![image](https://hackmd.io/_uploads/HJEvjgUBR.png) 由此圖可發現這次的情況是當數值都很小時，可能會因為小數位數不足而使誤差大。 **<解法>** 將`OpticalFlow_defs.h`中的 ```csharp const int VEL_PIXEL_T_INTEGER_PART = 56; ``` 改為 ```csharp const int VEL_PIXEL_T_INTEGER_PART = 32; ``` 使整數位數減少（應該不會那麼常出現worst case），小數位數可因此allocate多一些 ![image](https://hackmd.io/_uploads/BJ9uoxIrC.png) 即可解決此pixel的問題！ **<仍有issue>** 雖然前面的那些pixel已經解決，但執行testbench後發現仍有許多pixel出現inf或error大於1的結果，例如 (x, y)=(785,287)： ![image](https://hackmd.io/_uploads/HJfALGLSA.png) 看起來是小數位數依然不夠，需要再給更多位。但有些pixel的值很大（可能上萬），有些又很小。對於很大的值，其實小數點後的部分影響不大；對於很小的值，整數部分又皆為0，不需花32 bits來記錄，因此目前prefer的解法為將有值的部分 "**adaptive shift**" 到 32 bit，並記錄下shift的量，即可同時兼顧太大及太小的數據。（軟體的部分不會有這個問題，即使是 e-10 的數量級也依然能算出結果。） **<解法>** 如Github中的「`update with 'shift' feature in OpticalFlow_flow_calc.h`」這次commit的更新內容。增加shift的功能後（過程中也遇到type無法成功轉換的問題，最後是先將ac_fixed透過`.to_int()`的method轉換成32-bit的整數，再用整數乘法，結果存至64-bit的整數type輸出，才不出現compile error）即可使很小的小數也能計算出正確結果，且由於分母和分子皆同時乘上 $2^N$(其中$N$表示shift的次數)，因此相除後的結果即為最終結果，不需再shift。 ![image](https://hackmd.io/_uploads/BJWNx-PSC.png) **<仍有issue>** 雖然前面的那些pixel已經解決，但執行testbench後發現仍有許多error大於1的結果，例如 (x, y)=(354,277)： ![image](https://hackmd.io/_uploads/BJDH0zPrC.png) 看起來是當tensor_value更小的時候仍會有error。 **<解法>** 1. 因為改變各種type的bit數都影響不大，只有當`PIXEL_T_INTEGER_PART`從13變小時，才會變好，但也無法到非常接近，因此只好將 algorithm 的各function及HLS的function output皆print出來比對看看，結果發現HLS的Ix應該要是00000...000卻變成11111...111（input frame在此pixel附近連續5個I值為255, 255, 255, 255, 255），導致Ix轉成decimal後的值不為0，尤其在值很小的時候error比例更大： ![image](https://hackmd.io/_uploads/SyyComvS0.png) 2. 在看[Catapult HLS的bluebook](https://cse.usf.edu/~haozheng/teach/cda4253/doc/hls/hls_bluebook_uv.pdf)時，發現有`AC_RND`這個選項： ![image](https://hackmd.io/_uploads/SyHN67vrA.png) 因此試著將 ```csharp typedef ac_fixed<PIXEL_T_BIT_WIDTH,PIXEL_T_INTEGER_PART, true, AC_TRN, AC_WRAP> pixel_t; ``` 改為 ```csharp typedef ac_fixed<PIXEL_T_BIT_WIDTH,PIXEL_T_INTEGER_PART, true, AC_RND, AC_WRAP> pixel_t; ``` 可以變精準很多，Ix也變為00000...000： ![image](https://hackmd.io/_uploads/S1U60QPSC.png) ![image](https://hackmd.io/_uploads/BkDACmwBC.png) 若將其他的type也使用`AC_RND`則可以變更準： ![image](https://hackmd.io/_uploads/Sk9WJVwHC.png) ![image](https://hackmd.io/_uploads/Byuz1EvBR.png) 3. 到目前的版本放在Github的「debug with input of very very small differences, using AC_RND to replace AC_TRN」這個commit中 **<仍有issue>** 雖然前面的那些pixel已經解決，但執行testbench後發現仍有許多error大於1的結果，例如 (x, y)=(358,250)和(586,150)： ![image](https://hackmd.io/_uploads/HJI5ISPSC.png) ![image](https://hackmd.io/_uploads/SJijLHDHA.png) 由圖中「Algorithm_denominator_value」這行可發現這些都是出現在值更小時(可能到 e-20 的等級甚至更小) **<解法>** 將shift的功能改成在超小值的源頭，這樣一來shift之後可以留下更小的bit（目前是將shift功能放在`OpticalFlow_flow_calc.h`中，但此時已經沒有更小的bit(更LSB)的資訊了）。首先觀察上圖發現源頭在「Algorithm_tensor/HLS_tensor」處，因此決定將shift功能改到「`OpticalFlow_tensor_weight_x.h`」中，有底下兩種實作方法： 1. 此版本為Github的「failed version: shift and slice first, then do filtering」這個commit。由於此function的output設定為32 bits，故先將要相乘的兩個值（input以及filter coefficient）皆只留下16 bits，如此一來相乘後就會是32 bits了，因此先將小數值向左shift，並且切(slice)出MSB的16 bits，即可。此方式的好處為只需要16-bit乘16-bit的乘法器即可。但寫完後模擬發現效果沒有很佳（也可能其實很佳，只是因為取的pixel剛好是(Ix,Iy,It)=(0,0,0)的outlier，導致看起來結果差很多）。 2. 此版本為Github的「complete moving 'shift' feature from OpticalFlow_flow_calc.h to OpticalFlow_tensor_weight_x.h」這個commit。做法是先將精準的乘法算出並暫存至 64+32 bits 的register中，再將MSB的32 bits取出並輸出即可。此方法成功再進一步優化 (x, y)=(354,277) 的精準度： ![image](https://hackmd.io/_uploads/rkXj0auSR.png) ![image](https://hackmd.io/_uploads/HJWTRpdSA.png) ![image](https://hackmd.io/_uploads/S1pMJCuHC.png) **<仍有issue>** 雖然前面的那些pixel已經解決，但執行testbench後發現仍有許多error大於1的結果，大部分的denominator都很小(可能到 e-20 的等級甚至更小)，例如 (x, y)=(371,147)： ![image](https://hackmd.io/_uploads/ryWSlRdHA.png) **<解法>** 由上圖可發現其實這個pixel的(Ix,Iy,It)=(0,0,0)，因此算出來的那些很小的數值應該只是quantization error，實際上應趨近0，只要值稍微差一點點，除起來就會有巨大的差異！因此我們應該要在計算LK方程式之前，先判斷此pixel是否為(Ix,Iy,It)=(0,0,0)，若是，則應該直接輸出「u=0, v=0」即可。目前尚未實作此功能。 3. 在Catapult tool中FIFO數目最高限制是128個，因此無法使用原本的方式計算Ix、Iy、It，因為這樣一來會需要將Ix及It透過2個row數目的FIFO存起來，太多了。我們將其改成類似lab2的作法，如下圖為討論後的運算流程： ![image](https://hackmd.io/_uploads/HyzBH4cSA.png =245x206) 因此有重新修改各function之間的interface，並有使用 (x, y)=(354,277) 來確認結果與修改之前(透過`source run.sh > simˋulation_ver1--no_store_pixel_data.log`指令將print在螢幕上的內容存到檔案中)相同。修改後的版本為Github的「rearrange the order of functions: first compute Iy, then compute Ix & It」這個commit。 4. 在push到GitHub時不知道是不是因為ws44負載過重，導致無法上傳： ![image](https://hackmd.io/_uploads/By8KVznr0.png) 上網搜尋error後，經過 [參考資料1](https://stackoverflow.com/questions/77816301/git-error-rpc-failed-http-400-curl-22-the-requested-url-returned-error-400)、[參考資料2](https://www.cnblogs.com/yourstars/p/15533706.html)，使用了參考資料2中的方法2，將buffer提升為1GB（5242880000÷5=1048576000）即可解決：在 "/ASoC-Final_project-optical_flow/.git/config" 中加入此行： ```console [http] postBuffer = 1048576000 ``` - 關於 FSIC-FPGA 的實作部分（類似lab 4的流程），記錄一下有改到的部分： - Integrate OpticalFlow into Vivado project 的 source files 1. Add OpticalFlow-related files （`spram.v`、`concat_rtl.v`） from `/ASoC-Final_project-optical_flow/rtl/user/user_subsys/user_prj/user_prj2/rtl` into `/ASoC-Final_project-optical_flow/vivado/vvd_srcs/caravel_soc/rtl/user/user_subsys/user_prj/user_prj2/rtl` 2. Copy the content from `/ASoC-Final_project-optical_flow/rtl/user/user_subsys/user_prj/user_prj2/rtl/user_prj2.v` to `/ASoC-Final_project-optical_flow/vivado/vvd_srcs/caravel_soc/rtl/user/user_subsys/user_prj/user_prj2/rtl/user_prj2.v` 3. Copy the content from `/ASoC-Final_project-optical_flow/rtl/user/user_subsys/user_prj/user_prj2/rtl/rtl.f` to `/ASoC-Final_project-optical_flow/vivado/vvd_srcs/caravel_soc/rtl/user/user_subsys/user_prj/user_prj2/rtl/rtl.f` - Modify userDMA 1. Modify "BUF_LEN" in `/ASoC-Final_project-optical_flow/vivado/vitis_prj/hls_userdma/userdma.h` 2. Modify some lines (denoted as "--> uncomment in final froject" and "modified in final project") in `/ASoC-Final_project-optical_flow/vivado/vitis_prj/hls_userdma/userdma.cpp` 3. Develop a new Vitis project as `/ASoC-Final_project-optical_flow/vivado/vitis_prj/userdma_opticalFlow` directory :::info 💡 到目前為止對應到 GitHub中的「`[double update][About FSIC-FPGA] copy OpticalFlow-related project into Vivado project; modify userDMA`」這次commit的版本。 ::: - Simulation testbench 1. Copy the content from `/ASoC-Final_project-optical_flow/rtl/user/testbench/tc/pattern` to `/ASoC-Final_project-optical_flow/vivado/test_pattern` 2. Modify `fsic_tb.sv` 3. Modify `/ASoC-Final_project-optical_flow/vivado/vvd_caravel_fpga_fsic_sim.tcl` 4. After running `./run_vivado_fsic_sim` in `/ASoC-Final_project-optical_flow/vivado`, 可在`/ASoC-Final_project-optical_flow/vivado`資料夾中找到 "updma_input.log"、"updma_output.log"、"updma_output_gold.log"。另外，若沒有註解掉這幾行： ![image](https://hackmd.io/_uploads/B1WB-pTr0.png) 則可在`/ASoC-Final_project-optical_flow/vivado/vvd_caravel_fpga_sim/vvd_caravel_fpga_sim.sim/sim_1/behav/xsim`資料夾中找到 .vcd 檔。 - Validation 1. Modify `/ASoC-Final_project-optical_flow/vivado/vvd_caravel_fpga_fsic.tcl` 2. Modify `/ASoC-Final_project-optical_flow/vivado/jupyter_notebook/caravel_fpga_fsic.ipynb` 3. 剩下的部分如上方 [Lab 4](https://hackmd.io/@whywhytellmewhy/r15D2Ao56#Lab-4-Simulation-amp-Validation-of-FSIC-with-FPGA) 中的做法 ### How to run HLS simulation ```console cd optical_flow_catapult make cd bin source run.sh ``` ### How to run simulation about RTL integrated into FSIC ```console cd rtl/user/testbench/tc make all ``` ### How to run FSIC-FPGA simulation ```console cd vivado ./run_vivado_fsic_sim ``` ### How to run FSIC-FPGA validation ```console cd vivado ./run_vivado_fsic ```  ## Optional lab 1: Design for Test - 從「design_for_test_tutorial.pdf」得知： 1. DFT deployment generally follows or in parallel with the synthesis flow，但在這個lab中 DFT will be inserted in the flow after synthesis. 2. 先做synthesis才會有gate-level netlist，接著才能做ATPG。可看PDF中的第14頁的流程圖(Figure 2.1)。 -  ## 附註：實用的 git 指令 - ([參考資料](https://june.monster/git-github-checkout-reset-revert/)) **回復到舊的commit**，但不會刪掉任何的commit，而是新建立一個新commit，內容還原到舊commit時： ```console # 把目前所在的 branch 裡最新的 commit 給逆向操作一遍，做完後會發現多一個 commit，裡面內容是復原最新 commit 的改動 git revert HEAD # 新增一個 commit，內容是反向去復原 6be9cb5 做的改動 git revert 6be9cb5 ``` :::warning :warning: 特別注意： 1. revert 後會直接replace新commit的所有改動，**並非 merge**。也就是說，會完全回到當時的repository狀態，即使後來有新增檔案，也會被刪除 2. 若`revert 6be9cb5`，則會將`6be9cb5`所做的改動還原，也就是回到`6be9cb5`這個commit的「**前一個**」commit時的檔案 ::: ## Reference 1. [如何取得Github的PAT(personal access token)](https://hackmd.io/@cIKaSz9PQoq3bVHvsCI30Q/rkq1vXIgK) 2. [使用Https方式存取Github時不需每次都輸入帳號密碼的設定方式](https://blog.51cto.com/u_15302822/5687803)：`git config --global credential.helper store`（也可不加`--global`，如此一來就只設定當下所在的repository） 3. [關於Github的fork以及其他Git的知識](https://gitbook.tw/chapters/github/pull-request) 4. [To show an image with given size in HackMD.io](https://www.facebook.com/hackmdio/photos/now-you-can-show-the-image-with-given-size/1098614316862443/) 5. [如何修改git的最新一次的commit message](https://gitbook.tw/chapters/using-git/amend-commit1)：`git commit --amend -m "<新訊息>"` 6.