# **軟硬體協同設計 HW1**
### A Faster 4x4 Array Multiplier Implementation
##### 國立高雄大學 電機工程學系
##### M1145137 方信筌
##### 老師:林宏益 教授
#
# (一) 實驗目的
本次作業主要練習如何改善課堂作業中提到的
*Lab. 1: 4x4 Carry-Propagate Array Multiplier*
並且在EDA playground 上重新撰寫一個 4x4 CSA Multiplier
# (二) 實驗步驟
1. 先參考至*Lect_4_RTL_Coding_2-2.pdf*這份講義中所提出的4x4 Carry-Propagate Array Multiplier,以下是其Testbench.sv和Waveform
#### Testbench.sv
```
// Code your testbench here
// or browse Examples
module rca_arrayMultipiler_4x4_tb;
parameter DIN_WIDTH = 4;
parameter DOUT_WIDTH = 8;
reg[DIN_WIDTH-1:0] xin, yin;
reg[DOUT_WIDTH-1:0] result;
top top_u0(.x(xin), .y(yin),
.product(result)
);
initial
begin
$dumpfile("multiperResult.vcd");
$dumpvars(1, top_u0);
xin = 4'd0; yin =4'd0;
#50 xin = 4'd1; yin =4'd1;
#50 xin = 4'd1; yin =4'd3;
#50 xin = 4'd1; yin =4'd5;
#50 xin = 4'd1; yin =4'd7;
#50 xin = 4'd1; yin =4'd9;
#50 xin = 4'd1; yin =4'd11;
#50 xin = 4'd2; yin =4'd1;
#50 xin = 4'd4; yin =4'd1;
#50 xin = 4'd6; yin =4'd1;
#50 xin = 4'd8; yin =4'd1;
#50 xin = 4'd10; yin =4'd1;
#50 xin = 4'd5; yin =4'd5;
#50 xin = 4'd6; yin =4'd6;
#50 xin = 4'd7; yin =4'd7;
#50 xin = 4'd8; yin =4'd8;
#50 xin = 4'd9; yin =4'd9;
#50 xin = 4'd10; yin =4'd10;
#50 xin = 4'd11; yin =4'd11;
#50 xin = 4'd12; yin =4'd12;
#50 xin = 4'd13; yin =4'd13;
#50 xin = 4'd14; yin =4'd14;
#50 xin = 4'd15; yin =4'd15;
#50 $finish;
end
endmodule
```
#### Waveform

2. 撰寫自己的4x4 CSA Multiplier以及Testbench(多半參考github和老師講義)
3. Run 出結果
# (三) 實驗結果
#### design.sv
```
//codes from https://github.com/cvonk/FPGAmath/tree/main/multiplier/carry_save/altera
// Math, carry save multiplier, main
//
// Platform: Altera Cyclone IV using Quartus >=16.1
// Documentation: https://coertvonk.com/hw/building-math-circuits/faster-parameterized-multiplier-in-verilog-30774
//
// GNU GENERAL PUBLIC LICENSE Version 3, check the file LICENSE for more information
// (c) Copyright 2015-2016,2022, Johan Vonk and Coert Vonk
// All rights reserved. Use of copyright notice does not imply publication.
// All text above must be included in any redistribution
//-----------------------------------------------------------------------------------------------------
// Math, carry save multiplier, multiplier adder module
//
// Platform: Altera Cyclone IV using Quartus >=16.1
// Documentation: https://coertvonk.com/hw/building-math-circuits/faster-parameterized-multiplier-in-verilog-30774
//
// GNU GENERAL PUBLIC LICENSE Version 3, check the file LICENSE for more information
// (c) Copyright 2015-2016,2022, Johan Vonk and Coert Vonk
// All rights reserved. Use of copyright notice does not imply publication.
// All text above must be included in any redistribution
//`timescale 1ns/1ps
`default_nettype none
// 1-bit CSA Cell
module math_multiplier_ma_block(
input wire x, y, si, ci,
output wire so, co
);
wire p = x & y;
assign so = si ^ p ^ ci;
assign co = (si & p) | (ci & (si ^ p));
endmodule
// CSA Multiplier
module math_multiplier_carrysave #(parameter N = 4)(
input wire [N-1:0] a,
input wire [N-1:0] b,
output wire [2*N-1:0] p
);
wire [N+1:0] s [N:0];
wire [N+1:0] c [N:0];
generate
genvar ii, jj;
for (ii = 0; ii <= N; ii = ii + 1) begin: gen_ii
for (jj = 0; jj < N; jj = jj + 1) begin: gen_jj
math_multiplier_ma_block ma(
.x (ii < N ? a[jj] : (jj > 0) ? c[N][jj-1] : 1'b0),
.y (ii < N ? b[ii] : 1'b1),
.si (ii > 0 && jj < N - 1 ? s[ii-1][jj+1] : 1'b0),
.ci (ii > 0 ? c[ii-1][jj] : 1'b0),
.so (s[ii][jj]),
.co (c[ii][jj])
);
if (ii == N) assign p[N+jj] = s[N][jj];
end
assign p[ii] = s[ii][0];
end
endgenerate
endmodule
```
#### testbench.sv
```
//idea came from "https://github.com/cvonk/FPGAmath/blob/main/multiplier/carry_save/altera/math_multiplier_carrysave_tb.v"
//rebuilt with the help of "Lect_4_RTL_Coding_2-2.pdf"
// Math, carry save multiplier, test bench
//
// Platform: Altera Cyclone IV using Quartus >=16.1
// Documentation: https://coertvonk.com/hw/building-math-circuits/faster-parameterized-multiplier-in-verilog-30774
//
// GNU GENERAL PUBLIC LICENSE Version 3, check the file LICENSE for more information
// (c) Copyright 2015-2016,2022, Johan Vonk and Coert Vonk
// All rights reserved. Use of copyright notice does not imply publication.
// All text above must be included in any redistribution
//`timescale 1ns / 1ps
`default_nettype none
module top(
input [3:0] x,
input [3:0] y,
output [7:0] product
);
// 抄範例
math_multiplier_carrysave #(4) csa_u0(
.a(x),
.b(y),
.p(product)
);
endmodule
module rca_arrayMultipiler_4x4_tb;
reg [3:0] xin, yin;
wire [7:0] result;
top top_u0(.x(xin), .y(yin), .product(result));
initial begin
$dumpfile("multiperResult.vcd");
$dumpvars(0, top_u0);
xin = 4'd0; yin = 4'd0;
#50 xin = 4'd1; yin = 4'd1;
#50 xin = 4'd1; yin = 4'd3;
#50 xin = 4'd1; yin = 4'd5;
#50 xin = 4'd1; yin = 4'd7;
#50 xin = 4'd1; yin = 4'd9;
#50 xin = 4'd1; yin = 4'd11;
#50 xin = 4'd2; yin = 4'd1;
#50 xin = 4'd4; yin = 4'd1;
#50 xin = 4'd6; yin = 4'd1;
#50 xin = 4'd8; yin = 4'd1;
#50 xin = 4'd10; yin = 4'd1;
#50 xin = 4'd5; yin = 4'd5;
#50 xin = 4'd6; yin = 4'd6;
#50 xin = 4'd7; yin = 4'd7;
#50 xin = 4'd8; yin = 4'd8;
#50 xin = 4'd9; yin = 4'd9;
#50 xin = 4'd10; yin = 4'd10;
#50 xin = 4'd11; yin = 4'd11;
#50 xin = 4'd12; yin = 4'd12;
#50 xin = 4'd13; yin = 4'd13;
#50 xin = 4'd14; yin = 4'd14;
#50 xin = 4'd15; yin = 4'd15;
#50 $finish;
end
initial begin
$monitor("Time=%0t : xin=%d yin=%d result=%d", $time, xin, yin, result);
end
endmodule
```
## 執行結果
除了看到Waveform之外,由於使用 $monitor 函式追蹤也可以看到系統紀錄的log檔,確保 $4 \times 4$ CSA 乘法器(math_multiplier_carrysave 模組)能夠對所有可能的 $4$-bit 輸入(0 到 15)產生正確的 $8$-bit 乘積。
#### log檔
| 時間 (Time) | 輸入 X (xin) | 輸入 Y (yin) | 實際輸出 (result) | 期望值 (xin × yin) |
| -------- | -------- | -------- |-------- | -------- |
0| |0| 0| 0| 0|
50| 1| 1| 1| 1|
100| 1| 3| 3| 3
150| 1| 5| 5| 5
200| 1| 7| 7| 7
250| 1| 9| 9| 9
300| 1| 11| 11| 11
350| 2| 1| 2| 2
400| 4| 1| 4| 4
450| 6| 1| 6| 6
500| 8| 1| 8| 8
550| 10| 1| 10| 10
600| 5 |5| 25| 25
650| 6| 6| 36| 36
700| 7| 7| 49| 49
750| 8| 8| 64| 64
800| 9| 9| 81| 81
850| 10| 10| 100| 100
900| 11| 11| 121| 121
950| 12| 12| 144| 144
1000| 13| 13| 169| 169
1050| 14 |14| 196| 196
1100| 15 |15| 225| 225
#### log實際截圖

#### Waveform

#### Schemetic
來自網路上找的到電路圖,恰好可以符合本次作業的需求

##### CSA Block

# (四) 實驗心得
雖然說這並非第一次在修這堂課之後,使用EDA cloud來完成作業,但我在撰寫過程還是不太理解部分邏輯,到最後之能大量借鑑參考資料的資料以確保作業準確性。但即便如此,我個人在編寫testbench過程還是漏洞百出,花了好一陣子也無法確認我所寫的內容是否完全正確,只能期待日後可以慢慢加強。
# (五) 參考文獻
[彭皓楷/軟硬體實驗報告二]
https://hackmd.io/@NLfB0EwWT9im7FMj8Ae9GQ/Hy6ni9fbp#%E5%9B%9B-%E5%AF%A6%E9%A9%97%E7%B5%90%E6%9E%9C
[wikipedia/Carry-save_adder]
https://en.wikipedia.org/wiki/Carry-save_adder
[github/cvonk/FPGAmath]
https://github.com/cvonk/FPGAmath
[Multiplier circuit Johan Vonk]
https://coertvonk.com/hw/building-math-circuits/parameterized-multiplier-in-verilog-30772
[A faster multiplier circuit]
https://coertvonk.com/hw/building-math-circuits/faster-parameterized-multiplier-in-verilog-30774
[SystemC-based SoC Accelerator Design:
The Multiplier]
https://www.ecb.torontomu.ca/~courses/coe838/labs/lab2a.pdf