# NCKU DSP Arch NOTE contributed by <[`yanjiun`](https://github.com/yanjiunhaha)> ###### tags: `mynote` 課程:[VLSI design for digital communication systems -N2109 of NCKU](http://vlsilab.ee.ncku.edu.tw/vlsi.html) 學生:顏君翰 MS student 指導老師:[謝明德](http://vlsilab.ee.ncku.edu.tw/prof.html) PhD ## Retiming (請假) * 目標: * Clock period miniretiming * Registers minimization (Linear Problem) * 特性: * Retiming dose not alter the iteration bound in a DFG * Retiming 使用 Shortest Path Algorithm 解 constrains graph 得出 $r(V)$ * Constrains 有三種 1. feasible constrains : 不能讓邊上變負的。(必須) 2. period constrains : 增加最大延遲條件。(通常會加上,需使用合理時間) 3. registers constrains : 增加某點的fanout條件。(使問題變成線性規劃問題,解題複雜度大幅提升,斟酌使用) * 如要使用需要最先使用 registers constrains,再算後續條件。 ### Demo * 對以下電路實作完整 Retiming ![](https://i.imgur.com/Uw3faR2.png) :::info 1. feasible constrains 2. Registers constrains 3. Period constrains ::: #### Fesaible constrains (origin DFG) 轉換為 DFG 表示法。 ![](https://i.imgur.com/BNcGZbI.png) #### Registers minimization (add dummy node and breath $\beta$ ) 增加虛擬點在要限制暫存器之 fanout ==$w(\hat{e})=w_{max}-w(e), where\ w_{max}=max\{W_{out}(Node)\}$== ==$\beta=count.(E_{out}(Node))$== 1. 限制點 $Node_1$ 2. $w_{max}=2D$ $\beta=2$ 3. $w_{35}=2D-w_{13}=D$ $w_{45}=2D-w_{14}=0$ ![](https://i.imgur.com/sqPPBgG.png) #### Clock period constrains (computing $W(U,V)$ and $D(U,V)$) ==$w'=Mw(e)-t(U),where\ M=t_{max}N$== ![](https://i.imgur.com/Cl9Oj7C.png) ==$S'_{UV} : 對上圖做 Shorest Path Algorithm$== ![](https://i.imgur.com/OmKBJ02.png) ==$W(U,V)=\lceil \frac{S'_{UV}}{M}\rceil,when\ U\ne V$== ![](https://i.imgur.com/vzuNb6w.png) ==$D(U,V)=MW(U,V)-S'_{UV}+t(V)$== ![](https://i.imgur.com/1Yt7QVf.png) * 限制最長延遲 $c=2$ ==$r(U)-r(V) \leq W(U,V)-1,\ when\ D(U,V)>c$== ![](https://i.imgur.com/rrv4Otg.png) * 加上原圖 ![](https://i.imgur.com/h6LEYcr.png) ==使用 Shortest path algorithm 求解== * 並且最小化 COST [ILP Algorithm](https://www.youtube.com/watch?v=vx-hDqqYzKw) ![](https://i.imgur.com/zEQwPC8.png) ![](https://i.imgur.com/Hr8vs8s.png) ### Definations * $G$: Original DFG, $G_r$: Retimed DFG * $r(V)$: Delay transferred from outgoing edges of node V to the incident deges of node V. ```graphviz digraph D { rankdir=LR subgraph cluster_rU { label = "r(U)"; style=dotted U } subgraph cluster_rV { label = "r(V)"; style=dotted V } fw[shape=point width=0 ] fw -> U w[shape=point width=0 label=w] U -> V [label=w] //w [arrowhead = none] //w -> V [label=w] //w -> fw } ``` * $w(e)$: weigth(delay) of the edge $e$ * $w_r(e)$: weight(delay) of the dege $e$ after retiming. * $U \xrightarrow{w_r(e)} V$ after retiming has a weight of $w_r(e)=w(e)+r(V)-r(U)$ * 因為當前的個數 $w(e)$ 加上前面拿入的 $r(V)$ 減去拿出到後面的 $r(U)$ * weighty of path $w(p)=\sum_{i=0}^{k-1}w(e_i)$ * path computaion time $t(p)=\sum_{i_0}^{k}t(V_i)$ #### Minimium clock period * $\Phi(G)=max\{t(p):w(p)=0\}$ * $W(U,V)=min\{w(p):U\xrightarrow{p}V\}$ * Minimum number of registers on any path from node U to node V * $D(U,V)=max\{t(p):U\xrightarrow{p}V\ and\ w(p)=W(U,V)\}$ * Maximum computation time among all paths from U to V with weight $W(U,V)$ * $M=t_{max}n$ * $t_max$:max node computation time * $n$: number of nodes * $w'(e)=Mw(e)-t(U)$ * for all edges to create new graph $G'$ * $S'_{UV}$: Solve shortest path problem on $G'$ * $If\ U \ne V\ then$ $W(U,V)=\lceil{\frac{S'_UV}{M}}\rceil$ $D(U,V)=MW(U,V)-S'_{UV}+t(V)$ * $If\ U = V\ then$ $W(U,V)=0$ $D(U,V)=t(U)$ #### Minimium registers * $R_v=\max_{V\xrightarrow{e}?}\{w_r(e)\}$ * non-linear function, need reformulate. ==Reformulate graph== 1. add a dummy node $\hat{U}$ to $V$ * $t(\hat{U})=0$ : with no computing time. * $w(\hat{e_i})=w_{max}-w(e_i)$, where $w_{max}$ 是 $V(e)$ 邊中最大的延遲 * $\beta=1/k$ : 原本$V$點有$k$條邊,$\hat{U}$每條邊則為$1/k$條 ![](https://i.imgur.com/TuWu7Qh.png) * Registers constrains * Minimium the $\sum_V{r(V)(\sum_{?->V}\beta(e)-\sum_{V->?}\beta(e))}$ ![](https://i.imgur.com/YG7mOqO.png) ### 介紹 forward cutsets retiming(pipeline) < cutsets retiming < retiming * Retiming ![](https://i.imgur.com/fVjnF7R.png) * Cutset retiming rules ![](https://i.imgur.com/WQcN5iz.png) * forward cutsets retiming(pipeline) ![](https://i.imgur.com/p3SZZc3.png) * $uN-slow$ DFG ==Replace each delay in a DFG with N delays.== ![](https://i.imgur.com/JuzM3Fd.png) ![](https://i.imgur.com/uJzGxrJ.png) * Retiming 2-slow Graph ![](https://i.imgur.com/z5cb6qY.png) ### Shortest Path Algorithm #### Floyd-Warshall algorithm ==如果沒有*負圈*,就會找到最短路徑== ```graphviz digraph spa{ rankdir=LR 1 -> 2 [label=-3] 2 -> 3 [label=1] 2 -> 4 [label=2] 3 -> 4 [label=2] 4 -> 1 [label=1] } ``` 1. Initial frist matrix by DFG $R^{(1)}= \begin{bmatrix} \infty & -3 & \infty & \infty \\ \infty & \infty & 1 & \infty \\ \infty & \infty & \infty & 2 \\ 1 & \infty & \infty & \infty \end{bmatrix}$ * if $k^{(k+1)}(U,V)>r^k(U,k)+r^k(k,V)$ 2. $R^{(2)}= \begin{bmatrix} \infty & -3 & \infty & \infty\\ \infty & \infty & 1 & 2 \\ \infty & \infty & \infty & 2\\ 1 & -2 & \infty & \infty & \end{bmatrix}$ * if $k^{(k+1)}(U,V)>r^k(U,k)+r^k(k,V)$ 3. $R^{(3)}= \begin{bmatrix} \infty & -3 & -2 & -1\\ \infty & \infty & 1 & 2\\ \infty & \infty & \infty & 2\\ 1 & -2 & -1 & 0 & \end{bmatrix}$ * if $k^{(k+1)}(U,V)>r^k(U,k)+r^k(k,V)$ 4. $R^{(4)}= \begin{bmatrix} \infty & -3 & -2 & -1\\ \infty & \infty & 1 & 2\\ \infty & \infty & \infty & 2\\ 1 & -2 & -1 & 0 & \end{bmatrix}$ * if $k^{(k+1)}(U,V)>r^k(U,k)+r^k(k,V)$ 5. $R^{(5)}= \begin{bmatrix} 0 & -3 & -2 & -1\\ 3 & 0 & 1 & 2\\ 3 & 0 & 1 & 2\\ 1 & -2 & -1 & 0 & \end{bmatrix}$