---
# System prepended metadata

title: Digital Design Interview Note
tags: [工作]

---

# Digital Design Interview Note

[TOC]

## ICD Knowledge

### Latch
![](https://i.imgur.com/E6NY4ca.png =50%x)


### Flip-flop
![](https://i.imgur.com/tv3FbCD.png)


### **Metastability**
metastable會發生在flip-flop的output(Q pin)穩定的時間大於clk-to-q($t_{cq}$)時間 。
![](https://i.imgur.com/y3WG22Q.png)
![](https://i.imgur.com/kp5y9dk.png)

`Solution (double flip-flop synchronizer)`：直接在後一級的flip-flop後面再接上由相同clk驅動的flip-flop。

### **Clock domain crossing** (CDC)

#### **single bit signal**
$\rightarrow$可用double flip-flop synchronizer解決 
但不適合用於pulse signal，因此使用Pulse synchronizer

**Pulse synchronizer**：透過XOR gate將pulse signal轉為level signal，穿過double flip-flop後，再透過XOR gate將level signal轉回pulse signal
![](https://i.imgur.com/UuK9bvn.png)
Pulse synchronizer之侷限性：source clk domain的連續兩個pulse之間的間隔要足夠大，滿足destination domain的3個edge要求。

#### **Multi bit signal**
無法使用2F/F synchronizer同步multi-bits的data，因為2F/F synchronizer的delay有隨機性，可能花一個cycle同步，也可能花兩個cycle同步，這會造成multi-bits的每一個不穩定。

Three solutions：
* **Load signal**：使用pulse synchronizer產生load signal
![](https://i.imgur.com/gDR7VW7.png)

* **額外兩級flip-flop**：利用double flip-flop synchronizer將data同步到destination domain，接著透過兩級flip-flop將data敲2級，接著比較這3級，如果皆相等代表synchronizer同步到的值為穩定的。
* **Asynchronous FIFO**


### **Asynchronous FIFO**
處理multi-bit CDC Problem

將read pointer & write pointer 轉為gray code表示，由於gray code在每個edge只會有一個bit改變，因此可透過2F/F synchronizer將其轉移到destination domain。

#### Gray code 編碼方式
1. 相鄰的兩個gray code之間只有其中一個bit不同
2. 當binary code第N個bit從0變到1的時候，之後grap code的N-1個bits會跟前半段軸對稱，而N bit之前的bits一樣
![](https://i.imgur.com/Mfsh1nk.png)
![](https://i.imgur.com/CVVPyQy.png)

如果Asynchronous FIFO depth 不是2的冪次方，則利用gray code對稱軸的特性，改變起始點，確保每個相鄰的gray code只有一個bit的變化。
![](https://i.imgur.com/yBLXnAv.png)


## **MUX**

### **Full adder**


|  a  |  b  | Cin | Sum | Cout |
|:---:|:---:|:---:|:---:|:----:|
|  0  |  0  |  0  |  0  |  0   |
|  0  |  0  |  1  |  1  |  0   |
|  0  |  1  |  0  |  1  |  0   |
|  0  |  1  |  1  |  0  |  1   |
|  1  |  0  |  0  |  1  |  0   |
|  1  |  0  |  1  |  0  |  1   |
|  1  |  1  |  0  |  0  |  1   |
|  1  |  1  |  1  |  1  |  1   |

$Sum=a\oplus b \oplus Cin$
$Cout=ab+aCin+bCin$

![](https://i.imgur.com/k9lPhQQ.png)

## **Synthesis**

### **Technology library**
- fast.db
- slow.db
- typical.db

### **Undefined interconnect**
Can be solved by wire load model

### Clock gated
Clock signal arrives only when data is to be switched
$\rightarrow$ Reduce dynamic power dissipation
![](https://i.imgur.com/oazpEi2.png)

CG with AND gate may have glitch due to unstable enable signal
$\rightarrow$ **Glitch prevention**：Enable generated by <font color="#FF0000">latch with negative clk </font>
![](https://i.imgur.com/WfrnP41.png)


## STA

### **DTA v.s. STA**
* `DTA (Testbench simulation)`
    Cos.    
    1. Imposible to do exhaustive analysis.
    2. Hard to identify the cause of failure.
    3. Need more resource.
* `STA`
    Cons.
    1. Synchronous only.
    2. Tricky constraints on some special case.
        - False path
        - Multicycle path
        - Multiple clocks 
 
### **Types of timing path**
* reg (clk) $\rightarrow$ reg (D)
* reg (clk) $\rightarrow$ OUTPUT
* INPUT $\rightarrow$ reg (D)
* INPUT (clk) $\rightarrow$ OUTPUT
 
### **Type of STA**
*  Path-based STA
    - 真實情況，考慮每個path實際delay
    - 計算複雜
*  Block-based STA
    - 只考慮每個node的best/worst case
    - 較悲觀
 
### **Setup & Hold check**
* `Setup` (Max delay)
$AT=T_0+T_{clk,latency}+T_{cq}+T_{pd}$
$RT=T_0+T_{clk,latency}-T_{skew}+T_{cycle}-T_{setup}$
$Slack=RT-AT$
![](https://i.imgur.com/MNbkt4s.png)
 
* `Hold` (Min delay)
$AT=T_0+T_{clk,latency}+T_{cq}+T_{pd}$
$RT=T_0+T_{clk,latency}+T_{skew}+T_{hold}$
$Slack=AT-RT$
![](https://i.imgur.com/ZJxtQoR.png)

### **Delay bound of D flip-flop**

![](https://i.imgur.com/NLxP60C.png)
![](https://i.imgur.com/Jd9shRH.png)

### **Special timing path**
* `False paths` 會被STA忽略的timing path
    1. Unexercised path
        - 正常情況下不會使用的path 
        - e.g., probe for debugging
    2. Irrelevant path
        - 速度太慢、不在意速度的path 
        - e.g., reset
    3. Asynchronous path
        - 在不同clock domains的path
        - clock domain crossing (CDC) : transfer data from clk1 to clk2
        - CDC 需進階的timing去修正
    4. Logically impossible path
        - 存在於電路，但不可能會有data經過
        - 應該被PrimeTime發現
    5. Combinational loops

* `Multicycle paths` 會花費超過一個cycle的timing path

## **除頻電路**

### **除2電路**

#### Without cnt

```javascript= 
module div2 (
   input      clk,
   input      rst_n,
   output reg o_clk
);
 
always@(posedge clk or negedge rst_n) begin
    if (!rst_n)
        o_clk <= 1'b0;
    else
        o_clk <= ~o_clk;
end
 
endmodule
```

#### With cnt

```javascript= 
module div2 (
   input      clk,
   input      rst_n,
   output reg o_clk
);

reg cnt;
 
always@(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
        o_clk <= 1'b0;
        cnt <= 1'b0;
    end
    else begine
        o_clk <= (cnt < 1) ? 1'b0 : 1'b1;
        cnt <= cnt + 1'b1;
    end
end
 
endmodule
```  

### **除N電路**
```javascript= 
module divn    (
  input  clk,
  input  rst_n,
  output o_clk
);

parameter WIDTH = 3;
parameter N     = 6;

reg [WIDTH-1:0] cnt_p;
reg [WIDTH-1:0] cnt_n;
reg             clk_p;
reg             clk_n;

assign o_clk = (N == 1) ? clk :
               (N[0])   ? (clk_p | clk_n) : (clk_p);
       
always@(posedge clk or negedge rst_n) begin
  if (!rst_n)
    cnt_p <= 0;
  else
    cnt_p <= (cnt_p == (N-1)) ? 0 : cnt_p + 1;
end

always@(posedge clk or negedge rst_n) begin
  if (!rst_n)
    clk_p <= 1;
  else 
    clk_p = (cnt_p < (N>>1));  
end

always@(negedge clk or negedge rst_n) begin
  if (!rst_n)
    cnt_n <= 0;
  else
    cnt_n <= (cnt_n == (N-1)) ? 0 : cnt_n + 1;
end

always@(negedge clk or negedge rst_n) begin
  if (!rst_n)
    clk_n <= 1;
  else
    clk_n = (cnt_n < (N>>1));
end

endmodule
```
除5電路 waveform
![](https://i.imgur.com/AKh5gQR.png)


## **數位IC面試心得**

### **MTK**
部門：CAI, SPD1, SPD3, ADCT
考題：setup/hold計算, CDC(multi bit), false path & multi cycle差異和設值, input/output delay, 同步非同步差異, gray code, 給Waveform寫出RTL , IC design flow, systemverilog , 用mux組出adder和乘法器, async FIFO,  用NAND拼出OR, 除3電路, pre/post sim差異, 合成需要哪些檔案, 除1.5倍電路, cache如何加速cpu, clock gated

### **RTK**
部門：CN, RDC, MM
考題：blocking/non-blocking, 除頻電路, critical path計算, 用兩個latch組出DFF, low power design, 反向計數器, clock skew對setup/hold影響,  setup/hold violation解決方法, 什麼寫法會有latch, 用mux設計出NAND, IC design flow, 合成時clock timing如何決定, hold time可以為0否, pipeline, multicycle值怎麼設定, PVT violation

### **NTK**
部門：TCON, iHome
考題：除2電路(不要用cnt處理),給RTL畫出合成電路,解釋metastability, input/output delay作用&設值大小 , pipeline處理, fault coverage,  set max/min delay for violation, async rst問題, blocking/non-blocking, 用NAND/NOR組出comb電路, setup/hold有負值情況, 為什麼hold time跟clk沒關係, full adder, 進制轉換

### **PHISON**
部門：SSD, EMMC
考題：setup/hold不等式, clock gated, IC design flow, 同步非同步, 除頻電路, 用PMOS和NMOS組NAND & INV, CDC(multi bit), metastability, Xor truth table, multicycle & false path, 用mux設計出XOR, 計數器input到output要多少cycle ,critical path, 如何避免clock skew/latch, cell library的hold time有什麼特色, systemverilog作用

### **SMI**
部門：UFS, SSD
考題：CDC, FSM, 用and/or畫出mux, low power design, timing violation處理方式, IC design flow, sdf內容, latch/DFF差異, glitch產生原因, 用NOR拼出AND, input/output delay, 說明timing report , 敲兩級DFF就可以解所有CDC否, 畫出clock/data path of setup/hold, MUX怎麼擺才能省面積, 用inv和mux組Xor, 除3電路, edge/level trigger

### **GUC**
部門：APR, DFT
考題：APR flow, power ring, CTS作用, IR drop, scan chain,
 test/fault coverage差異, scan reorder, lockup latch, BIST, stuck at fault , Transition delay fault,  cross talk, electromigration, 畫SDFF, Boundary scan, 為什麼需要DFT, DRC/LVS, ATPG, 如何確定APR function和RTL相同, APR command, 先進process在back-end會碰到的問題, LEF和DB檔內容, wire load model

## **Reference**

### CDC
https://www.zhihu.com/people/li-hong-jiang-54

### 數位IC面試心得
https://www.dcard.tw/f/tech_job/p/238023076

### 除頻器
https://www.cnblogs.com/oomusou/archive/2008/07/31/verilog_clock_divider.html

###### tags: `工作`