# CVSD Final Note
## Videos
Successive-Cancellation (SC) decoder
- [(2) 錯誤控制編碼(16) 極化碼,編碼. Polar code, encoding - YouTube](https://www.youtube.com/watch?v=nxYcZLaPBPo&list=PLu6Wg31FIjYOnEMlVUKhf2sp57Le6zvXY&index=18)
- [(2) 錯誤控制編碼(17)極化碼,SC 解碼;Polar code, SC decoding - YouTube](https://www.youtube.com/watch?v=YD4fF12TVB4&list=PLu6Wg31FIjYOnEMlVUKhf2sp57Le6zvXY&index=18)
Polar codes transform, SC encoding, SC decoding
(Use tree to illustrate sequence of operations, 只能被一個一個解出來, 使用frozen bit可以跳過)
- [(2) Channel Polarization, Definition of (N,K) Polar Code and Encoding - YouTube](https://www.youtube.com/watch?v=1uYEq4ueOok&list=PLyqSpQzTE6M81HJ26ZaNv0V3ROBrcv-Kc&index=28)
- [(2) Successive Cancellation(SC) Decoder for a General (N,K) Polar Code - YouTube](https://www.youtube.com/watch?v=O3JWkvEY8Lc&list=PLyqSpQzTE6M81HJ26ZaNv0V3ROBrcv-Kc&index=32)
## Background for Basic SC decoder
Use N=4 as an example

**Function**
- Internal node: step **F**, **G**, **U(Combine)**
- Leaf node: Return decision **u**

**Node operation**
- If not leaf, do the following in sequence
1. Do step **F** and go to left child
2. When decision is received from left child, do Step **G** and go to right child
3. When decision is received from right child, do Step **U(Combine)** and go to parent
- If leaf, **hard decision** (also can viewed as sign)
Sequence of Operations, use (K,N)=(10,16) as example

## Improvement
ref
- [Fast Polar Decoders: Algorithm and Implementations](https://arxiv.org/pdf/1307.7154.pdf)
- [Area-Efficient Partially-Pipelined Architecture for Fast-SSC Decoding of Polar Codes](https://ieeexplore.ieee.org/document/9827219)
- [Multi-mode Unrolled Architectures for Polar Decoders](https://arxiv.org/pdf/1505.01459.pdf)
- **[A Semi-Parallel Successive-Cancellation Decoder for Polar Codes](https://ieeexplore.ieee.org/document/6327689)**
這篇講蠻仔細的,有列出需要的 alpha-RAM 跟 beta-RAM,PE的數量,計算複雜度與 resource 利用率
Based some observations, some nodes can be merged as a new-node to reduce traversals, following are some conditions to merge, see below example

**發生頻率: Rate-0 > Rate-1 > Rep >>> SPC**
| New-node | conditions |
| ------ |--|
| Rate-1 | root of sub-tree whoose leaves are all information bits</br>Output is equal to the **sign of its inputs** (just treat as leaf node) |
| Rate-0 | root of sub-tree whoose leaves are all frozen bits</br>Output is **0**</br> |
| Rep. | root of a sub-tree whoose leaves except the last leaf are frozen bits</br> The only information bit gets repeated over N_v output |
| SPC. | root of a sub-tree whoose leaves except the first leaf are information bits</br>|
## Hardware Design
1. output routing for different K
2. node type identification for different K
3. **frozen conditions** 希望可以用數學式表達
- [Channel Polarization: A Method for Constructing Capacity-Achieving Codes for Symmetric Binary-Input Memoryless Channels](https://ieeexplore.ieee.org/document/5075875)
- [How to Construct Polar Codes](https://ieeexplore.ieee.org/document/6557004)
結論: 他們都沒有提到找 frozen bit (排reliability sequence) 的方式,太複雜直接跳過,看來只能 look up 助教提供的 reliability sequence 得知 leaf node 是否 frozen,目前的想法是 non-leaf node 的 node type 可以由該 node 的兩個 children node 的 node-type 求得
Functions need to be implemented by decoder, refer to paper [Fast Polar Decoders: Algorithm and Implementations](https://arxiv.org/pdf/1307.7154.pdf)

## Idea
* 用一個sequence存frozen/info
* 判斷node type
* 判斷是否輸出
* index -> reliability? K變動就要從頭來過(?
1. N=128, 256, 512的reliability都寫進去
* coding方便,不確定繞線cost?
2. 只用N=512的reliability來判斷frozen/info
* counter: 對N=128, 256會多花很多cycle,但應該可以平行做
* N=128: 0~407 (or 0~300)
* N=256: 0~491 (or 0~443)
* N=512: 0~511
* reliability -> index? K變動只重新判斷ΔK個、要寫3組
* Storage:
* if_seq: 512 bits
* L: total **512 x 12** bits on the path (不包括 input llr storage), input llr 的儲存可以跟 L share,不然就需要 **1024 x 12** bits
* u: **512 x 1** bits
* ucap: **512 x 1** bits
* state table for each node: 1023 internal nodes, use 3 bits to represent a state, need **1023 x 3** bits
* state
* L
* R
* U
* Rate-0
* Rate-1
* Rep
## Sub-Module

## Progress
### Synthesis
* clk: 6.5 (15-bit without known_K)
* area: 548939 (worse)
* clk: 6.5 (15-bit with known_K)
* area: 542650
* note: timing violation before sending data (@32.5ns), **better A*T**
* clk: 6.0 (15-bit with known_K)
* area: 598803
* note: timing violation after sending data (@36ns) (failed?)
* clk: 5.8 (15-bit with known_K)
* area: 635696
* note: timing violation before sending data (@5848ps)
* clock gating (remove unknown_K)
* clk: 6.0
* area: 481846
* 8.2840 mW
* clock gating (15-bits remove unknown_k)
* clk: 6.0
* area 620435
* 13.0006 mW
* clock gating (15-bits remove unknown_k)
* clk: 6.5
* area 548611
* 8.68 mW
* clock gating (15-bits remove unknown_k) **better**
* clk: 7.0
* area 511365
* 7.83
### APR
* clk: 6.5ns (15-bit with known_K)
* area: 775239.92 um^2
* power: 29.5 mW
* note: line 73756 'ANTENNA' is unresolved
timing violation before sending data (@606ps)
* clock gating (15-bits remove known_k)
* clk: 7.0
* area: 730873.28 um^2
* power: 14.7 mW
* note: timing violation before sending data (@899ps)
Warning message

## Improvement
相較於 baseline 的改動 12/12
- `polar_deocder.v` if_seq 更新,port 中的 `clear` 訊號拿掉
可以改進的地方
- nodetype判斷 (至少針對baseline pattern優化)
- 存少一點L
- 每層回傳ucap也許不需要 1 cycle/layer
- LLR input排程