## This Week TODO
- 看 [In-Memory Data Parallel Processor](https://dl.acm.org/doi/pdf/10.1145/3173162.3173171) 是怎麼把一些比較複雜的operation去做分解(lowering instruction)
## In-Memory Data Parallel Processor
### Introdruction
The aim is to design a programmable in-memory processor architecture and a data-flow based programming framework that can levrage the massive parallelism and low data movement of Non-Volatile Memories (NVMs).
### How support complex operations
Natively, The arrays can perform basic arithmetic operations such as **dot product, addition, multiplication, and subtraction**.
However, general-purpose computation requires supporting a diverse set of operations such as **division, exponents, and transcendental functions**.
To support these complex operations, they are converted into a set of lookup table (LUT), addition, and multiplication instructions based on algorithms used in Intel IA-64 processors.
The compiler uses either **Newton-Raphson** or **Maclaurin Goldschmidt methods** that iteratively apply a set of instructions to an initial seed from the look-up table and refine the result.
#### Newton-Raphson Method
```
xn = xn-1 – f(xn-1)/f'(xn-1)
Where,
xn-1 is the estimated (n-1)th root of the function,
f(xn-1) is the value of the equation at (n-1)th estimated root, and
f'(xn-1) is the value of the first order derivative of the equation or function at xn-1.
```
In division, we can use this method to find the reciprocal(倒數) of divisor iteratively, then compute the quotient by multiplying the dividend by the reciprocal of the divisor.

Only add and mul.
#### Maclaurin-Goldschmidt methods
##### Maclaurin method
The Maclaurin series is a special case of a Taylor series, is a method for representing a function as an infinite sum of terms.

##### Goldschmidt methods
The Goldschmidt method is an iterative numerical algorithm used to perform high-precision floating-point arithmetic operations, primarily for division, square root extraction, and reciprocal approximation.
**In division:**
The steps for Goldschmidt division are:
1. Generate an estimate for the multiplication factor Fi .
2. Multiply the dividend and divisor by Fi .
3. If the divisor is sufficiently close to 1, return the dividend, otherwise, loop to step 1.
Assuming N/D has been scaled so that 0 < D < 1, each Fi is based on D:

After a sufficient number k of iterations, Q = N

## Reference
[In-Memory Data Parallel Processor](https://dl.acm.org/doi/pdf/10.1145/3173162.3173171)
[Newton-Raphson Method](https://www.geeksforgeeks.org/newton-raphson-method/)
[Division algorithm](https://en.wikipedia.org/wiki/Division_algorithm)