# [Study] Memory System -- DRAM study note
###### tags: `research-GraphRC`

> by Jacob, Bruce Wang, David T. Ng, Spencer W. Rodriguez, Samuel
[TOC]
## DRAM limitation
### Speed
- DRAM designers have increased the **data rate at the DRAM’s I/O pins** without having also to increase the speed of the DRAM’s core
- DDR improves speed relative to single data rate (SDR) by moving to a **2n prefetch architecture**—one in which twice the number of bits are fetched from the DRAM core
### Cost concern
- DRAM design will be far more concerned about pin count than transistor count
- Pin cost is becoming a major concern
### Power and Heat Dissipation
- Previous-era DIMMs exhibited power dissipation on the order of **1W**. Current **FB-DIMMs dissipate nearly ten times** that
## DRAM access latency
:::spoiler Breaking down access delay

:::
### Timing parameter
- Capture from Chapter 11, page 428
| Parameter | Description |
| --------- | ----------- |
| tAL | Added Latency to column accesses, used in DDRx SDRAM devices for posted CAS commands. |
| tBURST | Data burst duration. The time period that data burst occupies on the data bus. Typically 4 or 8 beats of data. In DDR SDRAM, 4 beats of data occupy 2 full clock cycles. |
| tCAS | Column Access Strobe latency. The time interval between column access command and the start of data return by the DRAM device(s). Also known as tCL. |
| tCCD | Column-to-Column Delay. The minimum column command timing, determined by internal burst (prefetch) length. Multiple internal bursts are used to form longer burst for column reads. tCCD is 2 beats (1 cycle) for DDR SDRAM, and 4 beats (2 cycles) for DDR2 SDRAM |
| tCMD | Command transport duration. The time period that a command occupies on the command bus as it is transported from the DRAM controller to the DRAM devices. |
| tCWD | Column Write Delay. The time interval between issuance of the column-write command and placement of data on the data bus by the DRAM controller. |
| tFAW | Four (row) bank Activation Window. A rolling time-frame in which a maximum of four-bank activation can be engaged. Limits peak current profi le in DDR2 and DDR3 devices with more than 4 banks. |
| tOST | ODT Switching Time. The time interval to switching ODT control from rank to rank. |
| tRAS | Row Access Strobe. The time interval between row access command and data restoration in a DRAM array. A DRAM bank cannot be precharged until at least tRAS time after the previous bank activation. |
| tRC | Row Cycle. The time interval between accesses to different rows in a bank. tRC = tRAS + tRP. |
| tRCD | Row to Column command Delay. The time interval between row access and data ready at sense amplifiers. |
| tRFC | Refresh Cycle time. The time interval between Refresh and Activation commands. |
| tRP | Row Precharge. The time interval that it takes for a DRAM array to be precharged for another row access. |
| tRRD | Row activation to Row activation Delay. The minimum time interval between two row activa- tion commands to the same DRAM device. Limits peak current profiile. |
| tRTP | Read to Precharge. The time interval between a read and a precharge command. |
| tRTRS | Rank-to-rank switching time. Used in DDR and DDR2 SDRAM memory systems; not used in SDRAM or Direct RDRAM memory systems. One full cycle in DDR SDRAM. |
| tWR | Write Recovery time. The minimum time interval between the end of a write data burst and the start of a precharge command. Allows sense amplifi ers to restore data to cells. |
| tWTR | Write To Read delay time. The minimum time interval between the end of a write data burst and the start of a column-read command. Allows I/O gating to overdrive sense amplifi ers before read command starts. |
#### Row-Read Command
- The purpose of a row access command is to move data from the cells in the DRAM arrays to the sense amplifiers and then restore the data back into the cells in the DRAM arrays as part of the same command.
- **tRCD**: The time it takes for the row access command to move data from the DRAM cell arrays to the sense amplifiers is known as the Row-Column (Command) Delay.
- **tRAS**: The time it takes for a row access command to discharge and restore data from the row of DRAM cells is known as the Row Access Strobe latency or tRAS.
- 
#### Column-Read Command
- A column-read command moves data from the array of sense amplifiers of a given bank of DRAM arrays through the data bus back to the memory controller.
- **tCAS**: Column Access Strobe Latency (tCAS, or tCL) is the time it takes for the DRAM device to place the requested data onto the data bus after issuance of the column-read command.
- **tCCD, tBURST**: The internal burst length of the DRAM device is labelled as tCCD in Figure 11.4, and the duration of the data burst on the data bus for a single column-read command is labelled as tBURST.
- 
#### Column-Write Command
- A column-write command moves data from the memory controller to the sense amplifiers of the targeted bank.
- **tCWD**: The column-write delay specifies the timing between assertion of the column-write command on the command bus and the placement of the write data onto the data bus by the memory controller.
- **tWR**: The write recovery time, tWR, is the time it takes for the write data to propagate into the DRAM arrays
- **tWTR**: The write-to-read time, tWTR, accounts for the time that the I/O gating resources are released by the write command
- 
#### Precharge Command
- The precharge command completes the row access sequence as it resets the sense amplifi ers and the bitlines and prepares them for another row access command to the same array of DRAM cells.
- **tRP**: Time after the assertion of the precharge command, the bitlines and sense amplifiers of the selected bank are properly precharged, and a subsequent row access command can be sent to the just-precharged bank of DRAM cells.
- **tRC**: = tRP + tRAS. The minimum amount of time that a DRAM device needs to bring data from the DRAM cell arrays into the sense amplifi ers, restore the data to the DRAM cells, and precharge the bitlines to the reference voltage level for another row access command.
- tRC is also commonly referred to as the random row-cycle time of a DRAM device.
- 
#### Refresh Command
- 
## Timing
### Read cycle
### Write cycle
###
## Overall architecture
- 
- Memory Controller
- Channel
- Rank
- Chip (Device)
- Bank
- Subarray (Array)
- Mat (Tile)
- Row, Column
- Cell
## DRAM Channel
:::info
- A channel is the **collection of all banks** that share **a common physical link** (command, address, data buses) to the processor.
- Each system controller has a single **DRAM memory controller (DMC)**, and each DRAM memory controller controls a single **channel** of memory. While banks from the same channel experience contention at the physical link, **banks from different channels** can be accessed **completely independently** of each other.
:::
- 
- In modern DRAM memory systems, commodity DRAM memory modules are standardized with **64-bit-wide data busses**, and the 64-bit data bus width of the memory module matches the data bus width of the typical personal computer system controller.
- Commodity **error correcting** memory systems are standardized with a **72-bit-wide data bus**.
- Modern memory systems with o**ne DRAM memory controller and multiple physical channels** of DRAM devices are typically designed with the **physical channels operating in lockstep** with respect to each other.
- Figure explanation
- Typical system controller: The system controller controls a **single 64-bit-wide channel**.
- Intel i875P system controller: The system controller connects to a single channel of DRAM with a **128-bit-wide data bus**. The system controller requires matching pairs of 64-bit wide memory modules to operate with the 128-bit-wide data bus. The paired-memory module configuration of the i875P is often referred to as a **dual channel** configuration.
- 
- The use of independent DRAM memory controllers can lead to higher sustainable bandwidth characteristics. As a result, newer system controllers are often designed with **multiple memory controllers** despite the additional die cost.
## DRAM Rank
:::info
- **Rank** is now used to denote a set of DRAM devices that operate in lockstep to respond to a given command in a memory system.
- Banks in different ranks are fully decoupled with respect to their device-level electrical operation and, consequently, offer **better bank-level parallelism than banks in the same rank.**
- A DRAM rank typically consists of **eight DRAM chips**, each of which has **eight banks**. Since the chips operate in lockstep, **the rank has only eight independent banks**, each of which is the set of the ith bank across all chips.
:::
- 
## DRAM Device (Chip)
- 
- This DRAM cell consists of **1024 columns**, and **each column is 16 bits wide**. That is, the **16-bit-wide column is the basic addressable unit of memory in this device**, and each column access that follows the row access would read or write 16 bits of data from the same row of DRAM
- 
- In a given generation, a DRAM device may be configured with different data bus widths for use in different applications. The 64 Meg x4 device consists of **4 bits of data per column**, **2048 columns of data per row**, and **8192 rows per bank**, and there are **4 banks in the device**. Alternatively, a 256-Mbit SDRAM device with a 16-bit-wide data bus will have 16 bits of data per column, 512 columns per row, and 8192 rows per bank; there are 4 banks in the 16 Mbit, x16 device.
:::info
Recap: What is the red 8 in the upper left corner ? The number of subarry ? or the number of
- Figure from last week 
- Updated figure 
- **Conclusion:**
- Given SALP, we have **subarry ID** field in memory address. We can select a specific **slice** from bank **volume**
- Without SALP, we DON'T have **subarry ID** field in memory address. We automatically retrieve a **cube** of bits from bank **volume** (i.e. 8 bits for x8 DRAM, 4 bits for x4 DRAM)
:::
## DRAM Bank
:::info
- **Bank** is only used to denote a set of independent memory arrays inside a DRAM device.
- **Banks** are the smallest memory structures that can be accessed in parallel with respect to each other. This is referred to as **bank-level parallelism**.
- This is NOT true for RC-NVM bank
:::
- DRAM device with **4 banks** of DRAM arrays internally.
- 
- Modern DRAM devices contain **multiple banks** so that multiple, independent accesses to different DRAM arrays can **occur in parallel**. In this design, each **bank of memory is an independent** array that can be in different phases of a row access cycle.
- Multiple banks within a given DRAM device can be **activated independently from each other** -- subject to the **power constraints** of the DRAM device that may specify how closely such activations can occur in a given period of time.
## DRAM Subarray (Array)

- A **x4 DRAM** (pronounced “by four”) indicates that the DRAM has at least four memory arrays and that a **column width is 4 bits** (each column read or write transmits 4 bits of data). In a x4 DRAM part, four arrays each read 1 data bit in unison, and the part **sends out 4 bits** of data each time the memory controller makes a **column read request**.
- 
- Due to their large parasitic capacitance, long bitlines have two disadvantages.
- First, they make it difficult for a DRAM cell to cause the necessary perturbation required for reliable sensing.
- Second, a sense-amplifier takes longer to drive a long bitline to a target voltage-level, thereby increasing the latency of activation and precharging.
- 
- All tiles in the horizontal direction – a **“row of tiles” – share the same set of global wordlines**. Therefore, these tiles are activated and precharged in lockstep. We abstract such a “row of tiles” as a single entity that we refer to as a **subarray**. More specifically, a subarray is a collection of cells that **share a local row-buffer** (all sense-amplifiers in the horizontal direction) and a **subarray row-decoder**.
## DRAM Row
:::info
- **Row** is simply a group of storage cells that are **activated in parallel** in response to a row activation command.
:::
- DRAM **devices** can be connected in parallel to form **a rank of memory**. The effect of DRAM devices connected as ranks of DRAM devices that **operate in lockstep** is that a row activation command will activate the **same addressed row in ALL DRAM devices** in a given rank of memory.
- From the perspective of the memory controller, the **size of a row** is simply the **size of a row in a given DRAM device** multiplied by the number of DRAM devices in a given rank. 
## DRAM Column
:::info
- **Column** of data is the smallest addressable unit of memory.
:::
- 
- **The size of a column of data is the same as the width of the data bus.**
:::info
- A **beat** is simply a data transition on the data bus.
:::
- In SDRAM memory systems, there is **one data transition per clock cycle**, so **one beat of data is transferred per clock cycle**.
- In DDRx SDRAM memory systems, **two data transfers can occur in each clock cycle**, so **two beats of data are transferred in a single clock cycle**.
- In DDRx SDRAM memory systems, **each column access command fetches multiple columns of data depending on the programmed burst length**.
- For example, in a DDR2 DRAM device, each memory read command returns a minimum of 4 columns of data.
## DRAM Cell design
- Currently, DRAM device manufacturers such as Micron, Samsung, Elpida, Hynix, and the majority of the DRAM manufacturing industry use the stacked capacitor structure, while Qimonda, Nanya, and several other smaller DRAM manufacturers use the trench capacitor structure
### Trench capacitor design
- 
### Stacked capacitor design
- 
- 