---
title : An Efficient Page-level FTL to Optimize Address Translation in Flash Memory
tags : SSD
description : 2023/05/24 reading group
---
# An Efficient Page-level FTL to Optimize Address Translation in Flash Memory
##### Link : [paper](https://dl.acm.org/doi/pdf/10.1145/2741948.2741949)
###### paper origin: EuroSys '15
## Introduction
- Problem :
- With the increasing capacity of SSDs, the **mapping table grows too large to be cached**
- Reduce the extra operations caused by address translation with a small mapping cache
- Proposal :
- Using a relatively **small mapping cache**
- **Clusters** the cached mapping entries that belong to the same translation page
## Background
SSD mainly consists of
- A software layer FTL
- An internal RAM
- Flash memory
## Experiment
1. Distribution of entries in the mapping cache
- a PPN takes 4B
- a flash page 4KB
This result shows **only a small fraction** (less than 15%)(150/(4KB/4B) of entries in a cached translation page **are recently used**.

We can see that 53%-71% of cached translation pages have more than one dirty entry cached, and **the average numbers of dirty entries in each page are above 15**.

2. Spatial locality in workloads
Financial1 is a random-dominant workload, it is evident that **sequential accesses**, denoted by the diagonal lines, **are very common**.

**The decline is because sequential accesses** require consecutive mapping entries, which concentrate on a few translation pages

## Design of TPFTL
### Overview
- cache
- a small set of TP nodes
- each TP node mantains a cluster of L2P list
- a counter records the number change of TP nodes
- flash memory
- data blocks
- translation blocks

### Page-level LRU
- A TP node usually has **more than one entry node with different hotness**
- Note that the hotness of each entry node is obscured by the page-level hotness, which results in less efficiency in exploiting the temporal locality
### Loading Policy
**Sequential accesses are very common**
- Request-level prefetching
- **Spliting a request into one or more page accesses** according to its start address and length
- It is efficient with large requests
- Selective prefetching
- dynamic length

1. If the **number of TP nodes continues to decrease** by a threshold, PFTL assumes sequential accesses are happening
- Performing selective prefetching when a cache miss occurs
2. If the number begins to continuously increase by the threshold
- Stop selective prefetching
### Replacement Policy
- Batch-update replacement
- When a dirty entry node becomes a victim, TPFTL **writes back all the dirty entry nodes of its TP node**, and only the victim is evicted
- In GC, if page is cached, all the cached dirty entries in the page are written back
- Clean-first replacement
- Chooses its LRU **clean entry node as a victim**

## Evaluation
- Flashsim platform
- workload

### Results





## Conclusion
- **Extra operations degrade both the performance and lifetime**
- Both a high **hit ratio** and a **low probability of replacing a dirty entry** of the mapping cache play a crucial role in reducing the system response time as well as the overall write amplification