Alleviating Garbage Collection Interference through Spatial Separation in All Flash Arrays === ###### tags: `usenix` `atc` Jaeho Kim Kwanghyun Lim‡, Young-Don Jung†, Sungjin Lee†, Changwoo Min Sam H. Noh⋆, Virginia Tech ‡, Cornell University †, DGIST ⋆, UNIST --- ## All Flash Array (AFA or SSA) * Storage infrastructure that contains only flash memory drives * AFA architecture is not much different from traditiona HHD-based storage server (e.g. RAID structure) --- ## Problem * This design (Replacing HHDs with SSDs) is not adequate to take full advantage of high-speed SSDs * AFA-level GC interfere with user I/O -> Throughout drop * Storage bandwidth and network bandwidth are unbalanced --- * However, even with such a small number of SSDs being used, providing ideal, consistent performance is impossible with GC interference -> *The only way is hiding the GC effect!* --- ## Existing AFA * In-place writes AFAs |-| Media type | Modification | Disk Organization | GC Interference alleviation | |-|-|-|-|-| | Harmonia | SSDs | Array controller| RAID-0 | Globally-coordinated GC AL. -> not completly eliminate | | HPDA | SSDs & HDDs | Host layer | RAID-4 | - (No) | | GC-Steering | SSDs | Host layer | RAID-4/5 | Staging disks -> space constraint | --- * Log-structure writes AFAs |-| Media type | Modification | Disk Organization | GC Interference alleviation | |-|-|-|-|-| | SOFA | SSDs | Host layer | Log-RAID | Globally-coordinated GC AL. -> not completly eliminate | | SALSA | SSDs & SMR | Host layer | Log-RAID | - (unmentioned) | | Purity | SSDs | Host layer | Log-RAID | - (unmentioned) | | SWAN | SSDs | Host layer | 2D Array | SWAN | --- ## Architecture of SWAN * Log-structured writing on RAID * Segment based append only writes * Mapping table : 4KB logical blocks <-> segments * **Spatial separation** * Front end : serve all write requests * Back end : perform SWAN's GC * Implemented on block I/O layer --- ## Procedure of I/O handling ![](https://i.imgur.com/0a5YfFC.jpg) 1. I/O sequence <w1, r7, w3, r8> arrivess from the network 2. * Writes are appended to a segment, but are actually distrubuted across SSDs in the front end R-group and are written in parallel. * Reads will be served by any of the three R-groups ![](https://i.imgur.com/Env8Od3.jpg) ![](https://i.imgur.com/9kVn5i7.jpg) ![](https://i.imgur.com/cae6Ynh.jpg) --- ## Feasibility * How many SSDs in an R-group? Enough to saturate network throughput * Minimum number of R-group? -> To avoid cases where GC falls behind the writing (GC being bottleneck) For each R-group, once it's become a backend: GC time < recycled time (time to become a frontend again) Time to trim? trim : Mark the old pages as "invalid", but those pages remain in the block; however, GC ignore them during GC ## Evaluation * Environment * Dell R730 server with Xeon CPUs and 64GB DRAM * Up to 9 SATA SSDs are used (up to 1TB capacity) * Open channel SSD for monitoring internal activity of an SSD * Target Configurations * RAID0/4: Traditional RAID * Log-RAID0/4: Log-based RAID * SWAN0/4: Our solution * Workloads * Micro-benchmark: Random write request * YCSB C benchmark --- ## Write performance (Micro-benchmark)