Mass Storage System

--- tags: Operating System --- # Mass Storage System ## Disk Structure ![](https://i.imgur.com/HiORO3t.png) Disk drives are addressed as large 1-dim arrays of logical blocks (sector). 1. Logical blocks are mapped onto disk sequentially. 1. Constant Linear Velocity (CLV) 1. Density of bits per track is uniform 2. To keep the same data rate, increasing rotation speed in inner cylinders is needed. 2. Constant Angular Velocity (CAV) 1. There is larger bit density on inner tracks. 2. Same data rate 2. Disk I/O Disk drive attached to a computer by an I/O bus. * I/O bus is controlled by controllers 1. Host controller 2. Disk controller ## Disk Scheduling 1. Disk-access time has 3 major components 1. Seek Time The time that move disk arm to the desired cylinder. 2. Rotational Latency The time that rotate disk head to the desired sector 3. Read Time Content transfer time 2. Disk Bandwidth ![](https://i.imgur.com/gGIbavO.png) $\frac{number\ of\ bytes\ transferred}{complete\ of\ last\ req – start\ of\ first\ req}$ 3. Disk Scheduling Minimize seek time ### FCFS ![](https://i.imgur.com/uMNHTMb.png) ### SSTF (Shortest-Seek-Time-First) ![](https://i.imgur.com/ZjTDLom.png) ### SCAN Scheduling ![](https://i.imgur.com/YtC8qmq.png) Disk head moves from one end to the other end ### C-SCAN Scheduling ![](https://i.imgur.com/pAskX5Q.png) Disk head moves in one direction only ### C-LOOK Scheduling ![](https://i.imgur.com/C0UEl5o.png) Disk head moves only to the last request location ## Disk Management ### Disk Formatting Low-level formatting (or physical formatting) divides a disk into sectors that disk controller can read and write. * $sector = header + data area + trailer$, where 1. $header$: $sector\ number$ 2. $trailer$: ECC (error-correcting code) * ECC is calculated based on all bytes in data area. OS does the next 2 steps to use the disk 1. Partition the disk into one or more groups of cylinders 2. Logical Formatting ### Booting 1. Bootstrap Program It initializes CPU, registers, device controllers, memory, and then starts OS. 1. First bootstrap code is stored in ROM. 2. It completes bootstrap in the boot block of the boot disk (system disk) 2. Booting from a Disk in Windows 2000 ![](https://i.imgur.com/ANzfs2a.png) 1. Run bootstrap code in ROM 2. Read boot code in MBR (master boot record) 3. Find boot partition from partition table 4. Read boot sector/block and continue booting ### Bad Blocks 1. Simple Disks Like IDE Disks They manually use format program to mark the corresponding FAT entry of the bad block and the bad blocks are locked away from allocation. 2. Sophisticated Disks Like SCSI Disks 1. Disk controller maintains the list of bad blocks 2. List is updated over the life of the disk 3. The lost data can be reused 1. Sector Sparing (Forwarding) It remaps bad block to a spare one. * It could affect disk-scheduling performance 2. Sector Slipping It ships sectors all down one spot. ### Swap-Space Management 1. Swap-Space Virtual memory use disk space (swap-space) as an extension of main memmory. 2. Management 1. Part of a Normal File System 2. Separate Disk Partition 3. Allows access to both types 3. Allocation 1. 1st Version Copy entire process between contiguous disk regions and memory. 2. 2nd Version Copy pages to swap space. 1. Solaris 1 1. Text segments read from file system, thrown away when pageout 2. Only anonymous memory (stack, heap) is stored in swap space 2. Solaris 2 Swap-space allocation only when pageout rather than virtual memory creation time 4. Data Structures for Swapping (Linux) ![](https://i.imgur.com/N1JLKrw.png) ## Redundant Arrays of Inexpensive (RAID) Disks RAID is arranged into different levels 1. Striping 2. Mirror (Replication) 3. Error-correcting code (ECC) & Parity bit ### RAID 0 & RAID 1 ![](https://i.imgur.com/2ULQVt0.png) #### RAID 0: Non-Redundant Striping Strip data into several disks. 1. Properties 1. I/O bandwidth is proportional to the striping count. 2. It improves performance via parallelism. 3. It has no fault tolerence. 2. Recommended Applications 1. Non-critical data storage 2. Application requiring high bandwidth (such as video editing) #### RAID 1: Mirrored Disks Mirror data into several disks. 1. Properties 1. It provides reliability via redundancy. 2. Read bandwidth increases by $N$ (the number of disks) times. 2. Recommended Applications * Application requiring very high availability #### Hybrid RAID ![](https://i.imgur.com/kEZpBEz.png) 1. RAID 0+1: Stripe then Replicate. 2. RAID 1+0: Replicate then Stripe. * First level often control by a controller. Therefore, RAID 1+0 has better fault tolerance than RAID 0+1 when multiple disk fails. ### RAID 2: Hamming code ![](https://i.imgur.com/jvN43jU.png) ![](https://i.imgur.com/oQn4YSY.png) 1. Properties 1. It can fix any single disk failure. 2. It is inefficient. 2. Recommended applications No commercial implementations exist / not commercially viable ### RAID 3 & RAID 4: Parity Bit ![](https://i.imgur.com/XOGvSVB.png) Parallel transfer with parity in bit-level or block-level. Disk controller can detect whether a sector has been read correctly so a single parity bit is enough to correct error from a single disk failure. 1. RAID 3: Bit-Level Striping 2. RAID 4: Block-Level Striping * RAID4 has higher I/O throughput, because controller does not need to reconstruct block from multiple disks. ### RAID 5: Distributed Parity ![](https://i.imgur.com/Vw3fzGq.png) It spreads data & parity across all disks to prevent over use of a single disk. * Properties 1. Disk failure has a medium impact on throughput 2. When one disk failed, user have to rebuild the RAID array. 3. Controller design is complex. ### RAID 6: P+Q Dual Parity Redundancy ![](https://i.imgur.com/Slb6ipb.png) 1. It is like RAID 5, but it stores extra redundant information to guard against multiple disk failure. 2. It uses ECE code (Error Correction Code) instead of single parity bit. **Reference** NTHU OCW 作業系統 by 周志遠