Brief #
- RAID
- S/W (file system)
- H/W (hardware) disks, RAIDs, SSD
- File System
RAID #
- Why?
- Performance
- Capacity
- Reliability (Durability)
- Failure model
- entire drive:
- Working or (Completely) failed
- Easily detected (by RAID controller)
- RAID “Levels”
- Level 0: no redundancy (striping / JBOD)
| Disk 0 | Disk 1 | Disk 2 |
|---|
| Block 0 | 1 | 2 |
| 3 | 4 | 5 |
- No redundancy: can’t handle failure
- Level 1: Mirroring
- For each block, have copies on some other drive
| Disk 0 | Disk 1 | Disk 2 | Disk 3 |
|---|
| 0 | 0 | 1 | 1 |
| 2 | 2 | 3 | 3 |
- More advanced failure model:
- Block could become corrupt
- Solutions
- have > 2 copies, vote
- Checksum
- Good:
- Performance (1 logical write => 2 physical write)
- Tolerate failure
- Bad:
- Capacity (1/2 for 2 way mirror)
- Level 4: Parity
- Bit level example, each row has even # of 1’s
| Disk 0 | Disk 1 | Disk 2 | Parity Disk |
|---|
| 0 | 1 | 0 | 1 |
| 0 | 0 | 0 | 0 |
- “Full stripe write”
- Write: Disk 0,1,2 => RAID controller, compute parity
- Do all writes in parallel
- Random write: 1 block
| Disk 0 | Disk 1 | Disk 2 | Parity |
|---|
| 0 | 1 | 2 | P0,1,2 |
| 3 | 4 | 5 | P3,4,5 |
| 6 | 7 | 8 | P6,7,8 |
- How to write 4?
- Approach #1 (Additive):
- Read 3, 5
- Compute Parity (Over 3, 4, 5)
- Write 4, P3,4,5
- Approach #2
- Read old data
- If different:
- Read old parity
- Compute (flip) the new parity
- Write new data, new parity
- 2 Reads + computation(free) + 2 writes
- RAID 4: 1 Write => 4 I/Os
- Level 5
- Stagger Parity
| Disk 0 | Disk 1 | Disk 2 | Parity |
|---|
| 0 | 1 | 2 | P0,1,2 |
| 3 | 4 | P3,4,5 | 5 |
| 6 | P6,7,8 | 7 | 8 |
- Reduce write bottleneck
- More parallel read
Mirroring Vs RAID - 5 #
| Mirroring | Raid 5 |
|---|
| Small Writes | 2 writes per write | 4 I/Os |
| Sequential I/O | similar | similar |
| Capacity | Wastes 1/2 or more | much more efficient |
| Reliability | 1 failure (fore sure) | same |
- File System:
- 2 abstractions
- File:
- array of bytes of some size, read or write or (grow or delete …)
- FS doesn’t care about file contents
- Has name: low-level (inode number)
- Directory
- List of files, directories
- Map “human readable” name => low-level name
comments powered by