Brief

  • Hard Drive: today
  • Later: Flash-based SSD
  • RAID: (Redundant Array of Independent Drives)
  • Collection => faster, more reliable

Hard Drive

  • Persistent
  • Storage
  • $T_{i/o}=T{seek}+T{rotate}+T{transfer}$
    • $T_{seek}$: Move arm to correct track
    • $T_{rotate}$: Wait
  • Performance
    • Random, small I/Os:
      • Really bad (spend most time in seek rotate)
    • Sequential, large I/Os:
      • better perf (most time is spent in transfer)

Disk Scheduling:

  • Disk: queue of request
    • How to schedule them?
    • What order?
  • Strategies:
    • FIFO (not so good)
    • SSTF: Shortest seek time first
      • Variants: SCAN (elevator) or C-SCAN
    • SPTF/SATF: Shortest positioning (access) time first
      • (Accounts for seek & rotate)
      • Not “optimal” but used in practice
    • Problem
      • Starvation
    • Solutions Bounded SATF
      • Window of requests (some fixed number)
      • Services all requests in window before subsequent requests

RAID

  • Evaluate: Different Approaches to RAID organization

    • Performance
    • Reliability
    • Capacity
  • RAID levels (approaches to RAID)

    • RAID level 0 (Striping or JBOD - just bunch of disks)

      • No redundancy
      • Performance -> good
        • Random I/O performance:
          • 1 drive -> R MB/s
          • N drives -> N * R MB/s
        • Sequential I/O performance:
          • 1 drive -> S MB/s
          • N drives -> N * S MB/s
      • Capacity
        • Useful capacity
          • 1 drive -> D GB of capacity
          • N drives -> N * D GB
      • Reliability
        • 1 failure -> lose data
    • RAID level 1 (Mirroring, 2-way [2-copies])

      • No redundancy
      • Performance
        • Random read I/O performance:
          • 1 drive -> R MB/s
          • N drives -> N * R MB/s
        • Random write I/O performance:
          • 1 drive -> R MB/s
          • N drives -> N/2 * R MB/s
        • Sequential I/O performance:
          • 1 drive -> S MB/s
          • read N drives -> ?????
          • write N drives -> N/2 * S MB/s
      • Capacity
        • Useful capacity
          • 1 drive -> D GB of capacity
          • N drives -> N/2 * D GB
      • Reliability
        • tolerate 1 failure for sure
        • N drives -> if lucky, tolerate up to N/2
    • RAID level 4 (Parity)

      • Data drives and a Parity Drive
      Data0 Data1 Data2 Data3     Parity
      0     1     1     0         0      [stripe 0]
      1     0     0     0         1      (even # of 1s in each stripe)