Today

  • File Systems implementation

Static

  • on-disk structures
----------------------------------
| (1) | inode |       Data       |
----------------------------------
(1) superblock, inode bitmap, data bitmap

Performance Problem

  • Example: new file creation (1 block)
  • creat("/foo")
Reads    Writes
x        x         inode bitmap
x        x         data  bitmap
x        x         inode (foo)
x        x         inode (root dir)
x        x         data  (root dir)
         x         data (foo)
  • lots of small, random I/O (Bad)
  • Solutions
    • caching (reads)
    • write buffering (writes)
      • doesn’t remove write in general

Dynamic: Log-structured File System

Disk
---------------------------------------
|write|write|write|...................|
---------------------------------------

Attempt #1

  • buffer updates in memory =>segment (~ 1 MB)
  • When segment fills, write to end of log
Creating /foo
---------------------------------
| Data  | Foo's | Data  | Dir   |
| (foo) | Inode | (dir) | inode |
---------------------------------
   ^        |       ^         |
   |--------|       |---------|

Problem: Read a File?

  •   fd = open("/foo", O_RDONLY);
      read(fd, buffer, 4KB);
    
    • read root inode
    • read root data (look for “foo”)
    • read inode of foo
    • read data foo
  • How to find inode?
    • scan disk (SLOW)
    • inode map (array)
      • inode#: index
      • contents: address of latest version of inode
    • write inode map (pieces) => log too
      -----------------------------------------------------
      | CR | Data  | Foo's | Data  | Dir   | Piece of  | Piece|
      |    | (foo) | Inode | (dir) | inode | inode map | (dir)|
      ----------------------------------------------------
      
    • once in a while update checkpoint region (beginning of disk)

Problem Garbage

  • New versions of inodes, data, etc., written
  • Garbage Collection
    • reclaim disk space
    • Problem
      • How to determine what is live/dead?
      • how to reuse space efficiently?
  • Efficient GC
    • Read N segments (with fragmentation)
    • Write M segments (W/ live) to the end of block