CS537

CS537 12/14

Today Flash-based SSD Solid State => Circuitry persistent storage basic characteristics from “row” flash => storage device Basics NAND Flash “trap” charge, encode bit(s) Bank/Chip operations read (page) (~2KB, 4KB) --------------------------------- | |page| | | --------------------------------- | block | erase (block) (~256KB) program (page) to write erase entire block then, you can program each page within block exactly once if you want to overwrite, must erase/prog again Performance read 10s of microseconds much faster then hard drive erase a few milliseconds program 100s of microseconds Reliability “wear out” once you erase/program block “too many” times it stops working “to many”: 10k ~ 100k times SSD: Block-level Device API: read, write block Map to read, erase/program Parallel storage device |------------------------ | ------------ ------ | | |Controller| |DRAM| | | ------------ ------ | | --- --- | | |F| |F| ....

CS537 12/2

Today File Systems implementation Static on-disk structures ---------------------------------- | (1) | inode | Data | ---------------------------------- (1) superblock, inode bitmap, data bitmap Performance Problem Example: new file creation (1 block) creat("/foo") Reads Writes x x inode bitmap x x data bitmap x x inode (foo) x x inode (root dir) x x data (root dir) x data (foo) lots of small, random I/O (Bad) Solutions caching (reads) write buffering (writes) doesn’t remove write in general Dynamic: Log-structured File System Disk --------------------------------------- |write|write|write|....

CS537 12/7

Today Crashes What? Why for FS? Crash (Unexpected interruption) Power loss OS bug hard reset Example file append -> file exists add block to it File System Image --------------------------------------------------- |S|IB|DB| inodes | Data blocks | --------------------------------------------------- S: Super block IB: Inode bitmap DB: Data bitmap Need to change: DB, Inode, Data block Possible write ordering: ------> D(1) I(2) DB D(3) DB(4) I I(5) DB(6) D I D DB DB I D DB D I lose data inconsistency lose data inconsistency (Space leak) inconsistency consistent (garbage) Solution Lazy: File system checker Eager: Write-ahead loggin (journaling) File System Checker (fsck) run checker after reboot (crash) scan entire file system -> SLOWt find inconsistencies fix them Write-ahead Logging (WAL) [Journaling] Idea: Before update, record some info Then do update If crash use info to recover -------------------------------------------- |S|Journal|IB|DB| inodes | Data blocks | -------------------------------------------- Protocol for update Assumptions: 512-bytes sector write is atomic (all or none) if issue many writes, happen in any order Example: File append (update DB, I, D) (memory) | -------------------------------------------------- (disk) | Journal Tb|DB|I|D|Te Tb: transaction begin Te: transaction end Split step 1 1a write Tb, contents (wait to complete) 1b write Te (512 bytes) Step 2 update in place (checkpointing) Problem: too slow (1/2 speed for data intensive workloads) Metadata-only Journaling write data only once how/when to write data

CS537 12/9

Distributed File Systems Sun Network File System (NFS) Server crash recovery design of network protocol Distributed Systems Client/Server One Server Replicated Servers Many servers Different Than “local” System? machine crash network lose packets performance latency, bandwidth resource sharing policies NFS Basics Protocol from protocol to FS API idempotency: key to failure handling performance: caching Server Crashes: How to Handle? lead to unavailibility key idea: when there is a problem => retry File Handle 3 parts: <volume#, inode #, generation #> volume: which fs?...

CS537 9/16

CPU Virtualization Mechanisms: How Policies: Scheduling How to “Time-Share” Problem: What if process $P$ wants to do something restricted? See prev Note Problem: Process may run for a long time Q: How does OS regain control of CPU? A: timer interrupt @Boot: set this up interrupt the CPU every X milliseconds Why not shorter? A: Context switch overhead, Cache miss OS responsibility for each process: track current state Running Ready (Not running, but could be) Problem: What if Process does something “Slow”?...