Today
- Crashes
- What?
- Why for FS?
Crash (Unexpected interruption)
- Power loss
- OS bug
- hard reset
- Example
- file append -> file exists add block to it
- File System Image
--------------------------------------------------- |S|IB|DB| inodes | Data blocks | --------------------------------------------------- S: Super block IB: Inode bitmap DB: Data bitmap- Need to change: DB, Inode, Data block
- Possible write ordering:
------> D(1) I(2) DB D(3) DB(4) I I(5) DB(6) D I D DB DB I D DB D I- lose data
- inconsistency
- lose data
- inconsistency (Space leak)
- inconsistency
- consistent (garbage)
Solution
- Lazy: File system checker
- Eager: Write-ahead loggin (journaling)
File System Checker (fsck)
- run checker after reboot (crash)
- scan entire file system -> SLOWt
- find inconsistencies
- fix them
Write-ahead Logging (WAL) [Journaling]
- Idea:
- Before update, record some info
- Then do update
- If crash use info to recover
-------------------------------------------- |S|Journal|IB|DB| inodes | Data blocks | -------------------------------------------- - Protocol for update
- Assumptions:
- 512-bytes sector write is atomic (all or none)
- if issue many writes, happen in any order
- Example: File append (update DB, I, D)
(memory) | -------------------------------------------------- (disk) | Journal Tb|DB|I|D|Te Tb: transaction begin Te: transaction end- Split step 1
- 1a write Tb, contents (wait to complete)
- 1b write Te (512 bytes)
- Step 2
- update in place (checkpointing)
- Assumptions:
- Problem:
- too slow (1/2 speed for data intensive workloads)
- Metadata-only Journaling
- write data only once
- how/when to write data