Bigtable
Wide applicability, scalability, high performance, high availability Data Model Rows Every read or write of data under a single row key is atomic Rows sorted in lexicographic order tablet: row range Columns Grouped into columns family Column key = family:qualifier Access control on family level Timestamp Store multiple versions in one cell Implementation Master assigning tablets to tablet servers detect the addition and expiration of tablet servers load balancing garbage collection Tablet location Read chubby file -> get location of the root tablet Root tablet: contains all Metadata tablet location METADATA table Row key: (tablet's table id, end row) Data: location of data tablet, other metadata Tablet Assignment Master Tablet servers create file and lock in chubby Master detect tablet alive by asking tablet server If timeout or server report lost lock Master try lock server file If lock success => chubby up, server down => delete server file server commit suicide If can’t reach chubby => master kill itself On master boot Master scan chubby and ask every server to know tablet <=> server mapping Servers split table => commit to METADATA table => notify master Tablet Serving memtable => new updates SSTable => old updates Optimizations Locality groups One SSTable for each locality group (column family) Compression Cache Bloom filter reduce SSTable access by asking if data is in that SSTable Commit-log all servers append commit log to same physical file