- Wide applicability, scalability, high performance, high availability
Data Model
Rows #
- Every read or write of data under a single row key is atomic
- Rows sorted in lexicographic order
- tablet: row range
Columns #
- Grouped into columns family
- Column key =
family:qualifier - Access control on family level
Timestamp #
- Store multiple versions in one cell
Implementation
Master #
- assigning tablets to tablet servers
- detect the addition and expiration of tablet servers
- load balancing
- garbage collection
Tablet location #

- Read chubby file -> get location of the root tablet
- Root tablet: contains all Metadata tablet location
- METADATA table
- Row key:
(tablet's table id, end row) - Data: location of data tablet, other metadata
Tablet Assignment #
- Master
- Tablet servers create file and lock in chubby
- Master detect tablet alive by asking tablet server
- If timeout or server report lost lock
- Master try lock server file
- If lock success => chubby up, server down => delete server file
- server commit suicide
- If can’t reach chubby => master kill itself
- On master boot
- Master scan chubby and ask every server to know tablet <=> server mapping
- Servers split table => commit to METADATA table => notify master
Tablet Serving #

- memtable => new updates
- SSTable => old updates
Optimizations #
- Locality groups
- One SSTable for each locality group (column family)
- Compression
- Cache
- Bloom filter
- reduce SSTable access by asking if data is in that SSTable
- Commit-log
- all servers append commit log to same physical file