Strong consistency Guarantees
Interface
query(objId, opts)update(objId, newVal, opts)
Protocol

- Reply: sent by tail
- Query: processed at tail
- Update: directed to the head, and chain to tail
- compute at head, send state to tail
- can be non-deterministic
Hist_objID=Hist_objIDof tailPending_objId= requests received but not processed by tail
Failure
masterservice- never fail ([[Paxos]])
- reconfigure chain
Vs Primary/backup
Downtime
- Chain
- Head failure
- Query uninterrupted
- Update unavailable for 2 message delay
- Middle failure
- Query, Update uninterrupted
- Ongoing update 4 message delay
- No visible to client
- Tail failure
- Query, update unavailable for 2 message delay
- Head failure
- Primary/Backup
- Primary failure
- Query, Update unavailable for 5 message delay
- Master broadcast to all backup
- Backup response number of update processed
- Master broadcast new primary
- New primary forward all missing update
- Master notify clients
- Backup failure
- Query uninterrupted
- Update 1 message delay
- Master notify primary no ACK needed
- Primary failure
In-class
Primary-Backup Replication
- Solve
- Ordering
- Non-determinism
- Leaves: failure
- Write: Send to all backups
- Read: Only on primary
- Read wait for concurrent write to complete
Chain-Replication
Goal
- high availability
- high throughput
- strong consistency
- handle failures fail-stop
- low mean-time-to-failure
- easy to add node
Non-goals
- handling slow nodes, misbehaving nodes
- Wide area network WAN
Basic Operations
val = query(obj-id): idempotentupdate(obj-id, newVal): not idempotent- can handle more complex functions
- What to do on loss?
- query object to check if updated
Properties
- Tail: queries
- Head: update (send diff on chain)
- every node write data
- Strong consistency guarantees
- Extra work for clients
- track both head & tail
- Retry if no ack
Latency
| x | primary-backup | chain |
|---|---|---|
| wirte | max(latency of each backup) | sum of latency |
| read | queue after write | low |
Failure
- Who?
- Sperate master server
- How?
- Heartbeats/ping
- Result
- Re-arrange chain to account for failed server
- Notify clients of new head/tail
- Case
- Head, mid, tail failure
Head Node Fails
- Notify new head
- Notify clients that S is new head
- Lost newest update if update only in head
Tail Node Fails
- Notify S it is new tail
- Notify client
Inner Node
- S- notified that S+ is new successor
- S- need to resend every update from the past
- Optimization
- sent-list: everything send-add
- acked: remove from list
Steps for Protocol
- Notify S+ and ask for latest id in history
- S+ response with id = x
- Master informs S- of S+ and id = x
- S- send update to S+
Add New Nodes
- When can T+ be new tail?
- sent(T) + hist(T+) = hist(T)
- What should T do when receives queries?
- T deny requests
- forward to T+
Variant: Week Chain
- Idea: Read from any node
- Benefits?
- better read throughput
- scale with # node
- good performance hen read heavy
- Challenges?
- weak consistency
- To maintain strong consistency
- Check sent list if updates have been process to tail?
- If not
- deny request, client retry
- re-route to tail
- save old values and send (expensive)
- wait
- If not
- Check sent list if updates have been process to tail?
Best Suited for
- low latency
- all nodes operating well
- homogeneous performance