Byzantine

Sustain $⌊ \frac{n - 1}{3} ⌋$ failures

$f$ don’t response
$f$ response
- but cannot tell if faulty node respond or not
need $f$ + 1 to out number
Provide safety and liveness
Safety:
- all non faulty replicas agree on a total order for the execution of requests despite failures.

System Model

Network
- drop, delay, out of order, duplicate
Faulty node behave arbitrarily
Independent node failures
$D (m)$ = digest of m
$⟨ m ⟩_{σ_{i}}$ = message sign by node $i$
Adversary
- can
  - coordinate faulty nodes
  - delay correct nodes
- cannot
  - delay correct nodes indefinitely
  - quantum compute

Algorithm

Terms

replicas move through a succession of view
primary of a view is $p = v \mod | R |$
- $p$ primary replica
- $v$ view number
- $| R |$ total replicas
$f$ max # to fail
Replica = all servers
Primary, Backup

Process

Client send request to primary
primary multicasts all backups
Replicas execute request and reply to client
client wait for $f + 1$ same replies

Client

Client $c$ send $⟨ R E Q U E S T, o, t, c ⟩_{σ_{c}}$ to master
- $t$ timestamp
- $o$ operation to be executed
Master broadcast to backups
Replicas reply $⟨ R E P L Y, v, t, c, i, r ⟩$
- $r$ result
- $i$ replica number
- $v$ view number
Client wait for $f + 1$ replies with same $t$ and $r$
If timeout
- Client broadcast request to all replica
- If already processed
  - re-send reply (remember)
- If not processed & not primary
  - redirect to primary

Normal Case

Replica state

message log
current view

Three-phase atomic broadcast

Pre-prepare

Multicast to all backup and write to log $⟨ ⟨ P R E - P R E A P R E, v, n, d ⟩_{σ_{p}}, m ⟩$
- $m$ is the client’s request (message)?
  - maybe not the full content?
- $d$ : $m$ ’s digest
- $n$ : seq number
Backup accepts if
- Signature match
- in view $v$
- has not accept same $v$ , $n$ with different digest
- $h$ < seq num < $H$
  - prevent exhausting seq number (by choosing a large one)

Prepare

If backup accept, multicast and write to log
$⟨ P R E P A R E, v, n, d, i ⟩_{σ_{i}}$
Replica accepts if ^d8ce5a
- Signature match
- In view
- h < seq num < H
prepared(m, v, n, i) = true if the following are all written to $i$ ’s log
- request $m$
- a pre-prepare in $v$ and seq num $n$
- $2 f$ prepares from different backups

Commit

when prepared(m, v, n, i) = true
multicast $⟨ C O M M I T, v, n, D (m), i ⟩_{σ_{i}}$
Replicas accept and insert to log (same condition as before)
execute if accepted $2 f + 1$ commits

Invariant

if prepared(m, v, n i) is true then prepared(m', v, n, j) is false
- (No prepared with same seq but diff msg)
- If executed locally, at least f+1 non-faulty replicas have prepared

Checkpoint

multicast $⟨ C H E C K P O I N T, n, d, i ⟩_{σ_{i}}$
- every $k$ execution
- $d$ = checkpoint digest
- correct if $2 f + 1$ same checkpoint
- update low, high water mark $H$ , $h$
- $h$ = seq num of last stable checkpoint
- $H$ = $h$ + $k$ (big enough constant)

View-Change

Backup start timer if receive request but not executed
Multicast $⟨ V I E W - C H A N G E, v + 1, n, C, P, i ⟩_{σ_{i}}$
- $n$ : seq num of last checkpoint
- $C$ : $2 f + 1$ valid checkpoint messages
- $P$ : un-execute pre-prepared & $2 f$ valid prepare msgs
When primary of view $v + 1$ receive $2 f$ other valid view-change $2 f + 1$ in total - Multicast $⟨ N E W - V I E W, v + 1, V, O ⟩_{σ_{p}}$
- $V$ : all valid VIEW-CHANGE msgs
- $O$ : new pre-prepares generated by new primary
  - might be null
Backup go to new view if receive valid new-view msg

Non-Determinism

Primary select the non-determinism value
Or let backups propose

In Class

Can’t trust single node

Ask all nodes?
- Liveness
- Latency
- Primary can spoof other servers response
  - Solution: Digital signature

Need a primary to order requests

What if primary fault?
- Primary lie to backup
  - Sol: Signature
- Primary ignore request
  - Sol: timeout => client broadcast
Can faulty replicas prevent progress?
- No, only wait for $2 f$ replies, can tolerate $f$ faulty nodes
handle primary sending different ops
handle primary lie to clients (wait for $p + 1$ response)

distributed systems - Why is the commit phase in PBFT necessary? - Computer Science Stack Exchange

Sustain ⌊n−13⌋ failures #

System Model #

Algorithm

Terms #

Process #

Client #

Normal Case

Replica state #

Three-phase atomic broadcast

Pre-prepare #

Prepare #

Commit #

Invariant #

Checkpoint #

View-Change #

Non-Determinism #

In Class

Can’t trust single node #

Need a primary to order requests #

Sustain $⌊ \frac{n - 1}{3} ⌋$ failures

System Model

Terms

Process

Client

Replica state

Pre-prepare

Prepare

Commit

Invariant

Checkpoint

View-Change

Non-Determinism

Can’t trust single node

Need a primary to order requests