RPC

Remote Procedure Calls Suspend caller, pass parameters across network Challenges How to handle machine & communication failure Address arguments without shared addr space? Integrate into existing programs How a caller determines the location and identity of the callee Protocol to transfer data & control Data integrity & security Alternative Solutions Message passing, Remote fork Same basic idea Remote shared address space Perhaps too much overhead? Structure Programmers write interface Write server program that export interface Write client program that import interface Compiler generate stubs Binding Naming What of machine the client want?...

February 24, 2022

Two-phase commit

Two-phase Commit Presumed Nothing Protocol coordinator sends PREPARE to notify transaction is terminated Each cohort send COMMIT-VOTE or ABORT-VOTE Coordinator collect response Coordinator send result ABORT or COMMIT Cohort Activity Abort? Client mark first action of a transaction, cohort mark the transaction active when see first. If terminate and not active -> ABORT Client and cohort record number of actions Client send count to coordinator If not match -> ABORT Force Write Before COMMIT-VOTE (because cohort need to ask for outcome if crashed) Before ACK (because coordinator might forget outcome after ACK) Protocol Database Transactions cohorts & state: committed, aborted, active ACK, not ACK Coordinator Activity Coordinator force result to disk before sending `` Write END after all cohorts ACK (not force) No need to restore Conclusion coordinator 2 log write, 1 forced 2 messages cohort 2 log write, all forced 2 messages Presumed Abort No log if transaction aborted No ack for ABORT Message count same as PrN Presumed Commit Coordinator force write when init Remove when complete Cohort force prepare, but non-force write before commit because forgetting = commit coordinator (when commit) 2 log write, 2 forced 2 messages cohort 2 log write, 1 forced 1 messages Read-only Transaction Coordinator sent PREPARE PrA: no log writes because data not forced write, forget = abort PrC: 2 write, 1 force 1 when init 1 to mark it delete Cohorts send READ-ONLY-VOTE no log writes PrC Optimization MaintainIN Store permanently Might not have initiated Assume abort IN = REC - (COM & REC) COM committed transaction REC $REC = {tid | tid_l < tid < tid_h}$ tid_stable Highest committed tid Recovering IN tid_h Method 1: set tid_h = tid_stable + fixed amount Method 2: dynamic set tid_h but log to disk tid_l if oldest transaction: increment tid_l and log to disk COM & REC Scan from tid_h to tid_l and check if committed store in bit vector or list nPRC Protocol Abort Coordinator send PREPARE Cohort send COMMIT-VOTE, one send ABORT-VOTE Coordinator send ABORT Cohort send ACK Coordinator possibly write increase tid_l 4 msgs, 1 soft write Update Coordinator send PREPARE Cohort send COMMIT-VOTE Coordinator send COMMIT Coordinator write increase tld_l & commit 3 msgs, 1 force write Read Only Coordinator send PREPARE Cohort send READ-ONLY-VOTE 2 msgs Recalcitrant Transaction Transaction too long (or cohort crash) Fall back to PrC for that transaction Write init info to disk Garbage Collect Maintain all corhort Send ABORT to all

February 24, 2022

CS739 Intro

Why build distributed systems? Performance of single computer can’t handle load throughput, latency cost/performance commodity components elasticity incremental scalability Fault-tolerance availability reliability: don’t do something wrong data sharing Why study distributed systems? Important Practical Challenging / Interesting Why is it challenging Faults - Fail-stop, crash slow nodes, misbehaving nodes nodes disagree, who to trust? Interperoleat: lack of global state File A has different content how man jobs on node B?...

January 25, 2022