Remote Procedure Calls
- Suspend caller, pass parameters across network
Challenges
- How to handle machine & communication failure
- Address arguments without shared addr space?
- Integrate into existing programs
- How a caller determines the location and identity of the callee
- Protocol to transfer data & control
- Data integrity & security
Alternative Solutions
- Message passing, Remote fork
- Same basic idea
- Remote shared address space
- Perhaps too much overhead?
Structure
- Programmers write interface
- Write server program that export interface
- Write client program that import interface
- Compiler generate stubs
Binding
Naming
- What of machine the client want?
type
e.g. Mail Serverinstance
e.g. Mail Server #1
Locating Address
- Use distributed database
- compile time binding? too early
- broadcast? interfere with innocent machine, not applicable for different networks
- Database
- individual - store address
- groups - store
instance
with sametype
- Usage
- Server export, update database & remember what it exported
- Client lookup database, communicate to the instance and remember some information
- Misc
- Server does not store anything about client (OK if client crashed)
- Binding break if server restart (because unique identifier)
- Can only called exported interface
Transport
- Home-brewed protocol
- Existing protocol handle large file
- Want to minimize startup & tear-down
- No handshake
- Simple
- Send request (with unique ID) & result
- Call identifier
- [calling machine identifier, calling process id, sequence number]
- callee maintain latest seq number for each [activity] ([calling machine id, process id])
- if the incoming packet seq number is larger, act, else ignore
- If the call takes longer
- retransmit & require explicit ACK
- periodically send probes
- If large arguments
- All but last arg packet require ACK
Exception
- Interface must define possible exceptions
Process
- Spawn many process on server and don’t kill
Questions
What were the goals for RPCs?
- Make distributed computation easy
- easy as local procedure call
- Secure (encrypt data)
- Efficiency, low latency (small msgs)
5x network speed
- Semantics similar to local procedure call
- local: precisely one
- rpc: error and “at most one”
- Handle failures (when server crashes)
- General - can use for all communication
- Make distributed computation easy
What are two alternatives to RPCs for communication in distributed systems? How well do they meet the goals of RPCs?
- low-level msgs - TCP/IP, UDP/IP
- - good performance
- - not easy to use
- - doesn’t hide failures
- + General
- Distributed Shared Memory (DSM)
- Address Space -> Pagetables -> physical mem
- - doesn’t handle failures well
- - unpredictable performance
- + ease of use
- - general
Performance Ease General Handle Failures Raw Msgs + No + Not easily DSM Not predictable + No no RPC + not great for large msgs, WAN + ? (size of msgs, ptr, global vars, concurrency) yes - low-level msgs - TCP/IP, UDP/IP
What is the basic RPC architecture? What are the responsibilities of the stub layer? of the RPCRuntime layer? What are the advantages of automatic stub generation? How could pointer arguments be handled?
- see image above
- user client:
rc = foo(5);
- user stub:
int foo(int arg) { m = msg_Create(); pack_int(m, arg); send_msg(dest, m); m2 = wait_response(); unpack_int(m2, rc); return rc; }
- Advantages: Auto generated -> ease
bar(int *ptr)
- can’t pass to server
- ptr doesn’t make sense on server
- Solution
- Disallow ptrs - no generality
- Call by copy/restore (implemented in stub)
- deref locally
- send actual data
- server access actual data, modify, send back
- local: restore (copy server sent back)
- nested ptr (linkedlist, trees) :-1:
How are services identified? What state must be tracked on each server? What is the role of the dispatcher and why is it useful? Why does the server need to track a unique id for each exported interface? Why does the client send a table index instead of the name of the desired service?
- Lookup in dist-replicated db
- Server d: Export(‘KVStore’, ‘dispatcher’)
- Grapevine:
- d valid for KVstore?
- add d
- d’s dispatcher
- case 0: foo(); break
- case 1: bar(); break
- state
- export table
- unique id: detect crash
- client c:
import("KVStore")
;returns (tbl_index, unique_Service_id)
tbl_index
for performance
What were the specific goals of the transport layer? What assumptions do they make?
- Goals
- Performance: fast response time
- Low load on server - (handle many clients)
- no handshaking, not much state
- Assumption: no large msgs, local area network
- Goals
An alternative to exactly-once and at-most-once semantics is “at-least-once”. When are “at-least-once” semantics appropriate? Why are at-least-once semantics not appropriate for RPC?
- Semantics:
- ideal: strong: exactly once
- good: at-most-once (w/ notification if not exactly once)
- OK for Idemptoent operations
- no side effects, same result each time (
read
)
- no side effects, same result each time (
- Bad for nonidemptoent operations
- e.g.
append
- e.g.
- Semantics:
How do RPCs provide at-most-once semantics? How does the client know the call packet was received by the server? How does the server know the result was received by the client? Why is a call_id needed in the result packet? Why is the call_id needed in the call packet? What exactly is kept in the call_id? What state is kept on the server for active connections? What happens when a packet with a call_id > last seq_num arrives? when call_id == last seq_num? when call_id < last seq_num?
- Goal: provide at-most-one
- General Idea: detect duplicated calls, discard and return previous result
- Client sends to server
(call_id, procedure, args)
procudure
:(table index, unique id, entry pt(dispatcher))
- How does client know server received call packet?
- send ack
- optimization: get result packet promptly from server
- How does server know client recived result?
- send ack
- optimization: client sends next call packet
- only 1 outstanding call
- send ack if after certain period of time
- Why
call_id
in result?- client knows result is for most recent call
- Why
call_id
in request?- throw away duplicates
call_id
:(seq_number, activity)
activity
:(machine_id, process)
- State:
- Call Table - per activity
(activity, last seq num, last result)
- Call Table - per activity
call_id > last_seq
-> new msg, executecall_id == last_seq
-> resend resultcall_id < last_seq
-> discard
How much state must be tracked on the server? on the client? What happens if the server crashes? What happens if the client crashes?
- Server
- ExportTable: unique id per exported interface
- Call Table: entry per activity
- discard entry when confident won’t see replay
- Client
- last seq (to generate call id)
- per machine per process
- last seq (to generate call id)
- What if server fails?
- Client time out (no reply)
- ask ack but doesn’t get one
- return error
- Possibility: RPC wasn’t executed or executed once
- server reboot, regenerate export table
- client send another request, unique id doesn’t match
- client notice error, handle as needed
- What if client crashes?
- Server
- lots of work associated with RPC
- server doesn’t know client crashed
- waste work
- Client
- Client reboots, increase counter (seq number)
- server know requests are new
- Server
- Server
How are calls with large arguments handled? What do they assume about their workload and environment?
- Large args, not optimized
Conclusions?
- RPC still popular -
grpc
- understanding correct workloads
- RPC on TCP/IP or UDP/IP
- RPC still popular -