RPC | Hello!

Remote Procedure Calls

Suspend caller, pass parameters across network

Challenges

How to handle machine & communication failure
Address arguments without shared addr space?
Integrate into existing programs
How a caller determines the location and identity of the callee
Protocol to transfer data & control
Data integrity & security

Alternative Solutions

Message passing, Remote fork
- Same basic idea
Remote shared address space
- Perhaps too much overhead?

Structure

Programmers write interface
Write server program that export interface
Write client program that import interface
Compiler generate stubs

Binding

Naming

What of machine the client want?
- type e.g. Mail Server
- instance e.g. Mail Server #1

Locating Address

Use distributed database
- compile time binding? too early
- broadcast? interfere with innocent machine, not applicable for different networks
Database
- individual - store address
- groups - store instance with same type
Usage
- Server export, update database & remember what it exported
- Client lookup database, communicate to the instance and remember some information
Misc
- Server does not store anything about client (OK if client crashed)
- Binding break if server restart (because unique identifier)
- Can only called exported interface

Transport

Home-brewed protocol
- Existing protocol handle large file
- Want to minimize startup & tear-down
- No handshake
Simple
- Send request (with unique ID) & result
Call identifier
- [calling machine identifier, calling process id, sequence number]
- callee maintain latest seq number for each [activity] ([calling machine id, process id])
- if the incoming packet seq number is larger, act, else ignore
If the call takes longer
- retransmit & require explicit ACK
- periodically send probes
If large arguments
- All but last arg packet require ACK

Exception

Interface must define possible exceptions

Process

Spawn many process on server and don’t kill

Questions

What were the goals for RPCs?
- Make distributed computation easy
  - easy as local procedure call
- Secure (encrypt data)
- Efficiency, low latency (small msgs)
  - $\leq$ 5x network speed
- Semantics similar to local procedure call
  - local: precisely one
  - rpc: error and “at most one”
- Handle failures (when server crashes)
- General - can use for all communication
What are two alternatives to RPCs for communication in distributed systems? How well do they meet the goals of RPCs?
- low-level msgs - TCP/IP, UDP/IP
  - - good performance
  - - not easy to use
  - - doesn’t hide failures
  - + General
- Distributed Shared Memory (DSM)
  - Address Space -> Pagetables -> physical mem
  - - doesn’t handle failures well
  - - unpredictable performance
  - + ease of use
  - - general
Performance Ease General Handle Failures
Raw Msgs + No + Not easily
DSM Not predictable + No no
RPC + not great for large msgs, WAN + ? (size of msgs, ptr, global vars, concurrency) yes
What is the basic RPC architecture? What are the responsibilities of the stub layer? of the RPCRuntime layer? What are the advantages of automatic stub generation? How could pointer arguments be handled?
- see image above
- user client: rc = foo(5);
- user stub:
```
int foo(int arg) {
    m = msg_Create();
    pack_int(m, arg);
    send_msg(dest, m);
    m2 = wait_response();
    unpack_int(m2, rc);
    return rc;
}
```
- Advantages: Auto generated -> ease
- bar(int *ptr)
  - can’t pass to server
  - ptr doesn’t make sense on server
  - Solution
    1. Disallow ptrs - no generality
    2. Call by copy/restore (implemented in stub)
      - deref locally
      - send actual data
      - server access actual data, modify, send back
      - local: restore (copy server sent back)
      - nested ptr (linkedlist, trees) :-1:
How are services identified? What state must be tracked on each server? What is the role of the dispatcher and why is it useful? Why does the server need to track a unique id for each exported interface? Why does the client send a table index instead of the name of the desired service?
- Lookup in dist-replicated db
- Server d: Export(‘KVStore’, ‘dispatcher’)
- Grapevine:
  - d valid for KVstore?
  - add d
- d’s dispatcher
  - case 0: foo(); break
  - case 1: bar(); break
- state
  - export table
  - unique id: detect crash
- client c:
  - import("KVStore");
  - returns (tbl_index, unique_Service_id)
  - tbl_index for performance
What were the specific goals of the transport layer? What assumptions do they make?
- Goals
  - Performance: fast response time
  - Low load on server - (handle many clients)
  - no handshaking, not much state
- Assumption: no large msgs, local area network
An alternative to exactly-once and at-most-once semantics is “at-least-once”. When are “at-least-once” semantics appropriate? Why are at-least-once semantics not appropriate for RPC?
- Semantics:
  - ideal: strong: exactly once
  - good: at-most-once (w/ notification if not exactly once)
- OK for Idemptoent operations
  - no side effects, same result each time (read)
- Bad for nonidemptoent operations
  - e.g. append
How do RPCs provide at-most-once semantics? How does the client know the call packet was received by the server? How does the server know the result was received by the client? Why is a call_id needed in the result packet? Why is the call_id needed in the call packet? What exactly is kept in the call_id? What state is kept on the server for active connections? What happens when a packet with a call_id > last seq_num arrives? when call_id == last seq_num? when call_id < last seq_num?
- Goal: provide at-most-one
- General Idea: detect duplicated calls, discard and return previous result
- Client sends to server (call_id, procedure, args)
- procudure: (table index, unique id, entry pt(dispatcher))
- How does client know server received call packet?
  - send ack
  - optimization: get result packet promptly from server
- How does server know client recived result?
  - send ack
  - optimization: client sends next call packet
    - only 1 outstanding call
    - send ack if after certain period of time
- Why call_id in result?
  - client knows result is for most recent call
- Why call_id in request?
  - throw away duplicates
- call_id: (seq_number, activity)
  - activity: (machine_id, process)
- State:
  - Call Table - per activity
    - (activity, last seq num, last result)
- call_id > last_seq -> new msg, execute
- call_id == last_seq -> resend result
- call_id < last_seq -> discard
How much state must be tracked on the server? on the client? What happens if the server crashes? What happens if the client crashes?
- Server
  - ExportTable: unique id per exported interface
  - Call Table: entry per activity
    - discard entry when confident won’t see replay
- Client
  - last seq (to generate call id)
    - per machine per process
- What if server fails?
  - Client time out (no reply)
  - ask ack but doesn’t get one
  - return error
    - Possibility: RPC wasn’t executed or executed once
  - server reboot, regenerate export table
  - client send another request, unique id doesn’t match
  - client notice error, handle as needed
- What if client crashes?
  - Server
    - lots of work associated with RPC
    - server doesn’t know client crashed
    - waste work
  - Client
    - Client reboots, increase counter (seq number)
    - server know requests are new
How are calls with large arguments handled? What do they assume about their workload and environment?
- Large args, not optimized
Conclusions?
- RPC still popular - grpc
  - understanding correct workloads
- RPC on TCP/IP or UDP/IP

	Performance	Ease	General	Handle Failures
Raw Msgs	+	No	+	Not easily
DSM	Not predictable	+	No	no
RPC	+ not great for large msgs, WAN	+	? (size of msgs, ptr, global vars, concurrency)	yes

Remote Procedure Calls #

Challenges #

Alternative Solutions #

Structure #

Binding

Naming #

Locating Address #

Transport #

Exception #

Process #

Questions #

Remote Procedure Calls

Challenges

Alternative Solutions

Structure

Naming

Locating Address

Transport

Exception

Process

Questions