Remote Procedure Calls

  • Suspend caller, pass parameters across network

Challenges

  • How to handle machine & communication failure
  • Address arguments without shared addr space?
  • Integrate into existing programs
  • How a caller determines the location and identity of the callee
  • Protocol to transfer data & control
  • Data integrity & security

Alternative Solutions

  • Message passing, Remote fork
    • Same basic idea
  • Remote shared address space
    • Perhaps too much overhead?

Structure

  • Programmers write interface
  • Write server program that export interface
  • Write client program that import interface
  • Compiler generate stubs

Binding

Naming

  • What of machine the client want?
    • type e.g. Mail Server
    • instance e.g. Mail Server #1

Locating Address

  • Use distributed database
    • compile time binding? too early
    • broadcast? interfere with innocent machine, not applicable for different networks
  • Database
    • individual - store address
    • groups - store instance with same type
  • Usage
    • Server export, update database & remember what it exported
    • Client lookup database, communicate to the instance and remember some information
  • Misc
    • Server does not store anything about client (OK if client crashed)
    • Binding break if server restart (because unique identifier)
    • Can only called exported interface

Transport

  • Home-brewed protocol
    • Existing protocol handle large file
    • Want to minimize startup & tear-down
    • No handshake
  • Simple
    • Send request (with unique ID) & result
  • Call identifier
    • [calling machine identifier, calling process id, sequence number]
    • callee maintain latest seq number for each [activity] ([calling machine id, process id])
    • if the incoming packet seq number is larger, act, else ignore
  • If the call takes longer
    • retransmit & require explicit ACK
    • periodically send probes
  • If large arguments
    • All but last arg packet require ACK

Exception

  • Interface must define possible exceptions

Process

  • Spawn many process on server and don’t kill

Questions

  1. What were the goals for RPCs?

    • Make distributed computation easy
      • easy as local procedure call
    • Secure (encrypt data)
    • Efficiency, low latency (small msgs)
      • 5x network speed
    • Semantics similar to local procedure call
      • local: precisely one
      • rpc: error and “at most one”
    • Handle failures (when server crashes)
    • General - can use for all communication
  2. What are two alternatives to RPCs for communication in distributed systems? How well do they meet the goals of RPCs?

    • low-level msgs - TCP/IP, UDP/IP
      • - good performance
      • - not easy to use
      • - doesn’t hide failures
      • + General
    • Distributed Shared Memory (DSM)
      • Address Space -> Pagetables -> physical mem
      • - doesn’t handle failures well
      • - unpredictable performance
      • + ease of use
      • - general
    PerformanceEaseGeneralHandle Failures
    Raw Msgs+No+Not easily
    DSMNot predictable+Nono
    RPC+ not great for large msgs, WAN+? (size of msgs, ptr, global vars, concurrency)yes
  3. What is the basic RPC architecture? What are the responsibilities of the stub layer? of the RPCRuntime layer? What are the advantages of automatic stub generation? How could pointer arguments be handled?

    • see image above
    • user client: rc = foo(5);
    • user stub:
      int foo(int arg) {
          m = msg_Create();
          pack_int(m, arg);
          send_msg(dest, m);
          m2 = wait_response();
          unpack_int(m2, rc);
          return rc;
      }
      
    • Advantages: Auto generated -> ease
    • bar(int *ptr)
      • can’t pass to server
      • ptr doesn’t make sense on server
      • Solution
        1. Disallow ptrs - no generality
        2. Call by copy/restore (implemented in stub)
          • deref locally
          • send actual data
          • server access actual data, modify, send back
          • local: restore (copy server sent back)
          • nested ptr (linkedlist, trees) :-1:
  4. How are services identified? What state must be tracked on each server? What is the role of the dispatcher and why is it useful? Why does the server need to track a unique id for each exported interface? Why does the client send a table index instead of the name of the desired service?

    • Lookup in dist-replicated db
    • Server d: Export(‘KVStore’, ‘dispatcher’)
    • Grapevine:
      • d valid for KVstore?
      • add d
    • d’s dispatcher
      • case 0: foo(); break
      • case 1: bar(); break
    • state
      • export table
      • unique id: detect crash
    • client c:
      • import("KVStore");
      • returns (tbl_index, unique_Service_id)
      • tbl_index for performance
  5. What were the specific goals of the transport layer? What assumptions do they make?

    • Goals
      • Performance: fast response time
      • Low load on server - (handle many clients)
      • no handshaking, not much state
    • Assumption: no large msgs, local area network
  6. An alternative to exactly-once and at-most-once semantics is “at-least-once”. When are “at-least-once” semantics appropriate? Why are at-least-once semantics not appropriate for RPC?

    • Semantics:
      • ideal: strong: exactly once
      • good: at-most-once (w/ notification if not exactly once)
    • OK for Idemptoent operations
      • no side effects, same result each time (read)
    • Bad for nonidemptoent operations
      • e.g. append
  7. How do RPCs provide at-most-once semantics? How does the client know the call packet was received by the server? How does the server know the result was received by the client? Why is a call_id needed in the result packet? Why is the call_id needed in the call packet? What exactly is kept in the call_id? What state is kept on the server for active connections? What happens when a packet with a call_id > last seq_num arrives? when call_id == last seq_num? when call_id < last seq_num?

    • Goal: provide at-most-one
    • General Idea: detect duplicated calls, discard and return previous result
    • Client sends to server (call_id, procedure, args)
    • procudure: (table index, unique id, entry pt(dispatcher))
    • How does client know server received call packet?
      • send ack
      • optimization: get result packet promptly from server
    • How does server know client recived result?
      • send ack
      • optimization: client sends next call packet
        • only 1 outstanding call
        • send ack if after certain period of time
    • Why call_id in result?
      • client knows result is for most recent call
    • Why call_id in request?
      • throw away duplicates
    • call_id: (seq_number, activity)
      • activity: (machine_id, process)
    • State:
      • Call Table - per activity
        • (activity, last seq num, last result)
    • call_id > last_seq -> new msg, execute
    • call_id == last_seq -> resend result
    • call_id < last_seq -> discard
  8. How much state must be tracked on the server? on the client? What happens if the server crashes? What happens if the client crashes?

    • Server
      • ExportTable: unique id per exported interface
      • Call Table: entry per activity
        • discard entry when confident won’t see replay
    • Client
      • last seq (to generate call id)
        • per machine per process
    • What if server fails?
      • Client time out (no reply)
      • ask ack but doesn’t get one
      • return error
        • Possibility: RPC wasn’t executed or executed once
      • server reboot, regenerate export table
      • client send another request, unique id doesn’t match
      • client notice error, handle as needed
    • What if client crashes?
      • Server
        • lots of work associated with RPC
        • server doesn’t know client crashed
        • waste work
      • Client
        • Client reboots, increase counter (seq number)
        • server know requests are new
  9. How are calls with large arguments handled? What do they assume about their workload and environment?

    • Large args, not optimized
  10. Conclusions?

    • RPC still popular - grpc
      • understanding correct workloads
    • RPC on TCP/IP or UDP/IP