The Role of Distributed State

State

information retained in one place that describes something, or is determined by something, somewhere else in the system.

Pros

  • Performance (Cache)
  • Coherency (Seq num to detect duplicates or out-of-order)
  • Reliability (Recover from cache if center die)

Cons

  • Consistency
    • Detect stale data on use (DNS cache)
    • Prevent inconsistency (Direct to a single copy when updating)
    • Tolerate inconsistency
  • Crash sensitivity
    • Crash on one machine crashes the whole system
  • Time & Space Overheads
    • Mainly due to maintaining consistency
    • Space: same data on many machines
  • Complexity

NFS

  • Idempotent
  • State almost exclusively on clients
  • Client State
    • File identifiers
    • File data (read)
    • File attributes (lookup)
    • Name translations (lookup, name -> File identifiers)
  • Pros
    • Handle server crashes with ease (Client notice delay)
    • Simplicity
  • Cons
    • Performance
      • Change will have to be written to disk before write returns
    • Consistency
      • Server cannot notify other clients if one client modify its file. Use polling to solve, but still leave a window of inconsistency
      • Write-through-on-close causes even more performance issue
    • Semantic difficulties
      • Some operation is impossible without violating statelessness and idempotency

NFSv2

  • Virtual File System / Vnode
  • Goals
    • Machine & Operating System Independence
    • Crash Recovery
    • Transparent Access
    • Unix Semantics
    • Performance
  • Stateless
  • Root file handle retrieved from mount
  • fhandle: (inode number, inode generation number, filesystem id)

Issues

  • Filesystem Naming
    • Can be mount on top of other remote fs, can also be mount on top of it self -> confusion
    • lookup will not cross mountpoint
  • Credentials
    • Need to make the whole network using the same gid/uid -> YP (yellow page)
    • Able to map root as root or nobody
  • Lock
    • Not included in NFS (use another RPC service)
    • two clients writing to the same remote file may get intermixed data on long writes
  • UNIX Open File Semantics
    • Some programs open and immediately delete the file but still write/read the file
      • replace the remove call to rename and remove later
    • Local: Only check permission on open, NFS: Check permission on every request
      • Save credentials at open time
    • In NFS, read will fail if other client delete the file
  • Time skew
    • time desync between client & server
  • Performance
    • File, attr, name cache, increase UPD packet size

NFSv3

V2 Problems

  • Max file size 4GB
    • Fixed: 64bit size/offset
  • Bad sequential write
    • implemented WRITE, COMMIT
  • Too many GETATTR call
    • All operations return attributes
      • Pre/Post operation attr
    • READDIRPLUS - read directory with attributes
  • Lack of consistency guarantee
    • Not fixed
    • close-to-open consistency (flush on close, revalidate on open)
  • Some request is nonidempotent (e.g. REMOVE)
    • Not fixed in protocol
    • Some implementation uses reply cache to solve
  • Inaccessible file
    • return NFS3ERR_JUKEBOX
  • Read/Write Performance
    • Remove 8KB limit

Async Write

  • Write verifier:
    • unique number that changed if server crashes
    • client check if WRITE and COMMIT response contains same write verifier

NFSv4

  • Stateful
    • OPEN (Replace normal CREATE), CLOSE
      • Check file access during OPEN (instead of LOOKUP + ACCESS)
      • MKDIR, RMDIR Replaced by OPEN, REMOVE
      • Windows need atomic CREATE + LOCK (Share reservation)
    • Locking
      • Use leases
      • If server reboot, wait for lease interval before allowing any new lock
      • Sequence number
  • Operation Coalescing
    • COMPOUND - group operations to a single
      • not atomic
    • Remove READDIRPLUSrequest
  • Security
    • Mandate strong RPC security
    • Support access control compatible with UNIX and Windows
    • Not SSL (does not support UDP)
    • NF3v3 negotiate security during mount, but mount is not secure
  • Delegation
    • A server cedes control of file updates and locking state to a client
      • Client callback when delegation revoked
      • No delegation if server cannot rpc client
  • Others
    • pre-post attributes replaced by a special data structure change_info
    • Server presents a single seamless view of all the exported
    • Multi-component LOOKUP

Data Structures

  • Filehandles
    • set CURFH during COMPOUND
    • volatile filehandles
      • filehandles can expire
  • Client ID
    • Client present itself and unique 64bit object to server
    • Server return 64bit clientid
    • if client or server crash, client need new clientid
    • if client crash, server free all locks
  • State ID
    • When client request lock
    • Client send clientid, lock_owner_id
    • Server response stateid and associate stateid with lock owner info

Questions

The Role of Distributed State

  1. Introduction: What is state in a computer system? What is distributed state? What are the three benefits Ousterhout gives of distributed state
    • State: Includes all observable properties of a program and its environment
    • Distributed State: Information retained in one place that describes something remote
    • Benefits
      • Performance - faster to access local
      • Coherency - know local state no need to guess
      • Reliability - state replicate
    • Challenges
      • Consistency
        • detect stale data - hint
        • prevent inconsistency (Read-only)
        • tolerate problem
      • Crash sensitivity
        • lost essential data
    • Time + Space overheads
    • Complexity
  2. What are the four reasons given for why distributed state can be bad?
  3. In NFSv2, servers are stateless; what does it mean to be stateless? Is it okay to keep anything in memory? What must be included in requests to the server given that it is stateless?
    • Server does not contain any essential info in volatile storage
    • Anything in RAM? non-essential (performance, cache)
    • Request completely describe operation
  4. Most NFSv2 operations are idempotent; what does it mean for an operation to be idempotent? Are all NFSv2 operations idempotent?
    • Idempotent: executed multiple times = executed 1 time
    • How can client do append?
      • User prog f = open()
      • NFS Client lookup & return fh
      • NFS Client keep (fd, fh, offset)
      • User prog append()
      • NFS Client send write(fh, offset, content)
    • Non idempotent
      • mkdir - return code
      • Solution: replay cache
  5. What is the main advantage of having stateless NFS servers? What does a client need to do when a server crashes? What does a server need to do when a client crashes?
    • Advantages: deals with server crashes easily
    • Server crash: Client does nothing special, replay operations
    • Client crash: Server also does nothing special
  6. Why does the stateless NFSv2 server cause performance problems for client write requests? How do some NFS servers fix this problem?
    • Can’t return write until persist
    • Added NVRAM (faster and persistent storage)
  7. To obtain respectable performance, NFS clients may cache data. Why does the stateless server cause problems with data consistency? How does a client find out whether data it is caching is stale? Why does this approach cause performance problems too? When does a client write back modified data to the server?
    • Issues
      • 3 clients and 1 server
      • All cient read and cache
      • C_1 write
      • Two issues
        • C_1 doesn’t flush new data to server
        • C_1 close and flush but C_2 re-read cache
    • Fundamental problem
      • multiple copies, no way knowing data was changed
    • Client track timestamp of cached data
      • client needs getattr()
    • Performance problems:
      • scalability problem
      • check only every 3 secs
  8. Lock?
    • Separate protocol

NFS Version 3: Design and Implementation

  1. What were the three cited problems with NFSv2? We’ll focus on problems having to do with distributed state… What was their real market incentive?
    • File sizes, > 4GB, 64-bit filesize
    • Perf. problem with forcing data to disk (async write)
    • Cache consistency + overhead of getattrs
  2. Why was keeping the server stateless still a goal? Combining stateless servers and non-idempotent requests is difficult; how did previous NFSv2 implementations deal with non-idempotent operations? With this technique, what happens if the reply to a non-idempotent operation is lost? What happens if the server crashes before it sends the reply? NFSv3 will continue to encourage this same implementation technique.
    • Quick development
      • Know how to deal with stateless
    • How did v2 handle non-idempotent ops
      • replay cache
    • Does not handle server crash (server lost cache)
  3. How does NFSv3 improve performance of writes to the server? How does this optimization complicate the client if the server crashes? How does a client now know that the server has crashed? (Detail: Is it possible to have worse consistency semantics with this optimization?)
    • Improvement
      • Client send many write and server buffer
      • Client close and send commit
    • Server Crash?
      • Client buffer all writes until commit
    • How can client know server crash?
      • Server response for write return a write verifier
      • When commit, server also return write verifier
      • Check if both verifier are is them
  4. NFSv2 suffered from the problem of too many calls to GETATTR being sent to the server. How does NFSv3 improve performance by reducing the number of calls to get attributes?
    • return all attr for all ops
  5. When are getattr results useless?
    • V2:
      • Client write and server response attr
      • Client find out time stamp is new
      • Client invalid cache? Keep cache?
    • V3: return pre, post attr

The NFS Version 4 Protocol

  1. What is the significance of the quote ``Old Marley was as dead as a door-nail’'?
    • no longer stateless
  2. Why does NFSv4 introduce the COMPOUND procedure? What are its semantics? Does it introduce any complexities?
    • Sequential, stop on error
    • Not atomic relative to others
    • Client needs to figure out errors
  3. Why does NFSv4 introduce OPEN and CLOSE operations? What does an exchange between Client A wishing to open file X for reading and writing look like with the server? What operations can Client A now keep local? What happens when Client B wishes to open the file for reading? How much state does the server now track?
    • Reasons
      • Implement file lock - adhere Windows semantic
      • Better consistency
      • Improving performance
    • Open with exclusive right of the file
      • Client can keep operations local
    • Client B?
      • If ask for delegation -> reject
      • Server revoke Client A’s delegation
      • Client A send all data back to server
  4. Adding state to the server complicates crash recovery. What happens now if client A crashes while it has the delegation for the open? How can the server give the delegation to another client? What are some of the problems with this solution?
    • Client crash: server waits for lease expire
  5. What happens now if the server crashes; that is, how can the server avoid simultaneously giving client B the delegation?
    • Server crash: no state, doesn’t know delegation
    • Another client ask for delegations? wait lease interval to expire
  6. Why are synchronization operations like lock/unlock needed for NFS? How does the lock protocol work? Leases are used for locks as well; what is different about leases for locks compared to delagations? What happens when client A holding a lock reboots? What if the server crashes while client A is holding a lock? What happens when client A tries to refresh its lease?
    • Normal:
      • client send info about itself & verifier, server return client id
    • Client crash:
      • client get new verifier, ask for client id
      • server notice different verifier, free previous lock
    • Server crash (lose knowledge of all locks)
      • Client refresh lock
      • Server doesn’t know clientid, client reaquire
    • Delagation vs Lock
      • different timeout