The Role of Distributed State
State
information retained in one place that describes something, or is determined by something, somewhere else in the system.
Pros
- Performance (Cache)
- Coherency (Seq num to detect duplicates or out-of-order)
- Reliability (Recover from cache if center die)
Cons
- Consistency
- Detect stale data on use (DNS cache)
- Prevent inconsistency (Direct to a single copy when updating)
- Tolerate inconsistency
- Crash sensitivity
- Crash on one machine crashes the whole system
- Time & Space Overheads
- Mainly due to maintaining consistency
- Space: same data on many machines
- Complexity
NFS
- Idempotent
- State almost exclusively on clients
- Client State
- File identifiers
- File data (
read) - File attributes (
lookup) - Name translations (
lookup, name -> File identifiers)
- Pros
- Handle server crashes with ease (Client notice delay)
- Simplicity
- Cons
- Performance
- Change will have to be written to disk before write returns
- Consistency
- Server cannot notify other clients if one client modify its file. Use polling to solve, but still leave a window of inconsistency
- Write-through-on-close causes even more performance issue
- Semantic difficulties
- Some operation is impossible without violating statelessness and idempotency
- Performance
NFSv2
- Virtual File System / Vnode
- Goals
- Machine & Operating System Independence
- Crash Recovery
- Transparent Access
- Unix Semantics
- Performance
- Stateless
- Root file handle retrieved from
mount fhandle:(inode number, inode generation number, filesystem id)
Issues
- Filesystem Naming
- Can be mount on top of other remote fs, can also be mount on top of it self -> confusion
lookupwill not cross mountpoint
- Credentials
- Need to make the whole network using the same gid/uid -> YP (yellow page)
- Able to map root as root or nobody
- Lock
- Not included in NFS (use another RPC service)
- two clients writing to the same remote file may get intermixed data on long writes
- UNIX Open File Semantics
- Some programs open and immediately delete the file but still write/read the file
- replace the
removecall torenameand remove later
- replace the
- Local: Only check permission on
open, NFS: Check permission on every request- Save credentials at open time
- In NFS, read will fail if other client delete the file
- Some programs open and immediately delete the file but still write/read the file
- Time skew
- time desync between client & server
- Performance
- File, attr, name cache, increase UPD packet size
NFSv3
V2 Problems
- Max file size 4GB
- Fixed: 64bit size/offset
- Bad sequential write
- implemented
WRITE,COMMIT
- implemented
- Too many
GETATTRcall- All operations return attributes
- Pre/Post operation attr
READDIRPLUS- read directory with attributes
- All operations return attributes
- Lack of consistency guarantee
- Not fixed
- close-to-open consistency (flush on close, revalidate on open)
- Some request is nonidempotent (e.g.
REMOVE)- Not fixed in protocol
- Some implementation uses reply cache to solve
- Inaccessible file
- return
NFS3ERR_JUKEBOX
- return
- Read/Write Performance
- Remove 8KB limit
Async Write

- Write verifier:
- unique number that changed if server crashes
- client check if
WRITEandCOMMITresponse contains same write verifier
NFSv4
- Stateful
OPEN(Replace normalCREATE),CLOSE- Check file access during
OPEN(instead ofLOOKUP+ACCESS) MKDIR,RMDIRReplaced byOPEN,REMOVE- Windows need atomic
CREATE+LOCK(Share reservation)
- Check file access during
- Locking
- Use leases
- If server reboot, wait for lease interval before allowing any new lock
- Sequence number
- Operation Coalescing
COMPOUND- group operations to a single- not atomic
- Remove
READDIRPLUSrequest
- Security
- Mandate strong RPC security
- Support access control compatible with UNIX and Windows
- Not SSL (does not support UDP)
- NF3v3 negotiate security during
mount, butmountis not secure
- Delegation
- A server cedes control of file updates and locking state to a client
- Client callback when delegation revoked
- No delegation if server cannot rpc client
- A server cedes control of file updates and locking state to a client
- Others
- pre-post attributes replaced by a special data structure
change_info - Server presents a single seamless view of all the exported
- Multi-component
LOOKUP
- pre-post attributes replaced by a special data structure
Data Structures
- Filehandles
- set
CURFHduringCOMPOUND - volatile filehandles
- filehandles can expire
- set
- Client ID
- Client present itself and unique 64bit object to server
- Server return 64bit clientid
- if client or server crash, client need new clientid
- if client crash, server free all locks
- State ID
- When client request lock
- Client send
clientid,lock_owner_id - Server response
stateidand associatestateidwith lock owner info
Questions
The Role of Distributed State
- Introduction: What is state in a computer system? What is distributed state? What are the three benefits Ousterhout gives of distributed state
- State: Includes all observable properties of a program and its environment
- Distributed State: Information retained in one place that describes something remote
- Benefits
- Performance - faster to access local
- Coherency - know local state no need to guess
- Reliability - state replicate
- Challenges
- Consistency
- detect stale data - hint
- prevent inconsistency (Read-only)
- tolerate problem
- Crash sensitivity
- lost essential data
- Consistency
- Time + Space overheads
- Complexity
- What are the four reasons given for why distributed state can be bad?
- In NFSv2, servers are stateless; what does it mean to be stateless? Is it okay to keep anything in memory? What must be included in requests to the server given that it is stateless?
- Server does not contain any essential info in volatile storage
- Anything in RAM? non-essential (performance, cache)
- Request completely describe operation
- Most NFSv2 operations are idempotent; what does it mean for an operation to be idempotent? Are all NFSv2 operations idempotent?
- Idempotent: executed multiple times = executed 1 time
- How can client do
append?- User prog
f = open() - NFS Client
lookup& returnfh - NFS Client keep
(fd, fh, offset) - User prog
append() - NFS Client send
write(fh, offset, content)
- User prog
- Non idempotent
mkdir- return code- Solution: replay cache
- What is the main advantage of having stateless NFS servers? What does a client need to do when a server crashes? What does a server need to do when a client crashes?
- Advantages: deals with server crashes easily
- Server crash: Client does nothing special, replay operations
- Client crash: Server also does nothing special
- Why does the stateless NFSv2 server cause performance problems for client write requests? How do some NFS servers fix this problem?
- Can’t return
writeuntil persist - Added NVRAM (faster and persistent storage)
- Can’t return
- To obtain respectable performance, NFS clients may cache data. Why does the stateless server cause problems with data consistency? How does a client find out whether data it is caching is stale? Why does this approach cause performance problems too? When does a client write back modified data to the server?
- Issues
- 3 clients and 1 server
- All cient read and cache
- C_1 write
- Two issues
- C_1 doesn’t flush new data to server
- C_1 close and flush but C_2 re-read cache
- Fundamental problem
- multiple copies, no way knowing data was changed
- Client track timestamp of cached data
- client needs
getattr()
- client needs
- Performance problems:
- scalability problem
- check only every 3 secs
- Issues
- Lock?
- Separate protocol
NFS Version 3: Design and Implementation
- What were the three cited problems with NFSv2? We’ll focus on problems having to do with distributed state… What was their real market incentive?
- File sizes, > 4GB, 64-bit filesize
- Perf. problem with forcing data to disk (async write)
- Cache consistency + overhead of
getattrs
- Why was keeping the server stateless still a goal? Combining stateless servers and non-idempotent requests is difficult; how did previous NFSv2 implementations deal with non-idempotent operations? With this technique, what happens if the reply to a non-idempotent operation is lost? What happens if the server crashes before it sends the reply? NFSv3 will continue to encourage this same implementation technique.
- Quick development
- Know how to deal with stateless
- How did v2 handle non-idempotent ops
- replay cache
- Does not handle server crash (server lost cache)
- Quick development
- How does NFSv3 improve performance of writes to the server? How does this optimization complicate the client if the server crashes? How does a client now know that the server has crashed? (Detail: Is it possible to have worse consistency semantics with this optimization?)
- Improvement
- Client send many
writeand server buffer - Client close and send
commit
- Client send many
- Server Crash?
- Client buffer all writes until commit
- How can client know server crash?
- Server response for
writereturn a write verifier - When commit, server also return write verifier
- Check if both verifier are is them
- Server response for
- Improvement
- NFSv2 suffered from the problem of too many calls to
GETATTRbeing sent to the server. How does NFSv3 improve performance by reducing the number of calls to get attributes?- return all
attrfor all ops
- return all
- When are getattr results useless?
- V2:
- Client write and server response attr
- Client find out time stamp is new
- Client invalid cache? Keep cache?
- V3: return pre, post attr
- V2:
The NFS Version 4 Protocol
- What is the significance of the quote ``Old Marley was as dead as a door-nail’'?
- no longer stateless
- Why does NFSv4 introduce the
COMPOUNDprocedure? What are its semantics? Does it introduce any complexities?- Sequential, stop on error
- Not atomic relative to others
- Client needs to figure out errors
- Why does NFSv4 introduce
OPENandCLOSEoperations? What does an exchange between Client A wishing to open file X for reading and writing look like with the server? What operations can Client A now keep local? What happens when Client B wishes to open the file for reading? How much state does the server now track?- Reasons
- Implement file lock - adhere Windows semantic
- Better consistency
- Improving performance
- Open with exclusive right of the file
- Client can keep operations local
- Client B?
- If ask for delegation -> reject
- Server revoke Client A’s delegation
- Client A send all data back to server
- Reasons
- Adding state to the server complicates crash recovery. What happens now if client A crashes while it has the delegation for the open? How can the server give the delegation to another client? What are some of the problems with this solution?
- Client crash: server waits for lease expire
- What happens now if the server crashes; that is, how can the server avoid simultaneously giving client B the delegation?
- Server crash: no state, doesn’t know delegation
- Another client ask for delegations? wait lease interval to expire
- Why are synchronization operations like
lock/unlockneeded for NFS? How does the lock protocol work? Leases are used for locks as well; what is different about leases for locks compared to delagations? What happens when client A holding a lock reboots? What if the server crashes while client A is holding a lock? What happens when client A tries to refresh its lease?- Normal:
- client send info about itself & verifier, server return client id
- Client crash:
- client get new verifier, ask for client id
- server notice different verifier, free previous lock
- Server crash (lose knowledge of all locks)
- Client refresh lock
- Server doesn’t know clientid, client reaquire
- Delagation vs Lock
- different timeout
- Normal: