LBFS + Lease
cross-file similarities Files often contain a number of segments in common with other files or previous versions of the same file divide files into chunks and indexes the chunk Design only close-to-open consistency Cache Chunk Fingerprint every over-lapping 48-byte region if last 13 bit of region equal to a magic value, place break point Enforce min/max chunk size 2K/64K Chunk Database use first 64bit of SHA-1 as key (file, offset, count) as value only as a hint READ Client GETHASH -> Server Server response a vector of hashes -> Client Client request missing data WRITE Implementation Notes Motivation File system for low-bandwidth networks Existing solutions local copy work local copy copy to server when done manual, mistakes, conflict work remotely ssh remote machine Goal: Min bandwidth Related technique: compression Workload Assumptions make small changes to files, similar versions e....