Nomadic File Systems
Download
Report
Transcript Nomadic File Systems
Nomadic File Systems
Uri Moszkowicz
05/02/02
Nomadic File Systems
Overview
Coda
Bayou
Previous papers
Andrew File System
Epidemic protocols
Coda
General idea
Ability to continue work when file servers
are inaccessible through use of caching
Transparent operation through optimistic
conflict resolution
Coda
Coda
Venus States
Hoarding – Hoard data anticipating disconnect
Emulation – Use of local cache
Reintegration – Updating local cache
Coda
Hoarding priorities
Recent usage
Hoard profiles – user specified
Children and descendants may be included
(ie directories)
Hoard walking
Since priority based on recent usage, every once
in a while need to update file system to reflect
priorities
10 min intervals chosen?
Coda
Emulation
Coda is faithful; all updates accepted
At reintegration, validity is checked by
servers
All writes are logged to be replayed at
reintegration
Logs may use up all disk space; not
gracefully handled (writes disabled)
Coda
Reintegration
Updates propagated to servers and vice versa
(one volume at a time)
4 stages
Log parsed, files locked
Validation: conflict detection, security, integrity
Fetching: shadow files are transferred (as needed)
Commit: locks released and changes finalized
Failures must be manually examined from logs
Coda
Performance
Reintegration time
Disk size
Likelihood of conflicts
Significantly affected by hardware (~1990)
and application
Bayou Anti-entropy
Address Coda limitation
Mobile replicas cannot reconcile amongst
themselves in Coda
Also
Incremental progress (not volume)
Efficient storage management
Light-weight replica creation
Bayou Anti-entropy
Simple anti-entropy protocol
One-way operation between pairs
Propagation of write operations
Accept-order maintained
Replicas can be ordered in any topology
Amount of data propagated proportional to
update activity at replicas, not size of data
being replicated
Bayou Anti-entropy
Basic operation
version vector sent from one replica to another
Receiving server traverses write log and requests
all writes it has not yet seen
Write log
Prune write log entries when stable ie all replicas
have seen it
Re-execute subsequent writes when learn of
earlier write
Each write log entry maintains a primary replica to
track its stability or encode stability in messages
Bayou Anti-entropy
Bayou Anti-entropy
Tuple Store
Database (SQL) obtained by executing writes in order; cache
for read requests
Two views: full and committed
Write Log - Ordered list of all writes ever received
Undo Log - Facilitates reordering of write log
“O” vector – for each vector timestamp of latest write
that has been discarded (avoid re-accepting earlier
writes because later one omitted)
“C” vector – tracks committed writes
“F” vector – tracks full writes
“C”+”F” used to determine what to needs update
Bayou Anti-entropy
Conflict Detection & Resolution
Application specified and included in write
Dependency checks (detection)
write/write + read/write
Merge procedures (resolution)
Failures resolved later by user through application
Bayou Anti-entropy
Bayou Anti-entropy
Consistency
Servers can receive writes from other
servers & clients, but writes are applied
immediately when received
Non-determinism must be removed since
servers can be in different states
Environment information (ie system clock)
Exceeding resource limits
Bayou Anti-entropy
Security
Early version relied on central
authentication server
No servers?
Possible invalid updates maintained on replicas
until connected
Later version relies on cryptography
Bayou Anti-entropy
Protocol extensions
Transportation of write log through
portable media
Causal order through logical clocks & total
order of eventual consistency
Light-weight server creation and deletion
(not specified at volume creation time)
Bayou Anti-entropy
Disadvantages
Security
Large network traffic during checkpoint
synchronization
Every replica must know about every other
replica: complexity grows with size of
network
Bayou Anti-entropy
Policies
When to reconcile
Choosing which replicas to reconcile with
Reachable, updatedness, primary status
Deciding when to truncate write-log
Periodic, Manual, System Triggered
Available HD & RAM space, network usage
Selecting server to create replicas
Connection bandwidth, completeness of write-logs
Bayou Anti-entropy
Performance
~1000 servers
Hierarchy schemes to expand network size
Size(tentative writes) = 10*size(committed
writes) due to access control certificate
Linear increase in anti-entropy time with
number of replicas
Bayou Anti-entropy
Examples
Meeting room scheduler
Users select several possible meeting times
Upon connection, meeting times resolved
based on chosen times or prompted for new
time if irresolvable conflict
Tentative meeting times exposed to other users
Bayou Anti-entropy
Examples
Bibliographic database
Users enter bibliographic entries
At resolution, dependency check can access
database to see if two entries are the same
(just typed differently), merge, or prompt user
User may opt to be disconnected to avoid
expensive cellular fees