rocks - cs.wisc.edu - University of Wisconsin

Download Report

Transcript rocks - cs.wisc.edu - University of Wisconsin

Reliable Sockets:
A Foundation for Mobile
Communications
Victor C. Zandy
Computer Sciences Department
University of Wisconsin-Madison
©2001 Victor C. Zandy
Paradyn/Condor Week (March 2001, Madison WI)
Motivation
• Network communication is unreliable
• Modems disconnect spontaneously
• Computers run on batteries
• Many IP addresses are not static
• Assignment by DHCP
• Mobile computers move across networks
• Applications do not respond well to
these failures
©2001 Victor C. Zandy
[2/36]
Paradyn/Condor Week 2001
Reliable Sockets (Rocks)
• Sockets that tolerate
• IP address changes
• Link failures
• Extended periods of disconnection
• Automatically detect failures and recover
• No loss of in-flight data
• Applications are oblivious to failures
©2001 Victor C. Zandy
[3/36]
Paradyn/Condor Week 2001
Rocks are General Purpose
• Rocks can be used for
• UDP and TCP (and everything over them)
• Connected sockets and listening sockets
• Interoperate with plain sockets
• Transparent, user-level, and portable
©2001 Victor C. Zandy
[4/36]
Paradyn/Condor Week 2001
Applications
• Remote shells
• Mail, editor
• Long-running builds
• Remote GUI-based applications
• Office apps
• Mobile and reliable UDP
• Streaming video and audio
©2001 Victor C. Zandy
[5/36]
Paradyn/Condor Week 2001
Applications
• Process migration
• Checkpoint Condor jobs with open sockets
• Migrate desktop applications
©2001 Victor C. Zandy
[6/36]
Paradyn/Condor Week 2001
Related Work
• Emphasize mobility, not reliability
• No extended periods of disconnection
• Lack mechanisms for failure detection and
automatic reconnection
• Based on kernel modifications
• Must be root to install
• Unportable
• Protocol internals
• Mobile IP, TCP Migrate, MSOCKS
©2001 Victor C. Zandy
[7/36]
Paradyn/Condor Week 2001
TCP Sockets
Host A
Application
Sockets API
Send
Recv
Port 10000
TCP Socket
Kernel
IP: 128.1.2.3
Network
©2001 Victor C. Zandy
[8/36]
Paradyn/Condor Week 2001
TCP Data Flow
Host A
Host B
1 2 3 4 5
write
Sockets API
Sockets API
Send 1 2 3
Recv
Port 10000
Send
Recv
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[9/36]
Paradyn/Condor Week 2001
TCP Data Flow
Host A
Host B
1 2 3 4 5
write
Sockets API
Sockets API
Send 4 5
Recv
Port 10000
Send
Recv 1 2 3
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[10/36]
Paradyn/Condor Week 2001
TCP Data Flow
Host A
Host B
1 2 3 4 5
write
Sockets API
Sockets API
In-flight data
Send 4 5
Recv
Port 10000
Send
Recv 1 2 3
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[11/36]
Paradyn/Condor Week 2001
TCP Data Flow
Host A
Host B
1 2 3 4 5
1 2 3
write
read
Sockets API
Sockets API
Send 4 5
Recv
Port 10000
Send
Recv 1 2 3
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[12/36]
Paradyn/Condor Week 2001
Socket Failures
Host A
Sockets API
Send
Recv
Port 10000
New IP Address
• Host movement
• Lease expiry
• Process migration
Disconnection
• Host suspension
• Link failure
IP: 128.1.2.3
©2001 Victor C. Zandy
Host B
Sockets API
Send
Recv
Port 22
IP: 144.0.1.1
[13/36]
Paradyn/Condor Week 2001
Effect on Applications
Host A
Host B
Sockets API calls fail
write
read
Sockets API
Sockets API
In-flight data is lost
Send
Recv
Port 10000
Send
Recv
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[14/36]
Paradyn/Condor Week 2001
What Rocks Do
• Detect socket failure
• Hide failure from the application
• Automatically reconnect
• Recover in-flight data
©2001 Victor C. Zandy
[15/36]
Paradyn/Condor Week 2001
Host A
Reliable Sockets
Application
Sockets API
In-Flight
Rock
Rocks Library
Sockets API
TCP Socket
Send
Recv
Port 10000
Kernel
IP: 128.1.2.3
Network
©2001 Victor C. Zandy
[16/36]
Paradyn/Condor Week 2001
Host A
Rock Data Flow
Host B
read
write
Sockets API
Sockets API
In-Flight
In-Flight
Count bytes read.
Copy data.
Count
bytesAPI
sent.
Sockets
Sockets API
Send
Recv
Port 10000
Send
Recv
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[17/36]
Paradyn/Condor Week 2001
Host A
Response to Failure
Host B
write
Sockets API
Sockets API
In-Flight
In-Flight
Sockets API
Sockets API
Send
Recv
Port 10000
Send
Recv
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[18/36]
Paradyn/Condor Week 2001
Host A
Response to Failure
Host B
Sockets API
Sockets API
In-Flight
In-Flight
Sockets API
Sockets API
Send
Recv
Port 10000
Send
Recv
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[19/36]
Paradyn/Condor Week 2001
Host A
Response to Failure
Sockets API
In-Flight
Each rock detects the
failure within seconds.
!
Host B
Sockets API
In-Flight
!
Sockets API
Sockets API
Send
Recv
Port 10000
Send
Recv
Port 22
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[20/36]
Paradyn/Condor Week 2001
Host A
Response to Failure
Sockets API
In-Flight
Each rock suspends:
•Close TCP socket
Host B
Sockets API
In-Flight
•Block application
•Attempt to reconnect
Sockets API
Sockets API
IP: 128.1.2.3
IP: 144.0.1.1
©2001 Victor C. Zandy
[21/36]
Paradyn/Condor Week 2001
Host A
Response to Failure
Sockets API
In-Flight
Each rock suspends:
•Close TCP socket
Host B
Sockets API
In-Flight
•Block application
•Attempt to reconnect
Sockets API
Sockets API
New IP Address
IP: 144.0.1.1
IP: 207.10.0.1
©2001 Victor C. Zandy
[22/36]
Paradyn/Condor Week 2001
Host A
Recovery
Sockets API
Sockets API
In-Flight
In-Flight
Sockets API
Sockets API
Send
Recv
Port 30001
New TCP
Connection
Send
Recv
Port 22
IP: 144.0.1.1
IP: 207.10.0.1
©2001 Victor C. Zandy
Host B
[23/36]
Paradyn/Condor Week 2001
Host A
Sockets API
Recovery
Authenticate.
Sockets API
In-Flight
In-Flight
Sockets API
Sockets API
Send
Recv
Port 22
Send
Recv
Port 30001
IP: 144.0.1.1
IP: 207.10.0.1
©2001 Victor C. Zandy
Host B
[24/36]
Paradyn/Condor Week 2001
Host A
Sockets API
In-Flight
Recovery
Authenticate.
Retransmit in-flight
data not received by
remote application.
Sockets API
In-Flight
Sockets API
Sockets API
Send
Recv
Port 22
Send
Recv
Port 30001
IP: 144.0.1.1
IP: 207.10.0.1
©2001 Victor C. Zandy
Host B
[25/36]
Paradyn/Condor Week 2001
Host A
Recovery
Host B
read
Sockets API
In-Flight
Sockets API
Authenticate.
Retransmit in-flight
data not received by
remote application.
Then resume the rock.
In-Flight
Sockets API
Send
Recv
Port 22
Send
Recv
Port 30001
IP: 144.0.1.1
IP: 207.10.0.1
©2001 Victor C. Zandy
Sockets API
[26/36]
Paradyn/Condor Week 2001
Reconnection
Host A
Host B
128.1.2.3
144.0.1.1
©2001 Victor C. Zandy
[27/36]
Paradyn/Condor Week 2001
Reconnection
Connection end moves to new IP address
Host B
Host A
Change IP Address
144.0.1.1
©2001 Victor C. Zandy
[28/36]
207.10.0.1
Paradyn/Condor Week 2001
Reconnection
Each end attempts to reconnect to its
peer at its last known address.
Connection does
not complete
©2001 Victor C. Zandy
Host B
Host A
144.0.1.1
207.10.0.1
[29/36]
Paradyn/Condor Week 2001
Reconnection
As long as one end does not move,
they eventually reconnect.
©2001 Victor C. Zandy
Host B
Host A
144.0.1.1
207.10.0.1
[30/36]
Paradyn/Condor Week 2001
Reconnection
They cannot reconnect if both ends move.
Host B
Connection does
not complete
101.8.7.1
©2001 Victor C. Zandy
Host A
Connection does
not complete
[31/36]
207.10.0.1
Paradyn/Condor Week 2001
Reconnection
Network Proxy
Host B
Host A
Where is A?
101.8.7.1
©2001 Victor C. Zandy
Where is B?
207.10.0.1
[32/36]
Paradyn/Condor Week 2001
Expanded Rocks API
• API allows rocks-aware applications to
control rocks behavior
• Fine control of reconnection
• Notification when rock is suspended
• Manual control of reconnection addresses
• Notification when rock is resumed
©2001 Victor C. Zandy
[33/36]
Paradyn/Condor Week 2001
Expanded Rocks API
• New socket options
• Extended getsockopt and setsockopt
• Policies
• Which ports are excluded?
• Parameters
• Reconnection timeout
• Sensitivity to connection failures
©2001 Victor C. Zandy
[34/36]
Paradyn/Condor Week 2001
Performance
Sockets Rocks Slowdown
10MB FTP 27.38 s 26.48 s
Connect 5.6 ms 20.4 ms
1x
4x
• Reconnection latency
• 1-2 seconds to reconnect
• Usually less than time to acquire DHCP lease
• Suspended rocks have negligible overhead
©2001 Victor C. Zandy
[35/36]
Paradyn/Condor Week 2001
Conclusion
• Rocks make sockets completely reliable
• Protect from link failures and IP address changes
• Use with any application
• Our release is ready for download
• Ready for remote shells and remote GUIs
• http://www.cs.wisc.edu/~zandy/rocks
• See the demo on Wednesday!
©2001 Victor C. Zandy
[36/36]
Paradyn/Condor Week 2001
Detecting Failures
• Users expect quick response to failures.
• Heartbeat:
• Periodically send heartbeat to peer
• Watch for too many missed heartbeats
• Sockets API Errors:
• Too slow to rely upon
• Not reported for idle connections
©2001 Victor C. Zandy
[37/36]
Paradyn/Condor Week 2001
Detecting Failures
• The TCP keep-alive probe is inadequate
• It waits two hours to send its first probe
• User cannot change its period
©2001 Victor C. Zandy
[38/36]
Paradyn/Condor Week 2001