Transcript ppt

Sockets
The building blocks of Internet communications
1
Motivation
• We need to present programs with an abstraction
of the network
– naming, rendezvous, service type and quality
• The abstraction implies an API (Application
Programming Interface)
• What happens if we let everyone define their own
API?
– tower of Babel
• Independently developed programs can’t talk to one another
• Code can’t be shared
• Models can’t be understood and developed upon
2
Overview
•
•
•
•
•
•
3
What is a socket
Sockets undressed
Writing a server
Writing a client
Problems with sockets
Sockets and Java
Sockets
• A socket is the "thing" that a program talks to
when it talks to the network.
– network endpoint
– looks like a "file"
• Sockets can be
– sent upon, received from, created, bound to addresses,
listened on, and connected to.
• Sockets are a programming interface, NOT a
protocol.
– TCP/IP is a protocol.
– We can talk TCP/IP by way of the sockets interface.
4
Two types of sockets
• Sockets are a communication endpoint
socket
green-host
red-host
• "Stream" sockets provide reliable, sequenced,
two-way communication.
– telnet, www, ftp
• "Datagram" sockets provide unreliable
communication.
5
– routing, mbone, bootp, some RPC services
Why use unreliable services?
• An unreliable service can have a substantially
lower implementation cost.
– no connection state
– US mail vs. the telephone
– "best efforts" means never having to say you're sorry.
• Often more effective to devise own reliability
protocol on top of an unreliable protocol.
– do nothing
– retransmit
– find an alternate route
6
All you need to know about
networking
• Key idea is Data Encapsulation
–
–
–
–
every network packet contains a header and some data
the header is protocol specific
the data is what we are really interested in sending
protocols are built by layering
Ethernet
IP
Sockets
work mostly
at this level
7
UDP
TFTP
File
data
An ethernet packet contains
some data, which is an IP packet
which contains some data which is
a UDP packet, which contains
some data which is a TFTP packet,
which contains some data, which
is our file.
Why use sockets?
• Sockets abstract the network into something easier
to think about.
• The OS provides a socket interface
– API for passing data over a socket.
8
• The socket implementation takes care to
encapsulate our data inside the appropriate packets
• The UDP or TCP layer sends data between socket
endpoints.
• The IP layer is responsible for routing the data to
the appropriate machine
Sockets Undressed
• Although sockets are easier to use than the
network, they still have a lot of grunge inside
them.
– addresses, ports, byte ordering, synchronization,
• For the most part, we'll ignore this grunge when
we start looking at the network from a Java
perspective.
• But, to really understand how they work, we need
to take a quick peek with our C hats on.
9
Addresses
• Necessary to associate a socket with an internet
address.
struct in_addr {
unsigned long s_addr; /* 4 bytes */
};
struct sockaddr_in {
short sin_family;
unsigned short int sin_port;
struct in_addr
sin_addr;
unsigned char
sin_zero[8];
}
When we want to tell the socket API about network addresses
(endpoints), we fill in one of these structures.
10
Byte Ordering
• Different computers use different representations
for basic data types
– big endian or little endian
– the issue is which byte in a word is the "high order"
byte
– totally arbitrary and uninteresting
• When different machines communicate, different
interpretations can cause real problems.
• Solution is to legislate a standard network byte
ordering.
11
Network byte ordering
• All data "seen" by the sockets layer
and passed on to the network must
be in network order
• Programmer's responsibility to
convert
– shorts (2 byte values) and longs (4
byte values)
– htons, htonl, ntohs, ntohs
• Extremely error prone
12
struct sockaddr_in s;
/* WRONG */
s.sin_port = 23;
/* RIGHT */
s.sin_port = htons(23);
Specifying Addresses
• My machine is bershad-pc.cs.washington.edu
– IP address is 128.95.4.109
• I want to talk to the ftp server there.
– FTP listens at port 21 in the TCP domain.
/* Pray that we get this right! */
struct sockaddr_in s;
char bytes4[4];
s.sin_family = AF_INET;
s.sin_port = htons(21);
bytes4[0] = 128; bytes4[1] = 95; bytes4[2] = 4; bytes4[3] = 109;
s.sin_addr.in_addr = htonl( *(unsigned long *)bytes4 );
bzero(&s.sin_zero, 8);
13
Some helper functions
• inet_addr("128.95.4.109")
– returns the in_addr (in network order)
• char *inet_ntoa(ina.sin_addr)
– returns the ascii address
• Nevertheless, this stuff is extremely hard to get
right.
14
Using Sockets
• A socket is an OS resource
– represents some context within the operating system
kernel
– looks much like a file descriptor
• Before we can use one, we've got to make one.
#include <sys/types.h>
#include <sys/sockets.h>
int
int
int
int
s; /* our socket */
domain = AF_INET; /* in which communication "domain" */
type = SOCK_STREAM;
/* datagram or stream? */
protocol = 0; /* specific protocol, usually 0 */
s = socket(domain, type, protocol);
15
Binding an address to a socket
• The socket() call returns an unbound endpoint.
– it has no "internetness" to it.
• We may need to bind the endpoint to a particular
internet (IP, port) address
#include <sys/types.h>
#include <sys/sockets.h>
struct sockaddr_in my_addr;
int s = socket(AF_INET, SOCK_STREAM, 0);
my_addr.sin_family = AF_INET;
my_addr.sin_port = hton(1234);
my_addr.sin_addr.s_addr = inet_addr("128.95.4.109");
bzero(&(my_addr.sin_zero), 8);
bind(s, (struct sockaddr*)&my_addr, sizeof(my_addr));
...
16
More on binding
• We can let the OS decide where we really are
– my_addr.sin_port = htons(0);
– my_addr.sin_addr.s_addr = htonl(INADDR_ANY);
• Ports below 1024 are "reserved"
– must be a superuser in order to associate a local socket with a
"small" port.
– this is a very weak form of network security
– assumption is that messages sent from a small port come from
a privileged process
• would you trust your bank account to this??
• Binding only necessary if you care about your address
– OS can implicitly bind for you in some cases
17
Connecting to a socket
• The connect(s, struct sockaddr *server, int
addrlen) call lets us connect to a remote socket.
#include <sys/types.h>
#include <sys/socket.h>
main()
{ int s = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in dest;
dest.sin_family = AF_INET;
dest.sin_port = htons(789);
dest.sin_addr.s_addr = inet_addr("128.95.4.109");
bzero(&dest.sin_addr.s_addr, 8);
connect(s, (struct sockaddr*)&dest, sizeof(dest));
/* returns -1 if we fail */
}
18
Listening on a socket
• A connect() request initiates communication with
a peer.
• We can only connect to a socket that someone is
listening on.
s = socket(...)
bind(s, (struct sockaddr*)&my_addr)...);
for (;;) {
if (listen(s) == -1) error();
/* someone somewhere has "connected" to the
address for s.
*/
19
Listening leads to acceptance
• listen() tells us that there is someone calling.
• But, we are under no obligation to accept
• The accept() call says "ok, let's start talking."
– communication happens over a NEW socket
• the original socket is used for future connections
s = socket(...)
bind(s, (struct sockaddr*)&my_addr)...);
for (;;) {
struct sock_addr_in peer_addr;
int peer_addrlen = sizeof(peer_addr);
if (listen(s) == -1) error();
/* someone somewhere has "connected" to the
address for s.
*/
int new_socket = accept(s, &peer_addr, &peer_addrlen);
/* start communicating on peer_addr */
}
20
Stepping back
s = socket();
s = socket();
bind(s, red_addr);
listen(s);
bind(s, blue_addr);
connect(s, red_addr);
new_socket = accept(s, &peer);
/* peer = blue_addr */
/* ready to xmit/recv
over s */
21
/* ready to xmit/recv over
new_socket */
Sending and Receiving
• Once connected, we send and receive data in the
same way on both sides
char *blueInfo = "I'm so blue";
int len;
/* connect up... */
len = send(s, blueInfo,
strlen(blueInfo), 0);
char peerInfo[32];
int len;
/* bind, listen, accept */
...
len = recv(new_socket,
peerInfo, 32, 0);
flags that we rarely
use.
22
Datagrams are unconnected
• Connection only required for stream sockets.
– Key idea is that once connected, all data flows through
same pair of endpoints.
• With datagrams, each message sent must carry its
own addressing information
– simpler operating system state
– more burden for the programmer
• Two calls provided
23
• sendto(int sockfd, char *msg, int len, int flags, sockaddr *to,
int tolen);
• recvfrom(int sockfd, char *msg, int *len, int flags, sockaddr
*from, int *fromlen);
Shutting down
• close(s) will shutdown a socket.
– no more sends or receives
• more graceful shutdown services are provided
– shutdown(s, 0)
• no more receives
– shutdown(s, 1)
• no more sends
– shutdown(s, 2)
• no more receives or sends (like close(s))
24
Who's Out There?
• getpeername(s, &peer, &peerlen)
– returns the sockaddr_in of the other end of a connected
socket.
• struct hostent gethostbyname("bershad-pc");
– return the internet address of a named host.
– relies on the Domain Name System (DNS) server.
• may involve hidden network communication
25
Writing a Server
• Sockets provide the basis for client/server
communication.
– the server listens at at an internet address (host addr,
port)
– the client connects to the port
– the client sends a request
– the server sends a response
– the server shuts the connection down.
26
A Simple Web Server
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
main()
{
int sockfd, client_socket;
struct sockaddr_in my_addr;
struct sockaddr_in peer_addr;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
my_addr.sin_family = AF_INET;
my_addr.sin_port = htons(8080);
my_addr.sin_addr.s_addr = INADDR_ANY;
bzero((my_addr.sin_zero), 8);
bind(sockfd, (struct sockaddr *)&my_addr, sizeof(my_addr));
listen(sockfd, 5);
for (;;) {
int peer_size = sizeof(struct sockaddr_in);
client_socket = accept(sockfd,
(struct sockaddr*)&peer_addr,
&sin_size);
process_request(client_socket);
}
}
27
Processing the request
process_request(int client_socket)
{
char reqBuf[64];
char *url;
char urlData[4096];
int len;
recv(client_socket, reqBuf, 64, 0));
if (strncmp(reqBuf, "GET", 3) == 0) {
url = reqBuf + 4;
len = readFile(url, urlData);
send(client_socket, urlData, len, 0);
} else send(client_socket, "Error", 5, 0);
close(client_socket, 2);
}
28
The Client Side
#include
#include
#include
#include
<stdio.h>
<sys/types.h>
<sys/socket.h>
<netinet/in.h>
/* invoked as urlget host object */
main(char **argv, int argc)
{
int sockfd;
struct sockaddr_in peer_addr;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
char *msg = strcat("GET ", argv[2]);
char urlData[4096];
int len;
peer_addr.sin_family = AF_INET;
peer_addr.sin_port = htons(8080);
he = gethostbyname(argv[1]);
peer_addr.sin_addr.s_addr = *((struct in_addr*)he->h_addr
bzero((peer_addr.sin_zero), 8);
connect(sockfd, (struct sockaddr*)&peer_addr, sizeof(peer_addr));
send(sockfd, msg, strlen(msg),0 );
len = recv(sockfd, urlData, 4096, 0);
printf("Received %s\n", urlData);
close(sockfd);
}
29
Concurrency
• In our example, the server can't accept any
requests while it is processing for the client.
– suppose it takes 100 ms to handle a request.
• not unreasonable if we have to go do disk
– at most, server can handle 10 requests per second.
• would you buy such a web server??
• The solution is to allow the server to process
requests concurrently
– Simplest way is to have the server create a new "copy"
of itself to handle each new request.
30
Concurrent requests
• The fork() system call duplicates the server
process
– what's the bet here?
s = accept();
fork();
if parent
continue
31
This process continues
to accept new requests.
s = accept();
fork();
if parent
continue
else
processRequest(s)
While this process handles
the specific request on s.
Problems with Sockets
• Even simple programs are difficult to write.
– byte ordering, connection management, msg
construction and deconstruction
– lots of reliance on wierd C idioms
• Difficult to distinguish server interface from
implementation
– connection protocol is "part" of the interface, but at a
different level than what goes into the message
• Network oriented
32
– difficult to "optimize" communication for more
efficient channels
Summary
• Sockets are the OS's way to present the network to
applications.
• They are powerful but clunky.
• We will see ways to create even more powerful,
but less clunky distributed programming
interfaces.
33