Transcript Server

1. Networks and the Internet
2. Network programming
3. Web services
A Client-Server Transaction
1. Client sends request
Client
process
4. Client
handles
response
Server
process
3. Server sends response
Resource
2. Server
handles
request
Note: clients and servers are processes running on hosts
(can be the same or different hosts)

Most network applications are based on the client-server
model:




A server process and one or more client processes
Server manages some resource
Server provides service by manipulating resource for clients
Server activated by request from client (vending machine analogy)
Hardware Organization of a Network Host
CPU chip
register file
ALU
system bus
memory bus
main
memory
I/O
bridge
MI
Expansion slots
I/O bus
USB
controller
mouse keyboard
graphics
adapter
disk
controller
network
adapter
disk
network
monitor
Computer Networks

A network is a hierarchical system of boxes and wires
organized by geographical proximity
 SAN (System Area Network) spans cluster or machine room
Switched Ethernet, Quadrics QSW, …
 LAN (Local Area Network) spans a building or campus
 Ethernet is most prominent example
 WAN (Wide Area Network) spans country or world
 Typically high-speed point-to-point phone lines


An internetwork (internet) is an interconnected set of
networks
 The Global IP Internet (uppercase “I”) is the most famous example
of an internet (lowercase “i”)

Let’s see how an internet is built from the ground up
Lowest Level: Ethernet Segment
host
100 Mb/s
host
hub
host
100 Mb/s
port

Ethernet segment consists of a collection of hosts connected
by wires (twisted pairs) to a hub

Spans room or floor in a building

Operation
 Each Ethernet adapter has a unique 48-bit address (MAC address)
E.g., 00:16:ea:e3:54:e6
 Hosts send bits to any other host in chunks called frames
 Hub slavishly copies each bit from each port to every other port

Every host sees every bit
 Note: Hubs are on their way out. Bridges (switches, routers) became cheap enough
to replace them (means no more broadcasting)

Next Level: Bridged Ethernet Segment
A
host
host
hub
B
host
host
X
100 Mb/s bridge
100 Mb/s hub
1 Gb/s
hub
100 Mb/s
bridge
host
100 Mb/s
host
host
hub
Y
host
host
host
host
host
C

Spans building or campus

Bridges cleverly learn which hosts are reachable from which
ports and then selectively copy frames from port to port
Conceptual View of LANs

For simplicity, hubs, bridges, and wires are often shown as a
collection of hosts attached to a single wire:
host
host ... host
Next Level: internets


Multiple incompatible LANs can be physically connected by
specialized computers called routers
The connected networks are called an internet
host
host ...
host
host
host ...
LAN
host
LAN
router
WAN
router
WAN
router
LAN 1 and LAN 2 might be completely different, totally incompatible
(e.g., Ethernet and Wifi, 802.11*, T1-links, DSL, …)
Logical Structure of an internet
host
router
host
router
router
router
router

router
Ad hoc interconnection of networks
 No particular topology
 Vastly different router & link capacities

Send packets from source to destination by hopping through
networks
 Router forms bridge from one network to another
 Different packets may take different routes
The Notion of an internet Protocol

How is it possible to send bits across incompatible LANs
and WANs?

Solution:
 protocol software running on each host and router
 smooths out the differences between the different networks

Implements an internet protocol (i.e., set of rules)
 governs how hosts and routers should cooperate when they
transfer data from network to network
 TCP/IP is the protocol for the global IP Internet
What Does an internet Protocol Do?

Provides a naming scheme
 An internet protocol defines a uniform format for host addresses
 Each host (and router) is assigned at least one of these internet
addresses that uniquely identifies it

Provides a delivery mechanism
 An internet protocol defines a standard transfer unit (packet)
 Packet consists of header and payload


Header: contains info such as packet size, source and destination
addresses
Payload: contains data bits sent from source host
Transferring Data Over an internet
LAN1
(1)
client
server
protocol
software
data
PH
data
PH
LAN2
(8)
data
(7)
data
PH
FH2
(6)
data
PH
FH2
protocol
software
FH1
LAN1 frame
(3)
Host B
data
internet packet
(2)
Host A
LAN1
adapter
LAN2
adapter
Router
FH1
LAN1
adapter
LAN2
adapter
LAN2 frame
(4)
PH: Internet packet header
FH: LAN frame header
data
PH
FH1
data
protocol
software
PH
FH2
(5)
Other Issues

We are glossing over a number of important questions:
 What if different networks have different maximum frame sizes?
(segmentation)
 How do routers know where to forward frames?
 How are routers informed when the network topology changes?
 What if packets get lost?

These (and other) questions are addressed by the area of
systems known as computer networking
Global IP Internet

Most famous example of an internet

Based on the TCP/IP protocol family
 IP (Internet protocol) :
Provides basic naming scheme and unreliable delivery capability
of packets (datagrams) from host-to-host
 UDP (Unreliable Datagram Protocol)
 Uses IP to provide unreliable datagram delivery from
process-to-process
 TCP (Transmission Control Protocol)
 Uses IP to provide reliable byte streams from process-to-process
over connections


Accessed via a mix of Unix file I/O and functions from the
sockets interface
Hardware and Software Organization
of an Internet Application
Internet client host
Internet server host
Client
User code
Server
TCP/IP
Kernel code
TCP/IP
Sockets interface
(system calls)
Hardware interface
(interrupts)
Network
adapter
Hardware
and firmware
Global IP Internet
Network
adapter
A Programmer’s View of the Internet

Hosts are mapped to a set of 32-bit IP addresses
 140.192.36.43

The set of IP addresses is mapped to a set of identifiers
called Internet domain names
 140.192.36.43 is mapped to cdmlinux.cdm.depaul.edu
IP Addresses

32-bit IP addresses are stored in an IP address struct
 IP addresses are always stored in memory in network byte order
(big-endian byte order)
 True in general for any integer transferred in a packet header from one
machine to another.
 E.g., the port number used to identify an Internet connection.
/* Internet address structure */
struct in_addr {
unsigned int s_addr; /* network byte order (big-endian) */
};
Useful network byte-order conversion functions (“l” = 32 bits, “s” = 16 bits)
htonl: convert uint32_t from host to network byte order
htons: convert uint16_t from host to network byte order
ntohl: convert uint32_t from network to host byte order
ntohs: convert uint16_t from network to host byte order
Dotted Decimal Notation

By convention, each byte in a 32-bit IP address is represented
by its decimal value and separated by a period


IP address: 0x8002C2F2 = 128.2.194.242
Functions for converting between binary IP addresses and
dotted decimal strings:
 inet_aton: dotted decimal string → IP address in network byte order
 inet_ntoa: IP address in network byte order → dotted decimal string
 “n” denotes network representation
 “a” denotes application representation
Internet Domain Names
unnamed root
.net
.edu
smith depaul
cti
.gov
berkeley
ece
.com
amazon
www
207.171.166.252
cstsis
reed
140.192.32.110
ctilinux3
140.192.36.43
First-level domain names
Second-level domain names
Third-level domain names
Domain Naming System (DNS)

The Internet maintains a mapping between IP addresses and
domain names in a huge worldwide distributed database called
DNS
 Conceptually, programmers can view the DNS database as a collection of
millions of host entry structures:
/* DNS host entry structure
struct hostent {
char
*h_name;
/*
char
**h_aliases;
/*
int
h_addrtype;
/*
int
h_length;
/*
char
**h_addr_list; /*
};

*/
official domain name of host */
null-terminated array of domain names */
host address type (AF_INET) */
length of an address, in bytes */
null-terminated array of in_addr structs */
Functions for retrieving host entries from DNS:
 gethostbyname: query key is a DNS domain name.
 gethostbyaddr: query key is an IP address.
Properties of DNS Host Entries



Each host entry is an equivalence class of domain names and
IP addresses
Each host has a locally defined domain name localhost
which always maps to the loopback address 127.0.0.1
Different kinds of mappings are possible:
 Simple case: one-to-one mapping between domain name and IP address:

reed.cs.depaul.edu maps to 140.192.32.110
 Multiple domain names mapped to the same IP address:

eecs.mit.edu and cs.mit.edu both map to 18.62.1.6
 Multiple domain names mapped to multiple IP addresses:

google.com maps to multiple IP addresses
A Program That Queries DNS
int main(int argc, char **argv) { /* argv[1] is a domain name */
char **pp;
/* or dotted decimal IP addr */
struct in_addr addr;
struct hostent *hostp;
if (inet_aton(argv[1], &addr) != 0)
hostp = Gethostbyaddr((const char *)&addr, sizeof(addr),
AF_INET);
else
hostp = Gethostbyname(argv[1]);
printf("official hostname: %s\n", hostp->h_name);
for (pp = hostp->h_aliases; *pp != NULL; pp++)
printf("alias: %s\n", *pp);
for (pp = hostp->h_addr_list; *pp != NULL; pp++) {
addr.s_addr = ((struct in_addr *)*pp)->s_addr;
printf("address: %s\n", inet_ntoa(addr));
}
}
Using DNS Program
$ ./hostinfo reed.cs.depaul.edu
official hostname: reed.cti.depaul.edu
alias: reed.cs.depaul.edu
address: 140.192.39.29
$ ./hostinfo 140.192.39.29
official hostname: reed.cti.depaul.edu
address: 140.192.39.29
$ ./hostinfo www.google.com
official hostname: www.google.com
address: 74.125.225.20
address: 74.125.225.17
address: 74.125.225.18
address: 74.125.225.19
address: 74.125.225.16
Querying DIG

Domain Information Groper (dig) provides a scriptable
command line interface to DNS
$ dig +short reed.cs.depaul.edu
reed.cti.depaul.edu.
140.192.39.29
$ dig +short -x 140.192.39.29
ipdstdwkr.cstcis.cti.depaul.edu.
reed.cti.depaul.edu.
$ dig +short www.google.com
74.125.225.19
74.125.225.16
74.125.225.20
74.125.225.17
74.125.225.18
Internet Connections

Clients and servers communicate by sending streams of bytes
over connections:
 Point-to-point, full-duplex (2-way communication), and reliable.

A socket is an endpoint of a connection
 Socket address is an IPaddress:port pair

A port is a 16-bit integer that identifies a process:
 Ephemeral port: Assigned automatically on client when client makes a
connection request
 Well-known port: Associated with some service provided by a server
(e.g., port 80 is associated with Web servers)

A connection is uniquely identified by the socket addresses
of its endpoints (socket pair)
 (cliaddr:cliport, servaddr:servport)
Putting it all Together:
Anatomy of an Internet Connection
Client socket address
128.2.194.242:51213
Client
Client host address
128.2.194.242
Server socket address
208.216.181.15:80
Connection socket pair
(128.2.194.242:51213, 208.216.181.15:80)
Server
(port 80)
Server host address
208.216.181.15
Clients

Examples of client programs
 Web browsers, ftp, telnet, ssh

How does a client find the server?
 The IP address in the server socket address identifies the host
(more precisely, an adapter on the host)
 The (well-known) port in the server socket address identifies the
service, and thus implicitly identifies the server process that performs
that service.
 Examples of well know ports
 Port 7: Echo server
 Port 23: Telnet server
 Port 25: Mail server
 Port 80: Web server
Using Ports to Identify Services
Server host 128.2.194.242
Client host
Client
Service request for
128.2.194.242:80
(i.e., the Web server)
Web server
(port 80)
Kernel
Echo server
(port 7)
Client
Service request for
128.2.194.242:7
(i.e., the echo server)
Web server
(port 80)
Kernel
Echo server
(port 7)
Servers

Servers are long-running processes (daemons)
 Created at boot-time (typically) by the init process (process 1)
 Run continuously until the machine is turned off

Each server waits for requests to arrive on a well-known port
associated with a particular service





Port 7: echo server
Port 23: telnet server
Port 25: mail server
Port 80: HTTP server
A machine that runs a server process is also often referred to
as a “server”
Server Examples

Web server (port 80)
 Resource: files/compute cycles (CGI programs)
 Service: retrieves files and runs CGI programs on behalf of the client

FTP server (20, 21)
 Resource: files
 Service: stores and retrieve files

See /etc/services for a
comprehensive list of the port
mappings on a Linux machine
Telnet server (23)
 Resource: terminal
 Service: proxies a terminal on the server machine

Mail server (25)
 Resource: email “spool” file
 Service: stores mail messages in spool file
Sockets Interface

Created in the early 80’s as part of the original Berkeley
distribution of Unix that contained an early version of the
Internet protocols

Provides a user-level interface to the network

Underlying basis for all Internet applications

Based on client/server programming model
Sockets

What is a socket?
 To the kernel, a socket is an endpoint of communication
 To an application, a socket is a file descriptor that lets the
application read/write from/to the network
 Remember: All Unix I/O devices, including networks, are
modeled as files

Clients and servers communicate with each other by
reading from and writing to socket descriptors
Client
clientfd

Server
serverfd
The main distinction between regular file I/O and socket
I/O is how the application “opens” the socket descriptors
Example: Echo Client and Server
On Client
On Server
$ ./echoserveri 28888
$ ./echoclient localhost 28888
server connected to
localhost.localdomain (127.0.0.1)
type: hello there
server received 12 bytes
echo: hello there
type: ^D
Connection closed
Overview of the Sockets Interface
Client
Server
socket
socket
bind
open_listenfd
open_clientfd
listen
connect
Client /
Server
Session
Connection
request
accept
rio_writen
rio_readlineb
rio_readlineb
rio_writen
close
EOF
rio_readlineb
close
Await connection
request from
next client
Socket Address Structures

Generic socket address:
 For address arguments to connect, bind, and accept
 Necessary only because C did not have generic (void *) pointers
when the sockets interface was designed
struct sockaddr {
unsigned short sa_family;
char
sa_data[14];
};
/* protocol family */
/* address data. */
sa_family
Family Specific
Socket Address Structures

Internet-specific socket address:
 Must cast (sockaddr_in *) to (sockaddr *) for connect,
bind, and accept
struct sockaddr_in {
unsigned short sin_family;
unsigned short sin_port;
struct in_addr sin_addr;
unsigned char
sin_zero[8];
};
sin_port
AF_INET
/*
/*
/*
/*
address family (always AF_INET) */
port num in network byte order */
IP addr in network byte order */
pad to sizeof(struct sockaddr) */
sin_addr
0
0
sa_family
sin_family
Family Specific
0
0
0
0
0
0
Echo Client Main Routine
#include "csapp.h"
Send line to
server
Receive line
from server
/* usage: ./echoclient host port */
int main(int argc, char **argv)
{
int clientfd, port;
char *host, buf[MAXLINE];
rio_t rio;
host = argv[1]; port = atoi(argv[2]);
clientfd = Open_clientfd(host, port);
Rio_readinitb(&rio, clientfd);
printf("type:"); fflush(stdout);
while (Fgets(buf, MAXLINE, stdin) != NULL) {
Rio_writen(clientfd, buf, strlen(buf));
Rio_readlineb(&rio, buf, MAXLINE);
printf("echo:");
Fputs(buf, stdout);
printf("type:"); fflush(stdout);
}
Close(clientfd);
exit(0);
}
Read input
line
Print server
response
Overview of the Sockets Interface
Client
Server
socket
socket
bind
open_clientfd
listen
connect
Connection
request
accept
open_listenfd
Echo Client: open_clientfd
int open_clientfd(char *hostname, int port) {
int clientfd;
This function opens a connection
struct hostent *hp;
from the client to the server at
struct sockaddr_in serveraddr;
hostname:port
}
if ((clientfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
return -1; /* check errno for cause of error */
Create
socket
/* Fill in the server's IP address and port */
if ((hp = gethostbyname(hostname)) == NULL)
return -2; /* check h_errno for cause of error */
bzero((char *) &serveraddr, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
bcopy((char *)hp->h_addr_list[0],
(char *)&serveraddr.sin_addr.s_addr, hp->h_length);
serveraddr.sin_port = htons(port);
Create
address
/* Establish a connection with the server */
if (connect(clientfd, (SA *) &serveraddr,
sizeof(serveraddr)) < 0)
return -1;
return clientfd;
Establish
connection
Echo Client: open_clientfd
(socket)

socket creates a socket descriptor on the client
 Just allocates & initializes some internal data structures
 AF_INET: indicates that the socket is associated with Internet protocols
 SOCK_STREAM: selects a reliable byte stream connection
 provided by TCP
int clientfd;
/* socket descriptor */
if ((clientfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
return -1; /* check errno for cause of error */
... <more>
Echo Client: open_clientfd
(gethostbyname)

The client then builds the server’s Internet address
int clientfd;
/* socket descriptor */
struct hostent *hp;
/* DNS host entry */
struct sockaddr_in serveraddr; /* server’s IP address */
...
/* fill in the server's IP address and port */
if ((hp = gethostbyname(hostname)) == NULL)
return -2; /* check h_errno for cause of error */
bzero((char *) &serveraddr, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
serveraddr.sin_port = htons(port);
bcopy((char *)hp->h_addr_list[0],
(char *)&serveraddr.sin_addr.s_addr, hp->h_length);
Echo Client: open_clientfd
(connect)

Finally the client creates a connection with the server
 Client process suspends (blocks) until the connection is created
 After resuming, the client is ready to begin exchanging messages with the
server via Unix I/O calls on descriptor clientfd
int clientfd;
/* socket descriptor */
struct sockaddr_in serveraddr;
/* server address */
typedef struct sockaddr SA;
/* generic sockaddr */
...
/* Establish a connection with the server */
if (connect(clientfd, (SA *)&serveraddr, sizeof(serveraddr)) < 0)
return -1;
return clientfd;
}
Echo Server: Main Routine
int main(int argc, char **argv) {
int listenfd, connfd, port, clientlen;
struct sockaddr_in clientaddr;
struct hostent *hp;
char *haddrp;
unsigned short client_port;
port = atoi(argv[1]); /* the server listens on a port passed
on the command line */
listenfd = open_listenfd(port);
while (1) {
clientlen = sizeof(clientaddr);
connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);
hp = Gethostbyaddr((const char *)&clientaddr.sin_addr.s_addr,
sizeof(clientaddr.sin_addr.s_addr), AF_INET);
haddrp = inet_ntoa(clientaddr.sin_addr);
client_port = ntohs(clientaddr.sin_port);
printf("server connected to %s (%s), port %u\n",
hp->h_name, haddrp, client_port);
echo(connfd);
Close(connfd);
}
}
Overview of the Sockets Interface
Client
Server
socket
socket
bind
open_listenfd
open_clientfd
listen
connect

Connection
request
accept
Office Telephone Analogy for Server




Socket:
Bind:
Listen:
Accept:
Buy a phone
Tell the local administrator what number you want to use
Plug the phone in
Answer the phone when it rings
Echo Server: open_listenfd
int open_listenfd(int port)
{
int listenfd, optval=1;
struct sockaddr_in serveraddr;
/* Create a socket descriptor */
if ((listenfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
return -1;
/* Eliminates "Address already in use" error from bind. */
if (setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR,
(const void *)&optval , sizeof(int)) < 0)
return -1;
... <more>
Echo Server: open_listenfd (cont.)
...
/* Listenfd will be an endpoint for all requests to port
on any IP address for this host */
bzero((char *) &serveraddr, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
serveraddr.sin_port = htons((unsigned short)port);
if (bind(listenfd, (SA *)&serveraddr, sizeof(serveraddr)) < 0)
return -1;
/* Make it a listening socket ready to accept
connection requests */
if (listen(listenfd, LISTENQ) < 0)
return -1;
return listenfd;
}
Echo Server: open_listenfd
(socket)

socket creates a socket descriptor on the server
 AF_INET: indicates that the socket is associated with Internet protocols
 SOCK_STREAM: selects a reliable byte stream connection (TCP)
int listenfd; /* listening socket descriptor */
/* Create a socket descriptor */
if ((listenfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
return -1;
Echo Server: open_listenfd
(setsockopt)

The socket can be given some attributes
...
/* Eliminates "Address already in use" error from bind(). */
if (setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR,
(const void *)&optval , sizeof(int)) < 0)
return -1;

Handy trick that allows us to rerun the server immediately
after we kill it
 Otherwise we would have to wait about 15 seconds
 Eliminates “Address already in use” error from bind()

Strongly suggest you do this for all your servers to simplify
debugging
Echo Server: open_listenfd
(initialize socket address)


Initialize socket with server port number
Accept connection from any IP address
struct sockaddr_in serveraddr; /* server's socket addr */
...
/* listenfd will be an endpoint for all requests to port
on any IP address for this host */
bzero((char *) &serveraddr, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
serveraddr.sin_port = htons((unsigned short)port);
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);

IP addr and port stored in network (big-endian) byte order
sin_port
AF_INET
sa_family
sin_family
sin_addr
INADDR_ANY
0
0
0
0
0
0
0
0
Echo Server: open_listenfd
(bind)

bind associates the socket with the socket address we just
created
int listenfd;
/* listening socket */
struct sockaddr_in serveraddr; /* server’s socket addr */
...
/* listenfd will be an endpoint for all requests to port
on any IP address for this host */
if (bind(listenfd, (SA *)&serveraddr, sizeof(serveraddr)) < 0)
return -1;
Echo Server: open_listenfd
(listen)


listen indicates that this socket will accept connection
(connect) requests from clients
LISTENQ is constant indicating how many pending requests
allowed
int listenfd; /* listening socket */
...
/* Make it a listening socket ready to accept connection requests */
if (listen(listenfd, LISTENQ) < 0)
return -1;
return listenfd;
}

We’re finally ready to enter the main server loop that
accepts and processes client connection requests.
Echo Server: Main Loop

The server loops endlessly, waiting for connection
requests, then reading input from the client, and echoing
the input back to the client.
main() {
/* create and configure the listening socket */
while(1) {
/* Accept(): wait for a connection request */
/* echo(): read and echo input lines from client til EOF */
/* Close(): close the connection */
}
}
Overview of the Sockets Interface
Client
Server
socket
socket
bind
open_listenfd
open_clientfd
listen
connect
Client /
Server
Session
Connection
request
accept
rio_writen
rio_readlineb
rio_readlineb
rio_writen
close
EOF
rio_readlineb
close
Await connection
request from
next client
Echo Server: accept

accept() blocks waiting for a connection request
int listenfd; /* listening descriptor */
int connfd;
/* connected descriptor */
struct sockaddr_in clientaddr;
int clientlen;
clientlen = sizeof(clientaddr);
connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);


accept returns a connected descriptor (connfd) with
the same properties as the listening descriptor
(listenfd)

Returns when the connection between client and server is created
and ready for I/O transfers

All I/O with the client will be done via the connected socket
accept also fills in client’s IP address
Echo Server: accept Illustrated
listenfd(3)
Client
Server
clientfd
Connection
request
Client
1. Server blocks in accept,
waiting for connection request
on listening descriptor
listenfd
listenfd(3)
Server
2. Client makes connection request by
calling and blocking in connect
clientfd
listenfd(3)
Client
clientfd
Server
connfd(4)
3. Server returns connfd from
accept. Client returns from connect.
Connection is now established between
clientfd and connfd
Connected vs. Listening Descriptors

Listening descriptor
 End point for client connection requests
 Created once and exists for lifetime of the server

Connected descriptor
 End point of the connection between client and server
 A new descriptor is created each time the server accepts a
connection request from a client
 Exists only as long as it takes to service client

Why the distinction?
 Allows for concurrent servers that can communicate over many
client connections simultaneously
 E.g., Each time we receive a new request, we fork a child to
handle the request
Echo Server: Identifying the Client

The server can determine the domain name, IP address,
and port of the client
struct hostent *hp; /* pointer to DNS host entry */
char *haddrp;
/* pointer to dotted decimal string */
unsigned short client_port;
hp = Gethostbyaddr((const char *)&clientaddr.sin_addr.s_addr,
sizeof(clientaddr.sin_addr.s_addr), AF_INET);
haddrp = inet_ntoa(clientaddr.sin_addr);
client_port = ntohs(clientaddr.sin_port);
printf("server connected to %s (%s), port %u\n",
hp->h_name, haddrp, client_port);
Echo Server: echo

The server uses RIO to read and echo text lines until EOF
(end-of-file) is encountered.
 EOF notification caused by client calling close(clientfd)
void echo(int connfd)
{
size_t n;
char buf[MAXLINE];
rio_t rio;
Rio_readinitb(&rio, connfd);
while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) {
upper_case(buf);
Rio_writen(connfd, buf, n);
printf("server received %d bytes\n", n);
}
}
Testing Servers Using telnet

The telnet program is invaluable for testing servers
that transmit ASCII strings over Internet connections
 Our simple echo server
 Web servers
 Mail servers

Usage:
 unix> telnet <host> <portnumber>
 Creates a connection with a server running on <host> and
listening on port <portnumber>
Testing the Echo Server With telnet
$ ./echoserveri 28888
Use separate
SSH sessions
$ telnet localhost 28888
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello
hello
For More Information

W. Richard Stevens, “Unix Network Programming:
Networking APIs: Sockets and XTI”, Volume 1, Second
Edition, Prentice Hall, 1998
 THE network programming bible

Unix Man Pages
 Good for detailed information about specific functions
Web Services
Web History

1989:
 Tim Berners-Lee (CERN) writes internal proposal to develop a
distributed hypertext system.
 Connects “a web of notes with links.”
 Intended to help CERN physicists in large projects share and
manage information

1990:
 Tim BL writes a graphical browser for Next machines.
Web History (cont)

1992
 NCSA server released
 26 WWW servers worldwide

1993





Marc Andreessen releases first version of NCSA Mosaic browser
Mosaic version released for (Windows, Mac, Unix).
Web (port 80) traffic at 1% of NSFNET backbone traffic.
Over 200 WWW servers worldwide.
1994
 Andreessen and colleagues leave NCSA to form “Mosaic
Communications Corp” (predecessor to Netscape).
Internet Hosts
Web Servers
HTTP request

Clients and servers
communicate using the
HyperText Transfer Protocol
(HTTP)
 Client and server establish TCP
connection
 Client requests content
 Server responds with
requested content
 Client and server close
connection (eventually)

Current version is HTTP/1.1
 RFC 2616, June, 1999.
Web
client
(browser)
Web
server
HTTP response
(content)
HTTP
TCP
IP
Web content
Streams
Datagrams
http://www.w3.org/Protocols/rfc2616/rfc2616.html
Web Content

Web servers return content to clients
 content: a sequence of bytes with an associated MIME (Multipurpose
Internet Mail Extensions) type

Example MIME types





text/html
text/plain
application/postscript
image/gif
image/jpeg
HTML document
Unformatted text
Postcript document
Binary image encoded in GIF format
Binary image encoded in JPEG format
Static and Dynamic Content

The content returned in HTTP responses can be either
static or dynamic.
 Static content: content stored in files and retrieved in response to
an HTTP request
 Examples: HTML files, images, audio clips.
 Request identifies content file
 Dynamic content: content produced on-the-fly in response to an
HTTP request
 Example: content produced by a program executed by the
server on behalf of the client.
 Request identifies file containing executable code

Bottom line: All Web content is associated with a file that
is managed by the server.
URLs


Each file managed by a server has a unique name called a URL
(Universal Resource Locator)
URLs for static content:
 http://reed.cs.depaul.edu:80/index.html
 http://reed.cs.depaul.edu/index.html
 http://reed.cs.depaul.edu


Identifies a file called index.html, managed by a Web server at
reed.cs.depaul.edu that is listening on port 80.
URLs for dynamic content:
 http://riely373.cdm.depaul.edu:8000/cgi-bin/adder?15000&213

Identifies an executable file called adder, managed by a Web server
at riely373.cdm.depaul.edu that is listening on port 8000,
that should be called with two argument strings: 15000 and 213.
How Clients and Servers Use URLs


Example URL: http://www.depaul.edu:80/index.html
Clients use prefix (http://www.depaul.edu:80) to infer:
 What kind of server to contact (Web server)
 Where the server is (www.depaul.edu)
 What port it is listening on (80)

Servers use suffix (/index.html) to:
 Determine if request is for static or dynamic content.
No hard and fast rules for this.
 Convention: executables reside in cgi-bin directory
 Find file on file system.
 Initial “/” in suffix denotes home directory for requested content.
 Minimal suffix is “/”, which all servers expand to some default
home page (e.g., index.html).

Anatomy of an HTTP Transaction
$ telnet reed.cs.depaul.edu 80
Trying 140.192.39.42...
Connected to reed.cti.depaul.edu.
Escape character is '^]'.
GET / HTTP/1.1
host: reed.cs.depaul.edu
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"2285-1357855910000"
Last-Modified: Thu, 10 Jan 2013 22:11:50 GMT
Content-Type: text/html
Content-Length: 2285
Date: Mon, 04 Mar 2013 04:01:00 GMT
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8”
...
HTTP Requests


HTTP request is a request line, followed by zero or more
request headers
Request line: <method> <uri> <version>
 <version> is HTTP version of request (HTTP/1.0 or
HTTP/1.1)
 <uri> is typically URL for proxies, URL suffix for servers.
 A URL is a type of URI (Uniform Resource Identifier)
 See http://www.ietf.org/rfc/rfc2396.txt
 <method> is either GET, POST, OPTIONS, HEAD, PUT,
DELETE, or TRACE.
HTTP Requests (cont)

HTTP methods:
 GET: Retrieve static or dynamic content
Arguments for dynamic content are in URI
 Workhorse method (99% of requests)
POST: Retrieve dynamic content
 Arguments for dynamic content are in the request body
OPTIONS: Get server or file attributes
HEAD: Like GET but no data in response body
PUT: Write a file to the server!
DELETE: Delete a file on the server!
TRACE: Echo request in response body
 Useful for debugging.








Request headers: <header name>: <header data>
 Provide additional information to the server.
HTTP Versions

Major differences between HTTP/1.1 and HTTP/1.0
 HTTP/1.0 uses a new connection for each transaction.
 HTTP/1.1 also supports persistent connections
multiple transactions over the same connection
 Connection: Keep-Alive
 HTTP/1.1 requires HOST header
 Host: www.depaul.edu
 Makes it possible to host multiple websites at single Internet
host
 HTTP/1.1 supports chunked encoding (described later)
 Transfer-Encoding: chunked
 HTTP/1.1 adds additional support for caching

HTTP Responses



HTTP response is a response line followed by zero or more
response headers.
Response line:
<version> <status code> <status msg>
 <version> is HTTP version of the response.
 <status code> is numeric status.
 <status msg> is corresponding English text.





200
301
403
404
OK
Moved
Forbidden
Not found
Request was handled without error
Provide alternate URL
Server lacks permission to access file
Server couldn’t find the file.
Response headers: <header name>: <header data>
 Provide additional information about response
 Content-Type: MIME type of content in response body.
 Content-Length: Length of content in response body.
GET Request From Chrome Browser
URI is just the suffix, not the entire URL
GET / HTTP/1.1\r\n
Host: reed.cs.depaul.edu\r\n
Connection: keep-alive\r\n
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.22
(KHTML, like Gecko) Chrome/25.0.1364.97 Safari/537.22\r\n
Accept-Encoding: gzip,deflate,sdch\r\n
Accept-Language: en-US,en;q=0.8\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3\r\n
Cookie:__utma=114012434.756988690.1360702406.1360702406.1360874291.2;
__utmz=114012434.1360874291.2.2.utmcsr=cdm.depaul.edu|utmccn=(referral
)|utmcmd=referral|utmcct=/academics/Pages/bs%20computerscience%20stand
ard.aspx\r\n
\r\n
GET Response From Apache Server
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1\r\n
Accept-Ranges: bytes\r\n
ETag: W/”2285-1357855910000”\r\n
Last-Modified: Thu, 10 Jan 2013 22:11:50 GMT\r\n
Content-Type: test/html\r\n
Content-Length: 2285\r\n
Date: Mon, 04 Mar 2013 04:58:40 GMT\r\n
\r\n
<html>\n
<head>\n
...
Tiny Web Server

Tiny Web server described in text
 Tiny is a sequential Web server.
 Serves static and dynamic content to real browsers.
text files, HTML files, GIF and JPEG images.
 226 lines of commented C code.
 Not as complete or robust as a real web server

Tiny Operation


Read request from client
Split into method / uri / version
 If not GET, then return error

If URI contains “cgi-bin” then serve dynamic content
 Fork process to execute program

Otherwise serve static content
 Copy file to output
Tiny Serving Static Content
/* Send response headers to client */
From tiny.c
get_filetype(filename, filetype);
sprintf(buf, "HTTP/1.0 200 OK\r\n");
sprintf(buf, "%sServer: Tiny Web Server\r\n", buf);
sprintf(buf, "%sContent-length: %d\r\n", buf, filesize);
sprintf(buf, "%sContent-type: %s\r\n\r\n",
buf, filetype);
Rio_writen(fd, buf, strlen(buf));
/* Send response body to client */
srcfd = Open(filename, O_RDONLY, 0);
srcp = Mmap(0, filesize, PROT_READ, MAP_PRIVATE, srcfd, 0);
Close(srcfd);
Rio_writen(fd, srcp, filesize);
Munmap(srcp, filesize);




Serve file specified by filename
Use file metadata to compose header
“Read” file via mmap
Write to output
Serving Dynamic Content


Client sends request to
server.
If request URI contains the
string “/cgi-bin”, then the
server assumes that the
request is for dynamic
content.
GET /cgi-bin/env.pl HTTP/1.1
Client
Server
Serving Dynamic Content (cont)

The server creates a child
process and runs the
program identified by the
URI in that process
Client
Server
fork/exec
env.pl
Serving Dynamic Content (cont)


The child runs and generates
the dynamic content.
The server captures the
content of the child and
forwards it without
modification to the client
Client
Content
Server
Content
env.pl
Issues in Serving Dynamic Content





How does the client pass program
arguments to the server?
How does the server pass these
arguments to the child?
How does the server pass other info
relevant to the request to the child?
How does the server capture the
content produced by the child?
These issues are addressed by the
Common Gateway Interface (CGI)
specification.
Request
Client
Content
Content
Server
Create
env.pl
CGI

Because the children are written according to the CGI
spec, they are often called CGI programs.

Because many CGI programs are written in Perl, they are
often called CGI scripts.

However, CGI really defines a simple standard for
transferring information between the client (browser),
the server, and the child process.
The cdmlinux addition portal
input URL
host
port CGI program
args
Output page
Serving Dynamic Content With GET



Question: How does the client pass arguments to the server?
Answer: The arguments are appended to the URI
Can be encoded directly in a URL typed to a browser or a URL
in an HTML link
 http://cdmlinux.cdm.depaul.edu/cgi-bin/adder?n1=4&n2=7





adder is the CGI program on the server that will do the addition.
argument list starts with “?”
arguments separated by “&”
spaces represented by “+” or “%20”
URI often generated by an HTML form
<FORM METHOD=GET ACTION="cgi-bin/adder">
<p>X <INPUT NAME="n1">
<p>Y <INPUT NAME="n2">
<p><INPUT TYPE=submit>
</FORM>
Serving Dynamic Content With GET

URL:
 cgi-bin/adder?4&7

Result displayed on browser:
Welcome to THE Internet addition portal.
The answer is: 4+7=11
Thanks for visiting!
Serving Dynamic Content With GET


Question: How does the server pass these arguments to
the child?
Answer: In environment variable QUERY_STRING
 A single string containing everything after the “?”
 For add: QUERY_STRING = “4&7”
Additional CGI Environment Variables

General
 SERVER_SOFTWARE
 SERVER_NAME
 GATEWAY_INTERFACE (CGI version)

Request-specific






SERVER_PORT
REQUEST_METHOD (GET, POST, etc)
QUERY_STRING (contains GET args)
REMOTE_HOST (domain name of client)
REMOTE_ADDR (IP address of client)
CONTENT_TYPE (for POST, type of data in message body, e.g.,
text/html)
 CONTENT_LENGTH (length in bytes)
Even More CGI Environment Variables

In addition, the value of each header of type type
received from the client is placed in environment variable
HTTP_type
 Examples (any “-” is changed to “_”) :



HTTP_ACCEPT
HTTP_HOST
HTTP_USER_AGENT
Serving Dynamic Content With GET


Question: How does the server capture the content produced by the child?
Answer: The child generates its output on stdout. Server uses dup2 to
redirect stdout to its connected socket.
 Notice that only the child knows the type and size of the content. Thus the child
(not the server) must generate the corresponding headers.
/* Make the response body */
From adder.c
sprintf(content, "Welcome to add.com: ");
sprintf(content, "%sTHE Internet addition portal.\r\n<p>",
content);
sprintf(content, "%sThe answer is: %s\r\n<p>",
content, msg);
sprintf(content, "%sThanks for visiting!\r\n", content);
/* Generate the HTTP response */
printf("Content-length: %u\r\n", (unsigned) strlen(content));
printf("Content-type: text/html\r\n\r\n");
printf("%s", content);
Serving Dynamic Content With GET
$ telnet perko406.cdm.depaul.edu 8000
Trying 140.192.39.11...
Connected to perko406.cdm.depaul.edu.
Escape character is '^]'.
GET /cgi-bin/adder?4&7 HTTP/1.0
HTTP/1.0 200 OK
Server: Tiny Web Server
Content-length: 97
Content-type: text/html
HTTP request sent by client
HTTP response generated by the server
Welcome to THE Internet addition portal.
<p>The answer is: 4 + 7 = 11
<p>Thanks for visiting!
HTTP response generated by
Connection closed by foreign host.
the CGI program
$
Tiny Serving Dynamic Content
/* Return first part of HTTP response */
sprintf(buf, "HTTP/1.0 200 OK\r\n");
Rio_writen(fd, buf, strlen(buf));
sprintf(buf, "Server: Tiny Web Server\r\n");
Rio_writen(fd, buf, strlen(buf));
From tiny.c
if (Fork() == 0) { /* child */
/* Real server would set all CGI vars here */
setenv("QUERY_STRING", cgiargs, 1);
Dup2(fd, STDOUT_FILENO); /* Redirect stdout to client */
Execve(filename, emptylist, environ);/* Run CGI prog */
}
Wait(NULL); /* Parent waits for and reaps child */
 Fork child to execute CGI program
 Change stdout to be connection to client
 Execute CGI program with execve
Proxies

A proxy is an intermediary between a client and an origin
server.
 To the client, the proxy acts like a server.
 To the server, the proxy acts like a client.
1. Client request
Client
2. Proxy request
Origin
Server
Proxy
4. Proxy response
3. Server response
Why Proxies?

Can perform useful functions as requests and responses pass
by
 Examples: Caching, logging, anonymization, filtering, transcoding
Client
A
Request foo.html
foo.html
Request foo.html
Client
B
Request foo.html
Proxy
cache
foo.html
Fast inexpensive local network
foo.html
Slower more
expensive
global network
Origin
Server
For More Information

Study the Tiny Web server described in your text
 Tiny is a sequential Web server.
 Serves static and dynamic content to real browsers.
text files, HTML files, GIF and JPEG images.
 220 lines of commented C code.
 Also comes with an implementation of the CGI script for the add.com
addition portal.


See the HTTP/1.1 standard:
 http://www.w3.org/Protocols/rfc2616/rfc2616.html