Transcript Lecture

Computer Systems
Iterative server
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
1
Web Servers
HTTP request
Web
client
(browser)
Web
server
HTTP response
(content)
• Clients and servers communicate using the
HyperText Transfer Protocol (HTTP)
–
–
–
–
Client and server establish TCP connection
Client requests content
Server responds with requested content
Client and server close connection (usually)
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
2
Web Content
• Web servers return content to clients
– content: a sequence of bytes with an associated
MIME (Multipurpose Internet Mail Extensions) type
• Example MIME types
–
–
–
–
–
text/html
HTML document
text/plain
Unformatted text
application/postscript Postcript document
image/gif
Binary image encoded in GIF format
image/jpeg
Binary image encoded in JPEG format
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
3
Static and Dynamic Content
• The content returned in HTTP responses can be
either static or dynamic.
– Static content: content stored in files and retrieved in
response to an HTTP request
• Examples: HTML files, images, audio clips.
– Dynamic content: content produced on-the-fly in
response to an HTTP request
• Example: content produced by a program executed by the
server on behalf of the client.
• Bottom line: All Web content is associated with
a file that is managed by the server.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
4
URLs
• Each file managed by a server has a unique name called
a URL (Universal Resource Locator)
• URLs for static content:
– http://www.cs.cmu.edu:80/index.html
– http://www.cs.cmu.edu/index.html
– http://www.cs.cmu.edu
• Identifies a file called index.html, managed by a Web server at
www.cs.cmu.edu that is listening on port 80.
• URLs for dynamic content:
– http://brooks.science.uva.nl:8008/cgibin/adder?33&9
• Identifies an executable file called adder, managed by a Web server
runing on brooks that is listening on port 8008, that should be called
with two argument strings: 33 and 9.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
5
How Clients and Servers Use URLs
• Example URL: http://www.aol.com:80/index.html
• Clients use prefix (http://www.aol.com:80) to
infer:
– What kind of server to contact (Web server)
– Where the server is (www.aol.com)
– What port it is listening on (80)
• Servers use suffix (/index.html) to:
– Determine if request is for static or dynamic content.
• No hard and fast rules for this.
• Convention: executables reside in cgi-bin directory
– Find file on file system.
• Initial “/” in suffix denotes home directory for requested content.
• Minimal suffix is “/”, which all servers expand to some default
home page (e.g., index.html).
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
6
Clients
• Examples of client programs
– Web browsers, ftp, telnet, ssh
• How does a client find the server?
– The IP address in the server socket address identifies the
host (more precisely, an adapter on the host)
– The (well-known) port in the server socket address
identifies the service, and thus implicitly identifies the
server process that performs that service.
– Examples of well know ports
•
•
•
•
Port 7: Echo server
Port 23: Telnet server
Port 25: Mail server
Port 80: Web server
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
7
Using Ports to Identify Services
Server host 128.2.194.242
Client host
Service request for
128.2.194.242:80
(i.e., the Web server)
Client
Web server
(port 80)
Kernel
Echo server
(port 7)
Client
Service request for
128.2.194.242:7
(i.e., the echo server)
Web server
(port 80)
Kernel
Echo server
(port 7)
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
8
Sockets Interface
• Created in the early 80’s as part of the original Berkeley
distribution of Unix that contained an early version of
the Internet protocols.
• Provides a user-level interface to the network.
• Underlying basis for all Internet applications.
• Based on client/server programming model.
Arnoud Visser
Computer Systems – Iterative server
University of Amsterdam
9
Sockets
• What is a socket?
– To the kernel, a socket is an endpoint of IP communication.
Internet client host
Internet server host
User code
Server
TCP/IP
Kernel code
TCP/IP
Network
adapter
Hardware
Network
adapter
Client
Sockets interface
(system calls)
Hardware interface
(interrupts)
Global IP Internet
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
10
Sockets
• What is a socket?
– To an application, a socket is a file descriptor that lets the
application read/write from/to the network.
• Remember: All Unix I/O devices, including networks, are modeled as
files.
• Clients and servers communicate with each by reading
from and writing to socket descriptors.
• The main distinction between regular file I/O and
socket I/O is how the application “opens” the socket
descriptors.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
11
A server listens to a port
listenfd(3)
Server
Client
1. Server blocks in accept,
waiting for connection request on
listening descriptor listenfd.
clientfd
connection
request
Client
listenfd(3)
Server
2. Client makes connection request by
calling and blocking in connect.
clientfd
listenfd(3)
Client
clientfd
Server
connfd(4)
3. Server returns connfd from accept.
Client returns from connect.
Connection is now established between
clientfd and connfd.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
12
System calls of the Sockets Interface
Client
Server
socket
socket
bind
open_listenfd
open_clientfd
listen
Connection
request
connect
rio_writen
rio_readlineb
rio_readlineb
close
accept
rio_writen
EOF
rio_readlineb
close
Arnoud Visser
Await connection
request from
next client
Computer Systems – Iterative server
University of Amsterdam
13
Echo Client Main Routine
#include "csapp.h"
/* usage: ./echoclient host port */
int main(int argc, char **argv)
{
int clientfd, port;
char *host, buf[MAXLINE];
rio_t rio;
host = argv[1];
port = atoi(argv[2]);
clientfd = Open_clientfd(host, port);
Rio_readinitb(&rio, clientfd);
while (Fgets(buf, MAXLINE, stdin) != NULL) {
Rio_writen(clientfd, buf, strlen(buf));
Rio_readlineb(&rio, buf, MAXLINE);
Fputs(buf, stdout);
}
Close(clientfd);
}
Arnoud Visser
University of Amsterdam
Computer Systems – Iterative server
14
Echo Server: Main Routine
int main(int argc, char **argv) {
int listenfd, connfd, port, clientlen;
struct sockaddr_in clientaddr;
struct hostent *hp;
char *haddrp;
port = atoi(argv[1]); /* the server listens on a port passed
on the command line */
listenfd = open_listenfd(port);
while (1) {
clientlen = sizeof(clientaddr);
connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);
hp = Gethostbyaddr((const char *)&clientaddr.sin_addr.s_addr,
sizeof(clientaddr.sin_addr.s_addr), AF_INET);
haddrp = inet_ntoa(clientaddr.sin_addr);
printf("server connected to %s (%s)\n", hp->h_name, haddrp);
echo(connfd);
Close(connfd);
}
}
Arnoud Visser
University of Amsterdam
Computer Systems – Iterative server
15
Echo Server: echo
void echo(int connfd)
{
size_t n;
char buf[MAXLINE];
rio_t rio;
Rio_readinitb(&rio, connfd);
while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) {
printf("server received %d bytes\n", n);
Rio_writen(connfd, buf, n);
}
}
• The server uses RIO to read and echo text lines until
EOF (end-of-file) is encountered.
– EOF notification caused by client calling
close(clientfd).
– IMPORTANT: EOF is a condition, not a particular data byte.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
16
Unix I/O vs. Standard I/O vs. RIO
• Standard I/O and RIO are implemented using
low-level Unix I/O.
fopen
fread
fscanf
sscanf
fgets
fflush
fclose
fdopen
fwrite
fprintf
sprintf
fputs
fseek
open
write
stat
read
lseek
close
C application program
Standard I/O
functions
RIO
functions
Unix I/O functions
(accessed via system calls)
rio_readn
rio_writen
rio_readinitb
rio_readlineb
rio_readnb
• Which ones should you use in your programs?
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
17
Pros and Cons of Standard I/O
• Pros:
– Buffering increases efficiency by decreasing the
number of read and write system calls.
– Short counts are handled automatically.
• Cons:
– Provides no function for accessing file metadata
– Standard I/O is not appropriate for input and output
on network sockets
– There are poorly documented restrictions on streams
that interact badly with restrictions on sockets
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
18
Pros and Cons of Standard I/O
• Restrictions on (full-duplex) streams:
– Restriction 1: input function cannot follow output function
without intervening call to fflush, fseek, fsetpos, or
rewind.
• Latter three functions all use lseek to change file position.
– Restriction 2: output function cannot follow an input function
with intervening call to fseek, fsetpos, or rewind.
• Restriction on sockets:
– You are not allowed to change the file position of a socket.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
19
Choosing I/O Functions
• General rule: Use the highest-level I/O functions you
can.
– Many C programmers are able to do all of their work using
the standard I/O functions.
• When to use standard I/O?
– When working with disk or terminal files.
• When to use raw Unix I/O
– When you need to fetch file metadata.
– In rare cases when you need absolute highest performance.
• When to use RIO?
– When you are reading and writing network sockets or pipes.
– Never use standard I/O or raw Unix I/O on sockets or pipes.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
20
Testing the Echo Server With telnet
bass> echoserver 5000
server established connection with KITTYHAWK.CMCL (128.2.194.242)
server received 5 bytes: 123
server established connection with KITTYHAWK.CMCL (128.2.194.242)
server received 8 bytes: 456789
kittyhawk> telnet bass 5000
Trying 128.2.222.85...
Connected to BASS.CMCL.CS.CMU.EDU.
Escape character is '^]'.
123
123
Connection closed by foreign host.
kittyhawk> telnet bass 5000
Trying 128.2.222.85...
Connected to BASS.CMCL.CS.CMU.EDU.
Escape character is '^]'.
456789
456789
Connection closed by foreign host.
kittyhawk>
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
21
Running the Echo Client and Server
bass> echoserver 5000
server established connection with KITTYHAWK.CMCL (128.2.194.242)
server received 4 bytes: 123
server established connection with KITTYHAWK.CMCL (128.2.194.242)
server received 7 bytes: 456789
...
kittyhawk> echoclient bass 5000
Please enter msg: 123
Echo from server: 123
kittyhawk> echoclient bass 5000
Please enter msg: 456789
Echo from server: 456789
kittyhawk>
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
22
A server listens to a port
listenfd(3)
Server
Client
1. Server blocks in accept,
waiting for connection request on
listening descriptor listenfd.
clientfd
connection
request
Client
listenfd(3)
Server
2. Client makes connection request by
calling and blocking in connect.
clientfd
listenfd(3)
Client
clientfd
Server
connfd(4)
3. Server returns connfd from accept.
Client returns from connect.
Connection is now established between
clientfd and connfd.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
23
Connected vs. Listening Descriptors
• Listening descriptor
– End point for client connection requests.
– Created once and exists for lifetime of the server.
• Connected descriptor
– End point of the connection between client and server.
– A new descriptor is created each time the server accepts a
connection request from a client.
– Exists only as long as it takes to service client.
• Why the distinction?
– Allows for concurrent servers that can communicate over
many client connections simultaneously.
• E.g., Each time we receive a new request, we fork a child to handle the
request.
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
24
Anatomy of an HTTP Transaction
unix> telnet www.aol.com 80
Trying 205.188.146.23...
Connected to aol.com.
Escape character is '^]'.
GET / HTTP/1.1
host: www.aol.com
Client: open connection to server
Telnet prints 3 lines to the terminal
Client: request line
Client: required HTTP/1.1 HOST header
Client: empty line terminates headers.
HTTP/1.0 200 OK
Server: response line
MIME-Version: 1.0
Server: followed by five response headers
Date: Mon, 08 Jan 2001 04:59:42 GMT
Server: NaviServer/2.0 AOLserver/2.3.3
Content-Type: text/html
Server: expect HTML in the response body
Content-Length: 42092
Server: expect 42,092 bytes in the resp body
Server: empty line (“\r\n”) terminates hdrs
<html>
Server: first HTML line in response body
...
Server: 766 lines of HTML not shown.
</html>
Server: last HTML line in response body
Connection closed by foreign host. Server: closes connection
unix>
Client: closes connection and terminates
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
25
The add.com Experience
input URL
host
port
CGI program
args
Output page
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
26
Serving Dynamic Content With GET
• Question: How does the client pass arguments to
the server?
• Answer: The arguments are appended to the URI
• Can be encoded directly in a URL typed to a
browser or a URL in an HTML link
– http://add.com/cgi-bin/adder?1&2
– adder is the CGI program on the server that will do the
addition.
– argument list starts with “?”
– arguments separated by “&”
– spaces represented by “+” or “%20”
• Can also be generated by an HTML form
University
<form method=get action="http://add.com/cgi-bin/postadder">
Arnoud Visser
Computer Systems – Iterative server
of Amsterdam
27
Assignment
•
Adder.com:
Make and Start your own tiny webserver
http://staff.science.uva.nl/~arnoud/onderwijs/CS/conc/tiny.tar.gz
•
Problem 11.10
Access the adder-code with a form
<form method=get action="http://add.com/cgi-bin/postadder">
See http://staff.science.uva.nl/~arnoud/onderwijs/CS/Evaluation.html
University of Amsterdam
Arnoud Visser
Computer Systems – Iterative server
28