CSSE 593 Internet Applications & Services

Download Report

Transcript CSSE 593 Internet Applications & Services

Dr. Yingwu Zhu

What is an internet?
◦ Network of networks

What is the Internet?
◦ A global internet based on the IP protocol

To what does “Internet technology” refer?
◦ Architecture, protocols and services

Services accessed over the net

Users, people who use the applications
◦ Everyone (mom and pop, kids)
◦ get something done (hopefully useful)

Service Designers
◦ You: protocol design and implementation
◦ Scale, performance, cost, incremental deployment

Service Providers/middleware
◦ Administrators and ISPs
◦ Management, revenue, deployment

Market/business models for the Internet
◦ Consumer to consumer (ebay), Business to
consumer(amazon, Orbitz), Business to business (IBM,
ARIBA),Consumer to business (hotjobs, monster)



Simple client/server abstraction
Client sends a request, and server sends a
response
Informational, transactional in nature

Web 1.0
◦ Users as readers of content (read-only)

Web 2.0
◦ Users create content (group communication)
◦ Read-write
◦ E.g. social network sites, blogs, wikis, Youtube

Web 3.0 ?
◦ Read, write, execute (in context, personalization), programs
◦ using semantic web, microformats, natural language
search, data mining, machine learning, recommendation
agents, and artificial intelligence technologies to improve
user experience

Web 4.0 ?
◦ Every living/non-living object connected?
1. China: 179.7 million
2. United States: 163.3 million
3. Japan: 60.0 million
4. Germany: 37.0 million
5. United Kingdom: 36.7 million
6. France: 34.0 million
7. India: 32.1 million
8. Russia: 29.0 million
9. Brazil: 27.7 million
10. South Korea: 27.3 million
11. Canada: 21.8 million
12. Italy: 20.8 million
13. Spain: 17.9 million
14. Mexico: 12.5 million
15. Netherlands: 11.8 million
top 15 countries by internet population from comScore as
of Dec. 2008, one billion in total


46% of Internet users watch an online
video once a week (as of Sept’06)
8% of Internet users downloaded a movie
during the 3Q06 using P2P apps
◦ 60% adult content, 20% TV content, rest is
movies, clips, etc

YouTube stats (March’06)
◦
◦
◦
◦
◦
50% users are younger than 20 years old
60% all videos watched online
65,000 new videos uploaded daily
Total viewing time: about 10,000 years!
YouTube consumed as much bandwidth in
2006 as the whole Internet did in 2000






Almost all users do the basics (email, Web
browsing)
50% of users pay bills online
25% online job hunting
8% upload videos
5% publish blogs
4% date online

The explosive growth in video apps &
downloads strains the network’s capacity
◦ YouTube today (January 2007) consumes as much
bandwidth as the entire Internet consumed in the
year 2000
◦ P2P video accounts for 30-40% total traffic in 2007
◦ Predicted: Internet video could soon consume 10
times the Internet current yearly traffic

BitTorrent accounts for as much as 40% of
all worldwide internet traffic (Dec. 2006)

millions of connected
computing devices:
hosts, end-systems
◦ pc’s workstations, servers
◦ PDA’s phones, toasters
router
server
mobile
local ISP
running network apps

workstation
regional ISP
communication links
◦ fiber, copper, radio,
satellite

routers: forward packets
(chunks) of data thru
network
company
network
13

protocols: control
sending, receiving of
msgs
◦ e.g., TCP, IP, HTTP, FTP, PPP

Internet: “network of
router
server
mobile
local ISP
networks”
◦ loosely hierarchical
◦ public Internet versus private
intranet

workstation
regional ISP
Internet standards
◦ RFC: Request for comments
◦ IETF: Internet Engineering
Task Force
company
network
14

communication
infrastructure enables
distributed applications:
◦ WWW, email, games, ecommerce, database.,
voting,
◦ more?

communication services
provided:
◦ connectionless
◦ connection-oriented
15

Network users: Does the network support the
users’ applications
◦ Reliability
◦ Error free service
◦ Speed of data transfer

Network designers: Cost efficient network
design
◦ Good utilization of network resources
◦ Cost of building the network
◦ Types of services to be supported
16

Network providers: Network administration
and customer service
◦ Maximize Revenue
◦ Minimize Operations Expenses
◦ Survivability and Resiliency (Why)
17
human protocols:
 “what’s the time?”
 “I have a question”
 introductions
… specific msgs sent
… specific actions
taken when msgs
received, or other
events
network protocols:
 machines rather than
humans
 all communication
activity in Internet
governed by
protocols
protocols define format,
order of msgs sent and
received among network
entities, and actions
taken on msg
transmission, receipt
18
a human protocol and a computer network protocol:
Hi
TCP connection
req.
Hi
TCP connection
Got the
time?
reply.
Get http://gaia.cs.umass.edu/index.htm
2:00
<file>
time
19

Building blocks of a network architecture

Each protocol object has two different
interfaces
◦ service interface: defines operations on this
protocol
◦ peer-to-peer interface: defines messages
exchanged with peer

Term “protocol” is overloaded
◦ specification of peer-to-peer interface
◦ module that implements this interface
20

end systems (hosts):
◦ run application programs
◦ e.g., WWW, email
◦ at “edge of network”

client/server model
◦ client host requests, receives
service from server
◦ e.g., WWW client (browser)/
server; email client/server

peer-peer model:
◦ host interaction symmetric
◦ e.g.: teleconferencing,
Gnutella, Kazza
21
Goal: data transfer

between end sys.
handshaking: setup
(prepare for) data
transfer ahead of time
◦ Hello, hello back human
protocol
◦ set up “state” in two
communicating hosts

TCP - Transmission
Control Protocol
◦ Internet’s connectionoriented service
TCP service [RFC 793]

reliable, in-order bytestream data transfer
◦ loss: acknowledgements
and retransmissions

flow control:
◦ sender won’t overwhelm
receiver

congestion control:
◦ senders “slow down
sending rate” when
network congested
22
Goal: data transfer
between end systems
◦ same as before!

UDP - User Datagram
Protocol [RFC 768]:
Internet’s
connectionless service
◦ unreliable data
transfer
◦ no flow control
◦ no congestion control
App’s using TCP:

HTTP (WWW), FTP (file
transfer), Telnet
(remote login), SMTP
(email)
App’s using UDP:

streaming media,
teleconferencing,
Internet telephony
23


mesh of interconnected
routers
the fundamental
question: how is data
transferred through net?
◦ circuit switching:
dedicated circuit per
call: telephone net
◦ packet-switching: data
sent thru net in
discrete “chunks”
24
End-end resources
reserved for “call”




link bandwidth, switch
capacity
dedicated resources:
no sharing
circuit-like
(guaranteed)
performance
call setup required
25


Must share (multiplex) network resources
among multiple users.
Common Multiplexing Strategies
◦ Time-Division Multiplexing (TDM)
◦ Frequency-Division Multiplexing (FDM): Frequency
band  bandwidth

Multiplexing multiple logical flows over a
single physical link.
26
network resources
(e.g., bandwidth)
divided into “pieces”



pieces allocated to
calls
resource piece idle if
not used by owning
call (no sharing)
dividing link bandwidth
into “pieces”
◦ frequency division
◦ time division
27
each end-end data stream
divided into packets
 user A, B packets share
network resources
 each packet uses full link
bandwidth
 resources used as
needed,
Bandwidth division into “pieces”
Dedicated allocation
Resource reservation
resource contention:
aggregate resource
demand can exceed
amount available
congestion: packets
queue, wait for link
use
store and forward:
packets move one hop
at a time
m transmit over link
m wait turn at next
link
28
10 Mbs
A
Ethernet
B
On-demand sharing
C
statistical multiplexing
1.5 Mbs
queue of packets
waiting for output
link
D
45 Mbs
E
29
Packet-switching:
store and forward behavior
30
Packet switching allows more users to use network!


1 Mbit link
each user:
◦ 100Kbps when “active”
◦ active 10% of time

circuit-switching:
◦ 10 users

N users
1 Mbps link
packet switching:
◦ with 35 users,
probability > 10 active
less than .004
31
Is packet switching a “slam dunk winner?”



Great for bursty data
◦ resource sharing
◦ no call setup
Excessive congestion: packet delay and loss
◦ protocols needed for reliable data transfer,
congestion control
Q: How to provide circuit-like behavior?
◦ bandwidth guarantees needed for audio/video
apps
still an unsolved problem!
32

Goal: move packets among routers from source to
destination
◦ we’ll study several path selection algorithms

datagram network:

virtual circuit network:
◦ destination address determines next hop
◦ routes may change during session
◦ analogy: driving, asking directions
◦ each packet carries tag (virtual circuit ID), tag determines
next hop
◦ fixed path determined at call setup time, remains fixed
thru call
◦ routers maintain per-call state
◦ ATM
33
Q: How to connect end
systems to edge
router?



residential access nets
institutional access
networks (school,
company)
mobile access
networks
Keep in mind:


bandwidth (bits per
second) of access
network?
shared or dedicated?
34



Dialup via modem
◦ up to 56Kbps direct access to
router (conceptually)
ISDN: intergrated services
digital network: 128Kbps alldigital connect to router
ADSL: asymmetric digital
subscriber line
◦ up to 1 Mbps home-to-router
◦ up to 8 Mbps router-to-home
35

HFC: hybrid fiber coax
◦ asymmetric: up to 10Mbps
upstream, 1 Mbps
downstream

network of cable and
fiber attaches homes to
ISP router
◦ shared access to router
among home
◦ issues: congestion,
dimensioning

deployment: available via
cable companies, e.g.,
MediaOne, Comcast
36



company/univ local area
network (LAN) connects
end system to edge
router
Ethernet:
◦ shared or dedicated
cable connects end
system and router
◦ 10 Mbs, 100Mbps,
Gigabit Ethernet
deployment: institutions,
home LANs soon
37


shared wireless access
network connects end
system to router
wireless LANs:
◦ radio spectrum replaces
wire
◦ e.g., Lucent Wavelan 10
Mbps

wider-area wireless
access
◦ CDPD: wireless access to
ISP router via cellular
network (base stations)
router
base
station
mobile
hosts
38
packets experience delay
on end-to-end path
 four sources of delay at
each hop

nodal processing:
◦ check bit errors
◦ determine output link

queueing
◦ time waiting at output
link for transmission
◦ depends on congestion
level of router
transmission
A
propagation
B
nodal
processing
queueing
39
Propagation delay:
 d = length of physical
link
 s = propagation speed
in medium (~2x108
m/sec)
 propagation delay =
d/s
Transmission delay:
 R=link bandwidth (bps)
 L=packet length (bits)
 time to send bits into
link = L/R
transmission
A
Note: s and R are very
different quantities!
propagation
B
nodal
processing
queueing
40
Latency (delay)
 Time it takes to send message from point
A to point B
 Example: 24 milliseconds (ms)
 Sometimes interested in in round-trip time
(RTT)
 Components of latency
Latency = Propagation + Transmit + Queue + Proc.
Propagation = Distance / SpeedOfLight
Transmit = Size / Bandwidth
41

Propagation delay
◦ The propagation delay over a link is the time it
takes a bit to travel from on end of the link to
the other
◦ = d/s

Transmission delay
◦ It is the amount of time it takes to push the
packet onto the link
◦ =L/B

Total latency over the link
◦ = transmission delay + propagation delay
42

Delay x Bandwidth Product
Delay
Bandwidth
e.g., 100ms RTT and 45Mbps Bandwidth = 560KB
of data

We have to view the network as a buffer.
This may have interesting consequences:
◦ How much data did the sender transmit before a
response can be received?
43

application: supporting network
applications
◦ ftp, smtp, http

transport: host-host data transfer
◦ tcp, udp

network: routing of datagrams
from source to destination
◦ ip, routing protocols

link: data transfer between
neighboring network elements
Application
Transport
Network
link
physical
◦ ppp, ethernet

physical: bits “on the wire”
44
Dealing with complex systems:



explicit structure allows identification,
relationship of complex system’s pieces
◦ layered reference model for discussion
modularization eases maintenance, updating of
system
◦ change of implementation of layer’s service
transparent to rest of system
◦ e.g., change in gate procedure doesn’t affect
rest of system
layering considered harmful?
45
Each layer:
 distributed
 “entities”
implement
layer functions
at each node
 entities
perform
actions,
exchange
messages with
peers
application
transport
network
link
physical
application
transport
network
link
physical
network
link
physical
application
transport
network
link
physical
application
transport
network
link
physical
46
E.g.: transport





take data from
app
add addressing,
reliability check
info to form
“datagram”
send datagram to
peer
wait for peer to
ack receipt
analogy: post
office
data
application
transport
transport
network
link
physical
application
transport
network
link
physical
ack
data
network
link
physical
application
transport
network
link
physical
data
application
transport
transport
network
link
physical
47
data
application
transport
network
link
physical
application
transport
network
link
physical
network
link
physical
application
transport
network
link
physical
data
application
transport
network
link
physical
48
Each layer takes data from above
 adds header information to create new data unit
 passes new data unit to layer below
source
M
Ht M
Hn Ht M
Hl Hn Ht M
application
transport
network
link
physical
destination
application
transport
network
link
physical
M
message
Ht M
Hn Ht M
Hl Hn Ht M
segment
datagram
frame
49

The combination of data from the next higher
layer and control information is referred to as
PDU.
◦ Control Information in the Transport Layer may
include:
 Destination Service Access Point (DSAP)
 Sequence number
 Error-detection code
50


roughly hierarchical
national/international
backbone providers
(NBPs)
◦ e.g. BBN/GTE, Sprint, AT&T,
IBM, UUNet
◦ interconnect (peer) with
each other privately, or at
public Network Access Point
(NAPs)

regional ISPs
◦ connect into NBPs

local ISP, company
◦ connect into regional ISPs
local
ISP
regional ISP
NBP B
NAP
NAP
NBP A
regional ISP
local
ISP
51
e.g. BBN/GTE US backbone network
52
Application: communicating,
distributed processes
◦ running in network hosts
in “user space”
◦ exchange messages to
implement app
◦ e.g., email, file transfer,
the Web
Application-layer protocols
◦ one “piece” of an app
◦ define messages
exchanged by apps and
actions taken
◦ user services provided by
lower layer protocols
application
transport
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical
53
Typical network app has two
pieces: client and server
Client:
initiates contact with server
(“speaks first”)
typically requests service from
server,
for Web, client is implemented
in browser; for e-mail, in mail
reader, e.g., outlook
Server:
provides requested service to
client
e.g., Web server sends
requested Web page, mail
server delivers e-mail
application
transport
network
data link
physical
request
reply
application
transport
network
data link
physical
54
Data loss


some apps (e.g., audio)
can tolerate some loss
other apps (e.g., file
transfer, telnet) require
100% reliable data transfer
Bandwidth
some apps (e.g., multimedia)
require minimum amount of
bandwidth to be “effective”
other apps (“elastic apps”)
make use of whatever
bandwidth they get
Timing

some apps (e.g., Internet
telephony, interactive
games) require low delay
to be “effective”
55
Data loss
Bandwidth
Time Sensitive
file transfer
e-mail
Web documents
real-time audio/video
no loss
no loss
loss-tolerant
loss-tolerant
no
no
no
yes, 100’s msec
stored audio/video
interactive games
financial apps
loss-tolerant
loss-tolerant
no loss
elastic
elastic
elastic
audio: 5Kb-1Mb
video:10Kb-5Mb
same as above
few Kbps up
elastic
Application
yes, few secs
yes, 100’s msec
yes and no
56
TCP service:
UDP service:






connection-oriented: setup
required between client,
server
reliable transport between
sending and receiving
process
flow control: sender won’t
overwhelm receiver
congestion control: throttle
sender when network
overloaded
does not providing: timing,
minimum bandwidth
guarantees

unreliable data transfer
between sending and
receiving process
does not provide:
connection setup,
reliability, flow control,
congestion control,
timing, or bandwidth
guarantee
Q: why bother? Why is
there a UDP?
57
Application
e-mail
remote terminal access
Web
file transfer
streaming multimedia
remote file server
Internet telephony
Application
layer protocol
Underlying
transport protocol
smtp [RFC 821]
telnet [RFC 854]
http [RFC 2068]
ftp [RFC 959]
proprietary
(e.g. RealNetworks)
NSF
proprietary
(e.g., Vocaltec)
TCP
TCP
TCP
TCP
TCP or UDP
TCP or UDP
typically UDP
58
http: hypertext transfer
protocol




Web’s application layer
protocol
client/server model
◦ client: browser that
requests, receives,
“displays” Web objects
◦ server: Web server sends
objects in response to
requests
http1.0: RFC 1945
http1.1: RFC 2068
PC running
Explorer
Server
running
NCSA Web
server
Mac running
Navigator
59
http: TCP transport
service:




client initiates TCP
connection (creates socket)
to server, port 80
server accepts TCP
connection from client
http messages
(application-layer protocol
messages) exchanged
between browser (http
client) and Web server (http
server)
TCP connection closed
http is “stateless”

server maintains no
information about
past client requests
aside
Protocols that maintain
“state” are complex!
past history (state) must
be maintained
if server/client crashes,
their views of “state” may
be inconsistent, must be
reconciled
60

HTTP is the protocol that supports
communication between web browsers and
web servers.

A “Web Server” is a HTTP server

Most clients/servers today speak version 1.1,
but 1.0 is also in use.
Suppose user enters URL www.someSchool.edu/someDepartment/home.index
1a. http client initiates TCP
connection to http server
(process) at
www.someSchool.edu. Port 80
is default for http server.
2. http client sends http request
message (containing URL) into
TCP connection socket
time
(contains text,
references to 10
jpeg images)
1b. http server at host
www.someSchool.edu waiting
for TCP connection at port 80.
“accepts” connection, notifying
client
3. http server receives request
message, forms response
message containing requested
object
(someDepartment/home.index),
sends message into socket
62
4. http server closes TCP
5. http client receives response
connection.
message containing html
file, displays html. Parsing
html file, finds 10 referenced
jpeg objects
time
6. Steps 1-5 repeated for each
of 10 jpeg objects
63
Non-persistent
 HTTP/1.0
 server parses
request, responds,
and closes TCP
connection
 2 RTTs to fetch each
object
 Each object transfer
suffers from slow
start
But most 1.0 browsers use
parallel TCP connections.
Persistent
 default for HTTP/1.1
 on same TCP
connection: server,
parses request,
responds, parses new
request,..
 Client sends requests
for all referenced
objects as soon as it
receives base HTML.
 Fewer RTTs and less
slow start.
64
Entity body is empty for “GET”, but not for “POST”
65
Goal: satisfy client request without involving origin server


user sets browser: Web
accesses via web cache
client sends all http
requests to web cache
◦ if object at web cache,
web cache immediately
returns object in http
response
◦ else requests object
from origin server,
then returns http
response to client
origin
server
client
client
Proxy
server
origin
server
66
Assume: cache is “close”
to client (e.g., in same
network)
 smaller response time:
cache “closer” to client
 decrease traffic to
distant servers
◦ link out of
institutional/local ISP
network often bottleneck
origin
servers
public
Internet
1.5 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
67


Hierarchy of caches to serve more population
ICP (Internet cache protocol) to coordinate
web caches
People: many identifiers:
◦ SSN, name, Passport #
Internet hosts, routers:
◦ IP address (32 bit) - used
for addressing datagrams
◦ “name”, e.g.,
gaia.cs.umass.edu - used
by humans
Q: map between IP
addresses and name ?
Domain Name System:

distributed database

application-layer protocol
implemented in hierarchy of
many name servers
host, routers, name servers
to communicate to resolve
names (address/name
translation)
◦ note: core Internet
function implemented as
application-layer protocol
◦ complexity at network’s
“edge”
69
Why not centralize DNS?
 single point of failure
 traffic volume
 distant centralized
database
 Maintenance
 DoS attacks?
doesn’t scale!
no server has all nameto-IP address mappings
local name servers:

◦ each ISP, company has
local (default) name server
◦ host DNS query first goes
to local name server
authoritative name server:
◦ for a host: stores that
host’s IP address, name
◦ can perform name/address
translation for that host’s
name
70





contacted by local name
server that can not
resolve name
root name server:
◦ contacts authoritative
name server if name
mapping not known
◦ gets mapping
◦ returns mapping to
local name server
~ dozen root name
servers worldwide
13 root DNS servers:
replication for security
and reliability
Top-level DNS server:
org, edu, com, jp,cn, fr,
uk
71
root name
server
host surf.eurecom.fr wants
IP address of
gaia.cs.umass.edu
1. Contacts its local DNS
server, dns.eurecom.fr
2. dns.eurecom.fr contacts
root name server, if
necessary
3. root name server contacts
authoritative name server,
dns.umass.edu, if
necessary
2
4
5
3
local name server authorititive name server
dns.eurecom.fr
1
dns.umass.edu
6
requesting host
gaia.cs.umass.edu
surf.eurecom.fr
72
root name
server
Root name server:


may not know
authoratiative name
server
may know
intermediate name
server: who to
contact to find
authoritative name
server
6
2
7
3
local name server intermediate name server
dns.eurecom.fr
1
8
dns.umass.edu
4
5
authoritative name server
dns.cs.umass.edu
requesting host
surf.eurecom.fr
gaia.cs.umass.edu
73
root name
server
recursive query:


puts burden of name
resolution on
contacted name
server
heavy load?
iterated query:


contacted server
replies with name of
server to contact
“I don’t know this
name, but ask this
server”
iterated query
2
3
4
7
local name server intermediate name server
dns.eurecom.fr
1
8
dns.umass.edu
5
6
authoritative name server
dns.cs.umass.edu
requesting host
surf.eurecom.fr
gaia.cs.umass.edu
74


once (any) name server learns mapping, it
caches mapping
◦ cache entries timeout (disappear) after some
time
update/notify mechanisms under design by
IETF
◦ RFC 2136
◦ http://www.ietf.org/html.charters/dnsind-charter.html
75
DNS: distributed db storing resource records (RR)
RR format:
(name, value, type,ttl)
Type=A
name is hostname
value is IP address

Type=NS
◦ name is domain (e.g.
foo.com)
◦ value is IP address of
authoritative name server
for this domain
Type=CNAME
name is an alias name
for some “cannonical”
(the real) name
value is cannonical
name
Type=MX
value is hostname of
mailserver associated with
name
76
For a particular hostname
 If a DNS server is authoritative, it contains
◦ a Type A record for the hostname

Otherwise
◦
◦
◦
◦
◦
◦
Maybe a Type A record for the hostname in cache
a Type NS record for the domain of the hostname
a Type A record for the DNS server for that domain
Host: gaia.cs.umass.edu
(umass.edu, dns.umass.edu, NS)
(dns.umass.edu, 128.119.40.111, A)
77
DNS protocol : query and repy messages, both
with same message format
msg header
identification: 16 bit # for
query, repy to query uses
same #
flags:
query or reply
recursion desired
recursion available
reply is authoritative
78
Name, type fields
for a query
RRs in reponse
to query
records for
authoritative servers
additional “helpful”
info that may be used
Try nslookup?
79


You setup a company: mynet.com
Step 1: register your domain name with a
registrar
◦ Provide name and IP address mapping
◦ Primary authoritative DNS server: dns1.mynet.com,
212.212.212.1
◦ Optional: secondary DNS server: dns.mynet.com,
212.212.212.2
◦ Registrar will insert type NS and A records for you
◦ (mynet.com, dns1.mynet.com, NS)
◦ (dn1.mynet.com, 212.212.212.1, A)

Step 2: insert records into your DNS server
◦ For web server (www.mynet.com, 212.212.212.3,A)
◦ For mail sever (mail.mynet.com, 212.212.212.4, MX)
◦ Then, others can access your web server and send
emails
80