ppt - Courses

Download Report

Transcript ppt - Courses

i206: Lecture 21:
Networking, WWW, and Internet
Protocols
Marti Hearst
Spring 2012
1
Distributed
Systems
Network
Security
Cryptography
Network
Standards
& Protocols
Inter-process
Communication
I/O
Operating
System
Memory
Register, Cache
Main Memory,
Secondary Storage
Process
ALUs, Registers,
Compiler/
Program Counter,
Instruction Register Interpreter
Program
Assembly
Instructions
Op-code, operands
Instruction set arch
Decimal,
Hexadecimal,
Binary
Data
Data
Representation
Gates
Adders, decoders,
Memory latches,
ALUs, etc.
Boolean
Logic
AND, OR, NOT,
XOR, NAND, NOR,
etc.
Number
Systems
Binary
Numbers
Bits & Bytes
Design
Principles
Formal models
Finite automata
regex
Circuits
Lossless v. lossy
Info entropy &
Huffman code
Numbers, text,
audio, video,
image, …
Application
Machine
Instructions
CPU
Data
storage
Data
compression
Methodologies/
Tools
TCP/IP,
RSA, …
Context switch
Process vs. Thread
Locks and deadlocks
Memory
hierarchy
Confidentiality
Integrity
Authentication
…
Algorithms
Analysis
Big-O
Data Structures
Searching, sorting,
Encryption, etc.
Stacks, queues,
maps, trees,
graphs, …
Truth table
Venn Diagram
DeMorgan’s Law
2
Topics
• Network abstractions
• Network architecture
• How the WWW works, end to end
– Illustrated with the example of a web search engine
3
Network as Communication
Channel
Source: Coulouris, Dollimore and Kindberg
4
Network Cloud
Network
client
server
5
Network: Routers & Links
A
Hosts
or local
networks
1
B
2
3
Links
4
C
5
D
6
E
Routers
Source: Coulouris, Dollimore and Kindberg
6
Network: More Details
Customer Premises
Telephone Network
Inter-exchange
Carrier (IXC) LongDistance
Point of
Network
Presence
Internet Service Providers
Internet backbones
Backbone
Provider 1
Router
Router
Local
Exchange
Carrier
(LEC)
Local
Ingress
Switch
Local
Loop
Analog
Modem
Tandem
Switch
ISP
DNS
Router
Local
Egress
Switch
Exchange
Point
Server
Content
Provider
Headend
Cable ISP
Backbone
Provider 2
xDSL
Modem
Remote ISP
Router
router
Cable
Modem
Firewall
Client
Wireless ISP
Mobile Client
7Corporate
LAN
7
Network Utilities
• Run from Terminal in unix/mac
– Ping: round trip time on an IP network from the
originating host to the destination computer
– Traceroute: displaying the route (path) and
measuring transit delays of packets across an IP
network
• ends a sequence of Internet Control Message
Protocol(ICMP) echo request packets addressed to a
destination host.
$ ping www.ischool.berkeley.edu
PING www.ischool.berkeley.edu (128.32.78.21): 56 data
bytes
64 bytes from 128.32.78.21: icmp_seq=0 ttl=61 time=0.846
ms
64 bytes from 128.32.78.21: icmp_seq=1 ttl=61 time=0.915
ms
8
TraceRoute
$ traceroute www.ischool.berkeley.edu
traceroute to www.ischool.berkeley.edu (128.32.78.21), 64 hops max, 52 byte packets
1 g2-11.inr-270-doecev.berkeley.edu (128.32.226.1) 0.681 ms 0.362 ms 0.495 ms
2 g3-3.inr-202-reccev.berkeley.edu (128.32.255.34) 0.437 ms 0.540 ms 0.476 ms
3 t5-5.inr-211-srb.berkeley.edu (128.32.255.127) 0.626 ms 0.648 ms 1.163 ms
4 www (128.32.78.21) 0.930 ms 1.220 ms 1.085 ms
$ traceroute www.google.com
traceroute: Warning: www.google.com has multiple addresses; using 74.125.224.83
traceroute to www.l.google.com (74.125.224.83), 64 hops max, 52 byte packets
1 g2-11.inr-270-doecev.berkeley.edu (128.32.226.1) 0.673 ms 0.431 ms 0.427 ms
2 g3-3.inr-201-sut.berkeley.edu (128.32.255.32) 0.482 ms 0.505 ms 0.510 ms
3 xe-0-1-0.inr-001-sut.berkeley.edu (128.32.0.64) 0.597 ms 0.450 ms 0.355 ms
4 dc-svl-agg1--ucb-10ge.cenic.net (137.164.50.18) 10.662 ms 7.790 ms 6.443 ms
5 dc-svl-core1--svl-agg1-10ge.cenic.net (137.164.47.121) 3.623 ms 3.477 ms
3.133 ms
6 dc-svl-px1--svl-core1-10ge-2.cenic.net (137.164.46.13) 4.791 ms 3.045 ms
2.955 ms
7 137.164.131.61 (137.164.131.61) 3.582 ms 3.415 ms 3.637 ms
8 137.164.130.94 (137.164.130.94) 8.095 ms 58.649 ms 7.700 ms
9 216.239.49.250 (216.239.49.250) 4.307 ms 4.829 ms 4.534 ms
10 64.233.174.19 (64.233.174.19) 4.943 ms 4.812 ms 5.091 ms
11 nuq04s07-in-f19.1e100.net (74.125.224.83) 4.528 ms 4.510 ms 4.802 ms
9
Network Types
Range
LAN
1-2 kms
WAN
worldwide
MAN
2-50 kms
Wireless LAN 0.15-1.5 km
Wireless WAN worldwide
Internet
worldwide
circa 2000
Bandwidth (Mbps) Latency (ms)
10-1000
0.010-600
1-150
2-11
0.010-2
0.010-2
1-10
100-500
10
5-20
100-500
100-500
Source: Coulouris, Dollimore and Kindberg
 An internet: a set of interconnected networks
 The Internet: the global internetwork based upon the Internet
Protocol (IP)
10
Network Building Blocks
• Transmission media
– Copper (coax, twisted pair), optical fiber, free
space (wireless)
• Signals
– Electrical currents, light, RF (radio-frequency),
microwave
• Hardware devices
– End hosts, network interfaces
– Routers, switches, hubs, bridges, repeaters
• Software components
– Communication protocol stack
11
Network Architecture
12
Network Architecture
• Networking can be quite complex and requires a high
degree of cooperation between the involved parties.
• Cooperation is achieved by forcing parties to adhere to
a set of rules and conventions (protocol).
• The complexity of the communication task is reduced by
using multiple protocol layers:
• Each layer is implemented independently.
• Each layer is responsible for a specific subtask.
• Layers are grouped in a hierarchy.
• A structured set of protocols is called a network
architecture, protocol architecture, or protocol suite.
13
TCP/IP Model
end-to-end
Appl
end-to-end
Trans
port
Net
work
Net
work
Link
Link
Host A
Appl
Router 1
point-to-point
point-to-point
Trans
port
Net
work
Net
work
Link
Link
Router 2
Host B
14
TCP/IP Model
(ping)
end-to-end
Appl
Net
work
Net
work
Link
Link
Router 1
point-to-point
point-to-point
server
Appl
end-to-end
Trans
port
Host A
client
Trans
port
Net
work
Net
work
Link
Link
Router 2
Host B
15
Message Flow
Appl
Appl
Trans
port
Trans
port
Net
work
Net
work
Net
work
Net
work
Link
Link
Link
Link
Host A
Router 1
Router 2
Host B
16
Encapsulation
Data
Appl
Data
Trans
port
Net
work
Link
Host A
Appl
Net
work
Link
Router 1
Data
Data
Trans
port
Net
work
Net
work
Link
Link
Router 2
Host B
17
Encapsulation
Example: Sending HTTP message using TCP/IP over
Ethernet
HTTP message
TCP header
HTTP message
TCP segment
IP header TCP
Ethernet header IP
port
IP datagram/packet
Ethernet frame
Adapted from Coulouris, Dollimore and Kindberg
18
ISO layer model
Application
layer
• Application (layer 7): specific to
application need
Presentation
• Presentation (layer 6): conversion
Layer 6
layer
of data representation
Session
• Session (layer 5): access mgt,
Layer 5
layer
synchronization
Transport
• Transport (layer 4): end-to-end
Layer 4
layer
delivery, congestion and flow
Network
control
Layer 3
layer
• Network (layer 3): addressing,
Data link
routing
Layer 2
layer
• Data Link (layer 2): framing, error
Physical
detection
Layer 1
layer
• Physical (layer 1): bits (0/1),
voltages, frequencies, wires, pins, …
ISO/OSI Reference Model
Layer 7
19
Layered Protocol Architecture
Layer 7
Layer 6
Application
layer
Presentation
layer
Application
layer
Layer 7
Software
Socket API
Layer 5
Session
layer
Layer 4
Transport
layer
Layer 3
Network
layer
Network
layer
Layer 2
Data link
layer
Link layer
Layer 1
Physical
layer
Physical layer
ISO/OSI Reference Model
TCP/IP Model
Transport
layer
Layer 4
Operating
System
Layer 3
Layer 2
Hardware
Layer 1
20
The “IP Hourglass”
Application Layer
HTTP, FTP, SSH, SMTP,
Your python program, ...
Transport Layer
TCP, UDP
A single protocol
Network Layer
IP
Data Link Layer
Ethernet,
WiFi, SONET
Physical Layer
coax, twisted pair, fiber,
wireless, pigeons, ...
21
Ensuring Reliability
• Layering:
– Hourglass: many different applications and
underlying network technologies, but
Internet Protocol establishes universal
addressing scheme
– Envelope/Encapsulation: layer-specific
functionalities; isolation between layers
• Reliable communication over unreliable
network
– IP provides “best-effort” packet delivery
service
– TCP supports retransmission of lost packets
22
Internet vs. WWW
• Internet and Web are not synonymous
• Internet is a global communication network
connecting millions of computers.
• World Wide Web (WWW) is one component of
the Internet, along with e-mail, chat, etc.
• Now we’ll talk about both.
23
How Does the WWW Work?
• Let’s say Oski received email with the
address for the i206 web page, or
saw it on a flyer.
• He goes to a networked computer,
and launches a web browser.
• He then types the address, known as
a URL, into the address bar of the
browser.
• What happens next?
(URL stands for Uniform Resource Locator)
24
How Does the WWW Work?
• Say Marti has written some web
pages for her class on her PC.
•
She copied the pages to a directory on a
computer on her local network at the
ischool. The computer’s name is herald.
•
This computer is connected to the
Internet and runs a program called
Apache. This allows herald to act
as a web server.
Web server
25
How Does the WWW Work?
• How does the computer at Oski’s desk
figure out where the i206 web pages are?
• In order for him to use the WWW, Oski’s
computer must be connected to another
machine acting as a web server (via his
ISP).
• This machine is in turn connected to other
computers, some of which are routers.
•
•
iSchool
Network
Routers figure out how to move
information from one part of the network
to another.
26
There are many different possible routes.
How Does the WWW Work?
• How do Oski’s server and the routers know
how to find the right server?
• First, the url has to be translated into a
number known as an IP address.
• Oski’s server connects to a Domain Names
Server (DNS) that knows how to do the
translation.
DNS server
27
Domain Name Syntax
• Domain names are read right to left, from
general to more specific locations
• For example, www.xyz.com can be interpreted
as follows:
• com — commercial site top-level domain
• xyz — registered company domain name
• www — host name (it is a convention to name web
server hosts “www” which stands for “world wide
web”)
Slide adapted from CIW foundations
28
Typical Domain Name
www.xyz.com
Server (host)
name
Registered
company
domain name
Domain
category
(top-level
domain)
Domain names are part of URLs, used in web pages.
Slide adapted from CIW foundations
29
Top-Level Domains
• com, biz, cc — commercial or company sites
• edu — educational institutions, typically universities
• org — organizations; originally meant for clubs,
associations and nonprofit groups
• mil — U.S. military
• gov — U.S. civilian government
• net — network sites, including ISPs
• int — international organizations (rarely used)
Many other top level domains are available
Slide adapted from CIW foundations
30
Converting Domain Names
• Domain names are for humans to read.
• The Internet actually uses numbers called IP
addresses to describe network addresses.
• The Domain Name System (DNS) – resolves IP
addresses into easily recognizable names
• For example:
– 12.42.192.73 = www.xyz.com
• A domain name and its IP address refer to the
same Web server.
Slide adapted from CIW foundations
31
Internet Addresses
• The internet is a network on which each computer must
have a unique address.
• The Internet uses IP addresses; for example, herald’s
IP address is 128.32.226.90
• Internet Protocol version 4 (IPv4) – supports 32-bit
dotted quad IP address format
– Four sets of numbers, each set ranging from 0 to 255
– UC Berkeley’s LAN addresses range from
128.32.0.0 to 128.32.255.255
– Other addresses in the iSchool LAN include
128.32.226.49
• Using this setup, there are approximately 4 billion
possible unique IP addresses
• Router software knows how to use the IP addresses to
find the target computer.
32
How the Internet Works
• Network Protocols:
– Protocol – an agreed-upon format for
transmitting data between two
devices
• Like a secret handshake
– The Internet protocol is TCP/IP
– The WWW protocol is HTTP
• Network Packets:


Typically a message is broken up into smaller pieces and
re-assembled at the receiving end.
These pieces of information, surrounded by address
information are called packets
Slide adapted from CIW foundations
33
IP Packet Format (v4)
Field length in bits
Bit 0
Version
(4)
Hdr Len
(4)
TOS (8)
Header
Identification (16 bits)
Time to Live (8)
Protocol (8)
Bit 31
Total Length in bytes (16)
Flags (3) Fragment Offset (13)
Header Checksum (16)
Source IP Address (32)
Destination IP Address (32)
Data
Options (if any)
Data (variable length)
34
How Does the WWW Work?
• What happens now that the request for
information from Oski’s browser has been
received by the web server herald at
www.ischool.berkeley.edu?
• The web server processes the url to figure
out which page on the server is
requested.
• It then sends all the information from
that page back to the requesting address.
iSchool
Network
35
Reading a URL
http://courses.ischool.berkeley.edu/i206/s12/index.html
http://
courses
.ischool
.berkeley
.edu/
i206/
s12/
index.html
=
=
=
=
=
=
=
=
HyperText Transfer Protocol
service name (often is www)
host name
primary domain name
top level domain
directory name
directory name
file name of web page
36
Web Pages and HTML
• So what do we see at
http://courses.ischool.berkeley.edu/i206/s12/index.html
?
37
Web Pages and HTML
• What does HTML look like?
38
HTML
• HyperText Markup Language
– Uses <tags> which mark up the text and tell
the browser how to display the content.
– A backslash tag means the end of the command
but is sometimes optional
• Examples
–
–
–
–
This is <b> boldface text </b>.
<p> indicates a paragraph break
<h1> This is a large heading </h1>
<h3> This is a smaller heading </h3>
39
HTML Hyperlinks
• Hyperlink is the most important:
<a href=http://www.berkeley.edu/map/maps/BC23.html> 100
Genetics & Plant Biology Bldg </a>
– The green part is called anchor text
• It’s the text you see on the link
– The pink part is the url that the link will take you to if you
click on it. The http:// at the front indicates the http (Web)
protocol.
– The <a href= …> … </a> is the command that indicates the
enclosed information is a hyperlink, and the that text
between the tags is the anchor text.
• A hyperlink can be clicked on by a person OR followed by
a computer program.
40
HTTP
• HTTP is the protocol used by the WWW
• When a user clicks on a hyperlink in their web
browser, this sends an HTTP command to the
Web server named in the URL
• This command usually is to “GET” the contents
of the web page and return them to the user’s
browser.
• It is a very simple protocol
– It relies on the TCP/IP functionality
41
HTTP Request: Example
This information is received by the web server at
www.ischool.berkeley.edu :
Request line
GET i141/s07/index.html HTTP/1.1<CRLF>
Request header
Host: courses.ischool.berkeley.edu <CRLF>
Blank line
<CRLF>
Because HTTP is built on TCP/IP, the web server
knows which IP address to send the contents of
the web page back to.
42
How Does the WWW Work?
• When Oski typed in the url for the i206
home page, this was turned into an HTTP
request and routed to the web server in
Berkeley.
• The web server then decomposed the url
and figured out which web page in its
directories was being asked for.
• The server then sends the HTML contents
of the page back to Oski’s IP address.
iSchool
Network
•
Oski’s browser receives these HTML contents
and renders the page in graphical form.
•
If he clicks on a hyperlink in that page, a
similar sequence of events occurs.
43