Network and Communications

Download Report

Transcript Network and Communications

Networking and Communications
David W. Hankins
10/28/2008
About your Speaker

Left college early to work for ISP's.

Yeah, don't do that.

Now I'm continuing my education.

Operated IP networks from small dial access to large backbones.

Wrote software for Skycache/Cidera.

Now a Software Engineer at Internet Systems Consortium, Inc. (ISC)

Working on the ISC DHCP software project.



Mostly a maintainer, but also wrote DHCPv6 (DHCP for IPv6) software.
Author, RFC 5071 (don't bother reading it).
Off-road and video game enthusiast.
About ISC

Internet Systems Consortium, Inc.




Headquartered in Redwood City, CA
501(c)(3) Nonprofit Corporation
http://www.isc.org/
Mission:



To develop and maintain production quality Open Source software, such
as BIND and DHCP
Enhance the stability of the global DNS through reliable F-root
nameserver operations and ongoing operation of a DNS crisis
coordination center, ISC's OARC for DNS
Further protocol development efforts, particularly in the areas of DNS
evolution and facilitating the transition to IPv6.
Overview

The OSI model is actually dead, but you still need to know it.

We'll talk about the historic to current progression of Ethernet.

'Network' means 'Internet' these days, so I will focus there.
The OSI Model

OSI defined 7 “Layers” in OSI standard networking, so that different
technologies could be used at each layer, and the lower and higher
layers needed no knowledge.


This is kind of like “Modular Programming” for networks.
The 7 OSI Layers (plus Evi Nemeth's extensions):

1: Physical – ex: “Twisted Pair”

2: Link – ex: “Ethernet”

3: Network – ex: “Ipv4”

4: Transport – ex: “TCP”

5: Session

6: Presentation

7: Application

8: Financial

9: Political
OSI is Dead

Long live OSI!


In Internet Protocols, the phrase was coined:




There are still some common uses of these terms.
“IP over Everything. Everything over IP.”
The idea in the Internet's childhood was to provide a simple framing
format that could be used, and communicate with hosts everywhere,
regardless of the lower layer link protocols (nor how proprietary they
were). IP over everything.
It was recognized that once you did this, the great bastions of
proprietary networks (the telco companies) would soon find
themselves subverted; IP could bridge them all.
Ex: Voice over UDP over IP over PPP over SSH over TCP over IP
over DNS over UDP over IP over Ethernet transmitted on twisted pair.
Which one is layer 3?
Ethernet Framing



Ethernet Network Interface Cards (NICs) were assigned
(theoretically) unique addresses by their manufacturers.

The first 24 bits are Organizationally Unique Identifiers, so for one manufacturer
that field is fixed, and they assign the remaining bits.

That didn't work out so well.
The NIC would filter out any packet it received that was not directed
to its own address, so the OS would not have to discard it.
Exception: broadcast address.
The 64-octet minimum length was to enforce Ethernet timing
parameters; at 10Mbps, 512 bits is 51.2us, so 10km.
Ethernet: Thicknet


Ethernet L1's follow a kind of naming convention. In the term
“XXXBase-PHY”, X is the speed of the link (in megabits per second),
and PHY is a conventional suffix to describe the link media. For
example, both 10Base-T and 100Base-T exist.
When I started working at ISP's, something called “Thicknet”
(“10Base-5”) was just getting phased out in favor of “Thinnet”
(10Base-2).

Thicknet used a coaxial cable, a center conductor wire made of copper sheathed
in an insulator (wax), with a braided second conductor completely surrounding it,
so that the shield and the center pin shared the same axis. A plastic sheath
enclosed the whole thing.

The center wire carried signal, the sheath made a Farraday Cage.

At points along the plastic sheath, lines marked the points in the wire where
Ethernet signals' standing waves would stand. Hosts could be attached to these
points, with two points sticking through the sheath and insulator like vampires.
Ethernet: Thinnet




Note that although the -5 suffix in Thicknet was used to describe its
maximum distance between cable endpoints (500 meters), and the -2
suffix used in Thinnet similarly marked the intent for it to survive 200
meters, 10Base-2 was later clarified to survive no more than 185
meters.
Thinnet used a coaxial cable (RG-58, smaller than Thicknet's RG-8),
but rather than vampiric connections, the coax cables used BNC
connectors on the end, and hosts in the network were connected on
“T” connections.
Each end of the coax would be terminated with a 50 Ohm resistor.
This reduced the amplitude of reflection signals reaching the cables'
ends.
This was certainly better than Thicknet, but it still had problems.
Ethernet: Do the Twist!

The 10Base-T standard described the use of Category 5 cable.
Category 5 cable is a performance standard, generally met by using
twisted pairs of 24 AWG unshielded cables.



To meet the 16Mhz standard, you might use more twists, or different gauge wire.
Similarly to 10Base-5 and 10Base-2 where the center conductor and
sheath served as the transmit/receive and Ground, twisted pair cable
usually carries 4 pairs of conductors, where each pair is twisted
around themselves.
The twist in the pair averages out noise imparted on the line from
other signal sources; receivers look at the voltage difference between
the pair, not at the absolute voltage of either. By twisting the pair, the
signal and ground are affected near identically, so the difference is
unaffected by noise.
Ethernet: RJ45

The new 10Base-T cables used RJ45 connectors on the end, plastic
connectors with a retaining lever and 8 conductors (for the 4 pair).


With 4 pair to choose from, Ethernet was now free to use two whole
pair for transmit and receive.



W-O, O, W-G, B, W-B, G, W-B, B
The orange and green pair.
Ethernet Hubs would present female RJ45's, and provided a bus
between all the transmit and receive pins. This meant that although
you had separate physical channels for transmit and receive, you still
had to handle collisions, where two nodes transmit at the same time.
This meant realistically, 10Base could not reach 10Mbps.
Ethernet: Duplexing




Throughout the 10Base-T and well into the 100Base-TX eras,
networks slowly migrated away from Ethernet 'Hubs' towards
Ethernet 'switches' (also referred to as 'bridges').
A switch's primary difference is that it has its own Ethernet receive
and transmit chips, which receive packets from one ingress interface,
examine the Ethernet header destination address, and select only
one egress port to transmit the Ethernet packet on (presuming it is
not already busy).
Switches slowly build a “Forwarding Database” by observing packets
that it receives, and recording the source address and the port it
received the packet on.
This enabled Ethernet hosts to transmit in “Full Duplex”, at full line
speed, receiving and transmitting in parallel. No collisions.
A day in the Life of your Laptop




On your OS's desktop, there is some widget that lets you pick
Wireless-Ethernet networks, distinguished by ESSID. This is just
Ethernet over 802.11(mumble), some form of spread-spectrum
microwave, with some quirks.
One quirk is that your NIC will associate with an access point, which
in essence establishes a connection for your NIC with the AP's
Ethernet broadcast domain. From here down, everything looks like
Ethernet packets.
These Ethernet packets will probably be carrying a number of things,
but we're only interested in IP (Internet Protocol) and ARP (Address
Resolution Protocol) packets.
ARP provides a way for hosts to find the Ethernet MAC address for a
given IP address, if it is on the same broadcast domain.
Getting Configured




You can't talk to other folks on the Internet unless you have an IP
address (so they can send replies).
So the first thing your laptop will do once it is associated is to start its
Dynamic Host Configuration Protocol (DHCP) client. The DHCP
client uses a finite state machine to retain a consistent and correct
configuration given changes in administrative policy (IP renumbering,
changes in service addresses).
The client's initial DHCP packets are transmitted to the broadcast
address, any servers will reply to the client's unicast MAC address
(with some complicated exceptions), offering it an address
configuration as well as service locators.
The client selects a configuration and requests it.
IP Addresses





IP version 4 addresses are 32-bit fields, usually represented by
octets: 10.1.2.3
An IP “subnet” is a specific region of the 32-bit space. 10.1.2.0/24,
for example, is all those addresses where the first 24 bits are
“10.1.2”, and the remaining 8 bits are of any value.
The IP packet header has lots of fields; let's just say it at least has a
source and destination IP address.
To forward a packet from your laptop to another on the same network,
your laptop notices “10.1.2.4” is inside “10.1.2.0/24”, and uses ARP to
unicast directly.
When you want to talk to someone on another network, it gets
complicated.
IP Routing Basics




One of the things you got from DHCP was a “routers” option. This
lists a number (usually one) of IP addresses inside your subnet which
you should direct packets to get to the rest of the world. These are
generally referred to as “default routers.”
Any route is a pair of values: a prefix, and a destination to forward
that prefix to.
The default route is simply the 0.0.0.0/0 prefix directed to the listed
default routers' IP addresses. Your laptop also carries a 10.1.2.0/24
route in its table, pointing to your NIC.
To route a packet outward, you start with the most specific route(s) in
your table, and work down to the least specific. The route matches if
the destination IP address is within the subnet. If no route matches,
you emit an ICMP error.
DNS




IP addresses are hard to read and type, so we use the Domain Name
System to map names to resources.
Now that your laptop is on the network, you start up your web
browser, and try for http://www.isc.org/.
“www.isc.org” is not an IP address, it is a domain name. To find
www.isc.org's IP address, your laptop performs a recursive DNS
query against a nameserver (whose IP address was provided by
DHCP). The recursive nameserver is an “assist service” extended by
your network administrator.
It recursively follows DNS delegations from the root nameservers (the
silent dot after org), the GTLD nameservers (org), and finally their
delegation to ISC's own nameservers (isc.org), which reply with the
“A record” of www.isc.org. All these nameservers are referred to as
'Authoritative' nameservers.
UDP and TCP

DHCP and DNS are protocols that run over UDP (over IP over...),
although DNS can also be carried by TCP.



They are really just UDP payload data, the UDP port essentially directs the
packet to a buffer to be sent to an application (or discard).
HTTP, to reach http://www.isc.org/, wants to open a TCP connection
to the HTTP port 80 on www.isc.org. Now that your laptop has this IP
address in hand, it can do that.
TCP and UDP differ mainly in that if a UDP packet is lost, no one
cares (the application must retransmit). TCP's whole purpose,
however, is to make sure a stream of data written in on one end
reliably reaches the other end, as fast as reasonably possible. It
tracks RTT's, and schedules retransmissions. It negotiates and then
uses a 'window' of data to completely use the network between the
nodes. The application is unaware of all this.
IP Routing Again




You understand Routing Basics, but how does your default router
know how to direct packets destined to ISC?
It has its own routing table. It might be simple, like yours, using
another default route. At some point, it will reach a router that is
running “default-less”, that has loaded the full table of routes for the
whole Internet.
Networks advertise routes for their own address space to their
customers, peers and transit providers using the Border Gateway
Protocol. These peers then (selectively) extend the announcement to
their own customers, peers and transit providers. This goes around
the world.
A BGP route is again just a prefix, with a destination address, and an
AS-list to perform loop detection. When an IP packet matches a BGP
destination route, the destination address is picked up and researched on the routing table.
The IGP




BGP was first laid down to carry those border routes; the routes
external to the network, and to advertise least-specifics for the local
network's address space.
Recursive lookups on its table must eventually be found in the
internal network, as a directly connected route on the current router.
So internal or directly attached networks are commonly advertised to
all the routers in the network using ISIS or OSPF – IGP Routing
protocols. These protocols have different design and limitations –
they converge faster and support load balancing, but do not scale to
large numbers of routes.
Many networks have grown so large, that they offload portions of the
IGP into BGP itself; this practice is called “iBGP”, although the
protocol is identical.
Route Caching




When a route is finally found, it is worthwhile not to repeat all that
recursive effort. So, the router will insert an entry in a cache.
There are many caching approaches, the most common is Flow
Caching, where the IP packet's source and destination IP address
and TCP/UDP ports are combined together to form a unique key in
the cache (usually a hash table).
This has a tendency to provide very stable RTT times, an advantage
over other load-balance related caching techniques (like round robin).
In earlier router architectures, the cache lived on the line card holding
the interfaces. Today, modern routers prepopulate the cache from
routing information.
Internet Growth


This graph only measures the number of PTR records registered in
the DNS. The actual number of IP addresses in use is probably
much higher!
There are only 2^32 addresses, but don't worry: IPv6!
Someday...
Keeping up with Growth





The main currency of Internet connectivity is bandwidth. Ports for
users is just capex. Bandwidth charges are forever.
One trouble is that network interfaces classically delivered by Telco
companies come with fixed monthly costs associated with their
maximum line rate capacity.
T1: 1.5Mbps, T3: 45Mbps, OC-3: 150Mbps...
This creates a “stepladder effect” in ISP operations. Your userbase
grows, your T1 is getting full. You have to double your expenses on a
second T1 to support a fraction of a percent more customers.
The ISP profit game was about matching your growth and expense
curves optimally, timing your buildouts in advance of your growth.
All your Profits are Belong...





Maybe you've spotted the flaw in this little plan. No matter how much
an ISP grows, it just gives all its profits to the Telco! Customers pay
just enough for more bandwidth.
This continued for a long time, until recent years (~2000) with the
advent of fibre based long-haul and metropolitan networks.
This allowed ISP's to get their own “dark” fibre (or a lambda on a
shared fibre using WDM). No one can tell you how fast your
networking goes over light. They just carry your light around.
So the Telco's just bought the ISP's instead.
Still, today we bill more in terms of 95%ile of bandwidth used, and not
so much in terms of connect rate.
Some notes on Telco Lines




Your home phone is still (unfortunately) analog. The Telco does a
Digital-To-Analog conversion, pushing pulses towards your home
carrying the audio signal (driving your speaker). When you speak, an
Analog-To-Digital conversion places your mic signal into a digital
stream that is call-routed outwards.
This digital stream is 56Kbps of audio, carried in a 64Kbps timeslice
(8 bits per second are used for control).
These timeslices are MUXed together to form telecommunications
backbone lines. A T1 is 24 of these “DS0”'s. A DS-3 is 30 T1's
MUXed together. Call routing protocols connect and assign
timeslices throughout the network.
ISPs first started by renting fixed allocations in these networks (a
whole or fractional T1 or T3 at a time).
PSTN Implementation



The stream of digital audio bits that comprise the timeslices inside
each DS0, carried in whatever level of hierarchical switching, is
transmitted between nodes as block waves, in its digital form using
Alternate Mark Inversion (AMI).

Positive or negative voltage is a 1. Neutral is a zero. The transmitter produces
an equal number of positive and negative voltage waves.

This makes for little if any “DC Bias” in the DS lines.
One end transmits, using its own internal clock. The far end synchs
up to the signal as it is received, and transmits in the other direction
using this signal to provide clocking.
In order to maintain clock synchronization, for DS1 a minimum of 1 bit
per every eight must be set high. In the attempt to send 8 binary
zeroes, the transmitting node would indicate a bipolar violation. Later
we used B8ZS and B3ZS.
ISDN – The Upgrade that Wasn't



The Integrated Switched Digital Network could essentially be
understood as a “Version 2” of the PSTN.

Basic Rate Interfaces were two pair instead of one and delivered two 64Kbps
digital channels (in AMI form just as before) and one 8Kbps D-channel.

PRI were carried by DS1, and had 23 64Kbps channels and one 64Kbps Dchannel.

You could make digital voice or data “calls” over any of these channels (even
opening multiple channels to one destination).
Because it was digital, and also provided a means for unadulterated
data transmission, Telco companies saw it as a “value added
service.”
So it cost extra to get one. Digital voice calls were free just like on a
normal phone, but data connections were costly.
Data over Voice

So, what you do is you modify your modem's DSP firmware.

When you get a voice call, check to see if it passes HDLC framing.


If it doesn't pass, start modem negotiations. If it does pass, push the
raw digital bits through to your processor for digital PPP negotiation.
The telco in Washington actually tried to get a law passed to make
this illegal.
Just how Fast


Well, OC-768 is running these days at about 40Gbps.
At IETF 71, Philadelphia, Comcast delivered transit for the event via a
single 100 Gigabit Ethernet line (the prototype).


Because it was the prototype, the IETF network didn't actually have equipment to
receive it. So the interface towards IETF was actually 10 individual 10Gig-E's.
The Ethernet and SONET specifications often do what they can to
just surpass each other. The simple reason is that SONET facilitates
ATM (Telco profit) whereas Ethernet facilitates IP (ISP profit).

The last DWDM system I saw put 64 lambdas on a single fibre.

So let's say, 6.4 Tbps is approachable.
References

I admit that I lifted the Ethernet framing picture from Wikipedia. It
appears to be under an appropriate license.


The Internet Growth image is from ISC's 2008 domain survey,
available on our website (but please do ask before including it in
publications, we'll probably say yes, but we like to hear from you
anyway).


http://en.wikipedia.org/wiki/Ethernet_II_framing
http://www.isc.org/ops/ds/
The rest is all from memory. I make no claims any of it is accurate.