The Domain Name System (DNS)

Download Report

Transcript The Domain Name System (DNS)

The Domain Name System (DNS)
J. C. Allen
March 14, 2004
History of DNS
●
Mid- to late-1960s:
–
U. S. Department of Defense Advanced Research
Projects Agency (ARPA)
–
1968. RFQ DAHC15 69 Q 0002, dated July 29, 1968
(“ARPANET”)
–
1969. ARPANET was commissioned:
●
●
●
IMPs (Bolt Beranek and Newman)
Network topology (Network Analysis Corp.)
Network measurement system (UCLA)
History (Continued)
●
The goal of the ARPANET: Connect important
research centers in the United States, allowing
government contractors to share computing
resources to promote research and joint
development
History (Continued)
●
Early 1980s
–
TCP/IP developed
–
Quickly became de facto standard
●
●
Included in BSD Unix
BSD Unix free to universities
–
ARPANET rapidly grew from hundreds of hosts to
tens of thousands
–
In time, ARPANET became the backbone of the
Internet
The Rapid Growth of the Internet
Created a Problem
●
1970s
–
ARPANET small with only a few hundred hosts
–
All routing information (name-address mappings)
contained in single file: HOSTS.TXT
●
●
–
HOSTS.TXT maintained by SRI NIC (“the NIC”)
HOSTS.TXT distributed from a single host: SRI-NIC
Network administrators
●
●
Update (email)
Download most current version (FTP)
Rapid Growth (Continued)
●
As hosts were added to the network, this scheme
proved to be unworkable
–
Size of HOSTS.TXT grew with the size of
ARPANET
–
Traffic generated by update process increased with
each additional host. Each additional host:
●
●
was another host updating from SRI-NIC
required another line in HOSTS.TXT
Rapid Growth (Continued)
●
1980s
–
ARPANET moved to TCP/IP
–
Size of the network exploded
–
Immediate problems:
●
●
●
●
Network traffic/load on SRI-NIC
Bandwidth consumed distributing a new version
proportional to the square of the number of hosts
Name collision. The NIC could guarantee unique
addresses, but had no authority over host names
HOSTS.TXT obsolete by the time it reached the most
remote hosts
Rapid Growth (Continued)
●
●
HOSTS.TXT update process not scalable
ARPANET chartered search for replacement.
System selected should:
–
Allow local administration of data
–
Make name-address mappings available globally
–
Use a hierarchical namespace to ensure names are
unique
Rapid Growth (Continued)
●
●
1984. Paul Mockapetris released RFCs 882 and
883, which describe the DNS
RFCs 882 and 883 later superseded by RFCs
1034 and 1035
The Design Goals of the DNS (RFC
1034)
●
The primary goal is a consistent name space
which will be used for referring to resources. In
order to avoid the problems caused by ad hoc
encodings, names should not be required to
contain network identifiers, addresses, routes, or
similar information as part of the name.
Design Goals (Continued)
●
●
The sheer size of the database and frequency of
updates suggest that it must be maintained in a
distributed manner, with local caching to improve
performance.
Where there tradeoffs between the cost of
acquiring data, the speed of updates, and the
accuracy of caches, the source of the data should
control the tradeoff.
Design Goals (Continued)
●
●
The costs of implementing such a facility dictate
that it be generally useful, and not restricted to a
single application.
Because we want the name space to be useful in
dissimilar networks and applications, we provide
the ability to use the same name space with
different protocol families or management.
Design Goals (Continued)
●
●
We want name server transactions to be
independent of the communications system that
carries them.
The system should be useful across a wide
spectrum of host capabilities. Both personal
computers and large timeshared hosts should be
able to use the system, though perhaps in
different ways.
Definitions
●
Domain Name Space. The domain name space is
represented as an inverse tree representing all
possible domains. Each node and leaf of the
domain name space provides a label for a set of
information, which may be empty. Query
operations attempt to extract specific types of
information from a particular set.
Definitions (Continued)
●
●
Node. A node is the point at which the tree
branches.
Domain Name. Each node has a text label. The
full domain name of a given node is the list of
labels on the path from that node to the root node.
Domain names are always read from the node
toward the root node. A domain name uniquely
identifies a single node in the domain name
space.
Definitions (Continued)
●
Fully Qualified Domain Name (FQDN). The
NULL (zero-length) domain name is reserved for
the root node. Therefore, a domain name which
ends in a dot actually ends in a dot and the root
node's label. This is interpreted as an indicator
that the domain name is absolute and
unambiguously specifies the node's location in
the domain name space. An absolute domain
name is referred to as a fully qualified domain
name.
Definitions (Continued)
●
●
Domain. A domain is simply a subtree of the
domain name space. The domain name of a
domain is the same as the domain name of the
node at the top of the domain.
Subdomain. A subdomain is a subtree of a
subtree. A domain is a subdomain of another
domain if the root of the subdomain is contained
within the domain.
Definitions (Continued)
●
Resource Records.
–
owner. The owner of a resource record is the domain
name where the resource record is found.
–
class. Resource records are divided into classes.
Each class corresponds to a protocol family or
instance of a protocol. For example:
●
●
"IN" describes internets based on TCP/IP.
"CH" describes internets based on the Chaosnet protocol.
Definitions (Continued)
●
Each record type of a given class is defined by a particular
syntax. All resource records of that class must adhere to
the syntax.
–
type. Different classes define different record types.
The type specifies the type of resource in the resource
record. Some types are common to more than one
class.
–
TTL (time-to-live). Time to live describes the amount
of time a resource record can be cached before it
should be discarded.
Definitions (Continued)
–
●
RDATA. The type and sometimes class dependent
data that describes the resource.
Examples:
–
localhost.cnu.edu IN A 127.0.0.1
–
lh IN CNAME localhost.cnu.edu
Definitions (Continued)
●
●
Top-Level Domain (TLD) or Generic Top-Level
Domain (gTLD). A TLD is a child of the root
node.
Delegation. Delegation is the process by which
responsibility for a subdomain is assigned to
another organization.
Definitions (Continued)
●
Name Servers. Name server, in this sense, is
used to refer to a server application that stores
information about the domain name space's
structure and set information. Name servers have
complete information about a subset of the
domain name space, called a zone, and pointers to
other name servers that lead to information from
any part of the domain name space.
Definitions (Continued)
●
●
Primary Master. A primary master reads the data
for the zone from a file on its host. A primary
master is authoritative.
Secondary Master. A secondary master retrieves
zone data from another master name server that is
authoritative at some interval, which may be
selected. This is frequently, but not necessarily,
the primary master. A secondary master is
authoritative.
Definitions (Continued)
●
Zone. A zone is a smaller unit than a domain, and
may be, but is not necessarily, a subdomain. A
domain is broken up into zones to make it more
manageable by delegation.
Definitions (Continued)
●
Zone Transfer. The process by which a secondary
master contacts its master server and, if
necessary, transfers the zone data files. Changes
are made to the zone data file on the primary
master. Secondary masters periodically check for
changes and obtain new copies of the zone data
files when necessary.
Definitions (Continued)
●
Zone Data Files ("data files" or "database files").
Zone data files are files from which the primary
master name servers load their zone data.
Secondary masters back up zone data to data
files. If killed, secondary masters check their
backup first. If the zone data is current, there is
no need to transfer files from the primary master.
Zone data files contain the resource records that
describe a zone.
Definitions (Continued)
●
Resolver. Resolvers are programs that access
name servers. A resolver must be able to access
at least one name server and use that name
server's information to answer a query directly, or
pursue the query using referrals to other name
servers. Resolvers provide a service by querying
name servers, interpreting the response, and
returning information to the application that
requested it.
Definitions (Continued)
●
●
Resolution. Resolution is the process by which
name servers satisfy client requests.
Root Name Server. A root name server knows
where the authoritative name servers for each
TLD are. There are thirteen (13) root name
servers spread across different parts of the
network.
Definitions (Continued)
●
Recursion. Recursion, or recursive resolution, is
the name for the process used by name servers
which receive recursive queries. The name server
takes on the task of locating the requested
resource. If it does not already know the answer,
it queries other name servers, following any
referrals it receives, until an authoritative answer
to the query is received. Recursion is the
simplest mode for the client.
Definitions (Continued)
●
Iteration. Iteration, or iterative resolution, is the
name for the process used by name servers which
receive iterative queries. The name server returns
the best answer that it already knows to the
requester using only information available
locally. The response to an iterative query
contains an error, the answer, or a referral to some
other server closer to the answer. Iteration is the
simplest mode for the name server.
Definitions (Continued)
●
Caching. Caching refers to the process by which
the information returned by DNS queries is
locally replicated, eliminating the need to requery each time the same information is
requested.
Features
●
DNS uses query-response to locate resources in
the domain name space
–
Applications use resolvers (stub vs. “smart”)
–
Resolvers construct queries and send them to name
servers
–
Response from name server either:
●
●
●
provides the information requested,
refers the requester to another name server, or
signals some error has occurred
Features (Continued)
●
DNS queries and responses have a standard
format:
–
header containing required information
–
four sections containing specific parameters:
●
●
●
●
Question (query name)
Answer (answer the question)
Authority (describe other authoritative servers)
Additional (may be helpful)
Features (Continued)
●
●
●
Standard query (SQUERY) specifies:
–
target domain name (QNAME)
–
type (QTYPE)
–
class (QCLASS)
Resolver constructs query, sends to name server
Name server searches for matching resource
records
Features (Continued)
●
If no exact match, name server
–
returns a pointer to a name server “closer” to the
answer, or
–
information that may be useful,
–
depending on type of query: recursive or iterative
–
Recursion if resolver and name server agree to its use,
iteration otherwise.
Features (Continued)
●
Example query:
+---------------------------------------------------+
Header
| OPCODE=SQUERY
|
+---------------------------------------------------+
Question
| QNAME=SRI-NIC.ARPA., QCLASS=IN, QTYPE=A
|
+---------------------------------------------------+
Answer
| <empty>
|
+---------------------------------------------------+
Authority | <empty>
|
+---------------------------------------------------+
Additional | <empty>
|
+---------------------------------------------------+
Features (Continued)
●
Example response:
+---------------------------------------------------+
Header
| OPCODE=SQUERY, RESPONSE, AA
|
+---------------------------------------------------+
Question
| QNAME=SRI-NIC.ARPA., QCLASS=IN, QTYPE=A
|
+---------------------------------------------------+
Answer
| SRI-NIC.ARPA. 86400 IN A 26.0.0.73
|
|
86400 IN A 10.0.0.51
|
+---------------------------------------------------+
Authority | <empty>
|
+---------------------------------------------------+
Additional | <empty>
|
+---------------------------------------------------+
Features (Continued)
●
Another example response:
+---------------------------------------------------+
Header
| OPCODE=SQUERY,RESPONSE
|
+---------------------------------------------------+
Question
| QNAME=SRI-NIC.ARPA., QCLASS=IN, QTYPE=A
|
+---------------------------------------------------+
Answer
| SRI-NIC.ARPA. 1777 IN A 10.0.0.51
|
|
1777 IN A 26.0.0.73
|
+---------------------------------------------------+
Authority | <empty>
|
+---------------------------------------------------+
Additional | <empty>
|
+---------------------------------------------------+
Structure
●
The DNS has three major components:
–
The domain name space and resource records
describe the inverse tree structure of the name space
and any associated data. Each node of the domain
tree describes a set of data which is a subset of the
whole tree.
Structure (Continued)
–
Name servers hold information about the domain
tree's structure and set information. A name server
may maintain a cache of structure or set information
about any part of the domain tree. A name server has
complete information about some subset of the
domain name space for which it is authoritative,
called a zone, and pointers to other name servers that
have information about any other part of the domain
tree.
Structure (Continued)
–
Resolvers are programs or library routines that
provide a service to user programs. Resolvers attempt
to extract information from name servers by sending
queries to a name server and handling the response.
Queries attempt to extract specific types of
information from a particular set of data. A query
provides the domain name of interest and describes
the type of information that is being requested.
Implementation
●
Implementation details are beyond the scope of
this presentation. In general (using BIND):
–
The zone administrator creates the zone data files
which describe the zone. These include a file
containing the resource records which provide
information on specific types of information which
are available in the zone, a file which describes the
loopback address hosts use to direct traffic to
themselves, and a file which contains the locations of
the name servers for the root zone and which is
obtained from InterNIC via anonymous FTP.
Implementation (Continued)
–
Next, the administrator configures BIND. The BIND
configuration file, called named.boot (older versions
of BIND) or named.conf, provides the locations of the
zone data files on the name server and the values of
any options to be passed to BIND.
–
Finally, the name server is started with the
/usr/sbin/named command.
Implementation (Continued)
–
The process for implementing a secondary master
name server is similar. Again, specific details in the
zone data files are changed to indicate that the name
server is a secondary master. When the secondary
master is started it reads zone data from these files.
However, it receives all updates from the primary
master server as described previously via a zone
transfer.
Significance
●
A distributed system is defined as one in which
components at networked computers
communicate and coordinate their actions only by
passing messages. This is clearly the case with
the DNS.
–
Name servers and resolvers pass messages back and
forth in the form of query and response to coordinate
the location of resources on a global network.
Significance (Continued)
●
In addition, the DNS has the following
characteristics:
–
Data available through client-server interaction.
–
Data replicated to provide global access to data
–
Data cached to minimize network traffic and load.
–
Data distributed so that the name server most likely to
know its location is authoritative.
–
DNS is scalable.
–
DNS is transparent
Summary
The DNS is a distributed database structured as an inverse tree. Each
node in the tree describes a set of data. The data is requested by user
programs. The user program uses a resolver to construct the query and send
it to a name server. A name server provides a response to client (resolver)
requests. Through client-server interaction, the location of every resource
on the Internet may be determined. The decentralized nature of the DNS
reduces network traffic and server load and increases ease of administration.
Applications exist to implement DNS which are dependent on the
underlying hardware and operating system.
References
1.
2.
3.
4.
5.
P. Albitz and C. Liu, DNS and BIND, Fourth Ed. (O'Reilly &
Associates, Inc., Sebastopol, CA, 2001)
P. Mockapetris, RFC-1034 Domain Names - Concepts and
Facilities (Information Sciences Institute, November 1987)
P. Mockapetris, RFC-1035 Domain Names - Implementation and
Specification (Information Sciences Institute, November 1987)
G. Coulouris, J. Dollimore, and T. Kindberg, Distributed Systems,
Concepts and Design, Third Ed. (Pearson Education Ltd., Harlow,
2001)
B. Leiner et al., A Brief History of the Internet, version 3.32
(Internet Society, www.isoc.org/internet/history/brief.shtml, 2003)