relational databases - Northern Illinois University

Download Report

Transcript relational databases - Northern Illinois University

Relational Databases
In this lesson you will learn:
•
concept of relational databases
•
database keys
•
relational database operations
•
relates vs. joins
•
indexing geography
•
the U.S. FIPS code system
•
SQL language
What is a relational database?
Relational database
Defn: A collection of tables, which can be connected to each other by
attributes whose values can uniquely identify a record in a table.
Kang-tsung Chang, Introduction to Geographic Information
Systems (2nd edn), New York: McGraw, Hill, pg 14.
You
Your place
of residence
Your
hometown
name
street address
city
street address
city
state
education
building type
population
career goals
square footage
year founded
age
# bedrooms
nickname
….
….
….
Flat vs. relational databases
person
residence
ID
name
age
gender
address
type
Sq ft
#baths
city
state
pop
0001
J. Smith
87
M
104 N. Main
SFR
1800
1.5
Alma
IL
9832
0002
J. Smith
89
F
104 N. Main
SFR
1800
1.5
Alma
IL
9832
0003
J. Smith
57
F
840 Washburn
SFR
2450
2
Alma
IL
9832
0004
J. Smith
20
M
840 Washburn
SFR
2450
2
Alma
IL
9832
….
…
…
…
…
…
…
…
…
…
…
Place of
residence
Persons
name
address
gender
J. Smith
104 N. Main
F
J. Smith
B. Doe
E. Mackey
…
community
104 N. Main
M
F
F
….
Communities
street address
city
building
type
13 N 2nd, unit B
Alma
MFR
840 Washburn
Alma
SFR
217 S. Main
Alma
SFR
104 N. Main
Alma
SFR
…
…
….
city
state
Acacia
IL
Addison
IL
Albion
IL
Alma
IL
…
…
population
9,832
The “Key”
Persons
name
address
gender
J. Smith
104 N. Main
F
W. Smith
109 N. Main
M
B. Doe
F
E. Mackey
F
…
….
Place of
residence
street address
city
building
type
13 N 2nd, unit B
Alma
MFR
840 Washburn
Alma
SFR
217 S. Main
Alma
SFR
104 N. Main
Alma
SFR
…
…
….
Communities
city
state
Acacia
IL
Addison
IL
Albion
IL
Alma
IL
…
…
population
9,832
Properties of “Keys”
Place of
residence
Persons
name
address
gender
J. Smith
104 N. Main
F
J. Smith
104 N. Main
M
B. Doe
F
E. Mackey
F
…
….
Communities
street address
city
building
type
13 N 2nd, unit B
Alma
MFR
840 Washburn
Alma
SFR
217 S. Main
Alma
SFR
104 N. Main
Alma
SFR
…
…
….
city
state
Acacia
IL
Addison
IL
Albion
IL
Alma
IL
…
…
Employment history
Persons
population
9,832
Medical records
name
address
gender
employee
employer
#years
patient
physician
insurance
Ja. Smith
104 N. Main
F
Ja. Smith
Alma City
13
Ja. Smith
D. Brown
Red Shield
Jo. Smith
104 N. Main
M
Jo. Smith
Alma Library
12
Jo. Smith
Eggleson
Etne
B. Doe
F
Bea Doe
Mom’s Grill
3.5
B. Doe
D. Brown
Mutual
E. Mackey
F
E. Mackey
AAA Towing
27
M.Morris
K. Chang
Mutual
…
….
…
…
….
…
…
Relational database operations
Single table operations:
• edit
• sort
• query (find, filter, subset)
• data statistics & graphs
Connected table operations
• edit
• sort
• query (find, filter, subset)
• reports (data statistics & graphs)
Relational database operations
Connection operations
1. Relates – links attribute data of secondary tables to primary table,
maintaining data table independence. Output table consists of no
more rows than records in primary table.
2. Joins – physically appends attribute data of secondary tables to
primary table; may be temporary or permanent. Each match creates
new record.
Combination integrity
Combinatorial relations
1:1
one-to-one; each instance of a match between the primary key
and the foreign key produces a new record. Integrity conflict
occurs when several foreign keys match one primary key.
Combination integrity is retained when primary and foreign
keys are unique to single records.
1:N
one-to-many; each primary key is unique, however secondary
tables may contain several records associated with the same
foreign key. Combination integrity is retained when the primary
key is unique to single records; conflict occurs when the
primary key is not unique.
M:N
many-to-many; primary and foreign keys are not necessarily
unique. No combination conflict.
A 1:1 combination
Spatial data:
counties of Illinois
Attribute data:
1989-98 county-level mortality rates
Name
Birth
Trauma
Automobile
accidents
Guns
Poison
Adams
1.00
14.30
0.60
1.50
Alexander
1.00
34.00
14.60
0.00
Bond
Bond
0.60
25.30
5.80
0.60
004
Boone
Boone
0.60
22.40
0.30
0.90
005
Brown
Brown
0.00
20.80
0.00
0.00
006
Bureau
Bureau
0.60
23.80
1.10
2.20
007
Calhoun
Calhoun
1.90
31.20
3.90
0.00
008
Carroll
Carroll
0.60
25.00
0.60
1.20
…
…
…
…
…
ID
County_name
001
Adams
002
Alexander
003
…
A 1: many combination
Attribute data:
Historic tornado records, by state
Spatial data:
states of the United States
ID
State
State
Year
Event#
…
…
…
…
Illinois
1950
1
017
Illinois
…
1950
2
018
Indiana
…
1951
1
019
Iowa
…
…
…
…
…
…
020
Kansas
Illinois
1995
8
021
Kentucky
Indiana
1950
1
…
…
..
022
Louisiana
…
…
…
023
Maine
…
…
…
Indiana
1995
12
Iowa
1950
1
…
…
…
…
…
…
…
…
…
…
From_LAT
From_LON
…
…
Indexing Geography
WA
Washington
MT
Montana
OR
ME
ND
Maine
North Dakota
Oregon
MN
VTNH
Minnesota
ID
Vermont
WI
SD
Idaho
South Dakota
WY
NY
MI
Wisconsin
NV
CA
Utah
CO
Illinois
KS
MO
Kansas
Missouri
IN
MD DE
DC
Ohio
Delaware
WV
Maryland
West Virginia
VA
Indiana
KY
Virginia
Kentucky
NC
TN
AZ
Arizona
OK
NM
AL
MS
Mississippi Alabama
TX
SC
South Carolina
Arkansas
New Mexico
North Carolina
Tennessee
AR
Oklahoma
GA
Georgia
LA
Texas
Louisiana
FL
Florida
0
200
400
600
800
NJ
Pennsylvania
New Jersey
OH
IL
Colorado
California
Iowa
Nebraska
UT
Nevada
PA
IA
NE
MA
RI
Massachusetts
CT
New York
Rhode Island
Connecticut
Michigan
Wyoming
New Hampshire
1000 km
Elementary geographic indexes
Grid index map of the San Francisco Zoo.
Image courtesy of Northern California GIS,
http://www.nocalgis.org
U.S. FIPS codes
Federal information processing standards codes (FIPS codes) are a standardized set of
numeric or alphabetic codes issued by the National Institute of Standards and Technology
(NIST) to ensure uniform identification of geographic entities through all federal government
agencies. The entities covered include: states and statistically equivalent entities, counties and
statistically equivalent entities, named populated and related location entities (such as, places
and county subdivisions), and American Indian and Alaska Native areas.
FIPS publications include:
FIPS PUB 5-2, Codes for the Identification of the States, the District of Columbia and the
Outlying Areas of the United States, and Associated Areas
FIPS PUB 6-4, Counties and Equivalent Entities of the U. S., Its Possessions, and
Associated Areas
FIPS PUB 6-5, Lookup Counties and Equivalent Entities of the U.S., Puerto Rico and Island
Areas
FIPS PUB 8-6, Metropolitan Areas (Including MSAs, CMSAs, PMSAs, and NECMAs)
FIPS PUB 9-1, Congressional Districts of the U.S.
FIPS PUB 55-3, Codes for Named Populated Places, Primary County Divisions, and
Other Locational Entities of the United States, Puerto Rico, and the Outlying Areas.
from the U.S. Census Bureau,
http://www.census.gov/geo/www/fips/fips.html
FIPS code geography
State
FIPS
Alpha
State
FIPS
Alpha
Code
Code
Code
Code
Alabama
01
AL
Montana
30
MT
Alaska
02
AK
Nebraska
31
NE
Arizona
04
AZ
Nevada
32
NV
Arkansas
05
AR
New Hampshire
33
NH
California
06
CA
New Jersey
34
NJ
Colorado
08
CO
New Mexico
35
NM
Connecticut
09
CT
New York
36
NY
Delaware
10
DE
North Carolina
37
NC
District of Columbia
11
DC
North Dakota
38
ND
Florida
12
FL
Ohio
39
OH
Georgia
13
GA
Oklahoma
40
OK
Hawaii
15
HI
Oregon
41
OR
Idaho
16
ID
Pennsylvania
42
PA
Illinois
17
IL
Rhode Island
44
RI
Indiana
18
IN
South Carolina
45
SC
Iowa
19
IA
South Dakota
46
SD
Kansas
20
KS
Tennessee
47
TN
Kentucky
21
KY
Texas
48
TX
Louisiana
22
LA
Utah
49
UT
Maine
23
ME
Vermont
50
VT
Maryland
24
MD
Virginia
51
VA
Massachusetts
25
MA
Washington
53
WA
Michigan
26
MI
West Virginia
54
WV
Minnesota
27
MN
Wisconsin
55
WI
Mississippi
28
MS
Wyoming
56
WY
Missouri
29
MO
FIPS code geography
Counties & Equivalent Entities
UTAH
CODE
001
003
005
007
009
011
013
015
017
019
(UT) (49)
NAME
Beaver
Box Elder
Cache
Carbon
Daggett
Davis
Duchesne
Emery
Garfield
Grand
(P)MSA - LEVEL - CMSA - POSTAL - STATE - COUNTY - PLACE SMAS
1600 A 14 ..........................................................
Chicago, Illinois
IL 17
031
Cook County
037
De Kalb County
043
Du Page County
063
Grundy County
089
Kane County
093
Kendall County
097
Lake County
111
McHenry County
197
Will County
CODE
021
023
025
027
029
031
033
035
037
039
NAME
Iron
Juab
Kane
Millard
Morgan
Piute
Rich
Salt Lake
San Juan
Sanpete
CODE
041
043
045
047
049
051
053
055
057
NAME
Sevier
Summit
Tooele
Uintah
Utah
Wasatch
Washington
Wayne
Weber
Metropolitan Areas (including MSAs,
CMSAs, PMSAs, and NECMAs)
Who uses FIPS?
State
FIPS
Entity
FIPS
State
Alpha
#
Counties
Geographic
Entity Name
County
FIPS
County
Name
17
00893
IL
1
Allin (Township of)
113
McLean
17
00906
IL
1
Allison (Township of)
101
Lawrence
17
00912
IL
1
Allright
023
Clark
17
00919
IL
1
Alma
121
Marion
17
00932
IL
1
Alma (Township of)
121
Marion
17
00937
IL
1
Almora
089
Kane
17
00945
IL
1
Alonzo
075
Iroquois
17
00958
IL
1
Alorton
163
St. Clair
17
00971
IL
1
Alpha
073
Henry
Portion of the Census geography FIPS table for Illinois. source: USGS,
FIPS55 Data Base, http://geonames.usgs.gov/fips55.html
Other specialized geographic indexes
Census Bureau Geographic Hierarchy **
** Legal and statistical
entities for which the
Census Bureau
tabulates data during
the decennial census.
source: U.S. Bureau of the
Census
http://www.census.gov/geo/
www/geodiagram.html
U.S. Census geography
Land parcel geography
SQL
Structured Query Language (SQL): provides a standard set of logical,
mathematical and text operators, commands, functions, and syntax rules for
externally accessing, interacting with and operating on database tables.
SQL is an applications-driven programming language used to:
•build & edit
•restructure
•relate & join
•sort
•query & segment
individual and connected database tables
Basic syntax of SQL
1. Function calls
a. parentheses: inner-most first
middle(st_name, find(st_name, “ “, 2)+1,2)
b. left-to-right
left(name,2) & “.” & right(year, 2)
2. Mathematical, logic or text operators
a. parentheses: inner-to-outer
(st_sufx=“Ave”) AND ( (st=“Monroe”) OR (st=“Broad “) )
b. unary operators (-, NOT)
(st_sufx=“Blvd”) AND NOT(st=“Washington”)
c. ^, ×/÷, +/-
yield × 1.10 + 0.1 × max_yld
d. left to right
hours × wage + overtime × OT_rate – FERPA
1
Legend
st_name
left(….., …)
OR, ×
AND, +
3, 2, …
operand, database attribute
function with two arguments
operator executed first
operator executed second
order of operators
3
2
4
SQL examples
SQL examples
The GeoDatabase
What you have learned
In this lesson you learned:
• A relational database is a system of independent data tables that can be connected to each other by
unique attribute values.
• Data tables are connected by means of primary and foreign keys; a key can be any type of data field.
• Relational databases support the full range of edit, sort, query, and data exploration operations; sorts and
queries are implemented through relational tables created by sort or query rules.
• Data tables are related by one of two types of connections: relate and join. A relate links a primary table
with a secondary table, creating an output table with no more rows than there are records in the primary
table. A join appends secondary table columns to the primary table and is always a 1-to-1 connection.
• Combination, or referential integrity is attained when a join or relate produces a single unambiguous
outcome; combination conflict arises when the relate or join produces multiple connections meeting
match conditions.
• A 1-to-1 connection assumes that each primary key is unique and matches with only one record having a
matching foreign key; a 1-to-many connection requires a unique primary key, but allows matches to many
records with the same foreign key
• Attribute data tables are typically connected to spatial data through a geographic index, serving as a key.
FIPS geographic codes index geographic units from the level of states, down to counties, county divisions,
metropolitan areas, and incorporated places. The Census Bureau indexes census geography down to the
level of tracts, block groups, and census blocks.
• SQL (Structured Query Language) is the standard programming language for relational databases used
to edit, restructure, create, join or relate, sort, query, filter, and subset individual and connected tables.
• The GeoDatabase is a relational database with capabilities for storing and accessing spatial data and
attribute data in a single, common database.