lecture6_old - University of Vermont

Download Report

Transcript lecture6_old - University of Vermont

Introduction to GIS
Lecture 6
Important Database Concepts
Lecture slides by Austin Troy, University of Vermont, except where noted
Introduction to GIS
Three Classic Database Models
• Hierarchical
• Network
• Relational -Arc View and Arc Info use this model
Introduction to GIS
Data Types
An individual data point, or datum (singular of data), can be of
any number of types, including:
•number (must be continuous); below are subclasses:
Currency (has specific decimal behaviors)
Byte (0 to 255)
Date (recognizes order in dates)
Integer (no decimals)
Long (integers, with a greater range)
Single precision (with decimals)
Double precision (like single, with a greater range)
•string (text: numbers can be represented as text, but they have no
numerical properties)
•Boolean (yes, no)
•Object (holds data objects)
Introduction to GIS
Data Types--precision
Byte data type has 256 possible values. Why?
A byte is 8 “switches” or bits. Each switch has an on and an off
position. This results in 28 or 256 possibilities
A unit of data in byte mode uses one byte, also known as 8 bit
More advanced data types use more memory—for instance,
double precision data can use up to 128 bits per value
Integer general uses 16 bit
The more bits, the more possible values can be stored and the
higher the level of precision
Introduction to GIS
Hierarchical Database Model
A one-to-many method for storing data in a database that
looks like a family tree with one root and a number of branches or
subdivisions. Problem: linkages in the tables must be known before
Groovy 70s TV
Action shows
Dukes of
Hazzard
Tom
Wopat
Eric
Estrada
CHIPs
Larry
Wilcox
Sitcoms
Drama
Dallas
Larry
Hagman
Fantasy
Island
Ricardo
Montalban
Love Boat
Gavin
McLeod
Three’s
company
Suzanne
Somers
John
Ritter
Introduction to GIS
Networked Database Model
A database design for storing information by linking all
records that are related with a list of “pointers.” Problem: linkages in
the tables must be known before. Not adaptable to change.
Action shows
Dukes of
Hazzard
CHIPs
NBC
Sitcoms
Drama
Dallas
CBS
Fantasy
Island
Love Boat
ABC
Three’s
company
Introduction to GIS
Relational (Tabular) Database Model
A design used in database systems in which relationships are
created between one or more flat files or tables based on the idea
that each pair of tables has a field in common, or “key”. In a
relational database, the records are generally different in each table
The advantages: each table can be prepared and maintained
separately, tables can remain separate until a query requires
connecting, or relating them, relationships can be one to one, one to
many or many to one
Introduction to GIS
Data Tables (flat files)
Records are the unit that the data
are specific to
fields
Fields, or columns, are attribute
categories
Cells are where individual values
of a record for a field are stored
Headings: are
the labels for
the columns
records
cells
Name
Phone
Address
Student ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
Introduction to GIS
Data key
Is a field that is common to two or more flat files; allows a
query to be done across multiple tables or allows two tables to
be joined
Flat file: professor info
Name
Phone
Address
faculty ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
Flat file: course info
Course
name
Course
number
enrollment
faculty ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
Introduction to GIS
Join Tables
Based on the values of a field that can be found in both tables
The name of the field does not have to be the same
The data type has to be the same
In this case we
have a one to
Key A B
one join; here
1
Key C
1 4
the key is unique
2 5
1 10
2
Key A B C
2 20
3
1
3 6
1 4 10
3 50
2 5 10
2
JOIN
3
3
6
50
Introduction to GIS
Join Tables
Key
1
A
B
4
1
1
2
2
3
In this case we have a
one to many join; here
the key is not unique
Key C
5
1
2
6
JOIN
10
20
Key
1
A
B
C
4
10
1
1
2
5
10
2
3
6
20
Introduction to GIS
 Relational (Tabular) Database Model: 70s TV example
Now we can have various flat files (tables) with different record
types and with various attributes specific to each record
Actor
Year
born*
Sideburn
length
John Ritter
1948
slight
Eric Estrada
1949
moderate
Larry
Wilcox
1953
Tom Wopat
1950
Table 1- specific to actors
Show
Lead
actor
Co-star
Network
*
slight
Three’s
Company
John
Ritter
Suzanne
Somers
CBS
major
CHIPs
Eric
Larry Wilcox CBS
Estrada
Dukes
Tom
Wopat
Table 2- specific to shows
John
Schneider
*entirely guessed at- I am not responsible for mistaken TV trivia
NBC
Introduction to GIS
 Relational (Tabular) Database Model
This allows queries that go across tables, like which CBS lead actors
were born before 1951? Answer: John Ritter and Larry Wilcox
Actor
Year
born*
Sideburn
length
John Ritter
1948
slight
Eric Estrada
1949
moderate
Larry
Wilcox
1953
Tom Wopat
1950
It does this by combining information
from the two tables, using common
key fields
Show
Lead
actor
Co-star
Network
*
slight
Three’s
Company
John
Ritter
Suzanne
Somers
CBS
major
CHIPs
Eric
Larry Wilcox ABC
Estrada
Dukes
Tom
Wopat
John
Schneider
*entirely guessed at- I am not responsible for mistaken TV trivia
CBS
Introduction to GIS
 Relational (Tabular) Database Model
Object-relational databases can contain other objects as well, like
images, video clips, executable files, sounds, links
Actor
Year
born*
Sideburn
length
John Ritter
1948
slight
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Tom Wopat
1950
major
Picture
Introduction to GIS
 Relational Database: another
example: property lot info
Parcel Street
ID
address
zoning
11
15 Maple
St.
Residential-1
12
85 Brooks
Ave
Commercial-2
13
74 Windam
Ct.
Residential 4
One-to-one
relationship
Owner Parcel ID
occupation
J.
Smith
13
lawyer
R.
Jones
11
dentist
T.
Flores
12
Real estate
developer
Introduction to GIS
 Relational database: one to many
relationship
Parcel Street
ID
address
zoning
11
15 Maple
St.
Residential-1
12
85 Brooks
Ave
Commercial-2
74 Windam
Ct.
Residential 4
13
In this case, several
people co-own the
same lot, so no longer
one lot, one person
One-to-many
relationship
Owner
Parcel ID occupation
J. Smith
13
lawyer
R. Jones 11
dentist
12
J.
McCann
financier
T.
Flores
12
Real estate
developer
Introduction to GIS
Parcel
ID
Assuming each owner owned several parcels, we
would structure the database differently
One-to-many
relationship
Street
zoning
address
Properties owned by T. Flores
11
15 Maple St.
Residential-1
12
85 Brooks Ave
Commercial-2
13
74 Windam Ct.
Residential 4
Owner
occupation
# properties
owned
J. Smith
lawyer
2
R. Jones
dentist
5
J. McCann financier
T. Flores
Real estate
developer
Owner
Parcel Date of
transaction
ID
Flores
13
4-15-00
Flores
15
4-17-01
Flores
19
3-12-99
2
3
Note: this table includes data
pertinent only to Flores’
ownership of these properties
Introduction to GIS
Example
Here’s an example of a chart showing the relationships between
flat files in a sample relational database for food suppliers* in
Microsoft Access
* This comes from an MS ACCESS sample database
Introduction to GIS
* This comes from an MS ACCESS sample database
Introduction to GIS
A real time RDBMS allows for realtime linking and
embedding of tables based on common fields
Here we see
all the
orders for
product ID
3; there is
no need to
include
product ID
in that subtable
* This comes from an MS ACCESS sample database
Introduction to GIS
Queries
•This is how we ask questions of the data
•To ask queries, we use mathematical operators, like =, >, <
•To ask queries on multiple criteria, we use logical operators, like
AND and OR
•Arc GIS uses the same query structure as SQL
•Queries can be on numbers, dates, text, or any other data type
•Mathematical operators don’t work for text queries, because text
has no numeric behavior
•if “5” and “8” stored as text values, rather than numbers, then
“5”+”8” will equal “58,” not 13. For text, + means
Concatenation, not addition
•Queries can simply select records or perform more advanced
operations with those selections, such as make new tables, or
summarize values by averages
Introduction to GIS
Queries in Arc GIS
•Arc GIS queries only select (highlight) records
•When a record is selected, so is its
corresponding feature
•To summarize selected values, use the
“statistics” function or “summarize” tool
•To create new values based on a query, use the
“calculate” tool.