Introduction to GIS - University of Vermont

Download Report

Transcript Introduction to GIS - University of Vermont

Introduction to GIS
Lecture 6
Important Database Concepts
Lecture slides by Austin Troy, University of Vermont, © 2007
©2007 by Austin Troy. All rights reserved
Introduction to GIS
How is Data Stored?
•People use number system with base 10
•Each digit corresponds to 10 to some power
•Hence a number with 3 digits has 103 or 1000
possibilities
•Why are computer values so often in multiples of eight?
•Because computers use a base 8 system of storing
numbers and values
•A byte is 8 “on-off switches” or bits
•Each switch/bit represents a binary number; one byte is
28 or 256 possibilities
©2007 by Austin Troy. All rights reserved
Introduction to GIS
How do binary numbers translate to real numbers?
•Switch combinations determine base ten number based
on the formula:
•N10= 2b-1+2b-2+…2b-b
•Where b= number of bits storing the number
•Hence the binary number
11111111 = 27*1+ 26 *1 + 25 *1 + 24 *1 + 23 *1 + 22
*1 + 21 *1 + 20 *1 = 255
•And the binary number
11111110 = 27*1+ 26 *1 + 25 *1 + 24 *1 + 23 *1 + 22
*1 + 21 *1 + 20 *0 = 254
2
10
2
10
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Number of Possible Values is fn of number of
bits
•Number of possible values for a unit of data is an
exponential function of the number of switches
•28=256
eight bit data
•216=65,536
sixteen bit data
•232= 4,294,967,296
thirty two bit data
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Number of bits determines data types
Examples of Integer data types
Byte: 28 (0 to 255)
Short Integer: 216 (ranges from –32,767 to
+32,767 without decimals, the sixteenth bit
determines sign)
Long Integer: 232 (+/-2.147483e+09 )
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Floating point data types
•In this case the number can have a decimal, but the
number of places is variable
•With this type of number the number of bits
determines not just the number of possible magnitudes
but also the level of precision of the decimal,
represented as number of decimal places.
•Fewer bits in FP numbers can lead to rounding errors
•Two types of FP number
Single Precision: Often 232
Double Precision: Usually double the bits of single
precision (i.e. 264)
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Other Data Types
Currency (type of number with specific
behaviors)
Date (recognizes order in dates)
String (text)
When numbers are represented as text they
have no numerical properties (e.g. zip codes)
Boolean (yes, no)
Object (e.g. pictures, bits of code, behaviors,
multi-media, programs)
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Three Classic Database Models
• Hierarchical
• Network
• Relational -Arc View and Arc Info use this model
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Hierarchical Database Model
A one-to-many method for storing data in a database that
looks like a family tree with one root and a number of branches or
subdivisions. Problem: linkages in the tables must be known before
Groovy 70s TV
Action shows
Dukes of
Hazzard
Tom
Wopat
Eric
Estrada
CHIPs
Larry
Wilcox
Sitcoms
Drama
Dallas
Larry
Hagman
Fantasy
Island
Ricardo
Montalban
©2007 by Austin Troy. All rights reserved
Starsky and
Hutch
Love Boat
Gavin
McLeod
Ted
Lange
David
Soul
Introduction to GIS
Hierarchical Database Model
•Example where this model works well:
•plant and animal taxonomies
•Soil classification
•Works when: classes are totally mutually exclusive
•Problem with this model:
•Does not work when have entities that belong to several
classes or do not have mutual exclusivity
•Think about the problems with Windows Explorer
•Example: classifying your music collection
•You may create classes like rock, jazz, classical, Latin,
with folders for artists nested within
•However, an artist may do rock and Latin and jazz on the
same album, or one song may be a combination
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Networked Database Model
A database design for storing information by linking all
records that are related with a list of “pointers.” Problem: linkages in
the tables must be known before. Not adaptable to change.
Action shows
Dukes of
Hazzard
CHIPs
NBC
Sitcoms
Drama
Dallas
CBS
Love Boat
Fantasy
Island
ABC
©2007 by Austin Troy. All rights reserved
Three’s
company
Introduction to GIS
Relational (Tabular) Database Model
A design used in database systems in which relationships are
created between one or more flat files or tables based on the idea
that each pair of tables has a field in common, or “key”. In a
relational database, the records are generally different in each table
The advantages: each table can be prepared and maintained
separately, tables can remain separate until a query requires
connecting, or relating them, relationships can be one to one, one to
many or many to one
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Data Tables (flat files)
Records are the unit that the data
are specific to
fields
Fields, or columns, are attribute
categories
Cells are where individual values
of a record for a field are stored
Headings: are
the labels for
the columns
records
cells
Name
Phone
Address
Student ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Data key
Is a field that is common to two or more flat files; allows a
query to be done across multiple tables or allows two tables to
be joined
Flat file: professor info
Name
Phone
Address
faculty ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
Flat file: course info
Course
name
Course
number
enrollment
faculty ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Join Tables
Based on the values of a field that can be found in both tables
The name of the field does not have to be the same
The data type has to be the same
In this case we
have a one to
Key A B
one join; here
1
Key C
1 4
the key is unique
2 5
1 10
2
Key A B C
2 20
3
1
3 6
1 4 10
3 50
2 5 10
2
JOIN
©2007 by Austin Troy. All rights reserved
3
3
6
50
Introduction to GIS
Join Tables
Key
1
A
B
4
1
1
2
2
3
In this case we have a
one to many join; here
the key is not unique
Key C
5
1
2
6
10
20
JOIN
©2007 by Austin Troy. All rights reserved
Key
1
A
B
C
4
10
1
1
2
5
10
2
3
6
20
Introduction to GIS
 Relational (Tabular) Database Model: 70s TV example
Now we can have various flat files (tables) with different record
types and with various attributes specific to each record
Table 1- specific to actors
Actor
Year
born*
Sideburn
length
David Sole
1948
serious
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Starsky
David
and Hutch Soul
Tom Wopat
1950
major
CHIPs
Eric
Larry Wilcox CBS
Estrada
Dukes
Tom
Wopat
Show
Table 2- specific to shows
Lead
actor
Co-star
Network
*
Paul Michael
Glaser
ABC
John
Schneider
*entirely guessed at- I am not responsible for mistaken TV trivia
©2007 by Austin Troy. All rights reserved
NBC
Introduction to GIS
 Relational (Tabular) Database Model
This allows queries that go across tables, like which CBS lead actors
were born before 1951? Answer: Tom Wopat and David Sole
Actor
Year
born*
Sideburn
length
David Sole
1948
serious
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Starsky
David
and Hutch Soul
Tom Wopat
1950
major
CHIPs
Eric
Larry Wilcox ABC
Estrada
Dukes
Tom
Wopat
It does this by combining information
from the two tables, using common
key fields
Show
Lead
actor
Co-star
Network
*
Paul Michael
Glaser
CBS
John
Schneider
*entirely guessed at- I am not responsible for mistaken TV trivia
©2007 by Austin Troy. All rights reserved
CBS
Introduction to GIS
 Relational (Tabular) Database Model
Object-relational databases can contain other objects as well, like
images, video clips, executable files, sounds, links
Actor
Year
born*
Sideburn
length
David Soul
1948
serious
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Tom Wopat
1950
major
Picture
©2007 by Austin Troy. All rights reserved
Introduction to GIS
 Relational Database: another
example: property lot info
Parcel Street
ID
address
zoning
11
15 Maple
St.
Residential-1
12
85 Brooks
Ave
Commercial-2
13
74 Windam
Ct.
Residential 4
One-to-one
relationship
Owner Parcel ID
occupation
J.
Smith
13
lawyer
R.
Jones
11
dentist
T.
Flores
12
Real estate
developer
©2007 by Austin Troy. All rights reserved
Introduction to GIS
 Relational database: one to many
relationship
Parcel Street
ID
address
zoning
11
15 Maple
St.
Residential-1
12
85 Brooks
Ave
Commercial-2
74 Windam
Ct.
Residential 4
13
In this case, several
people co-own the
same lot, so no longer
one lot, one person
One-to-many
relationship
Owner
Parcel ID occupation
J. Smith
13
lawyer
R. Jones 11
dentist
12
J.
McCann
financier
T.
Flores
12
©2007 by Austin Troy. All rights reserved
Real estate
developer
Introduction to GIS
Parcel
ID
Assuming each owner owned several parcels, we
would structure the database differently
One-to-many
relationship
Street
zoning
address
Properties owned by T. Flores
11
15 Maple St.
Residential-1
12
85 Brooks Ave
Commercial-2
13
74 Windam Ct.
Residential 4
Owner
occupation
# properties
owned
J. Smith
lawyer
2
R. Jones
dentist
5
J. McCann financier
T. Flores
Real estate
developer
Owner
Parcel Date of
transaction
ID
Flores
13
4-15-00
Flores
15
4-17-01
Flores
19
3-12-99
2
3
©2007 by Austin Troy. All rights reserved
Note: this table includes data
pertinent only to Flores’
ownership of these properties
Introduction to GIS
Example
Here’s an example of a chart showing the relationships between
flat files in a sample relational database for food suppliers* in
Microsoft Access
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
Introduction to GIS
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
Introduction to GIS
A real time RDBMS allows for realtime linking and
embedding of tables based on common fields
Here we see
all the
orders for
product ID
3; there is
no need to
include
product ID
in that subtable
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved