Introduction to GIS - University of Vermont
Download
Report
Transcript Introduction to GIS - University of Vermont
Introduction to GIS
Lecture 6
Database Concepts
By Brian Voigt
University of Vermont
Thanks are due to Dr Troy, upon whose lecture much of this material is based.
©2007 by Austin Troy. All rights reserved
©2007 by Austin Troy. All rights reserved
Fundamentals of GIS
How is Data Stored?
People
• typically use a decimal (base 10) number system
• each digit corresponds to 10 to some power
• a number with 3 digits has 103 or 1000 possibilities
Computers
• use a base 2 system … a bit (Binary digIT) has 2
possible values
• a byte is 8 “on-off switches” or bits
• each switch/bit represents a binary number
• one byte is 28 or 256 possibilities
Lecture slides by Austin Troy & Brian Voigt, University of Vermont, © 2011
Introduction to GIS
How is Data Stored?
Example: 12,345 (base 10)
1*10,000 + 2*1,000 + 3*100 + 4*10 + 5*1
1*104 + 2*103 + 3*102 + 4*101 + 5*100
©2007 by Austin Troy. All rights reserved
Introduction to GIS
How do binary numbers translate to real numbers?
• Switch combinations determine decimal number based
on the formula:
• N10= 2b-1+2b-2+…2b-b
• Where b= number of bits storing the number
• Hence the binary number
11111111 = 27*1+ 26 *1 + 25 *1 + 24 *1 + 23 *1 +
22 *1 + 21 *1 + 20 *1 = 255
2
10
11111110 = 27*1+ 26 *1 + 25 *1 + 24 *1 + 23 *1 +
22 1 + 21 *1 + 20 *0 = 254
2
10
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Number of Possible Values
• Number of possible values for a unit of data is an
exponential function of the number of switches
• 28 = 256
• 216 = 65,536
• 232 = 4,294,967,296
eight bit data
sixteen bit data
thirty two bit data
©2007 by Austin Troy. All rights reserved
Introduction to GIS
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Number of bits determines data types
Examples of Integer data types
Byte: 28 (0 to 255)
Short Integer: 216 (ranges from –32,767 to
+32,767 without decimals, the sixteenth bit
determines sign)
Long Integer: 232 (+/-2.147483648e+09 )
Floating point: precision can go to either side of
decimal place
Single precision: 32 bit
Double precision: 64 bit
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Other Data Types
Currency
Date (recognizes order in dates)
String (text)
When numbers are represented as text they
have no numerical properties (e.g. zip codes)
Boolean (TRUE/FALSE ; yes/no)
Object (“BLOB”, e.g., image file)
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Three Classic Database Models
• Hierarchical
• Network
• Relational - ArcView and ArcInfo use this model
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Hierarchical Database Model
A one-to-many method for storing data in a database that
looks like a family tree with one root and a number of branches or
subdivisions. Problem: table linkages must be known in advance
Groovy 70s TV
Action shows
Dukes of
Hazzard
Tom
Wopat
Eric
Estrada
CHIPs
Larry
Wilcox
Sitcoms
Drama
Dallas
Larry
Hagman
Fantasy
Island
Ricardo
Montalban
©2007 by Austin Troy. All rights reserved
Starsky and
Hutch
Love Boat
Gavin
McLeod
Ted
Lange
David
Soul
Introduction to GIS
Hierarchical Database Model
Example where this
model works well:
• Taxonomies
• Soil classification
Works when: classes
are totally mutually
exclusive / exhaustive
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Hierarchical Database Model
Problem:
• Does not work when there are entities that belong to
several classes or do not have mutual exclusivity
• Think about the problems with Windows Explorer
• Example: classifying your music collection
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Networked Database Model
A database design for storing information by linking all
records that are related with a list of “pointers.” Problem: linkages in
the tables must be known before. Not adaptable to change.
Action shows
Dukes of
Hazzard
CHIPs
NBC
Sitcoms
Drama
Dallas
CBS
Love Boat
Fantasy
Island
ABC
©2007 by Austin Troy. All rights reserved
Three’s
company
Introduction to GIS
Relational (Tabular) Database Model
A design used in database systems in which relationships are
created between one or more flat files or tables based on the idea
that each pair of tables has a field in common, or “key”. In a
relational database, the records are generally different in each table
The advantages: each table can be prepared and maintained
separately, tables can remain separate until a query requires
connecting, or relating them, relationships can be one to one, one to
many, or many to one
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Data Tables (flat files)
• Headings are the labels for the columns
• Fields, or columns, are attribute categories
• Records, or rows, are the unit that the data are specific to
• Cells are where individual values of a record for a field are stored
FIELD
RECORD
Name
Phone
Address
Student ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
©2007 by Austin Troy. All rights reserved
CELL
Introduction to GIS
Data key
The key is a field that is common to two or more flat files;
allows a query to be done across multiple tables or allows two
tables to be joined
Flat file: professor info
Name
Phone
Address
faculty ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
Flat file: course info
Course
name
Course
number
enrollment
faculty ID
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
***
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Joining Tables
•
Based on the values of a field that can be found in both tables
The name of the field does not have to be the same
The data type has to be the same
In this case we have a one to one join; here the key is unique
•
•
•
Key
1
2
3
A
B
Table 2
1
2
4
ID
C
5
1
10
2
20
3
50
3
6
Table 1
JOIN
©2007 by Austin Troy. All rights reserved
Joined Table
Key
1
A
B
C
4
10
2
1
2
5
10
3
3
6
50
Introduction to GIS
Join Tables
In this case we have a
one to many join; here
the key is not unique
Key
1
A
B
4
1
1
2
2
3
Key C
5
1
2
6
10
20
JOIN
©2007 by Austin Troy. All rights reserved
Key
1
A
B
C
4
10
1
1
2
5
10
2
3
6
20
Introduction to GIS
Relational (Tabular) Database Model: 70s TV example
Now we can have various flat files (tables) with different record
types and with various attributes specific to each record
Table 1- specific to actors
Actor
Year
born*
Sideburn
length
David Soul
1948
serious
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Starsky
David
and Hutch Soul
Tom Wopat
1950
major
CHIPs
Eric
Larry Wilcox CBS
Estrada
Dukes
Tom
Wopat
Show
Table 2- specific to shows
Lead
actor
Co-star
Network
*
Paul Michael
Glaser
ABC
John
Schneider
*entirely guessed at- I am not responsible for mistaken TV trivia
©2007 by Austin Troy. All rights reserved
NBC
Introduction to GIS
Relational (Tabular) Database Model
This allows queries that go across tables, like which CBS lead actors
were born before 1951? Answer: Tom Wopat and David Sole
Actor
Year
born*
Sideburn
length
David Sole
1948
serious
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Starsky
David
and Hutch Soul
Tom Wopat
1950
major
CHIPs
Eric
Larry Wilcox ABC
Estrada
Dukes
Tom
Wopat
It does this by combining information
from the two tables, using common
key fields
Show
Lead
actor
Co-star
Network
*
Paul Michael
Glaser
CBS
John
Schneider
*entirely guessed at- I am not responsible for mistaken TV trivia
©2007 by Austin Troy. All rights reserved
CBS
Introduction to GIS
Relational (Tabular) Database Model
Object-relational databases can contain other objects as well, like
images, video clips, executable files, sounds, links
Actor
Year
born*
Sideburn
length
David Soul
1948
serious
Eric Estrada
1949
moderate
Larry
Wilcox
1953
slight
Tom Wopat
1950
major
Picture
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Relational Database: another
example: property lot info
Parcel Street
ID
address
zoning
11
15 Maple
St.
Residential-1
12
85 Brooks
Ave
Commercial-2
13
74 Windam
Ct.
Residential 4
One-to-one
relationship
Owner Parcel ID
occupation
J.
Smith
13
lawyer
R.
Jones
11
dentist
T.
Flores
12
Real estate
developer
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Relational database: one to many
relationship
Parcel Street
ID
address
zoning
11
15 Maple
St.
Residential-1
12
85 Brooks
Ave
Commercial-2
74 Windam
Ct.
Residential 4
13
In this case, several
people co-own the
same lot, so no longer
one lot, one person
One-to-many
relationship
Owner
Parcel ID occupation
J. Smith
13
lawyer
R. Jones 11
dentist
12
J.
McCann
financier
T.
Flores
12
©2007 by Austin Troy. All rights reserved
Real estate
developer
Introduction to GIS
Parcel
ID
Assuming each owner owned several parcels, we
would structure the database differently
One-to-many
relationship
Street
zoning
address
Properties owned by T. Flores
11
15 Maple St.
Residential-1
12
85 Brooks Ave
Commercial-2
13
74 Windam Ct.
Residential 4
Owner
occupation
# properties
owned
J. Smith
lawyer
2
R. Jones
dentist
5
J. McCann financier
T. Flores
Real estate
developer
Owner
Parcel Date of
transaction
ID
Flores
13
4-15-00
Flores
15
4-17-01
Flores
19
3-12-99
2
3
©2007 by Austin Troy. All rights reserved
Note: this table includes data
pertinent only to Flores’
ownership of these properties
Introduction to GIS
Example
Here’s an example of a chart showing the relationships between
flat files in a sample relational database for food suppliers* in
Microsoft Access
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
Introduction to GIS
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
Introduction to GIS
A real time RDBMS allows for realtime linking and
embedding of tables based on common fields
Here we see
all the
orders for
product ID
3; there is
no need to
include
product ID
in that subtable
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
DEMO
Working with other number systems
©2007 by Austin Troy. All rights reserved