Transcript Lec16

Spatial Databases
ENVE/CE 424/524
Definitions
•
Database – an integrated set of data on a particular subject
•
Spatial database - database containing geographic data of a particular subject
for a particular area
•
Database Management System (DBMS) – software to create, maintain and
access databases
System
Geographic
Information
System
•
•
•
•
•
Data load
Editing
Visualization
Mapping
Analysis
Database
Management
System
•
•
•
•
Storage
Indexing
Security
Query
Data
GIS: old and new
GIS used to be monolithic systems
all-in-one, proprietary applications that stored, queried, and visualized data
New systems follow more of a tool-box approach
modularized applications that interoperate
Who can benefit from spatial data management?
Army Commander: Has there been any significant enemy troop movement
in the past week?
Insurance Risk Manager: Which houses are most likely to be affected in the
next great flood on the Mississippi?
Medical Doctor: Based on this patient’s MRI, have we treated somebody
with a similar condition?
Molecular Biologist: Is the topology of the amino acid biosynthesis gene in
the genome found in any other sequence feature map in the database?
Astronomer: Find all blue galaxies within 2 arcmin of quasars.
Three classes of users for spatial databases
Major database managers: specialized products for enterprise management
GIS users: analysis of data
Internet user: more generalized requirements
Advantages of Databases over Files
• Avoids redundancy and duplication
• Reduces data maintenance costs
• Applications are separated from the data
– Applications persist over time
– Support multiple concurrent applications
• Better data sharing
• Security and standards can be defined and enforced
Disadvantages of Databases over Files
•
•
•
•
Expense
Complexity
Performance – especially complex data types
Integration with other systems can be difficult
Types of DBMS Model
• Hierarchical
• Network
• Relational – RDBMS
• Object-oriented – OODBMS
• Object-relational - ORDBMS
Characteristics of DBMS
•
Data model support for multiple data types
– e.g MS Access: Text, Memo, Number, Date/Time, Currency, AutoNumber,
Yes/No, OLE Object, Hyperlink, Lookup Wizard
•
Load data from files, databases and other applications
•
Index for rapid retrieval
•
Query language – SQL
•
Security – controlled access to data
– Multi-level groups
•
Controlled update using a transaction manager
•
Backup and recovery
Relational DBMS
• Data stored as tuples (tup-el), conceptualized as tables
• Table – data about a class of objects
– Two-dimensional list (array)
– Rows = objects
– Columns = object states (properties, attributes)
Table
Column = property
Table =
Object Class
Row = object
Object
Classes with
Geometry
called
Feature
Classes
Relational DBMS
• Most popular type of DBMS
– Over 95% of data in DBMS is in RDBMS
• Commercial systems
–
–
–
–
–
–
IBM DB2
Informix
Microsoft Access
Microsoft SQL Server
Oracle
Sybase
Spatial Database Example
Land parcel with boundary id: 1050
Relational Database Example
Four tables needed in the land parcel relational database
Relational database example #2
Relation Rules (Codd, 1970)
• Only one value in each cell (intersection of row and
column)
• All values in a column are about the same subject
• Each row is unique
• No significance in column sequence
• No significance in row sequence
SQL
• Structured (Standard) Query Language – (pronounced SEQUEL)
• Developed by IBM in 1970s
• Now standard for accessing relational databases
• Three types of usage
– Stand alone queries
– High level programming
– Embedded in other applications (ArcGIS)
Types of SQL Statements
• Data Definition Language (DDL)
– Create, alter and delete data
– CREATE TABLE, CREATE INDEX
• Data Manipulation Language (DML)
– Retrieve and manipulate data
– SELECT, UPDATE, DELETE, INSERT
• Data Control Languages (DCL)
– Control security of data
– GRANT, CREATE USER, DROP USER
Spatial Types – OGC Simple Features
Data Model: A set of constructs for representing
objects and processes in a digital environment
Geometry
Point
SpatialReferenceSystem
Curve
Surface
LineString
Polygon
Line
LinearRing
Composed
Type
Relationship
GeometryCollection
MultiSurface
MultiCurve
MultiPolygon
MultiLineString
MultiPoint
Spatial Relations
• Equals – are the geometries the same?
• Disjoint – do the geometries share common point?
• Intersects – do the geometries intersect?
• Touches – do the geometries intersect at their boundaries?
• Crosses – do the geometries overlap?
• Within– is one geometry within another?
• Contains – does one geometry completely contain another?
• Overlaps – do the geometries overlap?
• Relate – are their intersections between the interior, boundary or
exterior of the geometries?
Contains Relation
Touches Relation
Spatial Methods
• Distance – determines shortest distance between any two points in two
geometries
• Buffer – returns a geometry that represents all the points whose
distance from the geometry is less than or equal to a user-defined
distance
• ConvexHull – returns a geometry representing the small polygon that
can enclose another geometry without any concave areas
• Intersection – returns a geometry that contains just the points common
to both input geometries
• Union – returns a geometry that contains all the points in both input
geometries
• Difference – returns a geometry containing the points that are different
between the two geometries
• SymDifference – returns a geometry containing the points that are in
Convex Hull and Difference Methods
Convex Hull
Difference
Indexing
• Used to locate rows quickly
• Like a book index, it is a special representation of the content that
adds order and makes finding items faster
• RDBMS use simple 1-d indexing
• Spatial DBMS needs 2-d, hierarchical indexing
– Grid
– Quadtree
– R-tree
• Multi-level queries often used for performance (MBR)
Grid Index (multi-level)
- Overlay uniform grid
- Assign objects a grid id
Multi-level grids are used for
variable sized objects within
a database
Point and Region Quadtree Indexing
Based on recursive division of space.
Point Quadtree
Region Quadtree
R-tree
Use minimum bounding rectangle (MBR) or minimum bounding box (MBB)
Add a new object to the MBR that would expand the least to accommodate the object
Minimum Bounding Rectangle
Minimum
Bounding
Rectangle
Study
Area
Order Dependence of a Query
Query: Select all households within 3 km of a store that
have an income greater than $100,000
1. Select all households with an
income greater than $100,000;
from this selected set, select all
households within 3 km of a
store
2. Select all households within 3
km of a store; from this selected
set, select all households with
an income greater than
$100,000
Distributed Databases
www.midcarb.org
References
Longley et al., Geographic Information Systems and Science, 2001
Chapter 11
Guenther, Environmental Information Systems, 1998
Chapter 3
Final Few Weeks
Lecture: April 15, Metadata and Interoperability
Lab: April 17 (next Thursday), project/problem set work
I’ll spend a few minutes with each of you to get an update on your
progress.
• Article review due April 17
Lab: April 22, project lab session.
Lecture April 24, GIS in decision-making
Project Presentation: May 8