- ShareStudies.com

Download Report

Transcript - ShareStudies.com

Spatial and Geographic
Databases
Course
outlines
Fundamentals of GIS - Overview
Spatial and Geographic Data(bases)
Why Study GIS? What is GIS? What’s in a GIS?
GIS vs. Other Systems - How GIS differs from Related Systems
GIS System-Architecture and Components
GIS Spatial Data Model
GIS Spatial and Attribute Data
How a GIS Organizes Spatial Data?
Raster and Vector data Model - Spaghetti & Topologic Vector Data Model
Representing Surfaces – DEM, TIN, Contour (isolines) Lines
File Formats for Raster and Vector data models
Spatial Database Management?
Querying Data & Indexing of Spatial Data
Sources of Geographic Data
Appendix - GIS Software Packages, GIS File Formats
Fundamentals of GIS - Overview
Spatial and Geographic Data(bases)
Spatial databases store information related to spatial locations, and
support efficient storage, indexing and querying of spatial data.
“Special index structures are important for accessing spatial data, and for processing spatial join queries.”
Examples of geographic/spatial data
 map data for vehicle navigation
 distribution network information for power, telephones, water supply, and sewage
Vehicle navigation systems store information about roads and services for the use of
drivers:
 Spatial data: e.g, road/restaurant/gas-station coordinates
 Non-spatial data: e.g., one-way streets, speed limits, traffic congestion
Global Positioning System (GPS) unit
 utilizes information broadcast from GPS satellites to find the current location of user with an
accuracy of tens of meters.
 increasingly used in vehicle navigation systems as well as utility maintenance applications.
Geographic databases store geographic information (e.g., maps):
often called geographic information systems or GIS.
2
Why Study GIS?
80% of local government activities estimated to be geographically based
plats, zoning, public works (streets, water supply, sewers), garbage collection, land ownership
and valuation, public safety (fire and police)
A significant portion of state government has a geographical component
natural resource management, highways and transportation
Businesses use GIS for a very wide array of applications




retail site selection & customer analysis
logistics: vehicle tracking & routing
natural resource exploration (petroleum, etc.)
precision agriculture, civil engineering and construction
Military and defense
 Battlefield management
 Satellite imagery interpretation
Scientific research employs GIS
 geography, geology, botany
 anthropology, sociology, economics, political science
 Epidemiology, criminology
3
Knowledge Base for GIS
Computer
Science/MIS
graphics
visualization
database
system administration
security
GIS
Geography
and related:
cartography
geodesy
photogrammetry
landforms
spatial statistics.
Application Area:
public admin.
planning
geology
mineral exploration
forestry
site selection
marketing
civil engineering
criminal justice
surveying
The convergence of technological fields and
traditional disciplines.
4
Applied GIS - Examples
Urban Planning,
Management & Policy
 Zoning, subdivision planning
 Land acquisition, Economic development
 Code enforcement
 Housing renovation programs
 Emergency response
 Crime analysis, Tax assessment
Environmental Sciences
 Monitoring environmental risk
 Modeling storm-water runoff
 Management of watersheds, floodplains, wetlands, forests,
aquifers
 Environmental Impact Analysis
 Hazardous or toxic facility siting
Groundwater modeling and contamination tracking
Political Science
 Redistricting
 Analysis of election results
 Predictive modeling
Civil Engineering/Utility
 Locating underground facilities
 Designing alignment for freeways, transit
 Coordination of infrastructure maintenance
Business
 Demographic Analysis
 Market Penetration/ Share Analysis
 Site Selection
Education Administration
 Attendance Area Maintenance
 Enrollment Projections
 School Bus Routing
Real Estate
 Neighborhood land prices
 Traffic Impact Analysis
 Determination of Highest and Best Use
Health Care
 Epidemiology
 Needs Analysis
 Service Inventory
5
What is GIS?
Defining Geographic Information Systems (GIS)
A powerful set of tools for
collecting, storing, retrieving, transforming, and displaying spatial data from the
real world. (Burroughs, 1986)
A computerized database management system for
the capture, storage, retrieval, analysis and display of spatial (locationally
defined) data. (NCGIA, 1987)
A decision support system involving the integration of spatially referenced
data in a problem solving environment. (Cowen, 1988)
…intuitive description
 A map with a database behind it.
 A virtual representation of the real world and its infrastructure.
 A consistent “as-built” of the real world, natural and manmade
queried to support on-going operations
summarized to support strategic decision making and policy
Which is
formulation
analyzed to support scientific inquiry
6
GIS overview…
What’s in a GIS?
GIS had three main components:
 a Database Management System;
 a Spatial Analytical Toolkit;
 a Mapping Package.
GIS technology integrates common database operations (such as query and
statistical analysis) with the unique visualisation and geographic analysis
benefits offered by maps.
Database
Management
System
route finding
buffering
polygon overlay
Spatial
Analysis
Tool kit
attributes
GIS
Mapping
Package
points, lines, areas
features
layers
7
GIS vs. Other Systems
How GIS differs from Related Systems
DBMS - typical MIS data base contains implicit but not explicit locational information
 city, county, zip code, etc. but no geographical coordinates
 is 100 N. High around the corner or across town from 200 E Main?
Automated mapping (AM) - primarily
two-dimensional display devices
 thematic mapping (choropleth,etc such as SAS/GRAPH, DIDS, business mapping software)
unable to relate different geographical layers (e.g zip codes and counties)
 automated cartography--graphical design oriented; limited database ability
Facility management (FM) systems - lack spatial analysis tools
CAD/CAM (computer aided design/drafting) - primarily
(engineering design) & display systems
3-D graphic creation
 don’t reference via geographic location (CAD sees the world as a 3D cube, GIS as a 3D sphere)
 limited (if any) database ability (especially for non-spatial data)
Scientific visualization systems - sophisticated multi-dimensional graphics, but:
 lack database support
 lack two-dimensional spatial analysis tools
8
GIS as a computer system
GIS System-Architecture and Components
Data input subsystem
collects and processes spatial data from various sources,
allows user to import, create, and edit spatial and tabular data
Sub-system
definition of GIS
Data storage and retrieval subsystem
provides storage, retrieval, updating and editing capabilities.
Data manipulation & analysis subsystem
provides to tools to examine characteristics
of the data and model building capabilities
(classification, modeling functions).
A reporting subsystem
provides tools for designing/displaying
maps, graphics, text, and tabular reports
Query-Input
Geographic
Database
“Data-Storage”
GIS main functions
Data acquisition (spatial and non-spatial) & processing
(data management)
Data storage (Store data more efficiently)
Data querying & analysis (Spatial & statistical)
most important tool
Data output - Visualization
Data-Input
Output: Display
and Reporting
?
Transformation
and Analysis
9
GIS engineering…
We have to deal with:
GIS Adapted Spatial Data Models, formats
Data acquisition, collecting, importing, creating, and editing
Asking about the more powerful data management tool(s)
GIS Querying
Data manipulation
Spatial & statistical analysis
Query-Input
Geographic
Database
“Data-Storage”
A reporting & Visualization
End-user interface
Web-based front-end…
Data-Input
Output: Display
and Reporting
?
Transformation
and Analysis
10
GIS Spatial Data Model
Allowing
Data acquisition, collecting, importing, creating,
and editing
Asking about the
More powerful Spatial Data Management tools
The GIS Data Model
GIS Spatial Data Model – an overview
allows the geographic features in real world locations to be digitally
represented and stored in a database
so that they
can be abstractly presented in map (analog) form,
and can also be worked with and manipulated to address some
problem
map
Real world
?
12
GIS Spatial and Attribute Data
GIS includes
Spatial data (specifies location; where).
Spatial data can be stored-in/obtained-from a shape (digital) files or maps images
Coordinate system
Latitude (Ø) and longitude ()
A planar coordinate system is defined by
a pair of orthogonal (x,y) axes drawn through
an origin (Øo, o)
Descriptive data
(Øo, o)
(Attribute; specifies characteristics at that location (what, how much, when).
Descriptive data can be stored in a database table or obtained from reports, statistical
outcomes, etc.
GIS systems traditionally maintain spatial and attribute data
separately, then “join” them for display or analysis.
13
The GIS Data Model
How a GIS Organizes Spatial Data?
A GIS can model the world in a layered ways.
The ‘layered’ approach where data is held (organized
by) in thematic layers
i.e. all water features will be contained in a
Hydrology layer, etc.
In this approach a GIS stores information about the
world as a collection of thematic map layers
 Layers are integrated using explicit location on the
earth’s surface, they can be linked together by
geography.
 The thematic layer approach allows us to organise
the complexity of the real world into a simple
representation to help facilitate an understanding of
natural relationships.
AdministrativeBoundaries
Zoning
Buildings
Parcels
Hydrology
Streets
Digital Orthophoto
…
GIS Data
14
The GIS Data Model
An example
Here we have three layers or themes:
Roads
longitude
 roads,
 hydrology (water),
 topography (land elevation)
They can be related because precise geographic
coordinates are recorded for each theme.
Layers are comprised of two data types
hydrology
 Spatial data which describes location (where)
 Attribute data specifying what, how much, when
Layers may be represented in two ways:
 in vector format as points and lines
 in raster (or image) format as pixels
topography
All geographic data has 4 properties:
projection, scale, accuracy and resolution
15
Projection, Scale, Accuracy and Resolution
the key properties of spatial data
Projection: the method by which the curved 3-D surface of the earth is represented
by X,Y coordinates on a 2-D flat map/screen  distortion is inevitable
Scale: the ratio of distance on a map to the equivalent distance on the ground
in theory GIS is scale independent but in practice there is an implicit range of scales
for data output in any project
Accuracy: how well does the database info match the real world
 Positional: how close are features to their real world location?
 Consistency: do feature characteristics in database match those in real world
is a road in the database a road in the real world?
 Completeness: are all real world instances of features present in the database?
Are all roads included.
Resolution: the size of the smallest feature able to be recognized
for raster data, it is the pixel size
16
Representing Data with
Raster and Vector Models
Raster Model (Location-based)
 area is covered by grid with (usually) equal-sized, square cells
 attributes are recorded by assigning each cell a single value based on the majority feature
(attribute) in the cell, such as land use type.
 Image data is a special case of raster data in which the “attribute” is a reflectance value from
the geomagnetic spectrum (cells in image data often called pixels “picture elements”)
Vector Model (Object-based)
The fundamental concept of vector GIS is that all geographic features in the real
work can be represented either as (because representation depends on shape) :
 points or dots (nodes): trees, poles, fire plugs, airports, cities
 lines (arcs): streams, streets, sewers,
 areas (polygons): land parcels, cities, counties, forest, rock type
line
point
polygon
Vector format often used to represent map data.
 Roads can be considered as two-dimensional and represented by lines and curves.
 rivers, may be represented either as complex curves or as complex polygons, depending on
whether their width is relevant.
 Features such as regions and lakes can be depicted as polygons.
17
The Raster and Vector data model
An example of Real World containing:
Trees, River, Houses
Real World
Digital representation
Raster Representation
Vector Representation
18
Vector Data Model
Spaghetti Vector Data Model
Each point, line, or polygon is stored as a record in a file that consists of
that entity’s ID and a list of coordinates that define geometry.
ID
Coordinates
p1
p2
5,6
19,17
l1
l2
(2,1), (11,8), (13,20)
(13,2), (18,1), (20,5)
a1
a2
(2,12), (8,13), (5,19)
(14,7), (20,8), (19,12), (16,11)
Advantages
Disadvantages
For points
p2
l1
15
a1
For lines
10
a2
For polygons
p1
5
l2
0,0
simple
efficient for display and plotting
5
10
15
inefficient for most types of spatial analysis
19
Vector Data Model
Topologic Vector Data Model
Composed of points, lines, and polygons
Node: a point at the intersection of three or more lines
n1
In addition to coordinate locations, the topologic relationships among
geometric features are explicitly recorded
a3
C a4
Arc
a1
a2
a3
a4
Start End
n1
n2
n1
n2
n1
n2
n2
n1
Arc
a1
a2
a3
a4
Left Right
A
A
B
C
C
B
StartXY
4,5
4,5
4,5
4,3
Arc
Topology
a1
n2
Planar Enforcement:
No two individual features can overlap.
There are no ‘holes’ or ‘íslands’ that are not themselves features.
Every feature is represented as a record in the attribute table.
Node Topology
Node
Arcs
n1
a4, a2, a1, a3
n2
a2, a4, a3, a1
A
B a2
Arc Coordinate Data
IntermediateXY
EndXY
(4,8), (8,8), (8,1), (4,1) 4,3
(6,7), (6,3)
4,3
(1,3)
4,3
4,5
Polygon Topology
ID
A
B
C
Arcs
a1, a2
a2, a4
a3, a4
20
Raster, Vector Data Model
Vector layers
Examples
Land Parcels layer: polygons
Street Network layer: lines
Raster (image) Layer:
Digital Ortho Photograph Layer
Overlay based on
Common Geographic
Location
0
1500
3000 Feet
Example
Digital Ortho photo: combines the visual properties
of a photograph with the positional accuracy of
a map, in computer readable form.
Projection:
Resolution:
Accuracy:
Scale:
State Plane, North Central Texas Zone, NAD 83
0.5 meters
1.0 meters
see scale bar
21
Representing Surfaces – an overview
 Surfaces involve a third elevation value (z) in addition to the x,y
horizontal values
 Surfaces are complex to represent since there are an infinite
number of potential points to model
 Three (or four) alternative digital terrain model
approaches available:
z
x
y
 Raster-based digital elevation model - Regular spaced set of elevation points (z-values)
 Vector based triangulated irregular networks - Irregular triangles with elevations at
the three corners
 Vector-based contour lines - Lines joining points of equal elevation, at a specified
interval
 Massed points and breaklines
 The raw data from which one of the other three is derived
 Massed points: Any set of regular or irregularly spaced point elevations
 Breaklines: point elevations along a line of significant change in slope (valley floor, ridge
crest)
22
Representing Surfaces – DEM
Digital Elevation Model (DEM or MNT)
A sampled array of elevations (z) that are at regularly spaced
intervals in the x and y directions.
Two approaches for determining the surface z value of a location
between sample points.
 In a lattice, each mesh point represents a value on the
surface only at the center of the grid cell. The z-value is
approximated by interpolation between adjacent sample
points; it does not imply an area of constant value.
 A surface grid considers each sample as a square cell
with a constant surface value.
Advantages




Simple conceptual model
Data cheap to obtain
Easy to relate to other raster data
Irregularly spaced set of points can be converted to regular
spacing by interpolation
Real World
2124
…
2011
2012
…
2022
…
1230
2123
…
…
DEM
Visualization
Disadvantages
 Does not conform to variability of the terrain
 Linear features not well represented
23
Representing Surfaces – TIN
Triangulated Irregular Network (TIN)
a set of adjacent, nonoverlapping triangles computed
from irregularly spaced points,
with x, y horizontal coordinates
and z vertical elevations.
24
Representing Surfaces – TIN
Triangulated Irregular Network (TIN) Surface
Points
Polygons
Node # X
Y
Z
1
0 999 1456
2
525 1437 1437
3
631 886 1423
etc
Elevation points (nodes) chosen
based on relief complexity, and
then their 3-D location (x,y,z)
determined.
2
1
A
D
6
B
3
C
4
H
E
G
F
5
Polygon Node #s Topology
A
1,2,4
B,D
B
2,3,4
A,E,C
C
3,4,5
B,F,G
D
1,4,6
A,H
etc
Elevation points connected to form
a set of triangular polygons; these
then represented in a vector
structure.
 Advantages over raster:
Attribute Info. Database
Poly gons Var 1
A
1473
B
1490
C
1533
D
1486
etc.
Var 2
15
100
150
270
Attribute data associated via
relational DBMS (e.g. slope,
aspect, soils, etc.)
fewer points
captures discontinuities (e.g ridges)
slope and aspect easily recorded
 Disadvantages:
Analysis involving comparison with other layers
difficult
Representing Surfaces – Isolines
Contour (isolines) Lines
Contour lines, or isolines, of constant elevation at a specified
interval,
Advantages
 Familiar to many people
 Easy to obtain mental picture of surface
 Close lines = steep slope
 Uphill V = stream
 Downhill V or bulge = ridge
 Circle = hill top or basin
Disadvantages
 Poor for computer representation: no formal digital model
 Must convert to raster or TIN for analysis
 Contour generation from point data requires sophisticated interpolation
routines, often with specialized software such as Surfer from Golden
Software, Inc., or ArcGIS Spatial Analyst extension
26
Storing data issues…
File Formats for Raster Spatial Data
The generic raster data model is actually implemented in
several different computer file formats:
 GRID is ESRI’s proprietary format for storing and processing raster data
 Standard industry formats for image data such as JPEG, TIFF and MrSid formats can
be used to display raster data, but not for analysis (must convert to GRID)
 Georeferencing information required to display images with mapped vector data
(mapping raster to vector…) - Requires an accompanying “world” file which
provides locational information
Image
Image File
World File
TIFF
image.tif
image.tfw
Bitmap
image.bmp
image.bpw
BIL
image.bil
image.blw
JPEG
image.jpg
image.jpw
Although not commonly encountered, a “geotiff’ is a single file which incorporates
both the image and the “world” information is a single file.
27
Storing data issues…
File Formats for Vector Spatial Data
Generic models above are implemented by software vendors in
specific computer file formats
Coverage: vector data format introduced with ArcInfo in 1981
 multiple physical files (12 or so) in a folder
 proprietary no published specs & ArcInfo required for changes
Shape ‘file’: vector data format introduced with ArcView in 1993
 comprises several (at least 3) physical disk files (with extension of .shp,
.shx, .dbf), all of which must be present
 openly published specs so other vendors can create shape files
Geodatabase: new format introduced with ArcGIS 8.0 in 2000
 Multiple layers saved in a singe .mdb (MS Access-like) file
 Proprietary, “next generation” spatial data file format
Shapefiles are the simplest and most commonly used format
28
Building a GIS Information
Spatial Database Management?
How do we incorporate spatial data into a computer application system?
SPATIAL OBJECT
Usually by using a relational Data Base Management System (DBMS)
GIS systems traditionally maintain spatial and attribute data
separately, then “join” them for display or analysis.
 The spatial data can be store in vector or raster format
ATTRIBUTE
SPATIAL
DBMS
 Vector format represents data in a series of (X,Y) coordinates
 Raster format represent data in a series of columns and rows-Matrix (Pixel, cell)
GIS Database
 Accuracy
 Vector data are accurate and takes less storage, but take long time e.g.
digitization
 Raster data are inaccurate and takes large storage, but takes short time e.g.
scanning
(1)
SPATIAL OBJECT
Hybrid vs. Integrated Approaches
Hybrid Approach:
ATTRIBUTE
stores spatial data and attribute data in different data models (typically
relational data model for attribute data and proprietary data structure for
spatial data).
Integrated Approach:
stores spatial and attribute data using the same data model (typically using
the relational data model in a single RDBMS).
SPATIAL
DBMS
GIS Database
(2)
29
Spatial Database Management System
Spatial Database Management System (SDBMS) provides the
capabilities of a traditional database management system (DBMS)
while allowing special storage and handling of spatial data.
SDBMS:




Works with an underlying DBMS
Allows spatial data models and types
Supports querying language specific to spatial data types
Provides handling of spatial data and operations
SDBMS Three-layer Structure
 SDBMS works with a spatial application at the
front end and a DBMS at the back end
 SDBMS has three layers:
 Interface to spatial application
 Core spatial functionality
 Interface to DBMS
Spatial application
Interface to spatial application
Core Spatial Functionality
Taxonomy, Data types
Operations, Query language
Algorithms, Access methods
Interface to DBMS
DBMS
30
Querying Data
A GIS must provide tools for finding specific features based
on their location or attributes.
Queries,
which are often created as logical statements or expressions, are used to select
features on the map and their records in the database.
A common GIS query is to determine what exists at a particular location.
the user knows where the features of interest are, but wants to know what
characteristics are associated with them.
This can be accomplished with a GIS because geographic features on the map
display are linked to their attributes stored in the database.
Another type of GIS query is to determine which location or locations satisfy certain
conditions. In this case, the user knows what characteristics are important and
wants to find out where the features are that have these characteristics.
31
Querying Data…
Spatial Queries
How many elderly in Richardson live further than
10 minutes at rush hour from ambulance service?
 Nearness queries request objects that lie near a specified location.
 Nearest neighbor queries, given a point or an object, find the nearest object that
satisfies given conditions.
 Region queries deal with spatial regions. e.g., ask for objects that lie partially or
fully inside a specified region, that compute intersections or unions of regions.
 Spatial join of two spatial relations with the location playing the role of join
attribute.
Graphical user interface – constituting the front-end
 Spatial data is typically queried using a graphical query language; results are
also displayed in a graphical manner.
 Extensions of SQL with abstract data types, such as lines, polygons and bit
maps, have been proposed to interface with back-end.
 Queries can use spatial conditions (e.g. contains or overlaps).
 Queries can mix spatial and nonspatial conditions
32
Querying Data…
Spatial Query Language
Number of specialized adaptations of SQL
 Spatial query language
 Temporal query language (TSQL2)
 Object query language (OQL)
 Object oriented structured query language (O2SQL)
Spatial query language provides tools and structures specifically for
working with spatial data
SQL3 provides 2D geospatial types and functions
Spatial Query Language Operations - Three types of queries:
 Basic operations on all data types (e.g. IsEmpty, Envelope, Boundary)
 Topological/set operators (e.g. Disjoint, Touch, Contains)
 Spatial analysis (e.g. Distance, Intersection, SymmDiff)
33
Querying Data…
Example Spatial Query
County (Name, State, Population, Shape);
River (Name, Source, length, Shape);
Find all the counties that border on Contra Costa county
SELECT C1.Name
FROM County C1, County C2
WHERE Touch(C1.Shape, C2.Shape) = 1 AND C2.Name = ‘Contra Costa’;
Find all the counties through which the Merced river runs
SELECT C.Name, R.Name
FROM County C, River R
WHERE Intersect(C.Shape, R.Shape) = 1 AND R.Name = ‘Merced’;
Analysis
 How County and River entities could implemented
 How the above Touch and Intersect operation could be implemented…
One approach : the Object Orientation facilities may provide the more adequate support.
34
Querying Data…
Analysis
Data Table
Photographic Image
35
More examples…
High yaw contraction requires
spatial intersection analysis
Dam contraction
requires 3D analysis
36
Radar location
37
3D studding Area- risk analysis
Providing “Building permits” requires
spatial analysis – flooding risk
analysis
38
Spatial indexing-an overview
Indexing of Spatial Data
 k-d tree - early structure used for indexing in multiple dimensions.
 Each level of a k-d tree partitions the space into two.
 choose one dimension for partitioning at the root level of the tree.
 choose another dimensions for partitioning in nodes at the next level and so
on, cycling through the dimensions.
 In each node, approximately half of the points stored in the subtree fall on one side and half on the other.
 Partitioning stops when a node has less than a given maximum
number of points.
 The k-d-B tree extends the k-d tree to allow multiple child nodes for
each internal node; well-suited for secondary storage.
39
Spatial indexing-an overview
Division of Space by Quadtrees
 Each line in the figure (other than the outside box)
corresponds to a node in the k-d tree
 the maximum number of points in a leaf node has
been set to 1.
 The numbering of the lines in the figure indicates
the level of the tree at which the corresponding node
appears.
 Each node of a quadtree is associated with a
rectangular region of space; the top node is
associated with the entire target space.
 Each non-leaf nodes divides its region into four
equal sized quadrants
 correspondingly each such node has four child
nodes corresponding to the four quadrants and so on
 Leaf nodes have between zero and some fixed
maximum number of points (set to 1 in example).
40
Spatial indexing-an overview
Quadtrees (Cont.)
 PR quadtree: stores points; space is divided based on regions,
rather than on the actual set of points stored.
 Region quadtrees store array (raster) information.
 A node is a leaf node is all the array values in the region that it covers are
the same. Otherwise, it is subdivided further into four children of equal area,
and is therefore an internal node.
 Each node corresponds to a sub-array of values.
 The sub-arrays corresponding to leaves either contain just a single array
element, or have multiple array elements, all of which have the same value.
 Extensions of k-d trees and PR quadtrees have been proposed to
index line segments and polygons
 Require splitting segments/polygons into pieces at partitioning boundaries
 Same segment/polygon may be represented at several leaf nodes
41
R-Trees
Supported in many modern database systems,
along with variants like R+-trees and R*-trees.
 R-trees are a N-dimensional extension of B+-trees, useful for indexing sets of rectangles and other
polygons.
Basic idea: generalize the notion of a one-dimensional interval associated with each B+ -tree
node to an N-dimensional interval, that is, an N-dimensional rectangle.
 A rectangular bounding box is associated with each tree node.
 Bounding box of a leaf node is a minimum sized rectangle that contains all the rectangles/polygons associated
with the leaf node.
 The bounding box associated with a non-leaf node contains the bounding box associated with all its children.
 Bounding box of a node serves as its key in its parent node (if any)
 Bounding boxes of children of a node are allowed to overlap
 A polygon is stored only in one node, and the bounding
Will consider only the
box of the node must contain the polygon
case (N = 2)
A set of rectangles (solid line) and the
bounding boxes (dashed line) of the nodes
of an R-tree for the rectangles.
The R-tree is shown on the right.
42
Sources of Geographic Data
Land surveys and GPS
Aerial photography and photogrammetry
Satellite remote sensing
U. S. Census
Existing paper maps
Sources of Geographic Data
Land Surveys
 Specifies boundaries, rights-of-way, and other legal descriptions
 Surveyors use optical and electronic instruments to measure precise
control point locations established by geodesists
 High quality data, but takes a lot of time
GPS
 Global Positioning Systems
 Earth-orbiting satellites broadcast precisely timed radio signals
 GPS receivers determine positions on the ground by calculating
distances from three or more satellite transmitters
 More expensive than traditional optical and electronic methods
44
Sources of Geographic Data
Aerial Photography and Photogrammetry
 Field concerned with producing geographic data from aerial
photographs
 Orthophoto: aerial photograph in which scale variations
have been rectified.
Satellite Remote Sensing
 Technologies and procedures used to measure and
record energy emitted by the Sun, and reflected or
emitted by the Earth
 Help predict the behavior of Earth's environmental systems
Census
Primary sources of
demographic attribute
data
Valued by businesses,
direct mail, and trade area
analyses, and by social
scientists who seek to
understand the behavior
of social systems
45
Building an end-user map…
Land Surveys
inspection
GPS
Census
Aerial/Satellite Photography; Orthophoto
Satellite Remote Sensing
Existing Paper Maps
46
Appendix
GIS Software Packages
And GIS File Formats
Software for GIS: The Main Players
ESRI, Inc., Redlands, CA




clear market leader with about a third of the market
originated commercial GIS with their ArcInfo product in 1981
privately owned by Jack Dangermond, a legend in the field
Strong in gov., education, utilities and business logistics
The main two
“pure GIS”
companies.
MapInfo, Troy N.Y.
 Aggressive newcomer in early 1990s, but now well-established.
 Strong presence in business, especially site selection & marketing, and telecom
Intergraph (Huntsville, AL)




origins in proprietary CAD hardware/software
Older UNIX-based MGE (Modular GIS Environment) evolved from CAD
Current GeoMedia was the first true MS Windows-based GIS
strong in design, public works, and FM (facilities management), but weakening
 Bentley Systems (Exton, PA)


Autodesk



MicroStation GeoGraphics, originally developed with Intergraph, is now their exclusive and main product..
Strong in engineering; advertises itself as “geoengineering”
(San Rafael, CA)
Began as PC-based CAD, but now the dominant CAD supplier
First GIS product AutoCAD Map introduced in 1996
Primarily small business/small city customer base
48
ESRI Product Line-up: ArcGIS client products (Fall 2007)
ArcReader (“adobe acrobat” for maps) & ArcExplorer (spatial data viewer)
Free viewers for geographic data.
ArcGIS 9.x Desktop: two primary modules (MS only)
 ArcMap: for data display, map production, spatial analysis, data editing
 ArcCatalog: for data management and preview
 ArcToolbox, for specialized data conversions and analyses, available as a window in both
 Available capabilities within these modules are “tiered” in three levels
 ArcView: viewing, map production, spatial analysis, basic editing:
 ArcEditor: ArcView, plus specialized editing:
 ArcInfo: ArcView & ArcEditor plus special analyses and conversions:
 Extensions: for special apps.: Spatial Analyst, 3D Analyst, Geostatistics, Business Analyst, etc.
 ArcObjects: to build specialized capabilities within ArcMap or ArcCatalog using VB for Applications
ArcGIS Workstation (for UNIX and MS) - the old command line ArcInfo 7.1
ArcGIS Engine (MS NT/2000/XP)
 Set of embeddable GIS components (ArcObjects software objects) for use in building custom applications
 Runs under Windows, Unix and Linux, with support for Java, C++, COM and .NET
 Replaces MapObjects which were based upon a previous generation of GIS objects
Notes:
ArcView 3.3 the only GUI option for UNIX.
ArcGIS 8 released 2000 to integrate two previous standalone products: ArcView and ArcInfo
ArcGIS 9 released 2004 providing the full capability that should have been in ArcGIS 8!!!
--full support for all data types (coverages, shapefiles, geodatabases)
--full support for all previous geoprocessing analyses
--Modelbuilder for scripting and repetitive processing
--ArcEngine for building custom applications
49
ESRI Product Line-up: ArcGIS server products (Fall 2007)
 ArcGIS Server: three tiers of capability
 Data services: ArcSDE (Spatial Database Engine)
 middleware to support spatial data storage in standard DBMS on server
 Supports all major industry databases:
Oracle, SQL-Server, IBM DB2, Ingres
 Map services: ArcIMS (Internet Map Server)
 Provides maps and simple query to a user without a desktop GIS
 Accessed via web interface
 Analytic services:
 Permits the creation of server-based specialized GIS applications
 Provides full range of GIS capabilities to a user without a desktop GIS
 Accessed via web interface
 (prior to 9.2 these were sold as three separate products)
 ArcGIS On-line Services
 On-line services made available on the Internet with a subscription
 Normally charged on a “per transaction” basis, but can be flat fee
 built and operated by ESRI (or other others), usually based on ArcGIS Server
50
ESRI ArcGIS System
c:\ ArcGIS Workstation
ArcInfo
ArcEngine/
ArcObjects
Application
Development &
Customization
ArcPad
Clients
ArcEditor
ArcMap
ArcCatalog
ArcToolbox
ArcMap
ArcCatalog
ArcToolbox
ArcView
Consistent interface
Increasing capability
ArcExplorer
Browser
ArcMap
ArcCatalog
ArcToolbox
ArcServer Services
Full GIS analysis
ArcIMS Services
Map display & query
ArcSDE Services
Database storage/access
Internet
Files
(Personal Geodatabase,
Shapefiles, Coverages,
Grids, tins, etc)
Handheld/Wireless
Source: ESRI with modifs.
Databases
Multi-user Geodatabases
(in Oracle, SQL Server,
IBM DBII, etc)
51
Future Generic GIS Internet Enterprise
Applications
Browsers
Web
Web Server
Broker
Services
( built on
.Net, SOAP/XML, Java API)
Dallas
Delhi
Durban
Databases
52
Vendor Implementation of GIS Data Structures:
file formats
 Raster, vector, TIN, etc. are generic models for representing spatial information
in digital form
 GIS vendors implement these models in file formats or structures which may be
 Proprietary: useable only with that vendor’s software (e.g. ESRI coverage)
 Published: specifications available for use by any vendor (e.g ESRI shapefile, or the
military vpf format)
 Transfer formats: intended only for transfer of data
 Between different vendor’s systems (e.g. AutoCAD .dxf format, or SDTS)
 between different users of same vendors’ software (e.g. ESRI’s E00 format for coverages)
 One GIS vendor may be able to read another file format:
 By translation, whereby format is converted externally to vendors own format
 Usually requires user to carry out conversion prior to use of data
 On-the-fly, whereby conversion is accomplished internally and “automatically”
 No user action needed, but usually no ability to change data
 Natively, or transparently, which normally implies
 No special user action needed
 ability to read and write (change or edit) the data
53
Common GIS & CAD File Formats
ESRI
AutoCAD
 Coverages (vector--proprietary)
 E00 (“E-zero-zero”) for coverage
exchange between ESRI users
 Shapefiles (vector--published) .shp
 Geodatabase (proprietary) .gdb
Based on current object-oriented
software technology
 GRID (raster)
 AutoCAD .DWG (native)
 AutoCAD .DXF for digital file exchange
Intergraph/Bentley
 Bentley MicroStation .DGN
 Intergraph/Bentley .MGE
Spatial Data Transfer Standard (SDTS)
– US federal standard for transfer of data
– Federal agencies legally required to conform
– embraces the philosophy of self-contained transfers, i.e. spatial data, attribute, georeferencing,
data quality report, data dictionary, and other supporting metadata all included
– Not widely adopted ‘cos of competitive pressures, and complexity and perceived disutility
derived from philosophy
54
ESRI Vector File Formats: “Georelational”
Shape ‘file’: native GIS data structure for a Coverage: native GIS data structure for
vector layer in ArcView
a vector layer in ArcInfo
 not fully topological
 fully topological
 better suited for large data sets
 limited info about relationship of features
 better suited for fancy spatial analyses
one to another
 comprises multiple physical files (12 or so)
 draw faster
per coverage
 not as good for some fancy spatial analyses
 is a ‘logical’ file which comprises several
(at least 3) physical disk files, all of which
must be present for AV to read the theme
 each coverage saved in a separate folder
named same as the coverage
 physical file set differs depending on type of
coverage (point, line, polygon).
 coverage folders stored in a “workspace”
directory with an info folder for tracking
 attribute tables stored there also
 layer.shp (geometric shape described by XY
coords)
 layer.shx (indices to improve performance)
 ARC/INFO required to make changes
 layer.dbf (contains associated attribute data)
 proprietary: no published specs.
 layer.sbn layer.sbx
 not really a database, although ArcView
presents files to user via relational
concepts
 openly published specs so other vendors
can develop shape files and read them
E00 Export Files: format for export of
coverages to other ESRI users
 IMPORT71 utility in ArcView Start Menu can
read E00 files and convert them back to
coverages
 Must convert to shapefile or AutoCAD .dxf
format to transfer to a non-ESRI GIS system