Database Management Issues of interest to Address Databases

Download Report

Transcript Database Management Issues of interest to Address Databases

Database Management Issues
of interest to Address
Databases
Database Overview Agenda
•
•
•
•
•
•
Database Components
Example Data Types
Table Indexes
Domains
Joins and Views
Foreign and Primary
Keys
Database Components
A database is the sum of all information you have obtained.
Database
Table 3
Table 1
col1
col2
col3
col4
Record/Row 1
Table 2
Table 3
Record/Row 2
Table 4
Record/Row3
Table 5
Record/Row4
Table 6
Table 7
Table 8
col5
Columns/
Fields
Sample Column Data Types
Character
Stores a maximum of 240 ASCII characters.
Integer
Stores an integer in a range -2,147,483,648 to 2,147,483,647
Smallint
Stores an integer value in the range -32,768 to 32,767
Double
Stores a real value in double precision floating point format
Real
Stores a real number value as a single precision floating point
Decimal
Stores a fixed point decimal number with a optional precision
and scale
Timestamp
Stores a timestamp with ‘yyyy-mm-dd:hh:mm:ss’ format
Table Indexes
• A table index contains information from a
specified table and column
• The index allows you to sort information by
column and place this information in a table
• Indexes can be placed on columns that are
frequently used in queries and have few
repeating values
• Indexes help to improve performance on queries
• A unique index can be created on a column that
will have unique values for each record
Domains and DB Integrity
• A domain allows you to check the validity of an entry into a
column in a database table against a corresponding set of
allowable values for that column
• Two types of domains exist
– Range domain -- used with numeric data and consists of one
or more inclusive minimum-maximum ranges
– List domain -- used with character data and consist of a set of
character strings
• Domains are stored in a series of domain tables
Relational Join
• A Join is a linkage between two tables in the database
• Columns from each table with like data types are used to
establish the join relationship
• There must be one identical value in the joined columns in
each table to complete the union
parcel_id
mslink
mapid
parcel_no
county_name
area_sqft
owner
parcel
parcel_id
assessed_value
zone_class
school_district
land_use
Database Views
• A view is a window that allows you to
analyze selected columns of joined tables.
• A view can be defined using either a single
join or multiple join relationships (ie using
several DB columns).
• Views are used for Query, Analysis and
reporting of Database Values.
• Makes huge DB Tables more user-friendly.
Primary Keys
Just Call them the “Record ID’s”
in a Database Table...
General Guidelines
• Should be numeric
• Must be Unique
• Do not change
• The shorter the better
• Automatically Generated
is best...
Column Names
*
Foreign Keys
Essentially the “Linkage Columns”
between Database Tables...
• Should/Must have a matching column in
another table with, at least some,
matching values.
• Require extensive planning during
Database Development Phase.
• Should be unique and numeric, but
don’t have to be….
What’s a Cartographic Feature
• Something from the real world represented in your
digital map: streets, streams, houses, trees, etc.
• A graphic element that contains a pointer to a
record in the Feature table
CAD Graphics Table
mslink
1
fname fcode ftype
road rd1
line
table
22
category fweight
3
2
flevel
fcolor
fstyle
29
0
0
Digital Map
digcmd
other...
Database Linkages on
CAD Graphics
“Old” DB Table ID
CAD File Graphics
record ID in “Old” Table
DMRS 8000 0004 0005 0000
DMRS 8000 0022 0014 0000
feature link
attribute link
The Database software will interpret the
“old linkage code” to determine what table the
graphic elements “points” to.
Relational Databases (example)
Feature Table
mslink
1
fname fcode ftype
road rd1
line
table
category fweight
22
3
flevel
fcolor
fstyle
29
0
0
2
digcmd
other...
Category Table
mslink
cname
3
trans
indexname indexlevel
CAD File Graphics
Maps Table
mslink
mapname
category
12
road1.dgn
3
roads Table
mslink
mapid
1
12
rd_name num_l
test
2
traffic
county
10100
morgan
MSCATALOG
Tablename
feature
roads
enitynum
4
22
nextocc
7
2
Joining Tables (DBMS)
Street Name Table
1:2
Sides Table
1: Many
Segment Table
Foreign Keys in each
table are used to complete
the Join Relationships from
Table to Table.
Joining Tables (example)
Segment Table in Graphics
Street Name Table
1:1
Segment Table
Master Address File
Address Database Design
Issues
• Determine your “Audience” and their
needs.
• What’s your Geographic Extent?
• What Partnerships should be established?
• Establish Standards Early!
• Re-evaluate those Standards Regularly
Address Tables
• Parse the entire Address Record
– Always easier for “us average” DB users to
“merge” columns, rather than “split” them.
• Pay close attention to Primary Keys and
Foreign Keys during the Design and
Testing Phase.
• Conduct a Pilot Study for the entire
database structure before going “live”.
Address Tables (continued)
• Use Domains to control user input at
EVERY opportunity!
– List Domain (valid Street Names)
– Range Domains (valid numeric ranges)
• If gathering new addresses from more
than one source, collect them in “dummy”
tables before the DB “gatekeeper” cleans
them up and dumps them into the “master
database”.
Address Data Entry
Specific data entry recommendations include:
1) Zip code entry first, with automatic fill of State and (optionally)
locality data.
2) Support on-line entry with help screens, pop-up valid values access,
and immediate edits.
3) Secondary unit data entry separate from street address (optionally
before street for emphasis).
4) Addresses entered with manual overrides of edits should be flagged
for future review.
5) Allow search for Zip code given City and State (optional).
Recommended Address Edits
Several types and levels of edits may be practical, depending on
circumstances and business purpose.
1) Check entered data for valid abbreviations. (Abbreviation standards
used by the USPS are included in Appendix B.)
2) Compare entered location(City) and State to Zipcode (based on
GCS or equivalent table information).
3) Check Zipcode for validity (based on GCS or equivalent table
information).
4) Compare entered address against valid addresses: Against an
existing database containing addresses (within the enterprise)
Recommended Address Edits
(continued)
5) Verify and correct the standard use of state code, standard spelling
for city; and presence of standard street type.
6) Inspect Street numbers that seem to represent ranges of addresses,
such as street numbers in a range or the use of terms such as
"scattered sites". (This only applies for those applications that
receive addresses representing, for example, blocks of apartments).
7) Identify and correct building name substitutions for street addresses
to the extent possible. Using COTS software modules, against a
postal-service database of 140 million valid addresses.
8) If County Code is missing, generate County Code.
Recommended Address Edits
(continued)
9) Identify where range of latitude or longitude is more than 5 miles.
Inspect and correct.
(This is a way to measure if the geocoding center is of a Zip code,
rather than to a specific street address. This is unnecessary if the
geocoding level is specified in a code, as is recommended).
10) Identify and delete official verbiage. For example: "Township of",
"The Commonwealth of", "The Great State of".
11) Comma Check. The USPS recommends not using commas or other
dividers within addresses, except the hyphen in Zip+4. The USPS
further recommends all capital letters, to aid machine readability.
Recommended Address Edits
(continued)
12) Enforce Business Rules.
For example, it may be a rule that P.O. Box numbers (and
equivalent) may not substitute for Street names (and equivalent) if
the address is for a property in which the enterprise holds an interest
(as opposed to the mailing address of an individual or organization).
Database Loading Tools
useful for enhancing your
GBF data.
• Bulk Update
– Attributes
• Area Loader
– Polygons
• Length Loader
– Lines
• Point Loader
– X, Y coords. from DB
• Label Loader
– from graphics to the
database!
Third-Party Database
“Scrubbers”
• Clean up un-parsed Address Databases.
• Remove duplicate records or misspellings.
• Can even Geocode database records for
you.
• Some provide CASS certified services for
Address clean-up.
Web Sites of Interest
• http://www.nonprofitmailers.org/vendors/page6old.htm
• http://www.census.gov/geo/www/tiger/vendors.html
• http://www.nena.org/ads/prodvend.htm