Transcript 01/24/01

Geog 357:
Data models and DBMS
1
Geographic Decision Making
Ways of storing digital data
• File structures
– simple
– ordered sequential
– indexed
• Data models
• Databases
– hierarchical
– network
– relational
File structures
• Basic terms
– record
• data items related to a single logical entity (e.g. a
student record) (row in a table)
– field
• a place for a data item in a record (first name field in
a student record) (column in a table)
– file
• a sequence of records of the same type (the table)
File structures
A file: “STUDENT”
record
field
ID Last
First
Grade
3
Smith
Jane
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
File structures
• Simple list
– list of entries in
which the order
of entry into the
list determines
the order of the
list
ID
Last
First
Grade
3
Smith
Jane
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
File structures
• Search of a
simple list
entails going
through each
record until
search is
satisfied (linear
search), which is
inefficient
ID Last
First Grade
3
Smith
Jane
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
File structures
• Ordered
sequential files
– list of entries
ordered in some
way (e.g.
numerically or
alphabetically)
ID
Last
First
Grade
1
Wood
Bob
C
2
Kent
Chuck
B
3
Smith
Jane
A
4
Boone
Dan
B
File structures
• Search of an
ordered
sequential list
can use a binary
search method but only for the
ordered field
ID
Last
First
Grade
1
Wood
Bob
C
2
Kent
Chuck
B
3
Smith
Jane
A
4
Boone
Dan
B
File structures
• Indexes provide a reference to records based on
an index field, which is ordered
Last
Pointer
ID
Last
First
Grade
Boone
*
1
Wood
Bob
C
Kent
*
2
Kent
Chuck
B
Smith
*
3
Smith
Jane
A
Wood
*
4
Boone
Dan
B
Data models
• A data model is a particular way of
conceptually organizing multiple data files
in a database
– hierarchical
– network
– relational
Hierarchical data model
Parent-child
relationship
(one-to-one or
one-to-many)
among data
Grade
Class
Student
Instructor
ID
Department
Hierarchical data model
• Advantages
easy to search
can add new branches easily
• Disadvantages
must establish the types of search prior to
development of the hierarchical structure
Network data model
One-to-one,
one-to-many,
many-to-one, or
many-to-many
relationships
possible
Grade
Class
Student
Instructor
ID
Department
Network data model
Advantages
flexible, fast, efficient
Disadvantages
complex
restructuring can be difficult because of
changing all the pointers
Data models
• Hierarchical and network data models have
generally been replaced by the relational
data model
• Relational databases (and their derivatives)
dominate the (non-GIS) database market:
Oracle, Informix
Databases
• A database is a collection of data files that is
structured (organized) to facilitate data
storage, manipulation, and retrieval.
• A database management system (DBMS) is
a software package that performs these
database functions
?
Why Databases??
• Shift from computation to information
–
Focus on the way to structure information
• Datasets increasing in diversity and volume.
–
–
Digital libraries, interactive video, e-commerce
... need for DBMS exploding
• DBMS encompasses most of the
information technology
–
OS, languages, theory, multimedia, logic, web
Database - Definition
• A very large, integrated collection of data.
• A shared collection of logically related data
designed to meet the information needs of
an organization
• Models real-world enterprise
–
–
Entities (e.g., students, courses)
Relationships (e.g., Madonna is taking CS564)
Database - Definition
• Three key elements of database definition:
– Shared
– Interrelated
– Predefined applications
• Side notes:
– Database is NOT the real world
• Database is an abstraction
– Database  Information
• Data becomes information only when they are used to provide
answers to queries
Database Management System
(DBMS)
• DBMS: A software system that enables users
to define, create, and maintain the database
and which provides controlled access to this
database.
• Provide a layer between user application
programs and the data
– Data Definition Language (DDL)
– Data Manipulation Language (DML)
File-based Processing
Problems with File-based
Systems
• Same data is stored in multiple places.
Inconsistencies!
• We need to write special programs for each
user question
• Data can be corrupted due to system crash
while change is made.
• User programs are not easy to share data or
evolve.
Database Management
System (DBMS)
Advantages of Database
Approach
• Control of data redundancy
– Have a central depository of all data and their
descriptions
– Same information stored only once
•
•
•
•
•
Data Integrity
Controlled access to database
Data independence
Concurrent Access
Crash recovery
Disadvantages of DBMS
• Complexity
• Cost of DBMS software, hardware and data
conversion
• Performance
• Higher impact of a failure
When NOT to use DBMS?
• No data sharing
• Small scale
• Real-time constraints
Roles in the Database
Environment
•
•
•
•
•
Data Administrator (DA)
Database Administrator (DBA)
Database Designers (Logical and Physical)
Application Programmers
End Users (native and sophisticated)
Summary
• Databases are collections of inter-related data.
• DBMS used to maintain, query large datasets.
• Benefits include recovery from system crashes,
concurrent access, quick application
development, data integrity and security.
• The advantages and disadvantages of DBMSs.
• The personnel involved in the DBMS
environment
• Database management is one of the broadest,
most important areas in IST.