IST210: Data Models and DBMS

Download Report

Transcript IST210: Data Models and DBMS

IST 210
Databases and DBMSs
Todd S. Bacastow
January 2005
1
IST 210
Evolution
Ways of storing data
IST 210


Files (1960) (ancient times)
Databases




Hierarchical (1970)
Network (1970)
Relational (1980)
Object (1990)
IST 210
File terms

Record


Field


data items related to a single logical entity (e.g.
a student’s information) or row in a table
a place for a data item in a record (first name
field in a student record) or a column in a table
File

a sequence of records of the same type (the
table)
IST 210
File structures
A file: “STUDENT”
record
field
ID
Last
First
Grade
3
Smith
Jane
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
IST 210
File organization

Serial


new records appended
Sequential

records ordered in file, usually according to
a numeric key
IST 210

File structures
Serial

list of entries in
which the order
of entry into the
list determines
the order of the
list
ID
3
Last
Smith
First
Jane
Grade
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
File structures
IST 210

Search of a simple
list entails going
through each record
until the query is
satisfied (linear
search), which is
inefficient
ID
3
Last
Smith
First Grade
Jane
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
File structures
IST 210

Sequential

list of entries
ordered in some
way (e.g.
numerically or
alphabetically)
ID
Last
First
Grade
1
Wood
Bob
C
2
Kent
Chuck
B
3
Smith
Jane
A
4
Boone
Dan
B
File structures
IST 210

Search of an
ordered
sequential list can
use a search
method
ID
Last
First
Grade
1
Wood
Bob
C
2
Kent
Chuck
B
3
Smith
Jane
A
4
Boone
Dan
B
IST 210

File structures
Indexes provide a reference to records based on an index
field, which is ordered
Last
Pointer
ID
Last
First
Grade
Boone
*
1
Wood
Bob
C
Kent
*
2
Kent
Chuck
B
Smith
*
3
Smith
Jane
A
Wood
*
4
Boone
Dan
B
Problems with files
IST 210

Redundancy


Inconsistency


data is updated in one
application’s files, but not in
another’s
Maintenance problems


number of files grows with
applications, and data is
duplicated
changes to data structures
mean changes to many
programs
Difficulty combining data

business needs may mean
users want data from
different applications
IST 210
Other ways to organize

Data model


A data model is a particular way of conceptually organizing
multiple data files in a database
Other common models
 Hierarchical
 Network
 Relational
 Object
IST 210
Network model
IST 210
Network data model
Class
Relationships:
• one-to-one
• one-to-many
• many-to-one
• many-to-many
Student
Grade
Instructor
ID
Department
Network data model
IST 210

Advantages


flexible, fast, efficient
Disadvantages


Complex
Restructuring can be difficult because of
changing all the pointers
IST 210
Hierarchical database model
IST 210

Hierarchical data model
Class
Parent-child
relationship:


one-to-one
one-to-many
Student
Grade
Instructor
ID
Department
IST 210
Hierarchical data model

Advantages



easy to search
add new branches easily
Disadvantages

Must establish the types of search prior to
development of the hierarchical structure
IST 210
Summary


Hierarchical and network data models have
generally been replaced by the relational data
model
Relational databases dominate the database
market





Oracle
Informix
SQL Server
DB2
……..
Relational database model
IST 210

Stores both


Data about real
world objects
(entities) in
tables
Relationships
between the
tables
IST 210

Relational database
Fields (columns) in the
table store attributes.


Tuples (or records or
rows) in the table store
information.


Each attribute has a
specific domain.
Each tuple is a unique
instance of an object.
Tables are composed of
a set of tuples.

A table is also called a
relation.
IST 210
Terms

Table


Column


Set of all possible values for a specific column.
Row


A specific place for one type of data relating to one type of
real world objects.
Domain


A collection of relevant data relating to one type of real
world objects.
Collection of data describing one real world object.
Primary Key

Columns, which are part of the row and uniquely identify
any one row.
IST 210


Records
Each record represents a logical entity (e.g. a
student)
Each field represents an attribute of the
logical entity
Student
ID
1
Last
Wood
First
Bob
Grade Class
C IST357
2
Kent
Chuck
B
IST115
3
Smith
Jane
A
IST357
4
Boone
Dan
B
IST357
IST 210

Keys
Each table has a primary key, one field (or a
combination of fields) that has a unique value for
each and every record in the table
ID is the primary key
in this table (two
students may share
either a last or first
name)
ID Last
1
Wood
Student
First Grade Class
Bob
C IST357
2
Kent
Chuck
B
IST115
3
Smith
Jane
A
IST357
4
Boone
Dan
B
IST357
Relating tables
IST 210
Tables can be related (joined) together based on their keys
The idea is to decompose into separate tables with no
redundancy and to provide a capability to reassemble with no
information loss


Student
ID
Last
First
Grade
Class
1
Wood
Bob
C
IST357
2
Kent
Chuck
B
IST115
3
Smith
Jane
A
IST357
4
Boone
Dan
B
IST357
Class
Name
#Stud
Instructor
IST357 48
Jones
IST115 120
Brower
IST20
120
Fountain
IST 210
Relating tables
Primary key
Foreign key
Primary key
Student
ID
Last
First
Grade
Class
1
Wood
Bob
C
IST357
2
Kent
Chuck
B
IST115
3
Smith
Jane
A
IST357
4
Boone
Dan
B
IST357
Class
Name
#Stud Instructor
IST357 48
Jones
IST115 120
Brower
IST20
120
Mennis
IST 210
Relating tables
Student
ID
Last
First
Grade
Class
1
Wood
Bob
C
IST357
2
Kent
Chuck
B
IST115
3
Smith
Jane
A
IST357
4
Boone
Dan
B
IST357
Class
Name
IST20
#Stud Instructor
120
Brower
IST115 120
Jones
IST357 48
Jones
Instructor
Name
Office
Jones
Brower
332
517
IST 210
DBMS Schema

Ultimately data in databases is stored in
files, but their structure is hidden
External Schema
The view on data used by
application programs.
Conceptual Schema
The logical model of data that is
separate from how it is used.
Internal Schema
The physical storage of data in
files and indexes.
IST 210
RDBMS Features







Data Definition Language
(DDL)
Data Manipulation Language
(DML)
Integrity Constraints
Transaction Management
Concurrency
Security
Tuning of Storage
IST 210
Data integrity and validation (Constraints)
Relationships
IST 210


Link between
entities.
A relationship may
define constraints.

E.G, a person can
only have one SSN.
IST 210
Advantages of RDBMS





Eliminate unnecessary duplication of data
Enforce data integrity through constraints
Changes to conceptual schema need not
affect external schema
Changes to internal schema need not affect
the conceptual schema
Many tools are available to manage the
database
IST 210
Disadvantages of RDBMS

To store objects (e.g., drawings) in a
relational database, the objects have to be
‘flattened’ into tables



e.g., a digital representation of a parcel must be
separated from the behaviour of other parcels
Complex objects have to be taken apart and
the parts stored in different tables
When retrieved from the database, the object
has to be reassembled from the parts in
different tables
IST 210
Other Types of DBMS

Object DBMS



store objects as objects
designed to handle complex nested objects
for graphical and multimedia applications
Object-relational DBMS

hybrid databases that can store data in
tables but can also store objects in tables
Object DBMS
IST 210


ODBMS have the
advantage that
objects can be
stored directly
Object databases
are closely linked
to programming
languages with
ways of navigating
through the
database
IST 210
Summary
Common
word
Textbook
word
Alternate
word
Object word
Table
Relation
File or Data set
(old)
Object class
Column
Attribute
Field
Object field
Domain
Domain
Range of
possible values
Datatype,
subtype
Row
Tuple
Record
Object instance
Primary key
Primary key
Key of the
record
Object identifier