General Concepts

Download Report

Transcript General Concepts

CS 430
Database Theory
Winter 2005
Lecture 2: General Concepts
1
Example of a Database

Picture of a Database


See Figure 1.2 of Text Book
Two Views of the Data

See Figure 1.4 of Text Book
2
Data Models
A Model for Describing Data

Examples:




Relational
Legacy: Network, Hierarchical
Object, Object-Relational
Entity/Relationship (ER)


Typically used for “data modeling”
Typically includes:


Definition: What kind of structures can be defined
Basic Data Manipulation

Except for ER Model
3
Relational Data Model

Data is organized into Tables





Rows = Records
Columns = Fields
Tables are related by data values
Data related by shared data values
Data manipulation:



Insert a record
Update and delete records
Select records a collection of fields from one or more
records
4
Network and Hierarchical Models

Data organized into Programming Language records


Programming language = COBOL!
Records organized into Sets

Each set has an owner

Hierarchical: Record is member of one set



Methods provided to link records
Network: Record can be member of multiple sets
Data Manipulation:



Traverse sets
Insert, Modify, Delete records
Change set memberships
5
Object and Object/Relational
Data Models

Object:

C++ (typical) style objects stored in database



Manipulated by manipulating records in memory
Objects can have methods as well as data
Object/Relational



Relational at core
Rows treated as objects
Supports object concepts such as inheritance,
methods, etc.
6
Entity/Relationship (ER)




Used for data modeling
Data organized into Entities which have
attributes
Entities are connected via named
relationships
Tools available which can convert from ER
model to Relational Schema
7
Data Modeling

Data modeling builds data models


Yes, we have duplicated terminology
Data modeling:

Develop a model for the data that will be stored
and manipulated by a database applications
8
Categories of Data Models

Conceptual or High-Level




Logical or Representational



Model the data as seen by the user
Typically incomplete
Answers question: Do we have all the data requirements?
Model the data as seen by the application developer
Should have all the Entities, Attributes and Relationships
(ER) or Tables, Columns, DataTypes (Relational) identified
Physical or Low-Level


The data as seen by the database administrator
All the physical information available

Where data is stored, indices, etc.
9
Schema, States

The database schema is the description of
the database


As the database is modified it moves from
state to state


The schema is the Intension of the database
The DBMS is responsible for guaranteeing that
each state is consistent (in accord with the
schema)
A database state is the Extension of the
schema
10
Three Schema Architecture


See figure 2.2 in Text Book
Internal Schema


Conceptual Schema


The actual schema of the database
How the data is organized
External Schemas

Individual user and/or application views of the
data
11
Three Schema Architecture
and Data Independence

Logical Data Independence


Can change Conceptual Schema without changing External
Schemas
Always true to some extent:




Add new Records or Tables
Add new columns? new Relationships?
Change representation of data items (e.g. numbers)?
Physical Data Independence


Can change internal Schema without changing Conceptual
Schema
Also always true to some extent


Where records are stored on disk
Add new access paths?
12
Data Languages

Data Definition Language (DDL)


Data Manipulation Language (DML)



Query the database and change the state of the database
Can be non-procedural (SQL) or procedural (OO)
View Definition Language (VDL)


Define the schema, create the catalog
Define external views
Storage Definition Language (SDL)

How data is stored on disk
13
Data Languages (continued)


An ordinary programming language (e.g. C, Java) is
a host language
DML embedded in a host language is a data
sublanguage


May require a preprocessor
May be an Application Programming Interface (API)


Queries, e.g. may be Character Strings
DML outside of a host language is a “Query
Language”

Even if it can update the database
14
Three Schema Architecture
Bottom Line


Nice idea
Reality:


The three levels are not cleanly separated
Applications (external) are reasonably well insulated from
internal changes


Except, of course, for performance
Example:SQL



There is no separate VDL, this is part of DDL
DDL can include physical information,e.g. indices
Conceptual schema is simply table definitions with the
physical portion removed
15
Database Components


See Figure 2.3, Text Book
Applications can use either precompiled
interface or “ad hoc” interface


Ad hoc interface is the same one used by
Interactive Query tools
Stored Data Manager can include Buffer
Management

Or this may be left to Operating System
16
Database Utilities

“Bulk loading” of data into the database



Backup and Restore



May include incremental backup
Knows DBMS structure, so can save consistent picture of
the database
Reorganization


Convert from an external text format for data to the DBMS
internal format
Integrated with the DBMS for efficiency
Schema changes and/or performance improvement
Performance monitoring

Help to improve database performance
17
Application Development

Data Dictionary



Host language API



For almost any host language
ODBC (Open Database Connectivity)
Host language precompilers


Description of all the data
All the ancillary information, e.g. units of measure
Convert embedded DML (typically) into host language API
Application Development Environments

Can be extremely sophisticated
18
Two-Tier and Three-Tier Database
Applications
Two-tier
Three-tier
Client
Client
Application
and/or
Web Server
Database
Database
19
Classification of DBMS


Data Model
Single or Multi-User



General or special purpose
Centralized or Distributed


Multi-User typical
Distributed: Homogeneous or Federated
Transaction processing (OLTP) versus
Decision Support (OLAP)
20