CS 430 Database Theory
Download
Report
Transcript CS 430 Database Theory
CS 430
Database Theory
Winter 2005
Lecture 1: Introduction
1
What’s a Database
“Collection” of “related” “data”
Contains data about some aspect of the “real
world”
Refers to a Universe of Discourse (UoD)
A “logically coherent” collection of data
Has a “specific purpose”
2
Typical Characteristics of Databases
(1 of 3)
“Large”
Typically bigger than a spreadsheet
May be very large
Example, IRS Tax return database:
About 200M returns per year, 5 year retention
About 1K-10K bytes per return (guess)
About 1 – 10 Terabytes (without overhead)
Shared
More than single user and single application
3
Typical Characteristics (2 of 3)
Structured
More than a simple flat table
Self describing
Contains Metadata (data about data) describing
the data contained in the database
Metadata maintained separately from applications
that use and manipulate the data
Has a Catalog which is a “database” of the
Metadata
4
Typical Characteristics (3 of 3)
Supports multiple views of the data
Different users and applications can view the data
differently
ACID properties
Atomicity – Atomic transactions (updates are all or
nothing)
Consistency – Enforces integrity constraints
Isolation – Transactions are isolated from each
other
Durability – Data from completed transactions is
never lost
5
A Little History of Databases (1 of 3)
Mid to late 1960s - first databases
Applications
Multiple data models
Maintain parts data for Lunar Lander
Airline reservations
Hierarchical, Network, Inverted File System
Early, mid 1970s - Relational data model
Edgar Codd – Father of Relation database
Basis for SQL (Structured or Standard Query
Language)
6
History (2 of 3)
1979 – Oracle Version 2
Late 1980s – IBM DB2 Version 1
Initial version (marketing decision)
Incomplete and slow
Used to define the SQL standard
Late 1980s – Object Oriented databases
Created to manage data for “non-traditional”
applications
7
History (3 of 3)
1990s – Object Relational Databases
Pioneered by Michael Stonebraker
Today
Dominant technology: Relational DBMS (RDBMS)
Oracle, MS SQL Server, IBM DB2, …
MySQL, PostgreSQL, …
OO capabilities being added to RDBMS
New: Object-Relational Mapping Software
Try to handle “impedance mismatch” between RDBMS
and OO programming languages
8
Database Applications (1 of 2)
Traditional
Business applications
Traditional data types
Numbers, strings, dates
Data warehousing
Personnel, accounting, ...
Student and Course data
Large “historical” databases for analytic support
Manufacturing Control
Real-time issues
9
Database Applications (2 of 2)
Non-traditional
Image and Video
GIS (Geographic Information Systems)
Engineering
Time Series
CAD (Computer Aided Drafting or Design)
Stock market data
Full text search
Environmental and Remote Sensing
10
Data Base Management System (DBMS)
Software that manages and or facilitates
Data definition
Data construction
E.g. loading data into the database
Data manipulation
E.g. creating and maintaining the catalog
Applications retrieving and updating the database
Data sharing
ACID properties
11
DBMS In Context
Users/Programmers
Application Programs
External Queries
Database System
DBMS Software
Query Processing
Application Program Interface
Access/Update Stored Data
Metadata
Catalog
The Data
Elmasri and Navathe, Figure 1.1, Page 6
12
Database People (Actors) (1 of 2)
Data Administrator
Database Administrator
Configure DBMS, manage data storage, DBMS
performance tuning
Database Designer
Responsible for correctness of the data
Design the database
All three of these may be same person or
group of people
13
Database People (2 of 2)
Application Analysts and Developers
Responsible for analyzing, designing, building,
and maintaining database applications
End Users
Use the database to accomplish useful work
14
Why use a DBMS? (1 of 2)
Manage redundancy
Access Control
If the same data is stored multiple times (often
enough, without periodic reconciliation) it is
guaranteed to be inconsistent
Not all the users can view and/or update all the
data
Persistent storage of program data
Rather than having to implement your own DBMS
internal to your application
15
Why a DBMS? (2 of 2)
Efficiency
Enforce integrity constraints
Defined and enforced once
Share data
DBMS vendors have done a lot of work to make their
products work efficiently
Mixed blessing (see “Why not to use a DBMS?”)
Among multiple applications, GUIs, users
ACID Properties
Difficult to implement correctly
16
Why not to use a DBMS?
Learning curve
“It takes four years to learn to be an Oracle DBA”
Overhead costs (time and space)
Generality
Concurrency and transactions
Multiple application and user access
Complex data structures
Rule of thumb: Using an RDBMS doubles the space
required for the data (e.g. versus a text file)
17
Course Administration
Course web site
Textbook
http://faculty.cs.wwu.edu/reedyc/CS_430_Winter_2005
Email: [email protected]
Elmasri, Navathe, Fundamentals of Database Systems,
Fourth Edition
Assignments
Use MySQL
Most convenient form of access?
Get hands dirty:
Design a database
Create database and load the data
Write a database application
18
Course Outline (1 of 2)
Introduction to Databases
Introduction to Data Modeling
Chapter 3 (partial)
Relation Data Model, Algebra, and Calculus
Chapters 1 and 2
Chapters 5, 6
Functional Dependencies and Normalization
Chapters 10 and 11 (partial)
19
Course Outline (2 of 2)
SQL Database Programming
Entity-Relationship Modeling
More of chapters 3, 4, and 7
Overview: What’s inside a DBMS?
Chapters 8 and 9
CS530, Chapters 13-19
Overview of additional topics
Object-Oriented and Object Relational DBMSs
XML in Databases
20