Database Systems

Download Report

Transcript Database Systems

Introduction to Databases
Lecture One
Relational and Object-Oriented Database Concepts
Introduction to Database Design and Techniques
1
Examples of Database Applications
Purchases from the supermarket
 Purchases using your credit card
 Booking a holiday at the travel agents
 Using the local library
 Taking out insurance
 Using the Internet
 Studying at university

2
File-Based Systems

Collection of application programs that
perform services for the end users (e.g.
reports).

Each program defines and manages its
own data.
3
Limitations of File-Based Approach

Separation and isolation of data
» Each program maintains its own set of data.
» Users of one program may be unaware of
potentially useful data held by other programs.

Duplication of data
» Same data is held by different programs.
» Wasted space and potentially different values
and/or different formats for the same item.
4
Database Approach

Arose because:
» Definition of data was embedded in application
programs, rather than being stored separately and
independently.
» No control over access and manipulation of data
beyond that imposed by application programs.

Result:
» the database and Database Management System
(DBMS).
5
Database



Shared collection of logically related data (and a
description of this data), designed to meet the
information needs of an organization.
System catalog (metadata) provides description
of data to enable program–data independence.
Logically related data comprises entities,
attributes, and relationships of an organization’s
information.
6
Database Approach

Controlled access to database may include:
» A security system.
» An integrity system.
» A concurrency control system.
» A recovery control system.
» A user-accessible catalog.

A view mechanism.
» Provides users with only the data they want or need to
use.
7
Views
Allows each user to have his or her own
view of the database.
 A view is essentially some subset of the
database.

8
Views

Benefits include:
» Reduce complexity;
» Provide a level of security;
» Provide a mechanism to customize the
appearance of the database;
» Present a consistent, unchanging picture of the
structure of the database, even if the underlying
database is changed.
9
History of Database Systems

First-generation
» Hierarchical and Network

Second generation
» Relational

Third generation
» Object-Oriented
10
The DBMS Marketplace
Relational DBMS companies – Oracle,
Sybase – are among the largest software
companies in the world.
 IBM offers its relational DB2 system. With
IMS, a non-relational system, IBM is by some
accounts the largest DBMS vendor in the
world.
 Microsoft offers SQL-Server, plus Microsoft
Access for the cheap DBMS on the desktop

11
Terminology





Database: persistent collection of data
Database Management System (DBMS): software
that controls access to the database
Database Administrator (DBA): person who controls
database
Data Model: general structure of the data in the
database
Data Language: commands used to define the data
model and give users access to the database
12
Utility of Databases
Data has value independent of use
 Organized approach to data management
 Eliminate redundancy in data
 Share data
 Archive data
 Security of data
 Integrity of data

13
DB Terms and Techniques



Database access is a key feature of current
enterprise computing
Relational DB: tables
To link/merge tables and extract/write information:
Structured Query Language (SQL) – language of all
modern databases (but many dialects)


SQL is transparent; operates with statements like
SELECT, INSERT, DELETE, etc.
SQL provides its result sets in table format
14
DB and the Internet

One vs. multiple user access

Internet browsers make it easy to access database
programs (compared with traditional client/server
programs)
15
Relational Database Model

Database
» Database is a collection of tables (relations)
» Data are stored in tables

Tables
» Each table has a name
» Each table has a set of columns (fields) and rows of data
(records)
» Each table has a fixed number of columns
» Each table has an arbitrary number of rows


Based on set theory
SQL (Structured Query Language)
» DBMS independent language
16
Database Columns (Fields)

Columns
»
»
»
»
Each column has a name
Columns are accessed by name
No standard column ordering
Data in a column belongs to a particular domain
– Columns are the “attributes” of the dataset
– Each value in a column is from the same domain
– Each value in a column is of the same data type
17
Database Rows (Records)

Rows
»
»
»
»
»
»
»
Each row entry is either a simple value or empty ("null")
Rows are sets of values for the columns (attribute values)
Primary key: a set of columns that uniquely identifies each row
Each row must be unique given the primary key (no duplicates)
Rows are referenced by the primary key
Row order cannot be determined by the user
Does not make sense to say “the fourth row” like it does in a
“paper” table or spreadsheet
18
Data Types

Each row value is an instance of a primitive data type
»
»
»
»

Integer
Real (e.g., number, currency
Character (e.g., text, hyperlink, yes/no)
Date/Time
No complex types in standard DBMS (matrix, drawing)
» MS Access will allow drawings and some objects
» Object oriented databases may allow objects and structures

Non existent value is “null”
19
Database Design


Database design deals with how to design a database
Importance of Good Design
» Poor design results in unwanted data redundancy
» Poor design generates errors leading to bad decisions

Practical Approach
» Focus on principles and concepts of database design
» Importance of logical design
20
Database Design Goals

Create a balanced design which is good for all users
Based on a set of assumptions about the world being
modeled
Determine the data to be stored
Determine the relations among the data
Determine the operations to be performed

Specify the structure of the tables




21
Database Design Process
1.
2.
3.
4.
5.
6.
7.
Identify all the objects, entities, and attributes
Identify all the dependencies, draw a dependency
diagram
Design tables to represent the data items and
dependencies
Verify the design
Implement the database
Design the queries
Test and revise
22
Identify All Objects and Entities


Determine the objects of your Database
For each object, describe each entity to be stored
» example: better to store first name and last name separately

Determine the data type for each item
» text, currency, date, etc.

Determine the range of allowable values for each item
»
»
»
»
»
»
non-negative?
greater than zero?
decimal points?
any of the 50 state abbreviations
zip code between 00000 and 99999
phone number
23
Turn Data Items into Attributes

Each attribute should have:
» a meaningful name
» a description of what the attribute means or what
kind of data make up the attribute
» a domain
– the data type of the attribute
– the range or a list of allowable values of the attribute
24
Identify All the Dependencies

Assume a set of relationships between data items
» a model of the world
» may have to make assumptions
» these assumptions should be listed clearly

Turn these relationships into dependencies
» single-valued : there is one and only one value of ‘x’ for every value of ‘y’
– a person Y receives a grade X for a course in a semester
– a person Y has a birth date X
» multi-valued : there are zero (or one) or more values of ‘x’ for every value of ‘y’
– a student Y enrolls in one or more classes (X) each semester
– a person Y has zero or more sisters

Draw a dependency diagram
25
Single-Valued (One-to-One) Dependencies

Draw a single-headed arrow for single-valued
dependencies
a person has one and only one birth date
PERSON
BIRTHDATE
a student has one and only one final grade for a course
STUDENT
FINAL
COURSE
GRADE
26
Multi-Valued (One-to-Many) Dependencies

Draw a double-headed arrow between
multi-valued dependencies
a student can enroll in one or more classes
STUDENT
CLASSES
a person has zero or more sisters
PERSON
SISTERS
27
Independent vs. Dependent Attributes

Some attributes are independent
» E.g., in a business – client relationship, your client’s phone
number does not depend on when you are scheduled to
meet him
» your client still exists whether or not you have an
appointment with him

Some attributes are dependent
» the length of a side rails on a bridge is dependent on the
structure of the bridge
» the side rails of a bridge would not exist if the bridge itself
was not there
28
Dependent vs. Independent
Attribute Representation

Start a new bubble around an independent attribute
» properties of that attribute are attached to the new bubble
» properties that are dependent on other attributes are attached to the
old bubble
» Each appointment is with one or more clients. Each appointment
with one or more clients has a time. Each client has a single phone
number.
APPOINTMENT
PHONE
NUMBER
CLIENT
TIME
29
Design the Tables
Draw a dependency diagram
 Each dependency statement is a part of the
diagram
 Each statement is a single path through the
diagram
 Tables are formed by traversing the dependency
diagram

30
Traversing the Dependency Diagram


Choose an attribute at the end of a path
Follow the chain of arrows upwards
» each multi-valued dependency on the path becomes a
primary key for the table
» combine all single-valued attributes at first level up into a
single table
» all attributes on the path should be included in the table
» stop when you reach a bubble that has no arrows coming
into it
» each path becomes a separate table


Mark off your traversed path
Repeat until all paths have been traversed
31
Verify the design

Inspect your tables
» are all of the data included?
Do you have too many tables? too few?
 If your design does not appear correct

» go back to step 1
» you must repeat all steps of process in order
» do not try to “rearrange” dependency diagram to
give you the tables you think you should have
32
Common Database Design Mistakes





Assuming the order of rows and columns is known
» this is not a spreadsheet!
» do not assume sorted order unless you explicitly sort
Guessing the design, not following the process
Storing what you can compute (when the value will change) – transitive
dependency
» e.g., do not store age if you are already storing birth date
Represent multi-valued dependencies in fixed size sets
» if you know that there are exactly X number of something, create X singlevalued dependencies, otherwise use multi-valued dependency
Adding a key when a unique value exists
» adding an ID number for each person when you are already storing their
social security number
33
Results

If you follow the process correctly
» you will not have redundant data
» you will not lose unrelated data when you delete
values
Databases with these characteristics are called
3NF (Third Normal Form) databases
 Normalization to be discussed later in the course

34
Program
SQL
Design
SQL
Design
Tools:
Database Design
SQL (queries)
Programming
Program
Goal: Build a Business Application
Best:
Spend your time
on design and SQL.
Worst:
Compensate for poor design
and limited SQL with programming.
35
Application Development
tasks
Feasibility
Identify scope, costs, and schedule
Analysis
Gather information from users
Design
Define tables, relationships, forms, reports
Development
Create forms, reports, and help; test
Implementation
Transfer data, install, train, review
time
36
DBMS Features/Components

Database engine
» Storage
» Retrieval
» Update
Query Processor
 Data dictionary
 Utilities
 Security

Report writer
 Forms generator
(input screens)
 Application generator
 Communications
 Programming
Interface

37
DBMS Engine, Security, Utilities
Product
OrderItemID Description
887
Dog food
OrderID
ODate
Customer 946 Cat food
9874
3-3-97
CustomerID Name
9888
3-9-97
1195
Jones
2355
Rojas
Product
Customer
ItemID
Integer, Unique
CustomerID
Description Integer,
Text, 100
Unique
char
Name
Text, 50 char
Data
Tables
Database
Engine
Data
Dictionary
User Identification
Access Rights
Security
Concurrency and
Lock Manager
Backup and
Recovery
Utilities
Administration
38
Database Tables (MS Access)
39
Database Tables (Oracle)
40
DBMS Report Writer
All Data
Database Engine
Data Dictionary
Query Processor
Report Writer
Report
Format
and Query
41
Report Writer (Oracle)
42
DBMS Input Forms
All Data
Database Engine
Data Dictionary
Query Processor
Form Builder
Input
Form
Design
43
DBMS Components
All Data
Communication
Network
Database Engine
Data Dictionary
Security
3GL
Connector
Query Processor
Form
Report
Builder
Writer
Application
Generator
Program
44
Relational Database
Customer(CustomerID, Name, …
Order(OrderID, CustomerID, OrderDate, …
ItemsOrdered(OrderID, ItemID, Quantity, …
Items(ItemID, Description, Price, …
45
Object-Oriented DBMS
Order
OrderID
CustomerID
…
NewOrder
DeleteOrder
…
OrderItem
OrderID
ItemID
…
OrderItem
DropOrderItem
…
Customer
CustomerID
Name
…
Add Customer
Drop Customer
Change Address
Item
ItemID
Description
…
New Item
Sell Item
Buy Item …
Government
Customer
Commercial
ContactName
Customer
ContactPhone
ContactName
Discount, …
ContactPhone
…
NewContact
NewContact
46
Objects


Object Definition
» Object Name
» Properties
» Methods
Inheritance
» Combine into one
table.
» Use multiple tables
and link by primary
key.
– More efficient.
– Need to add rows
to many tables.
Class name
Properties
Methods
Customer
CustomerID
Address
Phone
AddCustomer
DropCustomer
Inheritance
Commercial
Government
Contact
VolumeDiscount
Contact
BalanceDue
ComputeDiscount
BillLateFees
AddCustomer
47
Objects in a Relational Database



Separate inherited classes.
Link by primary key.
Adding a new customer
requires new rows in each
table.
Customer
CustomerID
Address
Phone
CommercialCustomer
CustomerID
Contact
VolumeDiscount
GovernmentCustomer
CustomerID
Contact
BalanceDue
48
OO Difficulties: Methods
IBM Server
Unix Server
Database Object
Personal Computer
Database Object
Customer
Method:
Add New Customer
Program code
Application
Customer
Name
Address
Phone
How can a method run
on different
computers?
Different processors
use different code.
Possibility: Java
49
End of Lecture
50