Distributed Database Design

Download Report

Transcript Distributed Database Design

Distributed Database Design
COSC 5040
Week One
Outline
Introduction
Course overview
Database systems concepts
Relational database model
Structured query language (SQL)
Webster University
Distributed Database Design
Jiangping Wang
Database System Concept
Data
Known facts
Database
A collection of related data
Database Management System (DBMS)
A software system to facilitate the defining,
constructing, manipulating, and sharing of a
computerized database
Database System
The DBMS software together with the data itself
Sometimes, the applications are also included
Webster University
Distributed Database Design
Jiangping Wang
Typical DBMS Functionality
Define a database
Construct and load the database
Manipulating the database
Querying, generating reports, insertions,
deletions and modifications
Concurrent processing and sharing
Protection or security
Presentation and visualization
Webster University
Distributed Database Design
Jiangping Wang
Database System Environment
Webster University
Distributed Database Design
Jiangping Wang
Example of a Database
Figure 1.2 A database that stores student and course information.
Webster University
Distributed Database Design
Jiangping Wang
Example of a Database
Webster University
Distributed Database Design
Jiangping Wang
Database Manipulation
Database manipulation involves
querying and updating
P. 9
Examples of queries
Examples of updates
Webster University
Distributed Database Design
Jiangping Wang
Database Approach Characteristics
Self-describing nature of a database system
Meta-data
Insulation between programs and data, data
abstraction
Program-data independence
Support of multiple views of the data
Virtual data
Sharing of data and multi-user transaction
processing
Concurrency control
Webster University
Distributed Database Design
Jiangping Wang
Database Users
Actors on the scene
Database administrators (DBA)
Authorizing access to the database
Acquiring software, and hardware resources
Controlling and monitoring efficiency of operations
Database designers
Define content, structure, constraints, and functions or
transactions
Communicate with the end-users
End-users
Queries, reports
Update the database content
Actors behind the scene
Webster University
Distributed Database Design
Jiangping Wang
Database Users
Webster University
Distributed Database Design
Jiangping Wang
Advantages of Database Approach
Controlling redundancy
Restricting unauthorized access
Providing persistent storage
Providing storage structures for efficient query
processing
Providing backup and recovery
Providing multiple interfaces
Representing complex relationships among data
Enforcing integrity constraints
Drawing inferences and actions
Webster University
Distributed Database Design
Jiangping Wang
Historical Development
Early database applications
Hierarchical model
Network model
Relational model based systems
Object-oriented applications: OODBs
and ORDBMSs
Web and e-commerce applications
Database for new applications
Webster University
Distributed Database Design
Jiangping Wang
Data Models
Data model
Data abstraction
A collection of concepts that can be used to
describe the structure of a database
Entities, attributes, relationships
Data types, constraints
Categories of data models
Conceptual (high-level, semantic) data models
Implementation (representational) data models
Physical (low-level, internal) data models
Webster University
Distributed Database Design
Jiangping Wang
Schemas and Instances
Database schema
Description of a database
Schema diagram
Diagrammatic display of a database
schema
Database state
Actual data in the database at a particular
moment in time
Current set of occurrences or instances
Webster University
Distributed Database Design
Jiangping Wang
Schema Diagram
Webster University
Distributed Database Design
Jiangping Wang
Three-Schema Architecture
Webster University
Distributed Database Design
Jiangping Wang
Data Independence
Logical data independence
The capacity to change the conceptual
schema without having to change the
external schemas and their application
programs
Physical data independence
The capacity to change the internal
schema without having to change the
conceptual schema
Webster University
Distributed Database Design
Jiangping Wang
DBMS Languages
Structured query language (SQL)
Data definition language (DDL)
To specify database conceptual schema
Data manipulation language (DML)
To specify database retrievals and updates
DBMS Interfaces
Stand-alone query language interfaces
Programmer interfaces for embedding DML
in programming languages
Webster University
Distributed Database Design
Jiangping Wang
Database System Utilities
To perform certain functions such as:
Loading data stored in files into a database
Data conversion tools
Backing up the database periodically
Reorganizing database file structures
Report generation utilities
Performance monitoring utilities
Sorting, user monitoring, data compression
Data dictionary
Webster University
Distributed Database Design
Jiangping Wang
Client-Server Architectures
Centralized architecture
Client-server architecture
Client
Provide appropriate interfaces and a clientversion of the system to access and utilize the
server resources
Server
Provides services to clients
Database server provides database query and
transaction services to clients
Webster University
Distributed Database Design
Jiangping Wang
Three Tier Client-Server Architecture
Webster University
Distributed Database Design
Jiangping Wang
Classification of DBMS
Based on data model
Relational
Network
Hierarchical
Object-oriented
Object-relational
Other classifications
Single-user vs. multi-user
Centralized vs. distributed
Webster University
Distributed Database Design
Jiangping Wang
Relational Model Concepts
The relational model is based on the
concept of a relation
A relation is a mathematical concept
based on the ideas of sets
Relation: A table of values
Contains a set of rows and columns
Webster University
Distributed Database Design
Jiangping Wang
Example of a Relation
Webster University
Distributed Database Design
Jiangping Wang
Definitions
The schema, or description of a relation
R (A1, A2, .....An)
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
A tuple is an ordered set of values
Each value is derived from an
appropriate domain
A domain is a set of atomic values
Data type or format
An attribute designates the role played
by the domain
Webster University
Distributed Database Design
Jiangping Wang
Definitions
The relation is formed over a subset of the
Cartesian product of the sets
Each set has values from a domain
That domain is used in a specific role which is
the attribute name
Given R(A1, A2, .........., An)
r(R)  dom (A1) X dom (A2) X ....X dom(An)
R: schema of the relation
r of R: a specific "value" or population of R
Webster University
Distributed Database Design
Jiangping Wang
Example
Let R(A1, A2)
Let dom(A1) = {0,1}
Let dom(A2) = {a,b,c}
Then, for example:
r(R) = {<0,a> , <0,b> , <1,c> }
is one possible “state” or “population” or
“extension” r of the relation R, defined
over domains D1 and D2
It has three tuples
Webster University
Distributed Database Design
Jiangping Wang
Definition Comparison
Informal Terms
Formal Terms
Table
Relation
Column
Attribute
Row
Tuple
Values in a column
Domain
Table Definition
Schema of a Relation
Populated Table
State of the Relation
Webster University
Distributed Database Design
Jiangping Wang
Characteristics of Relations
Ordering of tuples in a relation r(R)
The tuples are not considered to be ordered
Ordering of values within each tuple
The attributes in R(A1, A2, ..., An) and the values
in t=<v1, v2, ..., vn> are ordered
Values in a tuple
All values are considered atomic (indivisible)
A special null value is used to represent values
that are unknown or inapplicable to certain tuples
Webster University
Distributed Database Design
Jiangping Wang
Relational Integrity Constraints
Constraints are conditions that must
hold on all valid relation instances
Types of constraints
Domain constraints
Key constraints
Entity integrity constraints
Referential integrity constraints
Webster University
Distributed Database Design
Jiangping Wang
Key Constraints
Uniqueness
A set of attributes of R such that no two
tuples in any valid relation instance r(R)
will have the same value
Minimal
Removal of any attribute results in a set of
attributes that is not a key
If a relation has several candidate keys, one
is chosen to be the primary key
The primary key value is used to uniquely
identify each tuple in a relation
Webster University
Distributed Database Design
Jiangping Wang
Foreign Key
A set of attributes in one relation that
references the primary key in another
relation
Same domain(s)
Value of foreign key either occurs as a
value of primary key or is null
Webster University
Distributed Database Design
Jiangping Wang
Entity and Referential Integrity
Entity integrity
constraint
No primary key value
can be null
Referential integrity
constraint
Foreign key value
can be either an
existing primary key
value or a null value
Webster University
Distributed Database Design
Jiangping Wang
Update Operations
Update operations
Insert a tuple (p. 76)
Delete a tuple (p. 77)
Update a tuple (p. 78)
Maintain integrity constraints
Child insert restrict
Child update restrict
Parent update restrict
Parent delete restrict
Webster University
Distributed Database Design
Jiangping Wang
Relational Database Schema
Webster University
Distributed Database Design
Jiangping Wang
Webster University
Distributed Database Design
Jiangping Wang
Exercise 3.16
Consider the following relations for a database that keeps track
of student enrollment in courses and the books adopted for each
course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Specify the foreign keys for this schema, stating any
assumptions you make.
Webster University
Distributed Database Design
Jiangping Wang
SQL
Structured query language (SQL)
SQL-86 or SQL1
SQL-92 or SQL2
SQL-99 or SQL3
Comprehensive database language
Data definition (DDL)
Data manipulation (DML)
Query
Update
Webster University
Distributed Database Design
Jiangping Wang
Data Definition Language (DDL)
Used to CREATE, DROP, and ALTER the
descriptions of the tables (relations) of
a database
Data types
Numeric
Character string
Boolean
Data/time
Webster University
Distributed Database Design
Jiangping Wang
CREATE TABLE
Specifies its name, its attributes and
their data types
A constraint NOT NULL may be
specified
CREATE TABLE DEPARTMENT
( DNAME VARCHAR(10) NOT NULL,
DNUMBER INTEGER NOT NULL,
MGRSSN CHAR(9),
MGRSTARTDATE CHAR(9));
Webster University
Distributed Database Design
Jiangping Wang
CREATE TABLE
Use the CREATE TABLE command for
specifying
Primary key attributes
Secondary keys, and
Referential integrity constraints (foreign keys)
CREATE TABLE DEPT
( DNAME VARCHAR(10) NOT NULL,
DNUMBER INTEGER NOT NULL,
MGRSSN CHAR(9),
MGRSTARTDATE CHAR(9),
PRIMARY KEY (DNUMBER),
UNIQUE (DNAME),
FOREIGN KEY (MGRSSN) REFERENCES EMP );
Webster University
Distributed Database Design
Jiangping Wang
DROP TABLE and ALTER TABLE
Remove a relation (base table) and its
definition
DROP TABLE DEPENDENT;
Add an attribute to one of the base
relations
ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12);
Webster University
Distributed Database Design
Jiangping Wang
Retrieval Queries in SQL
One basic statement for retrieving
information from a database
SELECT statement
Basic form is a SELECT-FROM-WHERE
block
SELECT <attribute list>
FROM <table list>
WHERE <condition>
Webster University
Distributed Database Design
Jiangping Wang
Simple SQL Queries
Query 0:
Retrieve the birthdate and address of the
employee whose name is 'John B. Smith'
SELECT BDATE, ADDRESS
FROM
EMPLOYEE
WHERE FNAME='John'
AND MINIT='B'
AND LNAME='Smith';
Webster University
Distributed Database Design
Jiangping Wang
Simple SQL Queries
Query 1:
Retrieve the name and address of all
employees who work for the 'Research'
department
SELECT
FROM
WHERE
FNAME, LNAME, ADDRESS
EMPLOYEE, DEPARTMENT
DNAME='Research'
AND DNUMBER=DNO;
DNAME='Research' is a selection condition
DNUMBER=DNO is a join condition
Webster University
Distributed Database Design
Jiangping Wang
Simple SQL Queries
Query 2:
For every project located in 'Stafford', list the
project number, the controlling department
number, and the department manager's last
name, address, and birthdate
SELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATE
FROM
PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN
AND PLOCATION='Stafford';
There are two join conditions
DNUM=DNUMBER relates a project to its controlling
department
MGRSSN=SSN relates the controlling department to
the employee who manages that department
Webster University
Distributed Database Design
Jiangping Wang
Aliases
A query that refers to the same name must
qualify the attribute name with the relation name
Some queries need to refer to the same relation
twice
Query 8:
For each employee, retrieve the employee's name,
and the name of his or her immediate supervisor
SELECT
FROM
WHERE
Webster University
E.FNAME, E.LNAME, S.FNAME, S.LNAME
EMPLOYEE E, EMPLOYEE S
E.SUPERSSN=S.SSN;
Distributed Database Design
Jiangping Wang
Unspecified Where-Clause
Query 9:
Retrieve the SSN values for all employees
SELECT SSN
FROM
EMPLOYEE;
Query 10:
Retrieve the SSN and department name values for
all employees
SELECT SSN, DNAME
FROM
EMPLOYEE, DEPARTMENT;
Resulting CARTESIAN PRODUCT
Webster University
Distributed Database Design
Jiangping Wang
Use of Asterisk *
Q1C:
SELECT
FROM
WHERE
*
EMPLOYEE
DNO=5;
Q1D:
SELECT
FROM
WHERE
*
EMPLOYEE, DEPARTMENT
DNAME='Research' AND
DNO=DNUMBER;
To retrieve all the attribute values
Webster University
Distributed Database Design
Jiangping Wang
Use Of Distinct
To eliminate duplicate tuples in a query
result, the keyword DISTINCT is used
Q11:
SELECT
FROM
SALARY
EMPLOYEE;
Q11A:
SELECT
FROM
Webster University
DISTINCT SALARY
EMPLOYEE;
Distributed Database Design
Jiangping Wang
Set Operations
UNION, MINUS and INTERSECT operations
Query 4:
Make a list of all project numbers for projects that
involve an employee whose last name is 'Smith' as
a worker or as a manager of the department that
controls the project
(SELECT
FROM
WHERE
UNION
(SELECT
FROM
WHERE
Webster University
PNAME
PROJECT, DEPARTMENT, EMPLOYEE
DNUM=DNUMBER AND MGRSSN=SSN
AND LNAME='Smith')
PNAME
PROJECT, WORKS_ON, EMPLOYEE
PNUMBER=PNO AND ESSN=SSN
AND LNAME='Smith');
Distributed Database Design
Jiangping Wang
Substring Matching
Query 12:
Retrieve all employees whose address
is in Houston, Texas
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE ADDRESS LIKE '%Houston, TX’;
Webster University
Distributed Database Design
Jiangping Wang
Arithmetic Operations
Query 13:
Show the resulting salaries if every
employee on the ‘ProductX’ project is
given a 10 percent raise
SELECT FNAME, LNAME, 1.1*SALARY AS INCREASED_SAL
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE SSN=ESSN AND PNO=PNUMBER AND PNAME=‘ProductX’;
Webster University
Distributed Database Design
Jiangping Wang
Ordering of Query Results
Query 15:
Retrieve a list of employees and the
projects they are working on, ordered by
department and, within each department,
ordered alphabetically by last name, first
name
SELECT DNAME, LNAME, FNAME, PNAME
FROM DEPARTMENT, EMPLOYEE, WORKS_ON, PROJECT
WHERE DNUMBER=DNO AND SSN=ESSN AND PNO=PNUMBER
ORDER BY DNAME, LNAME, FNAME;
Webster University
Distributed Database Design
Jiangping Wang
Specifying Updates in SQL
There are three SQL commands to
modify the database
INSERT
DELETE, and
UPDATE
Webster University
Distributed Database Design
Jiangping Wang
INSERT
U1:
INSERT INTO EMPLOYEE
VALUES ('Richard', 'K', 'Marini', '653298653',
‘1962-12-30', '98 Oak Forest,Katy,TX', 'M',
37000, '987654321', 4);
U1A:
INSERT INTO EMPLOYEE (FNAME, LNAME, SSN)
VALUES ('Richard', 'Marini', '653298653');
Webster University
Distributed Database Design
Jiangping Wang
DELETE
U4:
DELETE FROM EMPLOYEE WHERE LNAME='Brown';
DELETE FROM EMPLOYEE WHERE SSN='123456789';
DELETE FROM EMPLOYEE WHERE DNO=5;
DELETE FROM EMPLOYEE;
Webster University
Distributed Database Design
Jiangping Wang
UPDATE
U5:
Change the location and controlling department number of
project number 10 to 'Bellaire' and 5, respectively
UPDATE
SET
WHERE
PROJECT
PLOCATION = 'Bellaire', DNUM = 5
PNUMBER=10;
U6:
Give all employees in the 'Research' department a 10% raise
in salary
UPDATE EMPLOYEE
SET
SALARY = SALARY * 1.1
WHERE DNO IN = 5;
Webster University
Distributed Database Design
Jiangping Wang
Reading and Homework
Readings
Chapter 1, 2, 3, and 4
Week one homework
Webster University
Distributed Database Design
Jiangping Wang