Software Engineering Syllabus

Download Report

Transcript Software Engineering Syllabus

Chapter 9
Database Systems
Introduction to CS
1st Semester, 2012 Sanghyun Park
Outline








Database Fundamentals
Data Models
Relational Model
Object-Oriented Databases
Maintaining Database Integrity
Traditional File Structures
Data Mining
Social Impact of Database Technology
(skip)
(skip)
(skip)
Database Fundamentals






A database is a collection of data that is ________, not
necessarily physically, _________
A database management system (_______) defines,
creates, and maintains a database
The DBMS also allows users _________ access to data
in the database relying on schemas and subschemas
A ________ is a description of the entire database
structure used by the DBMS
A __________ is a description of only that portion of the
database ________ to a particular user’s needs
Databases have evolved as a means of _________
data storage systems
File vs. Database Organization (1/2)


_________ data
___________ problem
File vs. Database Organization (2/2)
Layered Approach to
Database Implementation

A typical database system consists of ____ layers –
an _________ layer and a database management layer

Advantages of this dichotomy



Allows for the construction and use of _________ tools
Provides a means for controlling access to the databases
Achieves data _____________
Database Models

A database model defines the _______ design of data

The model also describes the ____________ between
different parts of data

In the history of database design, three models have
been in use:



Hierarchical model
Network model
Relational model
Hierarchical Model


Data are organized as a ____
Each entity has only ___ parent but can have several
children
Network Model


Entities are organized in a _____, where some
entities can be accessed through _______ paths
There is __ hierarchy
Relational Model (1/2)


Data are organized in two-dimensional ______
called _________
There is no hierarchical or network structure imposed on
the data
Relational Model (2/2)

Each column in a relation is called an _________

The total number of attributes for a relation is called the
______ of the relation

Each row in a relation is called a _____

The total number of rows in a relation is called the
__________ of the relation

The cardinality of a relation changes when tuples are
added or deleted; this makes the database _________
Issues of Relational Design (1/4)



Lack of efficiency due to __________
Possibility of information ____ when deleting a tuple
The source of the problems is that we have combined
more than one concept into a ______ relation
Issues of Relational Design (2/4)
Issues of Relational Design (3/4)
ASSIGNMENT relation
Job Id
Start Date
Term Date
S25X
F5
S25Z
3-1-1999
10-1-2001
5-1-2001
4-30-2001
*
*

This database contains information about employees in
EMPLOYEE relation, about available jobs in JOB
relation, and about job history in ASSIGNMENT relation

Additional information is implicitly available by ________
the information from _________ relations
Issues of Relational Design (4/4)

Finding the department in which employee 23Y34 has
worked
Relational Operations: SELECT
Relational Operations: PROJECT
Relational Operations: JOIN (1/3)
Relational Operations: JOIN (2/3)
Relational Operations: JOIN (3/3)
Issues of Implementation





The data in a database are actually stored in a ____
storage system
A DBMS allows the application software to be written in
terms of a ______________
It is the duty of the ______ to accept commands in terms
of the relational _____ and convert them into actions
relative to the actual storage structure
The simplest way for a DBMS to implement a relation is
to store it as a __________ file
To provide rapid access to entries, the DBMS would
store the relation as an _______ file or utilize ________
techniques
Structured Query Language (1/2)

The language called ____ (Structured Query Language)
is used extensively by application software programmers
for manipulating databases

A query involving a combination of SELECT, PROJECT,
and JOIN operations can be expressed as a _____ SQL
statement

We should read a SQL statement as a description of the
information ______ rather than a _________ of activities
to be performed  not procedural but __________
Structured Query Language (2/2)

SELECT statement
select
from
where

INSERT statement
insert
values

into EMPLOYEE
(‘42Z12’, ‘Sue A. Burt’, ‘33 Fair St.’, ‘444661111’)
DELETE statement
delete
where

EMPLOYEE.Name, ASSIGNMENT.StartDate
EMPLOYEE, ASSIGNMENT
EMPLOYEE.EmplId = ASSIGNMENT.EmplId
from EMPLOYEE
Name = ‘G. Jerry Smith’
UPDATE statement
update
set
where
EMPLOYEE
Address = ‘1812 Napoleon Ave.’
Name = ‘Joe E. Baker’
Maintaining Database Integrity (1/2)

___________ is a sequence of operations that must all
happen together (e.g., money transfer between accounts)

Before a transaction is allowed to alter the database,
the alteration to be performed is first recorded in the ___

The point at which all the steps in a transaction have
been recorded in the log is called the ____________

At this point DBMS becomes ________ to the transaction

If problems arise before a transaction has reached its
commit point, the log can be used to _______ (undo) the
activities actually performed by the transaction
Maintaining Database Integrity (2/2)

Simultaneous access problems



Incorrect summary problem
Lost update problem
______ is used for preventing others from accessing
data being used by a transaction


Shared lock is used when ________ data
Exclusive lock is used when _________ data
Traditional File Structures

A ___ is an external collection of related data treated as
a unit

The primary purpose of a file is to _____ data

Files are stored in what are known as ________ or
_________ storage devices such as disk and tape

For our purposes, a file is a collection of data _______
with each record consisting of one or more _____

The _____ method determines how records can be
retrieved: sequentially or randomly
Taxonomy of File Structures
Sequential Files
While Not EOF
{
Read the next record
Process the record
}
Applications of Sequential Files

The sequential file is used in applications that need to
access ___ records from beginning to end

For example, if personal information about each
employee is stored in a file, we can use sequential
access to retrieve each record at the end of the month
to print the paychecks

The sequential file is not efficient for _______ access
Indexed Files

To access a record in a file randomly, we need to know
the _______ of the record

An indexed file is made of a data file, which is a
sequential file, and an _____

The index itself is a very ____ file with only two fields:
the ___ of the sequential file and the ________ of the
corresponding record on the disk
Mapping in an Index File
Logical View of an Indexed File
Hashed Files

In an indexed file, the index _____ the key to the
address; A hashed file uses a ________ to accomplish
this mapping

The hashed file eliminates the need for an ____ file
(index)

However, we will see that hashed files have their own
problems
Modulo Division Hashing Method
Collision (1/2)

The population of keys is greater than the ______ of
records in the data file

Because there are many keys for each address in the
file, there is a possibility that more than one key will
hash to the _____ address in the file

We call the set of keys that hash to the same address
________

A _______ is the event that occurs when a hashing
algorithm produces an address for an insertion key, and
that address is already occupied
Collision (2/2)
Collision Resolution: Open Addressing
Collision Resolution:
Linked List Resolution
Collision Resolution: Bucket Hashing