Flat Files and Relational Databases Presentation
Download
Report
Transcript Flat Files and Relational Databases Presentation
Databases
Flat Files
&
Relational Databases
Learning Objectives
Describe flat files and databases.
Explain the advantages that using a
relational database gives over flat files.
Flat Files
A Flat File
All the data is held in a single file which is
two dimensional.
Rows for records and columns for fields.
E.g. A stock flat file
Field Name
Data Type
Description
String
Cost Price
Currency
Selling Price
Currency
Number in Stock Integer
Reorder Level
Integer
Supplier Name
String
Supplier Address String
A Set / Series of Flat Files
Flat files which have at least one common
field which is duplicated but not linked.
E.g. A Set / Series of Flat Files
Purchasing
Department
Sales
Department
Accounts
Department
Programs to place
orders when stocks
are low
Programs to
record orders
from customers
Programs to
record accounts
of customers
File containing Stock Code,
Description, Re-order level,
Cost Price, Sale Price, Supplier
name and address, etc
File containing Stock Code,
Description, Number sold,
Sale Price, Customer name
and address, etc.
File containing Customer
name and address, amount
owing, dates of orders, etc.
Limitations of Flat Files
Separation and isolation of data
Which customers have bought parts
produced by a particular supplier?
Find the parts supplied by a particular supplier
from one file and then use a second file to find
which customers have bought those parts.
Problem compounded if data is needed
from more than two files.
Duplication of data
Supplier details duplicated if a supplier supplies
more than one part.
Customer details held in two different files.
Data entered more than once, therefore time,
space and money are wasted.
Loss of data integrity:
What happens if a customer changes his address?
The Sales Department may update their files but the
Accounts Department may not do this at the same time.
If the Order Department order some parts and there is
an increase in price.
The Order Department increases the Cost and Sale prices
but the Accounts Department do not, there is now a
discrepancy.
Data dependence
If there is a need to change data formats,
whole programs have to be changed.
Different applications may hold the data in
different forms, again causing a problem.
If an extra field is needed in a file, all
applications using that file have to be
changed, even if they do not use that new
item of data.
Incompatibility of files
If one department writes its applications in
COBOL and another in C.
Departments may find it difficult to read each
other’s files.
Fixed queries and the proliferation
of application programs
Each time a new query is needed, a new
program has to be written.
Difficult if the data needed to answer the
query is in more than one file.
Compounded if some files are incompatible.
Relational Databases
Note:
The term database on its own means a relational database.
Database Management System
(DBMS)
Relational Databases use DBMS to
provide a structure at three different
levels:
Internal Level
Conceptual Level
External Level
Internal Level
The relational database physically stores on
some form of storage medium all data in one file
structure as a physical table with a data
dictionary.
Physical table:
Contains all fields for all types of users (e.g. departments).
Data dictionary:
Holds information about the database necessary to enable it
to work in different views for different types of users e.g.
departments.
Structures of logical / virtual tables or views, Relationships,
Fields (names, data types, validation, descriptions etc…).
Conceptual Level
A manager uses a data description/definition language
(DDL) to:
1. Create two or more
logical/virtual tables
or views and define
one or more common
linked fields called
relationships.
2. Set the data types
and write descriptions
of each field.
3. Set validation.
http://databases.about.com/od/sql/a/sqlfundamentals_2.htm
External Level
Users use a Data Manipulation Language
(DML) to store, access, change, delete and
search for data.
Data Manipulation Language (DML)
INSERT
To add a new employee to a “personal_info” table:
INSERT INTO personal_info values('bart','simpson',12345,$45000)
SELECT
Select everything from a “personal_info” table”:
SELECT * FROM personal_info
Extract a list of the last names of all employees in the company:
SELECT last_name FROM personal_info
Who has a salary value greater than $50,000?
SELECT * FROM personal_info WHERE salary > $50000
UPDATE
Gives all employees a 3% cost-of-living increase in their salary:
UPDATE personal_info SET salary = salary * 1.03
An employee (ID = 12345) is due a $5,000 raise.
UPDATE personal_info SET salary = salary + $5000 WHERE employee_id =
12345
DELETE
An employee (ID = 12345) has been laid off.
DELETE FROM personal_info WHERE employee_id = 12345
http://databases.about.com/od/sql/a/sqlfundamentals_3.htm
http://en.wikipedia.org/wiki/Data_manipulation_language
Advantages of Relational
Databases
Through the use of DBMS:
Different logical/virtual tables or user views of
data can be created.
Views:
A virtual or logical table composed of the result set of a
query.
Unlike ordinary tables a view is not part of the physical
schema: it is a dynamic, virtual table computed or
collated from data in the database.
Easier to access the data because data is
now accessible through queries using a Data
Manipulation Language (DML).
Advantages of Relational
Databases
All data can be accessed through this one
relational database but different tables can
be suited to different requirements (shows
only relevant information).
Less duplication of data held because data
is not duplicated across different files,
consequently there is less danger of data
integrity being compromised.
Less chance of one copy of data altered when
another is not.
Advantages of Relational
Databases
Data manipulation / input can be achieved
more quickly as there is only one copy of
each piece of data.
Flat files need to be compatible, this
problem does not arise with a database.
Vocabulary for relational databases
Fields / columns = attributes
Records / rows = tuples
Tables = relations
Table
description / shorthand notation / design:
Example table:
Employee (EMP) table for a company.
Note that an NINumber is a unique number given by the UK government to every
person who works in the UK, for tax purposes.
EmpID
NINumber
Name
Address
Table: EMP (EmpID, NINumber, Name, Address)
The words in brackets separated by a , = Fields / Columns / Attributes
Keys
Primary Keys
Primary Key:
Unique attribute (or set of attributes) used to
identify the record or tuple.
E.g. EMP (EmpID, NINumber, Name, Address)
EmpID & NINumber could be primary keys as they are
unique for each employee. However, it would be
standard to choose EmpID to be the primary key.
Note that it is conventional to underline the primary
key and I will do so on the next few slides.
Foreign Key
An non-primary key attribute/field in one
table which is also the primary key in
another table and so links the two tables
together.
E.g.
CINEMA (CID, Cname, Loc, MID)
CID = Cinema ID, Cname = Cinema name, Loc = Location, MID = Manager ID
&
MANAGER (MID, Mname)
Mname = Manager name
MID occurs in CINEMA and is the primary key in
MANAGER. When discussing CINEMA we say MID is
the foreign key.
Secondary Keys
A different attribute (other than the primary key) that allows
the data (the same record as the primary key or other groups of
related records – so not necessarily unique) to be accessed in
a different way.
E.g. EMP (EmpID, NINumber, FName, LName, Address, Dept)
NINumber could be a secondary key to allow access to the
Employee’s record in a different way e.g. if EmpID is not known.
Dept could also be a secondary key as it allows access to all
employees in a particular department.
The setup of one or more secondary keys allows these fields to
be “indexed” so that searching by these fields is faster than ones
that are not set to be possible secondary keys.
For example, it is very unlikely that we would search the EMP table
above by FName only, so we would not set FName to be a
secondary key (even though it would still be possible to search by FName,
it just would be slower than searching by a secondary key such as
NINumber or Dept).
However, Secondary keys can be a combination of fields e.g.
FName+LName to allow a record to be accessed if neither the
EmpID or the NINumber is known (but note that it is possible that two or
Plenary
What is meant by a flat file?
Plenary
All the data is held in a single file which is
two dimensional.
Rows for records and columns for fields.
Plenary
Describe the advantages of using a
relational database over flat files.
Plenary
All data can be accessed through this one
relational database but different tables can be
suited to different requirements (shows only
relevant information).
Less duplication of data held because data is
not duplicated across different files,
consequently there is less danger of data
integrity being compromised.
Less chance of one copy of data altered when
another is not.
Plenary
Data manipulation / input can be achieved
more quickly as there is only one copy of
each piece of data.
Flat files need to be compatible, this
problem does not arise with a database.
Through the use of DBMS:
Different user views of data can be created
Easier to access the data because data is
now accessible through relations / queries.
Plenary
What is meant by terms primary key,
secondary key and foreign key?
Primary Keys
Primary Key:
Unique attribute (or set of attributes) used to
identify the record or tuple.
E.g. EMP(EmpID, NINumber, Name, Address)
EmpID & NINumber could be primary keys as they are
unique for each employee.
Secondary Keys
Secondary
A different attribute that allows the data to be
accessed in a different order.
E.g. EMP(EmpID, NINumber, Name, Address)
If EmpID is the primary key then NINumber is the
secondary key.
Foreign Keys
Foreign
The primary key of another file / table /
relation that is used to link files / tables /
relations together.
E.g.
CINEMA(CID, Cname, Loc, MID)
&
MANAGER(MID, MName)
MID occurs in CINEMA and is the primary key in
MANAGER. When discussing CINEMA we say MID
is the foreign key.