Week 01 - Pravin Shetty > Resume

Download Report

Transcript Week 01 - Pravin Shetty > Resume

Welcome to CSE3180
‘Principles of Database Systems’
Semester 1, 2005
CSE3180 Semester 1, 2005 Week 1 / 1
Introduction to CSE3180
This unit covers many aspects associated with database,
and in this series of lectures, those related to Relational
Data Base (but not exclusively)
I’m : Rod Simpson
My Office is : Room C 4.46 (for Semester 1)
My Phone contact number is : (03)990 32352
My email is [email protected]
CSE3180 Semester 1, 2005 Week 1 / 2
Which stands for
School of Computer Science and Software Engineering
Faculty of Information Technology
Materials for this unit are located at
www.csse.monash.edu.au/courseware/cse3180
CSE3180 Semester 1, 2005 Week 1 / 3
No relationship
CSE3180 Semester 1, 2005 Week 1 / 4
University Notice
Friday 11th March is the last day to add a Semester 1 or Full
Year on-campus unit without the risk of a $73.00 late
enrolment penalty
CSE3180 Semester 1, 2005 Week 1 / 5
Principles of Database Systems
The examination will be closed book.
It will consist of 3 sections:
A:
A number of multi choice questions
B:
A number of short answer questions
C:
Four essay type questions (choice of 6)
CSE3180 Semester 1, 2005 Week 1 / 6
Principles of Database Systems
You should have a copy of the unit outline, which includes
– The recommended text (Hoffer, Prescott and McFadden,
Edition 7 or 8, whichever is available in the Bookshop)
– The Project
– The Certificate of Work Performed if working in a group
– The Tutorial Guide
CSE3180 Semester 1, 2005 Week 1 / 7
Timetable
• The lecture will be held in B2.15 from 10.00am to midday on
Thursdays
• Tutorials will be held on Thursday
in Laboratory K 1.07 from midday to 2.00pm
The Laboratories are booked for 2 hours so that if you wish
you can continue with the unit exercises or the project
There are 16 terminals in each room
CSE3180 Semester 1, 2005 Week 1 / 8
Exclusions
The exclusions to this unit are
– Bachelor of Computing
– Bachelor of Computer Science
– Masters degrees
– Graduate Diplomas in Computing ...
CSE3180 Semester 1, 2005 Week 1 / 9
PRINCIPLES OF DATABASE SYSTEMS
Lecture 1
CSE3180 Semester 1, 2005 Week 1 / 10
Introduction
Part 3
The notes for this unit can be found on the Monash Web Page
at this address:
http://www.csse.monash.edu.au/courseware/cse3180
The Unit Code is CSE3180
The notes are the PowerPoint format and can be viewed on
the File Server software.
CSE3180 Semester 1, 2005 Week 1 / 11
Lecture Objectives
This lecture will cover:
– some thoughts on data storage and retrieval constraints
– what is the form of ‘data’
– some definitions of ‘data’, ‘information’, ‘audit trail’
– what is a data base - who would want one
– some of the functions of the database management
software
– different models (commercial)
– the relational model
– advantages and disadvantages of database
– some practical aspects of your project
CSE3180 Semester 1, 2005 Week 1 / 12
Introduction
Course Outline
As you will see from the notes, the recommended text is
Hoffer, Prescott and McFadden, ‘Database Management’.
The examples and the exercises at the end of most
chapters are well worth a read.
There is a project, and it is expected that you will work in
groups - your tutor will arrange this in the first tutorial.
The assignment support software is either Oracle or MS
Access. If you wish to use some other DBMS, (such as
SQLServer or MySQL) you will need to come to some
amicable arrangement with your tutor
CSE3180 Semester 1, 2005 Week 1 / 13
Introduction
Features of database in the lectures such as
–
–
–
–
–
–
Recovery
Security
Consistency
Concurrency
Database Management architecture
Background processes
will be based on Oracle’s version 9i
CSE3180 Semester 1, 2005 Week 1 / 14
Database
In designing a database there is the need to understand
what requirements are made of data analyses, who needs
these analyses, when and in what form.
In all this there is an expectation that the results of ‘queries’,
which is the same as saying data supported by the
database is accurate, timely and complete.
In these lectures, you will see how these requirements can
be built into a database - as you will do with your database
model.
And that is what it is - a model which accurately reflects data
as it occurs and is processed in the ‘real world’.
CSE3180 Semester 1, 2005 Week 1 / 15
Some Thoughts on Data Storage
• A major benefit of Computing is the ability to STORE and
RETRIEVE large amounts of data
• However, there are a number of processes and other
considerations which need to be worked together to
maximise this benefit
• Some very early items are
– What data ?
– What are its sources ?
– What are the volumes / frequency ?
– How long is to be stored and why this period ?
– In what FORM is it to be stored ?
CSE3180 Semester 1, 2005 Week 1 / 16
Some Thoughts on Data Retrieval
• Who is going to ‘access’ (retrieve ?) this data
• How often ?
• From where ?
• Why is data to be accessed - for what purpose ?
CSE3180 Semester 1, 2005 Week 1 / 17
Some Thoughts on Data Retrieval
• How is it to be accessed ? Voice inquiry, remote, by formal
request, normal processing schedule, randomly, whenever
the ‘need’ arises ?
• Is the data to be freely available ?
– Are there some limitations on access ?
– How are these access limitations managed ?
• What value is inherent in the data ?
CSE3180 Semester 1, 2005 Week 1 / 18
Some General Thoughts
• What time base or volume spread is to be represented by
the data ?
• What levels of accuracy are to be expected ?
• Is data to be available 7 days a week, 24 hours/day ?
• What response time is expected ? Minimum ? / Tolerable ?
And just what does that mean in real value terms ?
• How is new or altered data to be directed to existing data ?
CSE3180 Semester 1, 2005 Week 1 / 19
Some General Thoughts
• How is input access to be controlled ?
• When and why is data deleted - who authorises such
deletions ?
• What does the ‘data’ consist of - characters, objects, audio
visual, TV, audio, animation ?
• What is the optimum method of storage (organisation) ?
CSE3180 Semester 1, 2005 Week 1 / 20
Some Advanced Commercial
Applications
You have probably heard of these ‘advanced’ applications
– Consumer (or Customer) Resource Management
– Supply Chain Management
But have you heard of
– Business Intelligence
– Business Process Intelligence
– Business Process Management
– Business Activity Monitoring
These are some of the new, current applications
CSE3180 Semester 1, 2005 Week 1 / 21
Some Definitions
A General Definition:
DATA - raw (unprocessed or partly processed) facts which
represent the state of entities (things) which have
occurred
INFORMATION - data which has been processed into a form
USEFUL TO THE USER
What is Information to one user may be Data to another user.
CSE3180 Semester 1, 2005 Week 1 / 22
Where are the Mars landers ?
Which of these is Mars ?
Possible influence of ABC Channel 2 ?
CSE3180 Semester 1, 2005 Week 1 / 23
Audit Trail
General Definition:
‘The presence of data processing media and procedures
which allow any and / or all transaction(s) to be traced
through ALL STAGES of processing’
This infers that the following devices / techniques are in place:
1. A logging device which ‘traps’ all transactions
2. Some way of tagging each transaction so that it can be
identified
3. Some way of retrieving the required transaction(s)
4. Some way of archiving - what is the required period ?
5. Control procedures and processes to ensure integrity
CSE3180 Semester 1, 2005 Week 1 / 24
Database
A Database is a shared collection of Inter-Related data
designed to meet the needs of multiple types of users and
applications.
This implies that multiple USER VIEWS can be defined
Data stored is independent of the programs which use it
Data is structured to provide a basis for future applications
DATABASE = Stored Collection of Related Data
May be physically distributed
CSE3180 Semester 1, 2005 Week 1 / 25
Database Management Software
A DBMS is SOFTWARE which provides access to the database
in an integrated and controlled manner
A DBMS must contain :
1. Data Definition and Structure capabilities
2. Data Manipulation capabilities
CSE3180 Semester 1, 2005 Week 1 / 26
Data Definition and Manipulation
Data Definition Language (DDL)
used to describe data at the database level
Schema level - complete database description
Sub-Schema level - user views (restricted)
Data Manipulation Language (DML)
Provides for these
Create
capabilities
Update
Delete
Modify
Report
Insert
Retrieve (extract)
Drop
Calculation
CSE3180 Semester 1, 2005 Week 1 / 27
The Many Faces of Database
Databases can be:
1. Transaction Intensive
2. Decision Support
3. Mixed-Load
4. Small databases
5. Very Large Database
(VLDB)
6. Non Traditional
7. Mobile
ATM’s Checkouts
Browsing for trends
Combination of both
Few thousand records
Many millions or trillions
of records (Banks)
- Weather bureau, flight plans
Computer Aided Design data
- Able to ‘move around’
-
CSE3180 Semester 1, 2005 Week 1 / 28
DBMS Requirements
Querying Capabilities
Data Displays (Presentation)
Data entry
Data Validation
Data Deletion
Committing Procedures (of changes)
AND Data Integrity, Security, Consistency and Concurrency
Capabilities
CSE3180 Semester 1, 2005 Week 1 / 29
The Many Faces of Database
• They can be:
Data Warehouses (which you will meet in Lecture 8)
Data Marts (or martlets)
• How is a database size measured ?
There are a number of ‘measurements’
Raw data size
Total database size
Total usable disk space size (which includes media
protection such as mirroring)
CSE3180 Semester 1, 2005 Week 1 / 30
The Many Faces of Database
Hardware
Database
Raw Data
Total Disk
HP9000
Oracle
100GB
643GB
Digital 8400
Oracle
100GB
361GB
IBM SP2
DB2/6000
100GB
377GB
NCR5100
Teradata
100GB
880GB
NCR5100
Teradata
1,000GB
3,280GB
CSE3180 Semester 1, 2005 Week 1 / 31
How ‘Big’ can a Database be ?
•
•
Try this : The Land Registry in England has a DB2
database whose ‘size’ is 18.3 TeraBytes
This includes : User data
Summaries
Aggregates
Indexes
It does not include Free Space and Redundancy areas
CSE3180 Semester 1, 2005 Week 1 / 32
Important Database Features
•
•
•
•
•
Data Integrity
Data Independence
Referential Integrity - Relational Database Model
Concurrency Control - Multiple Users
Consistency
- multi users
- distributed database
- replicated database
- partitioned database
- mobile database
• Recovery from failure (Transaction and Media)
• Security
CSE3180 Semester 1, 2005 Week 1 / 33
Data Base Models - Hierarchical
owner / parent
owner
child / parent
member
child
child / parent
CSE3180 Semester 1, 2005 Week 1 / 34
Data Base Models - Network
set of
data
owner
member
owner
member
set of data
Note: Only linked sets can be accessed
CSE3180 Semester 1, 2005 Week 1 / 35
Data Base Models - Network
set of
data
owner
member
set of data
Note: Only linked sets can be accessed
CSE3180 Semester 1, 2005 Week 1 / 36
Data Base Models - Relational
table
A
table
B
table
C
table
D
table
E
Any table(s) can be joined to any other table(s), provided there
is a means of effecting the join
Primary key / Foreign key concept. Data redundancy
No fixed linkages
CSE3180 Semester 1, 2005 Week 1 / 37
Edgar F. Codd
•
•
•
•
Born 1932, UK.
Died April 18, 2003.
BA/MA (Maths) Oxford University
PhD University of Michigan.
Dr. E. F. Codd

The Relational Oath:
“I promise to use the key, the whole key and
nothing but the key, so help me Codd”
CSE3180 Semester 1, 2005 Week 1 / 38
A Primary Key - What’s that ?
• Hoffer, Prescott and McFadden define a Primary Key as :
An attribute (or combination of attributes) which uniquely
identifies each row in a relation. (table)
• Richard T. Watson has this to say:
The primary key definition block specifies a set of column
values comprising the primary key. Once a Primary Key is
defined, the system enforces its uniqueness by checking
that the Primary Key of any new row does not already exist
in the table.
CSE3180 Semester 1, 2005 Week 1 / 39
And - A Foreign Key ??
•
Hoffer, Prescott and McFadden’s definition:
An attribute (or attributes) in a relation (table) of a database
which serves as the Primary Key of another relation (table)
in the same database.
• Richard T. Watson says:
An attribute (or attributes) that is a Primary Key in the same
table, or another table. It is the method of recording relations
in a relational database.
And, both the Primary and Foreign Key(s) should be drawn
from the same Domain.
CSE3180 Semester 1, 2005 Week 1 / 40
2 relations (tables)
EMPNUM
NAME
Date of Birth
DEPTNUM
3
JONES
16-05-1956
605
Referencing
7
SMITH
23-09-1965
432
Table
11
ADAMS
11-08-1972
201
15
NGUYEN
23-10-1964
314
18
PHAN
16-11-1976
201
23
SMITH
19-09-1974
314
Relation (Table) Name : EMP
Relation Schema: EMP(empnum,name,date of birth,deptnum)
DEPTNUM
201
314
432
605
DEPTNAME
Production
Finance
Information Systems
Administration
Referenced
Table
Relation (Table) Name : DEPT
Relation Schema: DEPT(deptnum, deptname) CSE3180 Semester 1, 2005 Week 1 / 41
Relational Database
Data is represented in ROW and COLUMN form (matrix)
(tuple)
(attribute)
Collections of related data ---> TABLES (relations)
1 or more tables
----> DATA BASE
ATTRIBUTES are generally static
ROWS are DYNAMIC and Time-Varying
The number of Attributes = DEGREE of a table
The number of Rows
= CARDINALITY of a table
CSE3180 Semester 1, 2005 Week 1 / 42
Some RDB Considerations
•
•
•
•
•
•
•
Data is held in tables
No order of data in tables - row or attribute
Concept of Foreign Key - Primary Key relationship
Data Typing - including nulls
Query Access - insert, update, delete, retrieval
Indexing on candidate (and Primary) keys
Integrity Constraints
Attribute value ranges
Referential Integrity (Foreign Key - Primary Key)
Entity Integrity
User Defined Integrity
• Set retention constraints
CSE3180 Semester 1, 2005 Week 1 / 43
Some RDB Considerations
• Domain constraints
• User defined ‘Rules’ e.g. quantities and values must not be
negative; pricing rate must not be zero
• Recovery procedures
• No explicit linkages between tables
• Linking or embedding database operations in a procedural
language (Cobol, C ..)
• Databases may be distributed across similar or different
DBMS’s
• Security features
CSE3180 Semester 1, 2005 Week 1 / 44
Data Description Language
Used to describe data at the Database level
Structure: Attributes
Schema : Complete description of the database using DDL
SubSchema : Describes data in the database as it is
‘known’ to individual programs(processes) or users
The segment of logical data record(s) required is
commonly known as a VIEW
CSE3180 Semester 1, 2005 Week 1 / 45
Data Manipulation Language
Language (commands and syntax) used to cause transfers
of data from the Database and the Operating Environment
and vice versa
Variety of Languages - Cobol, C, Java, C++
and SQL as in Access, DB2, dBASEV, Informix, Oracle,
VisualDataBase, SQLServer, MySQL
Windows versions provide Icons and Menu options which
are translated by the DBMS software to Database
manipulation commands
Typical commands: get, put, replace, seek, update,delete,
insert, drop, find, modify
CSE3180 Semester 1, 2005 Week 1 / 46
A Typical Database Model
Users - keyboard direct
Users
Menu
Options
Programs written in
Cobol, C, C++, Pascal
Java etc.
P
r
o
g
r
a
m
I
n
t
e
r
f
a
c
e
D
B
M
S
Database
Database Query
Access Language
CSE3180 Semester 1, 2005 Week 1 / 47
Advantages of Database
• Reduced Data Redundancy
• Data Integrity
• Data Independence
• Data Security
• Data Consistency
• Easier use of Data via DBMS Tools (Query languages,
4GL's)
CSE3180 Semester 1, 2005 Week 1 / 48
Disadvantages of Database
• Complexity
• Expense
• Vulnerability
• Size of - disk storage, processor memory
• Training Costs
• Compatibility
• Technology Lock In
CSE3180 Semester 1, 2005 Week 1 / 49
More on DBMS
Capabilities Required:
– the DBMS must provide a natural interface of user data
– the interface must be independent of any physical
storage structures
– different users with different views must be able to
access the same database
– database changes must be possible without affecting
programs which do not use the changes
(Physical and Logical Independence)
CSE3180 Semester 1, 2005 Week 1 / 50
More on DBMS
Other Requirements : Provision of Operational Facilities to
ensure:
– multi user access control
– remote terminal access
– restrictions on user access
– recovery from system faults
– database distribution
CSE3180 Semester 1, 2005 Week 1 / 51
CSE3180 Semester 1, 2005 Week 1 / 52
Your Project
• Intended to bring together many of the aspects required in
DATABASE DESIGN
• It is NOT a complete systems design
• There is no transaction processing required
• The database MUST contain data
• It must exhibit constraints and other forms of data integrity
CSE3180 Semester 1, 2005 Week 1 / 53
Data Models
Model of
system
in client’s
mind
Entity model of
client’s model
Table model
of entity model
Server
Tables on disk
CSE3180 Semester 1, 2005 Week 1 / 54
Communicating with a RDBMS
Using SQL
SQL statement
is entered
SQL> SELECT loc
FROM dept;
Statement is sent to
database
Database
Data is displayed
LOC
------------CLAYTON
CAULFIELD
PENINSULA
GIPPSLAND
CSE3180 Semester 1, 2005 Week 1 / 55
Your Project
• There are 3 deliverables - normal design practice
– Don’t overlook the User’s interest / Contribution / Costs
• In the assignment, YOU are the user
– You set the ground rules, also known as Business Rules
– You need to document these Rules
– Don’t make the project too onerous i.e. make the design
simple.
– That way you will create a working model in the 13
weeks we have
CSE3180 Semester 1, 2005 Week 1 / 56
Your Project
• Deliverable 1 - The Project Plan
Due 17th March
• Deliverable 2 - User Requirements Statement
Due 21st April
• Deliverable 3 - The working model
Due 26th May
CSE3180 Semester 1, 2005 Week 1 / 57
Your Project
• One of the group members will need to be the Co-ordinator
- do this co-operatively.
• Also, some person in the group will need to address the
question of documentation !!!
• Use of Microsoft Project, Word and possibly Excel (or some
spreadsheet) would be an advantage.
CSE3180 Semester 1, 2005 Week 1 / 58
Your Project
• Deliverable 1 - Project Plan:
– This sets the tempo for the assignment
– Make sure you have the MAJOR components included
– Gannt charts change as events occur - or don’t occur
• Continue with the development of the Project - don’t waste
time in just preparing the Gannt chart
• By Thursday 17th March, you should have some workable
ideas about the rest of the assignment
CSE3180 Semester 1, 2005 Week 1 / 59
Your Project
• One member will need to act as the Reviewer for
– Project plan completeness
– Validity of the Entities
– Validity of the Entity Relationships
– Normalisation / Primary key(s) / Foreign key(s)
– User requirements statement (this includes suggested
report and Query profiles)
CSE3180 Semester 1, 2005 Week 1 / 60
Your Project
Review (continued)
– Data structures
– Constraints
– Test strategy
– Test data
– Testing results
– Operational model
CSE3180 Semester 1, 2005 Week 1 / 61
Your Project
• Make rough outlines of the Entities
• Make rough outlines of the Data Structures - these fill out
the Entities
• Make rough outlines of the Reports, Queries
• Make notes of Rules, and the Constraints you consider will
help the Rules to control data in the database
CSE3180 Semester 1, 2005 Week 1 / 62
Your Project
• Review the E-R diagram critically - it’s best if one person of
the group becomes the ‘Quality Control’.
• Each member of the group MUST know what the assignment
is about and how it is being developed.
• Prepare the ‘final’ technical paper by Thursday 8th May
CSE3180 Semester 1, 2005 Week 1 / 63
Your Project
• You will probably be using MS-Access to develop the
physical database.
Or, you could use Oracle.
• Become familiar with the Table Design feature
– Include Primary Keys, Required Fields, Value Ranges or
Exclusions
– ‘Numbers’ are ??
– Integers (no decimals)
– Numerics with decimals
– Text references only
• Use data types to constrain data
• Use the ‘required’ property
CSE3180 Semester 1, 2005 Week 1 / 64
Your Project
• If you are using lists of controlled data, investigate the
Combo Box or List Box features
• When using Queries, don’t make them complex
• Reports are the standard report function
• Forms (a software development) can be easily developed
using normal function of the database dialog box
• You can make an active form using standard facilities
CSE3180 Semester 1, 2005 Week 1 / 65
Your Project
• There should be no need to delve into the Applications
Development features of Access. However, if you have
some expertise, and time, do so by all means
• Make sure you keep a backup of your database
CSE3180 Semester 1, 2005 Week 1 / 66
Building Evacuation
If the Building Evacuation Alarm System Activates
–
–
–
–
–
–
Collect your belongings
Move out of the room using the Exits
Use the Stairs - NOT the Lifts or Escalators
Follow the directions of FLOOR WARDENS - if present
Move to the Lawn outside K Block (common)
Wait for further instructions (if during the evening use
your discretion)
– TREAT EVERY EVACUATION ALERT AS REAL
CSE3180 Semester 1, 2005 Week 1 / 67
Timetable
• The lecture will be held in B2.15 from 10.00am to midday to
3.00pm on Thursdays
• Tutorials will be held on Thursdays in Laboratory K1.07
from 1.00pm to 3.00pm.
The Laboratories are booked for 2 hours so that if you wish
you can continue with the unit exercises or the project
There are 16 terminals in each room
CSE3180 Semester 1, 2005 Week 1 / 68
That should be enough for today
The unit materials are available at
http:///www.monash.edu.au/courseware/cse3180
CSE3180 Semester 1, 2005 Week 1 / 69