IMS1907 Database Systems - Monash University, Victoria, School of

Download Report

Transcript IMS1907 Database Systems - Monash University, Victoria, School of

IMS1907 Database Systems
Summer Semester 2004/2005
Lecture 13.2
Unit Review
Basic Concepts
Central concepts for understanding database systems
– Database
– Data
– Information
– Data vs Information
– Metadata
– DBMS
Monash University 2004
2
Basic Concepts
File processing systems vs DBMS
– data sharing
– speed of access and retrieval
– security
– integrity, quality, consistency of data
– data independence
– maintenance, productivity
– multiple users, complex data
– backup and recovery
Monash University 2004
3
Basic Concepts
The database system environment
– DBMS
– database
– metadata (repository)
– application software
– CASE tools
– user interfaces
– users, developers, administrators
Monash University 2004
4
Types of Database Systems
Personal Databases
Workgroup Databases
Department Databases
Enterprise Databases
Internet, Intranet, and Extranet Databases
Data warehouses
Monash University 2004
5
Costs and Risks of Database Systems
Need for new, specialised personnel
Installation cost and complexity
Maintenance cost and complexity
Conversion costs from legacy systems
Critical need for explicit backup and recovery
Organisational conflict and change
Monash University 2004
6
Relational DBMS Software
Relational database systems organise the database as
groups of related tables
– table or relation
– record
– field
– primary key
– secondary key
– foreign key
– table structure
Forms, reports, queries
Monash University 2004
7
Database System Development
Database development requires a focus on the information
needs of a business
Information Engineering (IE) is a popular, data-oriented
methodology used to develop database systems
– data are modelled in the organisational context, not in the
usage, processing or technology context
– business context changes slowly  stable databases
– top-down planning
Monash University 2004
8
Database System Development
Database systems planning
– the three steps in the IE Planning phase
• identify strategic planning factors
• identify corporate planning objects
• develop an enterprise model
Enterprise data model
– needed for top-down plans and bottom-up requests
– organisation-wide perspective
Monash University 2004
9
Database Development and the SDLC
Initiation
Enterprise modelling
Conceptual data modelling
Analysis
Logical database design
Design
Implementation
Review
Maintenance
Physical database design
and definition
Database implementation
Database review
Database maintenance
Monash University 2004
10
Database Development and the SDLC
Enterprise modelling
– the organisational perspective
Conceptual data modelling
– scope identification, ER modelling
Logical database design
– transform conceptual model into logical data model
– start to specify logic for maintaining and querying
database
– populate repository
Monash University 2004
11
Database Development and the SDLC
Physical database design and definition
– define database for specific DBMS used
– organisation of data, database processing programs
Database implementation
– install database and processing programs
– develop procedures, load data, turn on!
Database maintenance
– tune and fix the database, keep it running and evolving
Packaged data models – universal, industry specific
Monash University 2004
12
ER Modelling
Data modelling
Business rules
ER modelling
– entities or ‘things of interest’
– relationships
– properties or attributes
– rules and constraints affecting integrity of entities
Monash University 2004
13
ER Modelling
Entities
– strong, weak
Relationships
Associative entities
Attributes
– multi-valued, derived, composite
Degree
– unary, binary, ternary, n-ary
Monash University 2004
14
ER Modelling
Cardinality
– one-to-one, one-to-many, many-to-many
Cardinality constraints
– optional, mandatory
Time dependent data
Entity types and sub-types
ER quality issues
Monash University 2004
15
Relational Database Theory
Detailed data modelling
Relational database theory
– considers data structure, manipulation, integrity
Relation
Primary key
Composite key
Foreign key
Integrity constraints
– domain constraints, entity integrity, referential integrity
Monash University 2004
16
Relational Database Theory
A well-structured relation
– is robust, stable and flexible
– contains a minimum amount of redundancy
– allows users to insert, modify, and delete rows in a table
without errors or inconsistencies
Three types of anomaly are possible
- insertion
- deletion
- modification
Monash University 2004
17
Normalisation
Representing entities and relationships as relations
Normalisation is a process for converting complex data
structures into simple, stable data structures in the form of
relations
Functional dependency
Accomplished in stages, each of which corresponds to a
“normal form”
Monash University 2004
18
Normalisation
First normal form (1NF)
– identify PK, identify and remove repeating groups
Second normal form (2NF)
– remove partial dependencies
Third normal form (3NF)
– remove transitive dependencies
Merging relations
Data structure diagrams (DSD)
Monash University 2004
19
Structured Query Language (SQL)
Has become de facto language for creating and querying
relational databases
Benefits and disadvantages of SQL
The SQL environment
– catalog
– schema
– data definition language (DDL)
– data manipulation language (DML)
– data control language (DCL)
– data types
Monash University 2004
20
Structured Query Language (SQL)
DDL
– CREATE statements
• database, table, view, ….
• assigning constraints
– DROP statements
• database, table, view, ….
– ALTER statements
• database, table, view, column, ….
Monash University 2004
21
Structured Query Language (SQL)
DML
– INSERT, LOAD DATA statements to populate tables
– SHOW, DESCRIBE statements to view structures
– retrieving data – queries
• SELECT …. FROM …. WHERE ….
– aggregate operators
• COUNT, SUM, AVG, MIN, MAX, DISTINCT
• GROUP BY
• ordering query results with ORDER BY
Monash University 2004
22
Structured Query Language (SQL)
DML
– matching patterns with LIKE
– joining tables
– sub-queries
– outer joins using LEFT JOIN
– query format
– How joins are processed
• Cartesian product
Monash University 2004
23
Database Systems Architecture
Views
Schema
ANSI/SPARC three-schema architecture standard
– external schema
• user views
– conceptual schema
• single, coherent definition of enterprise data
– internal schema
• physical storage structures
Monash University 2004
24
Database Systems Architecture
Data independence
– logical
– physical
Network architecture
– client–server tiered architecture
– distributed databases
Monash University 2004
25
Database Systems Performance Issues
The ultimate measures of database performance are
– response time to queries
– the speed of updates
We also need to consider
– data accessibility, security, integrity
– usability
– recoverability
Physical database design translates conceptual and
external schemas into physical designs aimed at storing
data in a way that provides adequate performance
Monash University 2004
26
Physical Database Design
Guided by the nature of the data and its intended use
Tuning the database is often performed during operation but
good performance starts with a strong physical design
Critical decisions during physical design
– choice of storage format – data type
– grouping of attributes into physical records
– arranging similarly structured records in secondary
memory
– indexes, clusters, architectures
– strategies for query handling based on indexes, records
Monash University 2004
27
Physical Database Design
Data volume and usage analysis – workloads
Choice of data types
Designing physical records
– page size, blocking factor
Denormalisation
– combining attributes into a single table
– partitioning a table into several physical records
Physical file organisation
– sequential, indexed, hashed
Clusters, indexes
Improving file access - RAID
Monash University 2004
28
Database Systems Performance
Choosing an appropriate database architectures
– hierarchical database model
– network database model
– relational database model
– object-oriented database model
– multidimensional database model
Optimising query performance
– good query design
Monash University 2004
29
Information Resource Management
Data is viewed as a corporate asset
As with any asset, management is essential to exploit the
resource to the maximum benefit
Effective management of data provides support for
operations and decision making at all organisational levels
The roles of data administration and database
administration have evolved to meet the complex task of
– achieving effective management of data resources
– leveraging those resources to the greatest advantage
Monash University 2004
30
Information Resource Management
There are three major roles in information resource
management
– data administration
• planning, analysis
– database administration
• physical design and operational use
– application development
• systems design and implementation
Monash University 2004
31
Information Resource Management
Ineffective data administration leads to poor data utilisation
New technologies and trends are driving the evolution of the
roles of data administrator and database administrator
Roles of the
– data administrator
– database administrator
Evolving roles of the DA and DBA
Monash University 2004
32
Final Exam
3 hour exam, 10 minute reading time
Ten questions
– 1 question consisting of ten short answer questions
(10 x 1 mark)
– 6 short to medium length questions (1 x 5 marks, 6 x
10 marks)
– ER modelling (10 marks)
– normalisation (15 marks)
– Attempt all questions!
Monash University 2004
33
Exam Strategy
Know the date, time and location of your exam – it’s your
responsibility!
Know your seat number
Make sure you have your student ID card with you
Get to the exam early
Ensure you have adequate writing materials with you
No text books or notes allowed
Relax – there’s really not much to worry about
– whatever you have to do (within reason) to help you
relax is ok
Monash University 2004
34
Exam Strategy
3 hour, 10 minute reading time, 100 marks
– you’ve got 180 minutes to earn 100 marks!
Convert marks to minutes  1.8 minutes/mark
Calculate time available for each question
It is a guide to the amount of effort I expect you to spend on
each question
Once the available time for a question is up, stop writing!
If you finish a question within the available time, return to
any incomplete answers
Make sure you understand the questions!
Monash University 2004
35
Study Strategy
Give yourself sufficient time for revision
– don’t wait till the day before the exam to start studying
Study all topics covered in lectures
Re-read lecture notes, your notes, text books, tutorial notes
Do all exercises especially revision exercises
Attempt previous exams
Consult tutors or lecturer before exam
Get plenty of sleep, drink lots of water, eat green
vegetables
– it’s not quite time to party yet!
Monash University 2004
36