Lecture Note 1

Download Report

Transcript Lecture Note 1

ITEC 3220M
Using and Designing Database Systems
Instructor: Prof. Z.Yang
Course Website:
http://people.math.yorku.ca/~zyang/itec3
220m.htm
Office: TEL 3049
Course Objective
• Examine databases, trends in database
management systems and their application in a
wide range of organizational areas
• Provide an overview of database processing, both
historical and discussion of recent trends in
database management
• Provide the student with exposure to a range of
tools, including a relational DBMS as well as an
object-oriented DBMS
2
Textbooks
•A bundle consisting of
– Database Systems Design, Implementation, &
Management, Seventh Edition - Peter Rob &
Carlos Coronel
– A Guide to SQL, Seventh Edition - Philip J. Pratt
3
Marking Scheme
• Final exam (closed book) - 50%
Midterm (closed book) - 35%
Assignments (3 assignments) - 15%
• Lecture notes will be made available at:
http://people.yorku.ca/~zyang/itec3220m.
htm
4
Schedule
• Week 1 Database concepts and the relational
database model (Chapter 1, 2 & 3)
• Week 2 Entity relationship model (Chapter 4)
• Week 3 Normalization (Chapter 5)
• Week 4 SQL (Chapter 7 & A guide to SQL)
• Week 5 SQL + lab (Chapter 7 & A guide to SQL)
• Week 6 Advanced SQL + lab (Chapter 8 & A guide to
SQL)
5
Schedule (Cont’d)
• Week 7 Midterm
• Week 8 Database design & case study (Chapter 9)
• Week 9 Transaction management and concurrent
control (Chapter 10)
• Week 10 Transaction management and concurrent
control (Cont’d) and data warehousing (chapter 10
&13)
• Week 11 Objected-Oriented database (Appendix G)
• Week 12 TBA and review for final exam
6
Introduction
Database Systems and
Data Models
Basic Definition
• Data: raw facts
– Constitute building blocks of information
• Information: is produced by processing data and
reveals meaning of data
– Good, timely, relevant information key to decision
making
– Good decision making key to organizational survival
• Database: shared, integrated computer structure
housing:
– End user data
– Metadata
8
An Example
• Converting data to information
9
An Example (Cont’d)
• Metadata
10
What is a Database Management System
(DBMS)
• A collection of programs that manages the
database structure and controls access to the
data stored in the database
– Possible to share data among multiple
applications or users
• Example: bank and its ATM machines
– Makes data management more efficient and
effective
• End users have better access to more and bettermanaged data
11
DBMS Manages Interaction
12
File and File System
• Terminology
– Data
• Raw Facts
– Field
• Group of characters with specific meaning
– Record
• Logically connected fields that describe a person,
place, or thing
– File
• Collection of related records
13
Example
14
Disadvantages of File Processing
•
Data Dependence
– Change in file’s data characteristics
requires modification of data access
programs
– Lengthy development time
– Excessive program maintenance
•
Structural Dependence
– Change in file structure requires
modification of related programs
15
Example
16
Disadvantages of File Processing
(Cont’d)
•
Data Redundancy
–
–
Different and conflicting versions of same data
Results of uncontrolled data redundancy
• Data anomalies
– Modification
– Insertion
– Deletion
• Data inconsistency
– Lack of data integrity
17
Solution: Database Approach
• Database consists of logically related data
stored in a single repository
• Advantages of database approach
– Structural and data independence
– Minimal data redundancy
•Reduces inconsistency, data anomalies
– Improves data sharing and data quality
– Stores data structures, relationships, and
access paths
18
Database vs. File Systems
19
Database System Environment
• Hardware: all the system's physical
devices
• Software
– Operating system software
– DBMS software
– Application programs and utility software
• People
• Procedures
• Data
20
Database Models
• Collection of logical constructs used to
represent data structure and
relationships within the database
– Conceptual models: logical nature of data
representation
– Implementation models: emphasis on how
the data are represented in the database
21
Database Models: Historic
Overview
• Flat files - 1960s - 1980s
• Hierarchical – 1970s - 1990s
• Network – 1970s - 1990s
• Relational – 1980s - present
• Object-oriented – 1990s - present
• Object-relational – 1990s - present
• Data warehousing – 1980s - present
• Web-enabled – 1990s - present
22
Hierarchical Database Model
• Logically represented by an upside down tree
– Each parent can have many children
– Each child has only one parent
23
Hierarchical Database Model (Cont’d)
• Advantages
–
–
–
–
Conceptual simplicity
Database security and integrity
Data independence
Efficiency
• Disadvantages
–
–
–
–
–
Complex implementation
Difficult to manage and lack of standards
Lacks structural independence
Application programming and use complexity
Implementation limitations
24
Network Database Model
• Each record can have multiple parents
– Composed of sets
– Each set has owner record and member record
– Member may have several owners
25
Network Database Model (Cont’d)
• Advantages
–
–
–
–
–
–
Conceptual simplicity
Handles more relationship types
Data access flexibility
Promotes database integrity
Data independence
Conformance to standards
• Disadvantages
– System complexity
– Lack of structural independence
26
Relational Database Model
• Perceived by user as a collection of tables for
data storage
• Tables are a series of row/column intersections
• Tables related by sharing common entity
characteristic(s)
27
Relational Database Model
(Cont’d)
28
Relational Database Model
(Cont’d)
• Schema for the table
– Graphical representation
AGENT
AGENT_C
ODE
AGENT_LN AGENT_FN
AME
AME
AGENT_INI AGENT_AREA AGENT_PH
TIAL
CODE
ONE
– Text description
AGENT(AGENT_CODE, AGENT_LNAME,
AGENT_FNAME, AGENT_INITIAL,
AGENT_AREACODE, AGETN_PHONE)
29
Relational Database Model
(Cont’d)
• Advantages
– Structural independence
– Improved conceptual simplicity
– Easier database design, implementation, management,
and use
– Ad hoc query capability with SQL
– Powerful database management system
• Disadvantages
– Substantial hardware and system software overhead
– Poor design and implementation is made easy
– May promote “islands of information” problems
30
Object-Oriented Database Model
• Objects or abstractions of real-world
entities are stored
– Attributes describe properties
– Collection of similar objects is a class
•Methods represent real world actions of classes
•Classes are organized in a class hierarchy
– Inheritance is ability of object to inherit
attributes and methods of classes above it
31
OO Data Model
• Advantages
–
–
–
–
Adds semantic content
Visual presentation includes semantic content
Database integrity
Both structural and data independence
• Disadvantages
–
–
–
–
Lack of OODM
Complex navigational data access
Steep learning curve
High system overhead slows transactions
32
Costs and Risks of the
Database Approach
• Up-front costs:
– Installation Management Cost and
Complexity
– Conversion Costs
• Ongoing Costs
– Requires New, Specialized Personnel
– Need for Explicit Backup and Recovery
• Organizational Conflict
33
Review
• Basic concepts: data, information,
database, DBMS, file, conceptual model,
implementation model, etc
• Why database and its importance, cost
and risk
• Different database models
– definition
– advantage
– disadvantage
34
Chapter 3
The Relational Database Model
In this chapter, you will learn:
• Basic components of the relational
database model
– Entities and their attributes
– Relationships among entities
• Relational algebra
• Relationship in relational database
• Data redundancy
36
Basic Definition
• Entities and Attributes
– Entity is a person, place, event, or thing
about which data is collected
– Attributes are characteristics of the entity
• Tables
– Holds related entities or entity set
– Also called relations
– Comprised of rows and columns
37
Table Characteristics
•
•
•
•
•
•
•
•
Two-dimensional structure with rows and columns
Rows (tuples) represent single entity
Columns represent attributes
Row/column intersection represents single value
Tables must have an attribute to uniquely identify each row
Column values all have same data format
Each column has range of values called attribute domain
Order of the rows and columns is immaterial to the DBMS
38
Example Tables
39
Terminology for Relational
Database
Table-Oriented
Set-oriented
Record-Oriented
Table
Relation
Record type
Row
Tuple
Record
Column
Attribute
Field
40
Key
• Consists of one or more attributes that
determine other attributes
• Primary key (PK) is an attribute (or a
combination of attributes) that uniquely
identifies any given entity (row)
• Key’s role is based on determination
– If you know the value of attribute A, you can look up
(determine) the value of attribute B
41
Keys (Cont’d)
• Composite key
– Composed of more than one attribute
• Key attribute
– Any attribute that is part of a key
• Superkey
– Any key that uniquely identifies each entity
• Candidate key
– A superkey without redundancies
42
Keys (Cont’d)
• Foreign key (FK)
– An attribute whose values match primary key
values in the related table
• Referential integrity
– FK contains a value that refers to an existing valid
tuple (row) in another relation
• Secondary key
– Key used strictly for data retrieval purposes
43
Simple Relational Database
44
Controlled Redundancy
• Makes the relational database work
• Tables within the database share common
attributes that enable us to link tables
together
• Multiple occurrences of values in a table are
not redundant when they are required to make
the relationship work
• Redundancy is unnecessary duplication of
data
45
Integrity Rules
46
Integrity Rules (cont’d)
47
Exercises
Table name: TRUCK
Table name: BASE
Table name:
TYPE
48
Exercises (Cont’d)
• For each table, identify the primary key and
the foreign keys.
• Do the tables exhibit entity integrity? Explain
• So the tables exhibit referential integrity?
Explain
• Identify the TRUCK table’s candidate key (s).
• For each table, identify a super key and a
secondary key
49