International Computer Institute, Izmir, Turkey Inroduction

Download Report

Transcript International Computer Institute, Izmir, Turkey Inroduction

International Computer Institute, Izmir, Turkey
Lecture-1
Asst.Prof.Dr.İlker Kocabaş
UBİ502 at
http://ube.ege.edu.tr/~ikocabas/teaching/ubi502/index.html
Course Objectives
1. To understand the fundamentals of database systems
 Data models
 Database design
 Normalisation
2. To understand the languages and facilities provided by database
systems
 Query languages including SQL
 Integrity and security
 Transactions
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.2 of 67
Modifications & additions by Cengiz Güngör
With first homework
 Form teams of 2-3 students
 Plan the project with your team
 Review the course material
 Work on the similar problems
 Discuss your project topic
• Database / Table Models / Applicaton
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.3 of 67
Modifications & additions by Cengiz Güngör
Sources of Help
 Principle: you are responsible for your own learning;
the staff are just there to facilitate
1. Your team is your study group
 Help each other, except homeworks.
 But be sure that each team-member understands the material!
 Explaining a concept or technique is a good way to cement it
2. Read the textbook, review the lecture slides
3. See one of the tutors during their office hours
 Details on the course homepage
4. If none of the above work, please post a message to :
 İlker Kocabaş [email protected]
 Sercan Demirci [email protected]
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.4 of 67
Modifications & additions by Cengiz Güngör
Today: Introduction, and the
Entity-Relationship Model
 Introduction
 What is a database management system?
 Why study databases? Why not use file systems?
 The three-level architecture
 Schemas and instances
 Overview
 Data models, E-R model, Relational model
 Data Definition Language, Data Manipulation Language
 SQL
 Transaction Management, Storage Management
 User types, database administrator
 System Structure
 Entity-Relationship Model
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.5 of 67
Modifications & additions by Cengiz Güngör
Introduction
Practice and Theory
 Practice
 Theory
 Tables, columns, rows,
keys
 Relational model:
relations, attributes,
tuples
 SQL
 Relational algebra,
equivalences
 Application structure
 Logical & physical
database design
 Transactions
 Functional
dependencies,
normalization
 Security
 Schedules, serializability
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.7 of 67
Modifications & additions by Cengiz Güngör
What is a Database Management System?
 Collection of data
 Interrelated data
 Relevant to some endeavour
 Software to access the data
 Convenient
 Efficient
 History




1950s-60s: magnetic tape and punched cards
1960s-70s: hard disks, random access, file systems
1970s-80s: relational model becoming competitive
1980s-90s: relational model dominant, object-oriented
databases
 1990s-00s: web databases and XML
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.8 of 67
Modifications & additions by Cengiz Güngör
Why study databases?
 They touch every aspect of our lives
 Applications:









Banking: all transactions
Airlines: reservations, schedules
Universities: registration, course enrolment, grades
Sales: customers, products, purchases
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Telecommunications: subscribers, usage, routing
Computer accounts: privileges, quotas, usage
Records: climate, stock market, library holdings
 Explosion of unstructured data on the web:
 Large document collections
 Image databases, streaming media
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.9 of 67
Modifications & additions by Cengiz Güngör
Why not use file systems?
 Data redundancy and inconsistency
 Multiple file formats
 Duplication of information in different files
 Difficulty in accessing data
 Need to write a new program to carry out each new task
 Data isolation
 Multiple files and formats
 Integrity problems
 Integrity constraints (e.g. account balance > 0) become part
of program code
 Hard to add new constraints or change existing ones
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.10 of 67
Modifications & additions by Cengiz Güngör
Why not use file systems? (cont)
 Maintenance problems
 When we add a new field, all existing applications must be
modified to ignore it
 Atomicity of updates
 Failures may leave database in an inconsistent state with partial
updates carried out
 E.g. transfer of funds from one account to another should either
complete or not happen at all
 Concurrent access by multiple users
 Concurrent accessed needed for performance
 Uncontrolled concurrent accesses can lead to inconsistencies
• E.g. two people reading a balance and updating it at the same time
 Security problems
 Database systems offer solutions to all the above problems
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.11 of 67
Modifications & additions by Cengiz Güngör
The three-level architecture
 Physical level: how a record is stored on disk
 Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
 View level: application-specific selections and arrangements of
the data
 hide details of data types
 Views can also hide information for security reasons
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.12 of 67
Modifications & additions by Cengiz Güngör
The three-level architecture (cont)
An architecture for a database system
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.13 of 67
Modifications & additions by Cengiz Güngör
Schemas vs Instances
 Schema
 the logical structure of the database
 e.g., the database consists of information about a set of customers and
accounts and the relationship between them
 Analogous to type information of a variable in a program
 Instance
 the actual content of the database at a particular point in time
 Analogous to the value of a variable
 Physical Data Independence
 the ability to modify the physical schema without changing the logical
schema
 Applications depend on the logical schema
 Database engines take care of efficient storage and query processing
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.14 of 67
Modifications & additions by Cengiz Güngör
Overview
Data Models
 A collection of tools for describing
 data
 data relationships
 data semantics
 data constraints
 Entity-Relationship model
 Relational model
 Other models:
 object-oriented model
 semi-structured data models (XML)
 Older models: network model and hierarchical model
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.16 of 67
Modifications & additions by Cengiz Güngör
Entity-Relationship Model
Example of schema in the entity-relationship model
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.17 of 67
Modifications & additions by Cengiz Güngör
Relational Model
Attributes
 Example of tabular data in the relational model
Customerid
customername
192-83-7465
Johnson
019-28-3746
Smith
192-83-7465
Johnson
321-12-3123
Jones
019-28-3746
Smith
customerstreet
accountnumber
Alma
Palo Alto
A-101
North
Rye
A-215
Alma
Palo Alto
A-201
Main
Harrison
A-217
North
Rye
A-201
UBI 502
Database Management Systems
customercity
©Silberschatz, Korth and Sudarshan
1.18 of 67
Modifications & additions by Cengiz Güngör
A Sample Relational Database
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.19 of 67
Modifications & additions by Cengiz Güngör
Data Definition Language (DDL)
 Specification notation for defining the database schema
 E.g.
create table account (
account-number char(10),
balance
integer)
 DDL compiler generates a set of tables stored in a data
dictionary:
 Database schema
 Specification of storage structures and access methods
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.20 of 67
Modifications & additions by Cengiz Güngör
Data Manipulation Language (DML)
 Language for accessing and manipulating the data organized by
the appropriate data model
 DML also known as query language
 Two classes of languages
 Procedural – user specifies what data is required and how to
get those data
 Nonprocedural – user specifies what data is required without
specifying how to get those data
 SQL is the most widely used query language
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.21 of 67
Modifications & additions by Cengiz Güngör
SQL
 SQL: widely used non-procedural language
 E.g. find the name of the customer with customer-id 192-83-7465
select customer.customer-name
from customer
where customer.customer-id = ‘192-83-7465’
 E.g. find the balances of all accounts held by the customer with
customer-id 192-83-7465
select account.balance
from depositor, account
where depositor.customer-id = ‘192-83-7465’ and
depositor.account-number = account.accountnumber
 Application programs generally access databases through:
 Language extensions to allow embedded SQL (e.g. PHP)
 Application program interface (e.g. ODBC) which allow SQL queries
to be sent to a database
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.22 of 67
Modifications & additions by Cengiz Güngör
Database Users
 Users are differentiated by the way they expect to interact with
the system
 Application programmers – interact with system through DML
calls
 Sophisticated users – form requests in a database query
language
 Specialized users – write specialized database applications that
do not fit into the traditional data processing framework
 Naïve users – invoke one of the permanent application programs
that have been written previously
 E.g. people accessing database over the web, bank tellers,
clerical staff
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.23 of 67
Modifications & additions by Cengiz Güngör
Database Administrator
 Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs.
 Database administrator's duties include:
 Schema definition
 Storage structure and access method definition
 Schema and physical organization modification
 Granting user authority to access the database
 Specifying integrity constraints
 Acting as liaison with users
 Monitoring performance and responding to changes in
requirements
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.24 of 67
Modifications & additions by Cengiz Güngör
Transaction Management
 A transaction is a collection of operations that performs a single
logical function in a database application
 E.g. transfer funds from one account to another
 Transaction-management component ensures that the database
remains in a consistent state despite system failures
 Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the
database.
 E.g. simultaneous withdrawals
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.25 of 67
Modifications & additions by Cengiz Güngör
Storage Management
 Storage manager is a program module that provides the
interface between the low-level data stored in the database and
the application programs and queries submitted to the system.
 The storage manager is responsible to the following tasks:
 interaction with the file manager
 efficient storing, retrieving and updating of data
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.26 of 67
Modifications & additions by Cengiz Güngör
Overall System Structure
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.27 of 67
Modifications & additions by Cengiz Güngör
Application Architectures
 Two-level architecture: E.g. client programs using ODBC/JDBC to
communicate with a database
 Three-level architecture: E.g. web-based applications, and
applications built using “middleware”
 Your projects will use the three-tier architecture
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.28 of 67
Modifications & additions by Cengiz Güngör
Entity-Relationship Model
Entity-Relationship Model
 Entity Sets
 Relationship Sets
 Design Issues
 Mapping Constraints
 Keys
 E-R Diagram
 Extended E-R Features
 Design of an E-R Database Schema
 Reduction of an E-R Schema to Tables
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.30 of 67
Modifications & additions by Cengiz Güngör
Entity Sets
 A database can be modeled as:
 a collection of entities,
 relationship among entities.
 An entity is an object that exists and is distinguishable from other
objects.
 E.g. specific person, company, event, plant
 Entities have attributes
 E.g: people have names and addresses
 An entity set is a set of entities of the same type that share the
same properties.
 Example: set of all persons, companies, courses, books
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.31 of 67
Modifications & additions by Cengiz Güngör
Entity Sets customer and loan
customer-id customer- customer- customername street
city
UBI 502
Database Management Systems
loan- amount
number
©Silberschatz, Korth and Sudarshan
1.32 of 67
Modifications & additions by Cengiz Güngör
Attributes
 An entity is represented by a set of attributes, that is descriptive
properties possessed by all members of an entity set.
Example:
customer = (customer-id, customer-name,
customer-street, customer-city)
loan = (loan-number, amount)
 Domain – the set of permitted values for each attribute
 Attribute types:
 Simple and composite attributes.
 Single-valued and multi-valued attributes
• E.g. multivalued attribute: phone-numbers
 Derived attributes
• Can be computed from other attributes
• E.g. age, given date of birth
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.33 of 67
Modifications & additions by Cengiz Güngör
Composite Attributes
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.34 of 67
Modifications & additions by Cengiz Güngör
Relationship Sets
 A relationship is an association among several entities
Example:
Hayes
customer entity
depositor
relationship set
A-102
account entity
 A relationship set is a relation over n  2 entity sets Ei :
{(e1, e2, … en) | ei  Ei}
Example:
(Hayes, A-102)  depositor
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.35 of 67
Modifications & additions by Cengiz Güngör
Relationship Set borrower
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.36 of 67
Modifications & additions by Cengiz Güngör
Relationship Sets (Cont.)
 An attribute can also be property of a relationship set.
 For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.37 of 67
Modifications & additions by Cengiz Güngör
Degree of a Relationship Set
 The number of entity sets that participate in a relationship set
 Relationship sets that involve two entity sets are binary (or degree
two). Generally, most relationship sets in a database system are
binary.
 Relationship sets may involve more than two entity sets.
E.g. Suppose employees of a bank may have jobs
(responsibilities) at multiple branches, with different jobs at
different branches. Then there is a ternary relationship set
between entity sets employee, job and branch
 Relationships between more than two entity sets are rare. Most
relationships are binary. (More on this later.)
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.38 of 67
Modifications & additions by Cengiz Güngör
Mapping Cardinalities
 Express the number of entities to which another entity can be
associated via a relationship set.
 Most useful in describing binary relationship sets.
 For a binary relationship set the mapping cardinality must be
one of the following types:
 One to one
 One to many
 Many to one
 Many to many
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.39 of 67
Modifications & additions by Cengiz Güngör
Mapping Cardinalities
One to one
One to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.40 of 67
Modifications & additions by Cengiz Güngör
Mapping Cardinalities
Many to one
Many to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.41 of 67
Modifications & additions by Cengiz Güngör
Mapping Cardinalities affect ER Design
 Can make access-date an attribute of account, instead of a
relationship attribute, if each account can have only one customer
 I.e., the relationship from account to customer is many to one,
or equivalently, customer to account is one to many
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.42 of 67
Modifications & additions by Cengiz Güngör
E-R Diagrams
 Rectangles represent entity sets.
 Diamonds represent relationship sets.
 Lines link attributes to entity sets and entity sets to relationship sets.
 Ellipses represent attributes
 Double ellipses represent multivalued attributes.
 Dashed ellipses denote derived attributes.
 Underline indicates primary key attributes (will study later)
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.43 of 67
Modifications & additions by Cengiz Güngör
E-R Diagram With Composite, Multivalued, and
Derived Attributes
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.44 of 67
Modifications & additions by Cengiz Güngör
Relationship Sets with Attributes
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.45 of 67
Modifications & additions by Cengiz Güngör
Roles
 Entity sets of a relationship need not be distinct
 The labels “manager” and “worker” are called roles; they specify how
employee entities interact via the works-for relationship set.
 Roles are indicated in E-R diagrams by labelling the lines that connect
diamonds to rectangles.
 Role labels are optional, and are used to clarify semantics of the
relationship
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.46 of 67
Modifications & additions by Cengiz Güngör
Cardinality Constraints
 We express cardinality constraints by drawing either a directed
line (), signifying “one,” or an undirected line (—), signifying
“many,” between the relationship set and the entity set.
 E.g.: One-to-one relationship:
 A customer is associated with at most one loan via the
relationship borrower
 A loan is associated with at most one customer via borrower
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.47 of 67
Modifications & additions by Cengiz Güngör
One-To-Many Relationship
 In the one-to-many relationship a loan is associated with at most
one customer via borrower, a customer is associated with
several (including 0) loans via borrower
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.48 of 67
Modifications & additions by Cengiz Güngör
Many-To-One Relationships
 In a many-to-one relationship a loan is associated with several
(including 0) customers via borrower, a customer is associated
with at most one loan via borrower
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.49 of 67
Modifications & additions by Cengiz Güngör
Many-To-Many Relationship
 A customer is associated with several (possibly 0) loans
via borrower
 A loan is associated with several (possibly 0) customers
via borrower
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.50 of 67
Modifications & additions by Cengiz Güngör
Participation of an Entity Set in a
Relationship Set
 Total participation (indicated by double line): every entity in the entity
set participates in at least one relationship in the relationship set
 E.g. participation of loan in borrower is total
 every loan must have a customer associated to it via borrower
 Partial participation: some entities may not participate in any
relationship in the relationship set
 E.g. participation of customer in borrower is partial
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.51 of 67
Modifications & additions by Cengiz Güngör
Alternative Notation for Cardinality
Limits
 Cardinality limits can also express participation constraints
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.52 of 67
Modifications & additions by Cengiz Güngör
Keys
 A super key of an entity set is a set of one or more attributes
whose values uniquely determine each entity.
 A candidate key of an entity set is a minimal super key
 Customer-id is candidate key of customer
 account-number is candidate key of account
 Although several candidate keys may exist, one of the
candidate keys is selected to be the primary key.
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.53 of 67
Modifications & additions by Cengiz Güngör
Keys for Relationship Sets
 The combination of primary keys of the participating entity sets
forms a super key of a relationship set.
 (customer-id, account-number) is the super key of depositor
 NOTE: this means a pair of entity sets can have at most one
relationship in a particular relationship set.
• E.g. if we wish to track all access-dates to each account by each
customer, we cannot assume a relationship for each access.
We can use a multivalued attribute though
 Must consider the mapping cardinality of the relationship set
when deciding the what are the candidate keys
 Need to consider semantics of relationship set in selecting the
primary key in case of more than one candidate key
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.54 of 67
Modifications & additions by Cengiz Güngör
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.55 of 67
Modifications & additions by Cengiz Güngör
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.56 of 67
Modifications & additions by Cengiz Güngör
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.57 of 67
Modifications & additions by Cengiz Güngör
Not NULL
UBI 502
Database Management Systems
Non-negative
©Silberschatz, Korth and Sudarshan
1.58 of 67
Modifications & additions by Cengiz Güngör
E-R Diagram with a Ternary Relationship
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.59 of 67
Modifications & additions by Cengiz Güngör
Cardinality Constraints on Ternary
Relationship
 We allow at most one arrow out of a ternary (or greater degree)
relationship to indicate a cardinality constraint
 E.g. an arrow from works-on to job indicates each employee works
on at most one job at any branch.
 If there is more than one arrow, there are two ways of defining the
meaning.
 E.g a ternary relationship R between A, B and C with arrows to B
and C could mean
 1. each A entity is associated with a unique entity from B and C or
 2. each pair of entities from (A, B) is associated with a unique C
entity, and each pair (A, C) is associated with a unique B
 Each alternative has been used in different formalisms
 To avoid confusion we outlaw more than one arrow
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.60 of 67
Modifications & additions by Cengiz Güngör
Binary vs Non-Binary Relationships
 Some relationships that appear to be non-binary may be better
represented using binary relationships
 E.g. A ternary relationship parents, relating a child to his/her
father and mother, is best replaced by two binary relationships,
father and mother
• Using two binary relationships allows partial information (e.g. only
mother being know)
 But there are some relationships that are naturally non-binary
• E.g. works-on
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.61 of 67
Modifications & additions by Cengiz Güngör
Converting Non-Binary Relationships to
Binary Form
 In general, any non-binary relationship can be represented using binary
relationships by creating an artificial entity set.
 Replace R between entity sets A, B and C by an entity set E, and three
relationship sets:
1. RA, relating E and A
3. RC, relating E and C
2.RB, relating E and B
 Create a special identifying attribute for E
 Add any attributes of R to E
 For each relationship (ai , bi , ci) in R, create
1. a new entity ei in the entity set E
3. add (ei , bi ) to RB
UBI 502
Database Management Systems
2. add (ei , ai ) to RA
4. add (ei , ci ) to RC
©Silberschatz, Korth and Sudarshan
1.62 of 67
Modifications & additions by Cengiz Güngör
Converting Non-Binary Relationships
(Cont.)
 Also need to translate constraints
 Translating all constraints may not be possible
 There may be instances in the translated schema that
cannot correspond to any instance of R
• Exercise: add constraints to the relationships RA, RB and RC to
ensure that a newly created entity corresponds to exactly one entity
in each of entity sets A, B and C
 We can avoid creating an identifying attribute by making E a weak
entity set (described shortly) identified by the three relationship
sets
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.63 of 67
Modifications & additions by Cengiz Güngör
Design Issues
 Use of entity sets vs. attributes
Choice mainly depends on the structure of the enterprise being
modeled, and on the semantics associated with the attribute in
question.
 Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to describe an
action that occurs between entities
 Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary (n-ary, for n > 2)
relationship set by a number of distinct binary relationship sets, a nary relationship set shows more clearly that several entities
participate in a single relationship.
 Placement of relationship attributes
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.64 of 67
Modifications & additions by Cengiz Güngör
Summary of Symbols Used in E-R
Notation
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.65 of 67
Modifications & additions by Cengiz Güngör
Summary of Symbols (Cont.)
UBI 502
Database Management Systems
©Silberschatz, Korth and Sudarshan
1.66 of 67
Modifications & additions by Cengiz Güngör