database fundamentals

Download Report

Transcript database fundamentals

DATABASE
FUNDAMENTALS
OBJECTIVES
•
•
•
•
•
•
•
BASIC CONCEPTS
ARCHITECTURE OF DBMS
RDBMS
SQL
DATABASE DESIGN
DATA PROTECTION
CONCURRENCY
BASIC CONCEPTS
•
•
•
•
•
DATA
ROLE OF DATA IN BUSINESS
DATA VS INFORMATION
DATABASE
DATABASE MANAGEMENT
USING FILE CONCEPT
• WELL EVOLVED DBMS
Before Databases:
– Each application suite had independent master files.
– Duplication of data could lead to inconsistencies
– Common master files had integrity and security
problems.
Data structuring techniques to exploit random
access disks made data manipulation techniques
complicated.
– subroutines (a step towards DBMS) gave general
routines to manipulate data.
– To use subroutines required low-level data knowledge.
Database Approach
•Information in the database is subdivided into two concepts:
–Schema
–Data
• Schema is the concept of how information relates to other
pieces of information, and how information should be
grouped
• Data is the concept of the actual information users want to
store in the database. You can only store data in structures
which the schema provides, so we must ensure that the
schema is correct.
BENEFITS OF
DBMS – APPORACH
•
•
•
•
•
•
REDUCED REDUNDANCY
REDUCED INCONSISTENCY
DATA-SHARING
ENFORCEMENT OF STANDARDS
SECURITY RESTRICTIONS
DATA-INTEGRITY
DBMS KEEPS THESE FOLKS
HAPPY!!
• END USERS
• APPLICATION
PROGRAMMERS
• DBAs
Database Architecture
•
•
•
DBMSs do not all confirm to the same
architecture.
The three-level architecture forms the basis of
modern database architectures
The architecture for DBMSs is divided into three
general levels:
–
–
–
1.external
2.conceptual
3.internal
1.the external level :
concerned with the way individual users see the
data
2.the conceptual level :
can be regarded as a community user view -a
formal description of data of interest to the
organisation, independent of any storage
considerations.
3.the internal level :
concerned with the way in which the data is
actually stored
CLIENT SERVER
ARCHITECTURE
MODELS
–HIERARCHICAL
–NETWORK
–RDBMS
–ODBMS
–ORDBMS
RELATIONAL DATABASE
MANAGEMENT SYSTEMS
ORIGIN IN MATHEMATICS (SET THEORY)
RELATION (table)
TUPLE (rows)
DOMAIN
ATTRIBUTES
DOMAIN
Name_dom
DOMAIN
Num_dom
DOMAIN
Age_dom
PROPERTIES OF RELATIONS
•
•
•
•
•
NO DUPLICATE TUPLES
UNORDERED TUPLES
UNORDERED ATTRIBUTES
ATOMIC VALUES FOR ATTRIBUTES
SET OF OPERATIONS THAT CAN BE
PERFORMED ON RELATIONS:
– BASIC SET OPERATORS
– ADVANCED OPERATORS
• CLOSURE PROPERTY OF RELATIONS
KINDS OF RELATIONS
• BASE RELATION
• QUERY RESULTS
• VIEWS
DATA INTEGRITY
• Database reflects reality
• Data in the database makes sense
•
•
•
•
UNIQUE KEY
CANDIDATE KEY
PRIMARY KEY
FOREIGN KEY
INTEGRITY RULES
•
•
•
•
ENTITY INTEGRITY
REFRENTIAL INTEGRITY
ATTRIBUTE INTEGRITY
BUSINESS RULE INTEGRITY
SQL LANGUAGE
CODD’S RULES
1.
2.
3.
4.
The information rule
The guaranteed access rule
Systematic treatment of null values
Active online catalog based on the
relational model
5. The comprehensive data sublanguage
rule
6. The view updating rule
7. High-level insert, update, and delete
8. Physical data independence
9. Logical data independence
10. Integrity independence
11. Distribution independence
12. The nonsubversion rule
DATABASE DESIGN
• TOP-DOWN APPROACH
– E-R DIAGRAMS
• BOTTOM-UP APPROACH
– NORMALIZATION
E-R DIAGRAMS
•A rectangle to denote and entity or an entity set.
•A diamond to denote a relationship between two
entities
•An oval to denote attributes
• A line which links attributes to an
entity or entity set and entity sets to
relationships
• Types of relationships
– 1:1
– 1:M
– M:N
Example: emp-dept database
•1 employee – 1 department
•1 department – many employees
•Relationship between emp and dept: many
to 1
Designing tables from E-R
Diagram
• For 1:1 relations
• For 1:M relations
• For M:N relations
NORMALIZATION
• Transforming data from a problem into
relations while ensuring data integrity and
eliminating data redundancy.
– Data integrity : consistent and satisfies data
constraint rules
– Data redundancy: if data can be found in two places
in a single database (direct redundancy) or
calculated using data from different parts of the
database (indirect redundancy) then redundancy
exists.
• Normalisation should remove redundancy, but
not at the expense of data integrity.
First NF
• ALL THE UNDERLYING DOMAINS
CONTAIN SCALAR VALUES ONLY
With the relation in its flattened form, strange
anomalies appear in the system. Redundant
data is the main cause of insertion, deletion,
and updating anomalies.
– Insertion anomaly –at subject is now in the primary
key, we cannot add a student until they have at least one
subject. Remember, no part of a primary key can be
NULL.
– Update anomaly –changing the name of a student
means finding all rows of the database where that
student exists and changing each one separately.
– Deletion anomaly-for example deleting all database
subject information also deletes student 960145.
Second NF
• Relation is in 1NF
• Every non-key attribute is irreducibly
dependent on the primary key
Third NF
• The non-key attributes are:
– Mutually independent
– Irreducibly dependent on the primary key
Transitive functional dependencies arise:
–when one non-key attribute is functionally
dependent on another non-key attribute:
• FD: non-key attribute -> non-key attribute
–and when there is redundancy in the database
By definition transitive functional dependency
can only occur if there is more than one
non-key field, so we can say that a relation
in 2NF with zero or one non-key field must
automatically be in 3NF.
Project(project_no, manager, address)
manager -> address
FIRST NF
• A relation is in 1NF if it contains no repeating groups
• To convert an unnormalisedrelation to 1NF either:
–Flatten the table and change the primary key, or
–Decompose the relation into smaller relations, one for the repeating
groups and one for the non-repeating groups.
•Remember to put the primary key from the original relation into both
new relations.
• This option is liable to give the best results.
• R(a,b,(c,d)) becomes
• R(a,b)
• R1(a,c,d)
ND
2
NF
• A relation is in 2NF if it contains no repeating groups and
no partial key functional dependencies
–Rule: A relation in 1NF with a single key field must be in
2NF
–To convert a relation with partial functional dependencies
to 2NF.
create a set of new relations:
•One relation for the attributes that are fully dependent upon
the key.
•One relation for each part of the key that has partially
dependent attributes R(a,b,c,d) and a->c becomesR(a,b,d)
and R1(a,c)
RD
3
NF
A relation is in 3NF if it contains no repeating
groups, no partial functional dependencies, and no
transitive functional dependencies
To convert a relation with transitive functional
dependencies to 3NF, remove the attributes
involved in the transitive dependency and put
them in a new relation
R(a,b,c,d)c -> d
Becomes R(a,b,c) R1(c,d)
DENORMALIZATION
DATA ORGANIZATION
• LOGICAL
• PHYSICAL
PHYSICAL STORAGE
OF
DATA
• Volatile storage
• Nonvolatile storage
• Stable storage
FAILURE
• TRANSACTION FAILURE
• SYSTEM FAILURE
• MEDIA FAILURE
RECOVERY
• TRANSACTION RECOVERY
– ACID properties are to be maintained
– Different approaches:
• Log-based recovery
• Shadow paging
• SYSTEM RECOVERY
• MEDIA RECOVERY
CONCURRENCY
CONCURRENCY CONTROL
• TWO-PHASE LOCKING TECHNIQUES
• TIMESTAMPING