02 Lecture 1
Download
Report
Transcript 02 Lecture 1
Lecture 1
BIS4435 – Database Management for
Decision Support
Dr. Nawaz Khan
School of Computing Science
E-mail: [email protected]
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
BIS4435 – Data Management for Decision Support 1
Coursework
Discussion Topics
Each week you will have a discussion topic (DQ)
DT will be available on Fridays. Reply to the DQ by Tuesdays
Review and comment on your mates’ responses (at least one).
Lecture 1
DQ requirements
Initial response to DQ should be 250 words.
Review and comments on your mates’ initial response should be
150 words.
Appropriate references at the end of the text and proper citation of
references through out the text (avoid quotes). Harvard style ref.
Print your initial post, response to your mates’ initial response and
any comment on your post that is made by your mates. Preserve it
in order to attach it with your CW portfolio.
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
BIS4435 - Online Database Systems
2
More on Coursework: Self Reflection
Make sure you have one/one and half pages self reflection
on both the coursework.
It should contain:
Lecture 1
What you have achieved so far
What you find the most difficult thing
What you find the easiest thing
What challenges you face to do the coursework
How have you overcome the challenges
What motivated you
What are the new concepts you come across and how it
helps to achieve the learning outcomes
What you could have done better
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
3
Introduction
Copy coursework from someone ??
Lecture 1
See University Plagiarism policy in students’ handbook
Don’t forget to put references (follow specific guidelines for
referencing)
Acknowledge properly, since collusions are not accepted
Make sure you have cited your references
Examination: followed by the University regulations
Contents of the module: 3 main parts
Theoretical aspects of relational database
Large Database: Data Warehouse and Data Mining
Decision Support System
General diagram of the module: see next slide
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
4
Introduction
DB Back-End
Lecture 1
Theoretical Aspects
of Relational DB
Example Relational DB:
Oracle
Decision support
System
Uncertainty, probability and Linear model,
feed forward network, self-organising
map,
Expert system and knowledge
engineering
Large Data
Management
Data Strategy and Data
Warehousing
Intelligent Database and
Data Mining
Fuzzy logic, genetic algorithm and
hybrid intelligent
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
5
Reading Materials
Lecture 1
Connolly, T.M., and Begg, C.E., Database Systems: A
Practical Approach to Design, Implementation and
Management, Addison-Wesley, 4th Edition, ISBN:
0321210255
Belavkin, R., Blundell, B., Cairns, P., Huyck, C.,
Mitchell, I., Stockman, T. (2005). Management Support
Systems, Middlesex University Press, ISBN: 1-89825368-4
Module Learning Units at OasisPlus
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
6
Introduction
File-based approach
Shared file approach
Database approach: 3-schema architecture
Lecture 1
External schema
Conceptual schema
Internal schema
DBMS components
Characteristics of the database approach
Brief history of database systems and applications
Extending database capabilities for new applications
Reading suggestion:
Connolly, T.M., and Begg, C.E., Database Systems: A Practical Approach to Design,
Implementation and Management, Addison-Wesley, 4th Edition, ISBN:
0321210255(chapters 1)
Global campus materials on OASIS: http://oasis.mdx.ac.uk/ (unit 1)
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
7
Introduction
File-based approach
Lecture 1
Data is stored in one or more separate computer files
Data is then processed by computer programs applications
Problems:
Data redundancy
Data inconsistency
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
8
Introduction
Files
Customer
File
Applications
Stock
File
Customer
Orders
Order
File
Customer
File
Lecture 1
Stock
File
Customer
Invoicing
Files
Applications
Customer
File
Customer
Orders
Stock
File
Customer
Invoicing
Order
File
Order
File
Purchase
Orders
Stock
File
Purchase
Orders
Supplier
File
Supplier
File
Stock
Control
Stock
File
Stock
Control
Order
File
File-based approach
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
Shared file approach
9
Introduction
Shared file approach
Lecture 1
Data (files) is shared between different applications
Data redundancy problem is alleviated
Data inconsistency problem across different versions of the
same file is solved
Other problems:
Rigid data structure: If applications have to share files, the file
structure that suits one application might not suit another
Physical data dependency: If the structure of the data file
needs to be changed in some way, this alteration will need to
be reflected in all application programs that use that data file
No support of concurrency control: While a data file is being
processed by one application, the file will not be available for
other applications or for ad hoc queries
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
10
Introduction
Database approach
Lecture 1
DataBase Management System (DBMS): a generalpurpose software system that facilitates the processes of
defining, constructing, manipulating, and sharing databases
among various users and applications.
Database: a collection of related data managed by a DBMS
Data: known facts that can be recorded and that have
implicit meaning
Database system = the database + DBMS software
DBMS provides facilities for querying, data security, and
integrity and concurrent control
Database application - set of programs that use DBMS to
perform a particular business function
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
11
Introduction
Database approach
Lecture 1
Advantage - physical
and logical data
independence achieved by hierarchy of
levels of data
specification
External
Schema 1
External
Schema N
External
Schema 2
Interface between
conceptual schema and
external schemas
Conceptual
Schema
Interface between
conceptual schema
and internal schema
Internal
Schema
Database
physically stored
in files on disks
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
12
Introduction
Database approach
External schema:
Lecture 1
Describes database as it is seen by user and applications reflects a simplified model of the world
Allows applications to see as much of data as they require,
while excluding unrelated data items
Interfaces with conceptual schema
May be modified or created without altering physical storage
of data, modification reflected in the interface
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
13
Introduction
Database approach
Conceptual schema:
Lecture 1
Describes the universe of interest to the users of the database
system - data required
Concerned with data rather than storage or access,
concentrates on describing entities, data types, relationships,
user operations, and constraints
Interfaces with external and internal schema
Logical data independence is the capability to change the
conceptual schema without having to change external
schemas or application programs
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
14
Introduction
Database approach
Internal schema:
Lecture 1
Definition of the way in which data is physically stored
Interface with conceptual schema identifies how an item in the
conceptual schema is stored and accessed
Physical data independence is the capability to change the
internal schema without having to change the conceptual
schema. Hence, the external schemas need not to be
changed as well
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
15
Introduction
File-based approach
Shared file approach
Database approach: 3-schema architecture
Lecture 1
External schema
Conceptual schema
Internal schema
DBMS components
Characteristics of the database approach
Brief history of database systems and applications
Extending database capabilities for new applications
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
16
Introduction
DBMS components
Lecture 1
DBMS engine - central component
User interface - languages & interfaces
Data dictionary - data structure, user & access right, rules
Performance management - query optimisation & DBMS
reorganisation
Data integrity - intra-record, referential integrity &
concurrency
Backup and recovery - log
Application development - CASE tool
Security management - protect and control access to
database & data dictionary
See the reading suggestion for more details
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
17
Introduction
Characteristics of the database approach
Self-describing nature of a database systems
Insulation between programs and data, and data abstraction
Lecture 1
Program-data independence + Program-operation
independence = Data abstraction
A data model is a type of data abstraction
Support of multiple views of the data
Sharing of data and multi-user transaction processing
Other advantages of using the DBMS approach
Controlling redundancy
Restricting unauthorized access
Providing persistent storage for program objects
Providing storage structures for efficient query processing
Providing backup and recovery
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
18
Introduction
Characteristics of the database approach
Other advantages of using the DBMS approach (cont.)
Lecture 1
Providing multiple user interfaces
Representing complex relationships among data
Enforcing integrity constraints
Permitting inference and actions using rules: deductive
database systems
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
19
Introduction
Extending database capabilities for new applications
Lecture 1
Example applications: storage and retrieval of images,
videos, data mining (large amounts of data need to be stored
and analyzed), spatial databases, time series applications,
…
More complex data structures than relational representation
New data types except for the basic numeric and character
string types
New operations and query languages for new data types
New storage and retrieval methods
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
20
Distributed Database
Lecture 1
A distributed database management systems employ a
number of computer workstations at different sites. These
workstations are part of a local network system. The
workstations contain a set of hardware and software that
allow them to be an integral part of this network and the
DDBMS must rely on these network components for its
data exchange. The workstations are needed to be
attached to each other through a communication media
that allow the sites to interact and to carry data
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
21
Introduction
Summary:
Lecture 1
Introduction to the module, BIS4435File-based approach
Shared file approach
Database approach
DBMS components
Characteristics of the database approach
Brief history of database systems and applications
Distributed Database
Reading suggestion: do not forget !!
Next week: Relational Data Model
Reading suggestion:
[1]: Chapter 1 (The relational data model and relational database
constraints)
[3]: Unit 1: Global Campus Unit
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
22
Lecture 1
BIS4435 – Data Management for
Decision Support
Lecture 2: Relational Data Model
Dr. Nawaz Khan
School of Computing Science
E-mail: [email protected]
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
BIS4435
23
Introduction
Unit 1: Introduction to the Module: Introduction to the Database
Unit 2: Fundamentals of Relational and Object Model:
Reading Suggestion:
Lecture 1
Unit 1:
Connolly, T.M., and Begg, C.E., Database Systems: A Practical
Approach to Design, Implementation and Management, Addison-Wesley,
4th Edition, ISBN: 0321210255(chapters 1)
Global campus materials on OASIS: http://oasis.mdx.ac.uk/ (unit 1)
Unit 2:
Connolly, T.M., and Begg, C.E., Database Systems: A Practical
Approach to Design, Implementation and Management, Addison-Wesley,
4th Edition, ISBN: 0321210255(chapters 3)
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
BIS4435
24
Relational Data Model
Outline
Lecture 1
Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc.
Relational Data Model Constraints
key, primary key & foreign key
entity integrity constraint
referential integrity
Update Operations on Relations
insert
deletion
modification
Summary
Q&A
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
25
Relational Data Model
Basic concepts
Lecture 1
Relational data model: represents a database in the form of
relations - 2-dimensional table with rows and columns of data. A
database may contain one or more such tables. A relation schema
is used to describe a relation
Relation schema: R(A1, A2,…, An) is made up of a relation name
R and a list of attributes A1, A2, . . ., An. Each attribute Ai is the
name of a role played by some domain D in the relation schema R.
R is called the name of this relation. The degree of a relation is the
number of attributes n of its relation schema.
Domain D: D is called the domain of Ai and is denoted by dom(Ai).
It is a set of atomic values and a set of integrity constraints
STUDENT(Name, SSN, HomePhone, Address, OfficePhone, Age, GPA)
Degree = ??
dom(GPA) = ??
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
26
Relational Data Model
Basic concepts
Lecture 1
Tuple: row in table, record
Cardinality: number of tuples (rows)
Database schema S = {R1, R2,…, Rm}
A relation (or relation state, relation instance) r of the relation
schema R(A1, A2, . . ., An), also denoted by r(R), is a set of ntuples r = {t1, t2, . . ., tm}. Each n-tuple t is an ordered list of n
values t = <v1, v2, . . ., vn>, where each value vi, i=1..n, is an
element of dom(Ai) or is a special null value. The ith value in tuple
t, which corresponds to the attribute Ai, is referred to as t[Ai]
Relational data model
Database schema
Relation schema
Relation
Tuple
Attribute
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
27
Relational Data Model
Lecture 1
A relation can be conveniently represented by a table, as the
above example shows
The columns of the tabular relation represent attributes
Each attribute has a distinct name, and is always referenced by
that name, never by its position
Each row of the table represents a tuple. The ordering of the
tuples is immaterial and all tuples must be distinct
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
28
Relational Data Model
Outline
Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc.
Relational Data Model Constraints
Lecture 1
key, primary key & foreign key
entity integrity constraint
referential integrity
Update Operations on Relations
insert
deletion
modification
Summary
Q&A
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
29
Relational Data Model
Relational Data Model Constraints
Key, primary key & foreign key
Key: A key K of a relation R is a subset of the attributes of R
which has the following time-independent properties:
Lecture 1
– Unique identification: The value of K must uniquely identify each
tuple in R
– Non-redundancy: No attribute in K can be discarded without
destroying property 1
Primary Key: There may be more than one set of attributes
which satisfy both properties. Each of them is called a
candidate key. In this case only one candidate key must be
chosen as the key called primary key. Primary key (PK)
cannot contain null value
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
30
Relational Data Model
Relational Data Model Constraints
Key, primary key & foreign key
Single primary key:
Lecture 1
TASK (TASK_NAME, START_DATE,EXPECTED_COMP_DATE,
COMP_DATE, EMPNO)
Combined primary key: If we want to store in the task table, the
details of tasks for a number of different projects, TASK_NAME
is no longer a unique identifier. The solution is to use a further
attribute to provide a unique identifier for each task.
TASK (PROJECT_NAME, TASK_NAME, START_DATE,
EXPECTED_COMP_DATE, COMP_DATE, EMPNO)
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
31
Relational Data Model
Example: A relational database schema that we call COMPANY = {EMPLOYEE,
DEPARTMENT, DEPT_LOCATIONS, PROJECT, WORKS_ON, DEPENDENT}
Lecture 1
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
32
Relational Data Model
Relational Data Model Constraints
Key, primary key & foreign key
Lecture 1
Foreign Key (FK): A set of attributes is a FK of R1 if and only if it is the
primary key of another relation schema R2
FK represents the relationship between R1 and R2
FK can contain null value, but we may avoid this by replacing with a
flag
Example:
TASKS (TASK_NAME, START_DATE, EXP_COMP_DATE, COMP_DATE,
EMPNO, PROJECT_NAME*)
PROJECT (PROJECT_NAME, START_DATE, EXP_COMP_DATE,
COMP_DATE, PROJECT_LEADER)
To form the link between the two tables, we place the primary key of
the PROJECT table into the TASK table. Through the use of
PROJECT_NAME as a FK, we can see for any given task, the project
to which it belongs
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
33
Relational Data Model
Relational Data Model Constraints
Lecture 1
Entity Integrity Constraint: all entities in a database must be
identified by the primary key. That is why NO primary key value can
be null
Referential Integrity Constraint: The value of a foreign key must
be meaningful, which means that a tuple in one relation that refers
to another relation must refer to an existing tuple in that relation
Example: JobList and Company tables (see notes for labs 02, 03)
JobList
JobID CompanyID JobTitle Salary ….
Company
CompanyID CompanyName Address ...
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
34
Relational Data Model
Outline
Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc.
Relational Data Model Constraints
Lecture 1
key, primary key & foreign key
entity integrity constraint
referential integrity
Update Operations on Relations
insert
deletion
modification
Summary
Q&A
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
35
Relational Data Model
Update Operations on Relations
Insertion: to insert a new tuple t into a relation R. When
inserting a new tuple, it should make sure that the database
constraints are not violated:
Lecture 1
The value of an attribute should be of the correct data type (i.e.
from the appropriate domain).
The value of a prime attribute (i.e. the key attribute) must not be
null
The key value(s) must not be the same as that of an existing
tuple in the same relation
The value of a foreign key (if any) must refer to an existing
tuple in the corresponding relation (labs !!)
Two options if the constraints are violated:
Reject the operation
Rectify the operation
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
36
Relational Data Model
Update Operations on Relations
Deletion: to remove an existing tuple t from a relation R.
When deleting a tuple, the following constraints must not be
violated:
Lecture 1
The tuple must already exist in the database
The referential integrity constraint is not violated
Four options if the constraints are violated:
Reject the operation
Rectify the operation to find the existing tuple concerned
Also remove the tuples that reference the tuple being deleted
Modify the referencing attribute values (i.e. values of foreign
keys) that cause the violation
Modification: to change values of some attributes of an
existing tuple t in a relation R
Test all above operations and constraints during the labs
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
37
ACTIVITY
Lecture 1
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
BIS4229 – Industrial Data Management
Technology
38
Introduction
Summary:
Relational Data Model
Lecture 1
Basic concepts
Constraints
Update operations on relations
Reading suggestion: do not forget !! (other relational operations)
Next week:
ER Modeling:
Reading suggestion:
– [1]: Chapter 3, 7
– [3]: Unit 2
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
39
Relational Data Model
Outline
Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc.
Relational Data Model Constraints
Lecture 1
key, primary key & foreign key
entity integrity constraint
referential integrity
Update Operations on Relations
insert
deletion
modification
Summary
Q&A
Dr. Nawaz Khan, School of Computing Science
E-mail: [email protected]
40