Week 14 - California State University, Sacramento

Download Report

Transcript Week 14 - California State University, Sacramento

1
Week 14
November 28
• Database Security
• Transaction Management and Concurrency Controls
• Distributed Database
• Data Warehouse, Data Marts and MMDBMS
• OODBMS
R. Ching, Ph.D. • MIS • California State University, Sacramento
Database Security
2
• The protection of the database against threats using both
technical and administrative controls
• Database security aims to minimize losses caused by
anticipated events in a cost-effective manner without
unduly constraining the users
Threats:
Organization Policy
 Theft and fraud
 Loss of confidentiality
Controls (objectives for system)
 Loss of privacy
 Loss of integrity
Database Security
 Loss of availability
Organizational Resource
R. Ching, Ph.D. • MIS • California State University, Sacramento
Threats
• Any situation or event, whether intentional or
unintentional, that will adversely affect a system and
consequently the organization.
– Tangible losses (hardware, software, data)
– Intangible losses (credibility, confidentiality)
Countermeasures and Contingency Plans
R. Ching, Ph.D. • MIS • California State University, Sacramento
3
Threats and Countermeasures
• Initiate countermeasures to overcome threats
– Consider the types of threat and their impact on the
organization
• Cost-effectiveness
• Frequency
• Severity
R. Ching, Ph.D. • MIS • California State University, Sacramento
4
Threats and Countermeasures
• Objective is to achieve a balance between a reasonable
secure operation, which does not unduly hinder users, and
the costs of maintaining it.
Secured
Costs
Operations
Risks
• Risks are independent of the countermeasures
R. Ching, Ph.D. • MIS • California State University, Sacramento
5
Countermeasures
6
• Computer-based vs. Non-computer-based
Implemented through
the operating system
and/or DBMS
R. Ching, Ph.D. • MIS • California State University, Sacramento
Management policies
and procedures
Computer-Based Controls
• Computer-based controls
– Authorization
– Views
– Backup (and recovery)
– Journaling
– Checkpointing
– Integrity
– Encryption
– Associated procedures
R. Ching, Ph.D. • MIS • California State University, Sacramento
7
Computer-based Control:
Authorization or Access Controls
• Granting privileges which enables users and applications to
legitimately have access to a system or object (table, view,
application, procedure, etc.)
– Authentication ensures the user is who s/he claims
her/himself to be
• Layers of access or penetration into a system
– Ownership and privileges
• Access to database(s)
• Manipulation and definition of data
R. Ching, Ph.D. • MIS • California State University, Sacramento
8
Authorization and Authentication
O/S User
Operating System
DBMS User
DBMS
Table
Grants
Database
Database
Table
Objects and Privileges
R. Ching, Ph.D. • MIS • California State University, Sacramento
9
Computer-based Control:
Views
10
• Virtual relation to support a user’s particular needs
– Restricts access and actions
– Created upon demand of the user
Base
Relations
Virtual relation
R. Ching, Ph.D. • MIS • California State University, Sacramento
Computer-based Control:
Tables
11
SQL> grant select, update, delete on comp_products to scott;
Grant succeeded.
Privilege
SQL> revoke delete on comp_products from scott;
Revoke succeeded.
Table
GRANT privilege ON table TO user;
REVOKE privilege ON table FROM user;
R. Ching, Ph.D. • MIS • California State University, Sacramento
User name
Transaction Management
What is a “transaction?”
• An action or series of actions, carried out by a single user
or application program which reads or updates (changes)
the contents of the database
– Retrievals
– Updates (modifications)
– Insertions
– Deletions
R. Ching, Ph.D. • MIS • California State University, Sacramento
12
What is a “transaction?”
• Characteristics
– Atomicity (entirety of action)
– Consistency (from one consistent state to another)
– Isolation (independent of other transactions)
– Durability (permanence)
R. Ching, Ph.D. • MIS • California State University, Sacramento
13
Transaction Management
• Provide a means for maintaining the integrity of the
database
• Importance:
– In a multi-user environment, the order of transactions
actions must be maintained through concurrency
control
– In the event of a failure or destruction of data, data must
be reconstructed through database recover
Data Integrity
R. Ching, Ph.D. • MIS • California State University, Sacramento
14
Concurrency Control
• The process of managing simultaneous operations on the
database without having them interfere with one another
• Potential problems:
– Lost update problem (one update overrides another)
– Uncommitted dependency problem (intermediate results
of one update viewed by another before it has been
committed)
– Inconsistent analysis problem (data retrieved by one user
updated by another before the end of the retrievals)
– Nonrepeatable read (retrieval results cannot be repeated)
R. Ching, Ph.D. • MIS • California State University, Sacramento
15
Concurrency Control
16
• Serializability - scheduling transactions to maximize
concurrency and parallelism, yet preventing them from
interfering with one another and maintaining consistency
– Serial schedule - non-interleaved transactions
T1  T2  T3  ...  Tn
– Nonserial schedule - interleaved transactions
T1
 T3
 ... Tn
T2  T4  T5  T6  ...  Tn+1
Conflict
R. Ching, Ph.D. • MIS • California State University, Sacramento
Scheduler must resolve conflict
Concurrency Control:
Locking and Timestamping
• Locking
– Prevents simultaneous access or update of the same
data
• Timestamping
– Ordering (prioritizing) transactions by their timestamp
R. Ching, Ph.D. • MIS • California State University, Sacramento
17
Concurrency Controls: Locking
• Locking methods – lock denies other users from accessing
the data while user accessing them
– Shared vs. exclusive lock
– “Deadly embrace” or deadlock – when a user has a lock
on one data item and awaits another, and a second user
awaits the data item locked by the first user and has a
lock on the data item sought by the first
Account balance (locked)
Credit limit (waiting)
Credit limit (locked)
Account balance (waiting)
R. Ching, Ph.D. • MIS • California State University, Sacramento
18
Concurrency Controls: Timestamping
• All transactions assigned a timestamp (unique identifier
that indicates its relative starting time)
• Smaller (older) timestamps are given priority
• Conflicts resolved through rollbacks and restarts
– Transaction rolled back (to its beginning) and restarted
(reassigned a newer timestamp)
R. Ching, Ph.D. • MIS • California State University, Sacramento
19
Timestamping
• Problems
– A younger transaction writes a data item before an older
transaction accesses it
– An older transaction needs to write a data item already
accessed by a younger transaction
– An older transaction needs to write a data item already
written by a younger transaction
• Resolved through roll backs and restarts
R. Ching, Ph.D. • MIS • California State University, Sacramento
20
Distributed Databases
21
• Distributed database:
A logically interrelated collection of shared data,
physically distributed over a computer network
Network
Transparency
DDBMS
Global Data
Dictionary
DDBMS
Global Data
Dictionary
Local DBMS
Local DBMS
Geographically Distributed
Database
Database
Site 1
Site 1
Site 1
Site 3
DDBMS
DDBMS – software system that
permits the management of the
distributed database and makes the
distribution transparent to the user.
R. Ching, Ph.D. • MIS • California State University, Sacramento
Global Data
Dictionary
Local DBMS
Database
Site 1
Site 2
Heterogeneous
vs.
Homogenous
DDBMS Architecture
DDBMS
Global Data
Dictionary
Data
Communications
Global external schema
Global conceptual schema
Local DBMS
Database
22
DDBMS
Global Data
Dictionary
Local DBMS
Local external schema
Local conceptual schema
Local internal schema
Site 1
R. Ching, Ph.D. • MIS • California State University, Sacramento
Database
Site 2
Data Allocation
• Centralized
• Partitioned (fragmented)
– Vertical (by columns)
– Horizontal (by rows)
– Mixed (by columns and rows)
• Complete replication
• Selective replication (hybrid)
– Combination of partitioning,
replication and centralization
R. Ching, Ph.D. • MIS • California State University, Sacramento
23
Distributed
Advantages to Distributing
•
•
•
•
•
•
•
•
•
Reflects organizational (distributed) structure
Improved shareability and local autonomy
Improved availability
Improved reliability
Improved performance
Economics
Modular growth
Integration
Remaining competitive
R. Ching, Ph.D. • MIS • California State University, Sacramento
24
Disadvantages to Distributing
•
•
•
•
•
•
•
Complexity
Cost
Security
Integrity control more difficult
Lack of standards
Lack of experience
Database design more complex
R. Ching, Ph.D. • MIS • California State University, Sacramento
25
Considerations for Fragmenting
• Usage
– Fragmenting by subsets
• Efficiency
– Store data where they are used most frequently
• Parallelism
– Parallel execution of a query (divided into subqueries)
simultaneously
• Security
– Store data away from site that do not require them
R. Ching, Ph.D. • MIS • California State University, Sacramento
26
Disadvantages to Fragmenting
• Performance
– Increased retrieval time
• Integrity
– Difficult to maintain across multiple sites
– What happens when two users need to update the same
data?
R. Ching, Ph.D. • MIS • California State University, Sacramento
27
Transparency
28
• Distribution
– Users perceive the database as a single logical entity
• Fragmentation transparency
The user should NOT be
• Location transparency
aware of where the data
• Replication transparency
reside or are allocated
• Local mapping transparency
• Transaction
– All distributed transactions maintain the distributed
database’s integrity and consistency
• Concurrency transparency
• Failure transparency
R. Ching, Ph.D. • MIS • California State University, Sacramento
Transparency
• Performance
– DDBMS must perform as if it were a centralized
DBMS
• DBMS
– Hides the knowledge that the local DBMS may be
different (applicable to heterogeneous DDBMS)
R. Ching, Ph.D. • MIS • California State University, Sacramento
29
R. Ching, Ph.D. • MIS • California State University, Sacramento
Low
Infrequent
Required Accuracy
Frequency of Use
High
Very frequent
Quite old
Currency
Highly current
Future
Aggregate
Time Horizon
Information requirements change
between levels of management
Historical
Scope
Well defined
Operational
Control
Level of Aggregation
Source
Management
Control
Internal
Strategic
Planning
Detailed
Wide
Information Requirements
External
Robert Anthony’s Taxonomy of Managerial
Information Requirements
30
R. Ching, Ph.D. • MIS • California State University, Sacramento
Low
Infrequent
Required Accuracy
Frequency of Use
High
Very frequent
Quite old
Currency
Highly current
Future
Aggregate
Time Horizon
•Relational (Oracle, DB2, SQL7
•Hierarchical (IMS)
•Network (Image)
Historical
Operational
Control
Level of Aggregation
Scope
Well defined
Transactionbased
databases
Source
Management
Control
Internal
Strategic
Planning
Detailed
Wide
Information Requirements
External
Robert Anthony’s Taxonomy of Managerial
Information Requirements
31
Data Warehousing
32
• A subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s
decision-making process.
Ad hoc queries
External data
Time-variant
Internal data
(within the organization)
Tools Decision-making
Information
Summarized data
Competitive or Strategic Advantage
R. Ching, Ph.D. • MIS • California State University, Sacramento
•Report generators
•EIS
•OLAP
•Data mining
Data Warehousing Characteristics
• Subject-oriented - Organized around the major business
subjects or entities, such as customers, order or products
• Integrated - Operational (internal) data and external data
are integrated into the data warehouse to provide a single
unified database for decision support
• Time-variant - Use time stamps to represent historical data.
Data warehouses consist of a long series of snapshots, each
of which represents operational data captured at a point in
time
• Nonvolatile - New data are appended, rather than replaced,
so that historical data are preserved
R. Ching, Ph.D. • MIS • California State University, Sacramento
33
Data Warehouse
34
Warehouse Manager
External
sources
Load Manager
Upflow
Meta-flow
Lightly
summarized data
Metadata
Detailed data
Query Manager
Inflow
Highly
summarized data
Outflow
Outflow
Outflow
Warehouse Manager
Downflow
Archive/backup data
R. Ching, Ph.D. • MIS • California State University, Sacramento
End-user
tools
Data Warehouse
Data Mart
35
• A subset of a data warehouse that supports the
requirements of a particular department or business
function
Summarized
data
Oracle9i
Relational
database
Extraction
Summarized Oracle
Express
data
Multidimension
database
R. Ching, Ph.D. • MIS • California State University, Sacramento
End-user tools:
• Reporting
• EIS
• OLAP
• Data mining
Implementation
36
• Build data warehouse first
• Build data marts first
• Build both in parallel
Architecture
Data Warehouse
Developed and implemented in parallel
Data Marts
R. Ching, Ph.D. • MIS • California State University, Sacramento
Multi-dimensional Database (MDDBMS)
Products
Geographic locations
Time is an implied dimension
R. Ching, Ph.D. • MIS • California State University, Sacramento
Sales medium
(e.g., retail,
Internet, mail
order)
37
Multi-dimensional Database (MDDBMS)
38
For example…
Computers
Products
Printers
Scanners
Retail
Mail
Internet
Sales medium
Cameras
Geographic locations
R. Ching, Ph.D. • MIS • California State University, Sacramento
Multi-dimensional Database (MDDBMS)
Working with Two Dimensions
39
Internet
Q1
‘95
April
Electronics
‘96
Total
Revenue
Q2
May
‘97
Mail
Order
Audio
Receivers
Speakers
‘98
Q3
‘99
Q4
June
Retail
Repeated for
each quarter Repeated for
each medium
Repeated for
each year
R. Ching, Ph.D. • MIS • California State University, Sacramento
Speakers
CD/DVD
Visual
Entertainment
Multi-dimensional Database (MDDBMS)
Working with Three Dimensions
Internet
Q1
‘95
40
USA
Electronics
‘96
Total
Revenue
Q2
N. America
‘97
Mail
Order
Europe
‘98
‘99
Q4
Receivers
Speakers
Q3
Aisa
Audio
Retail
Speakers
CD/DVD
Visual
Entertainment
R. Ching, Ph.D. • MIS • California State University, Sacramento
Time dimension
41
Retail sales
dimension
Dimensions
R. Ching, Ph.D. • MIS • California State University, Sacramento
Oracle Express
Distribution channels dimension
42
Retail sales
dimension
R. Ching, Ph.D. • MIS • California State University, Sacramento
Data Warehousing Configuration:
Star Schema
Which sales mode is
becoming more effective
for certain products in
particular regions?
Dimension Table
(Sales medium)
Dimension Table
Fact Table
(Product line)
Which sales staff produced
the highest level of sales for a
particular product line in
California?
43
What products sold well in
different regions of the country
through e-commerce (list by
quarters)?
Dimension Table
(Geographic divisions)
Dimension Table
(Sales staff)
What this the growth rate for the
past 5 years in retail sales of a
particular product line by region?
Time is an implied dimension
R. Ching, Ph.D. • MIS • California State University, Sacramento
OODBMS
44
OID
Message
Data
VS.
Entities
R. Ching, Ph.D. • MIS • California State University, Sacramento
Object-Oriented Concepts
Methods (function)
determine the behavior
of the object
Message
• External call to
the object
• Activates a
method
OID
Data
45
Object Identifier
• System generated
• Unique
• Invariant
• Independent of
attribute values
• Invisible to the
user
Attributes or instance variables
• Simple
• Complex
• Reference
R. Ching, Ph.D. • MIS • California State University, Sacramento
Relational vs. Object-Relational
Relational
Table
Built-in Data
Types
46
Relational
View
Tables
Object
Table
Views
Built-in Data
Types
Object Views
Abstract Data
Types
Object
Tables
David A. Anstey, 1997
R. Ching, Ph.D. • MIS • California State University, Sacramento
Data Types
• Built-in
– Character (char, varchar2)
– Number (integer, decimal, number)
– Date
– Raw and long raw
– RowID
– LOB (CLOB, BLOB)
R. Ching, Ph.D. • MIS • California State University, Sacramento
47
ADTs (Abstract Data Types)
48
• User-defined data types
• Composed of simple or built-in data types
• Types: object types and collection (aggregate) types
Object type
Table
ADT
Built-in
Built-in
R. Ching, Ph.D. • MIS • California State University, Sacramento
Built-in
New Data Type: VARRAY
49
• Single dimension arrays of fixed lengths
SQL> create or replace type contact_addresses as varray(4) of varchar2(30);
2 /
Type created.
SQL> create or replace type contact_zip_codes as varray(4) of char(8);
2 /
Type created.
R. Ching, Ph.D. • MIS • California State University, Sacramento
Object Types
• Three components:
– Name - unique identifier of the object
– Attributes - describes the object through built-in and
abstract data types
– Method - dictates the behavior of the object
SQL> create type students as object
2 (student_ID char(9),
3 student_information personal_information);
4 /
Type created.
R. Ching, Ph.D. • MIS • California State University, Sacramento
50
SQL> create or replace type contact_addresses as varray(4) of varchar2(30);
2 /
51
Type created.
SQL> create or replace type contact_zip_codes as varray(4) of char(8);
2 /
Type created.
Embedding a user-defined data type
SQL> create or replace type personal_information as object
2 (first_name varchar2(20),
3 middle_name varchar2(20),
4 last_name varchar2(30),
5 address contact_addresses,
6 zip_code contact_zip_codes);
7 /
Type created.
Data name
Data type
R. Ching, Ph.D. • MIS • California State University, Sacramento
ADT
SQL> create or replace type personal_information as object
2 (first_name varchar2(20),
3 middle_name varchar2(20),
4 last_name varchar2(30),
5 address contact_addresses,
6 zip_code contact_zip_codes);
7 /
Type created.
Embedding an ADT
SQL> create table employees
2 (employee_id char(6) primary key,
3 employee_address personal_information);
Table created.
ADT (user-defined)
SQL> create table vendors
2 (vendor_id char(5) primary key,
3 employee_address personal_information);
Table created.
R. Ching, Ph.D. • MIS • California State University, Sacramento
52
Table with ADT
53
SQL> describe employees;
Name
-------------------------------EMPLOYEE_ID
EMPLOYEE_ADDRESS
Null?
Type
-------- ---------------------NOT NULL CHAR(6)
PERSONAL_INFORMATION
SQL> describe vendors;
Name
-------------------------------VENDOR_ID
EMPLOYEE_ADDRESS
Null?
Type
-------- ---------------------NOT NULL CHAR(5)
PERSONAL_INFORMATION
Employee_address
Vendor_ID
(Employee_ID)
R. Ching, Ph.D. • MIS • California State University, Sacramento
ADTs
Creating an Object Table
SQL>
2
3
4
5
54
create or replace type personnel as object
(employee_id char(7),
manager personal_information,
rank varchar2(5));
/
Type created.
ADT
SQL> create table managers of personnel;
Table created.
SQL> describe managers;
Name
Null?
------------------------------- -------EMPLOYEE_ID
MANAGER
RANK
R. Ching, Ph.D. • MIS • California State University, Sacramento
Type
--------------------CHAR(7)
PERSONAL_INFORMATION
VARCHAR2(5)
Creating an Object Table
SQL>
2
3
4
5
55
create or replace type personnel as object
(employee_id char(7),
manager personal_information,
ADT
rank varchar2(5));
/
SQL> describe personal_information;
Type created.
Name ADT
Null?
Type
--------------- ------- --------------------FIRST_NAME
SQL> create table managers
of personnel; VARCHAR2(20)
MIDDLE_NAME
VARCHAR2(20)
LAST_NAME
VARCHAR2(30)
Table created.
ADDRESS
CONTACT_ADDRESSES
ZIP_CODE
CONTACT_ZIP_CODES
SQL> describe managers;
Name
Null?
------------------------------- -------EMPLOYEE_ID
MANAGER
RANK
R. Ching, Ph.D. • MIS • California State University, Sacramento
Type
--------------------CHAR(7)
PERSONAL_INFORMATION
VARCHAR2(5)
Object Reusability
56
• Create a second table using PERSONNEL
SQL> create table executives of personnel;
Table created.
SQL> describe executives;
Name
Null?
Type
-------------------------------- -------- ---------------------EMPLOYEE_ID
CHAR(7)
MANAGER
PERSONAL_INFORMATION
RANK
VARCHAR2(5)
R. Ching, Ph.D. • MIS • California State University, Sacramento
Object Tables
Executives
Object type
ADT
57
Tables
Managers
Personnel
Employee_ID
Personal Information
Contact_addresses
ADT
R. Ching, Ph.D. • MIS • California State University, Sacramento
Contact_zip_codes
Built-in
Data Type
Methods
Map method:
SQL> create or replace type transactions
2 (trans_id number,
3 trans_date date)
4 map member function get_date
5 return date is
Function
6
begin
7
select sysdate from dual;
8
end;
9 );
10 /
R. Ching, Ph.D. • MIS • California State University, Sacramento
58
59
R. Ching, Ph.D. • MIS • California State University, Sacramento