Fundamentals, Design, and Implementation, 9/e by David M

Download Report

Transcript Fundamentals, Design, and Implementation, 9/e by David M

BSA206 Database Management Systems
Lecture 23:
Sharing Enterprise Data
Fundamentals, Design,
and Implementation, 9/e
Chapter 15
by David M. Kroenke
Database Processing Architectures
 System architectures for enterprise
database processing:
– Teleprocessing system
– Client-server system
– File-sharing system
– Distributed system
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 2
Teleprocessing Systems
 Classic architecture for multi-user database
processing
 Users operate dumb terminals or PC that
emulate dumb terminals
– User interface is usually simple and primitive
 A single centralized computer processes
communications control program,
application programs, DBMS, and
operating system
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 3
Teleprocessing Systems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 4
Client-Server Systems
 A client-server system consists of a
network of computers connected via a LAN
 Clients are personal computers used to
process application programs
 Servers are PCs or mainframes that stores
DBMS and the data-management portion
of the operating system
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 5
Client-Server Systems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 6
File-Sharing Systems
 This architecture was developed before the clientserver architecture
 File server and user computers are connected
through LAN
– File server provides access to files and other resources
– User computers must contain a copy of DBMS and
application programs
 DBMS on user’s computer sends requests to the
data management portion of the operating system
on the file server for file-level processing
– This cause more traffic across LAN than the client-server
system
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 7
File-Sharing Systems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 8
Distributed Database Systems
 Distributed database systems use multiple
computers to process the same database
 Distributed processing: use multiple
computers for applications or DBMS
processing
– E.g., file-sharing, client-server, and distributed
database system
 Distributed database processing: distribute
database to multiple computers
– E.g., distributed database system
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 9
Distributed Database Systems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 10
Database Partitioning
 A vertical partition, or vertical fragment,
refers to a table that is broken into two or
more sets of columns
 A horizontal partition, or horizontal
fragment, refers to a table that is broken
into two or more sets of rows
 Mixed partition refers to a database broken
into both horizontal and vertical partitions
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 11
Types of Distributed Databases
 Types of distributed database:
–
–
–
–
Nonpartitioned, nonreplicate
Partitioned, nonreplicated
Nonpartitioned, replicated
Partitioned, replicated
 The greater the degree of partitioning and
replication
– The greater the flexibility, independence, and reliability
– The greater the expense, control difficulty, and security
problems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 12
Types of Distributed Databases
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 13
Comparing DB Distribution Alternatives
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 14
Distributed Processing Techniques
 Three types of distributed database processing
 Downloading of read-only data: only one computer
updates data, but multiple computers are sent
copies to process
 Updating by a designated computer: allows data
update requests to originate on multiple
computers, but to transmit those update requests
to a designated computer for processing
– Database at the designated computers must be
periodically synchronized
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 15
Distributed Processing Techniques
(cont.)
 Updating by multiple computers: allows multiple
updates on the same data at multiple sites
– Three types of distributed update conflict can occur:
• Loss of uniqueness
• Lost updates due to concurrent transactions
• Updates of deleted data
 Coordinating distributed atomic transactions is
difficult and requires a two-phase commit
 The OLE Distributed Transaction Server and Java
Enterprise Beans are two technologies for dealing
with these problems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 16
Downloading Data
 Powerful personal computers enable user
to download enterprise data for local
processing
 Users can query and report on downloaded
data using DBMS products on their
machines
 Normally, users are not allowed to update
and return data to prevent data integrity
problems
 A Web server can be used to publish
downloaded data
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 17
Potential Problems of Downloaded
Databases
 Coordination
– Downloaded data must conform to database
constraints
– Local updates must be coordinated with
downloads
 Consistency
– In general, downloaded data should not be
updated
– Applications need features to prevent updating
– Users should be made aware of possible
problems
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 18
Potential Problems of Downloaded
Databases (cont.)
 Access Control
– Data may be replicated on many computers
– Procedures to control data access are more
complicated
 Potential for Computer Crime
– Illegal copying is difficult to prevent
– Diskettes and illegal online access are easy to
conceal
– Risk may prevent the development of
downloaded data applications
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 19
Processing Downloaded Data with
a Web Server
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 20
OLAP
 On Line Analytical Processing (OLAP) is a
new way of presenting information
 With it, data is viewed in cubes that have
axes, dimensions, measures, slices, and
levels
 Cube refers to
– Underlying semantic structure that is used to
interpret data
– A particular materialization of data in such a
semantic structure
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 21
Example: Relational Source Data
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 22
Example: OLAP Cube
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 23
OLAP Terminology
 OLAP
hypercube:
means a
data display
with an
unlimited
number of
axes
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 24
OLAP Schema Structures
 Star schema: every dimension table is adjacent to
the table storing the measure values
– These tables may or may not be normalized
 Snowflake schema: there can be multilevel,
normalized tables
 In general, the star schema requires greater
storage, but it is faster to process than the
snowflake schema
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 25
Example: Star Schema
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 26
Example: Snowflake Schema
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 27
OLAP Storage Alternatives
 Three different means for storing OLAP data
 ROLAP (relational OLAP): relational DBMS with
extensions is sufficient to meet OLAP
requirements
 MOLAP (multidimensional OLAP): a specialized
multidimensional processor is necessary to
produce acceptable OLAP performance
 HOLAP (hybrid OLAP): both DBMS products and
specialized OLAP engines have a role and can be
used to advantage
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 28
Data Warehouse
 A data warehouse is a store of enterprise
data that is designed to facilitate
management decision-making
 Goal: to increase the value of the
organization’s data asset
 Role: to store extracts from operational
data and make those extracts available to
users in a useful format
– Data can be extracts from databases, files,
images, recordings, photos, external data, etc.
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 29
Data Warehouse
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 30
Data Warehouse Components









Data extraction tools
Extracted data
Metadata of warehouse contents
Warehouse DBMS(s) and OLAP servers
Warehouse data management tools
Data delivery programs
End-user analysis tools
User training courses and materials
Warehouse consultants
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 31
Data Warehouse Requirements





Queries and reports with variable structure
User-specified data aggregation
User-specified drill down
Graphical outputs
Integration with domain-specific programs
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 32
Challenges for Data Warehouses
 Inconsistent data
– When data are integrated, inconsistencies can
develop due to timing and domain differences
– Solution: create metadata to describe both
timing and domains of source data
 Tool Integration
– Because of the many tools required in a data
warehouse, tools will have different user
interfaces and inconsistent means of importing
and exporting data, and it may be difficult to
obtain technical support
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 33
Challenges for Data Warehouses
 Lack of tools for managing the data
warehouse
– The organization may have to develop its own
tools for managing non-relational data and for
maintaining appropriate metadata. Such
development is difficult and expensive
 Ad hoc nature of requirements
– Such requests are difficult to satisfy
– Solution: create datamart, i.e,, limited-scope
data warehouses
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 34
Data Marts
 A data mart is a limited-scope data
warehouse
 A data mart is easier to manage than the
enterprise data warehouse because
– It has a much smaller domain
– It can be restricted
• To a particular type of input data
• To a particular business function
• To a particular business unit or geographic area
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 35
Enterprise Data Sharing Continuum
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 36
Data Administration
 Data are an important organizational asset
that can support both operations and
management decision making
 The purpose of offices of data
administration is to guard and protect the
data and to ensure that they are used
effectively
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 37
Data Administration Challenges




Many types of data exist
Basic categories of data are not obvious
The same data can have many names
The same data can have many
descriptions and formats
 Data are changed often concurrently
 Political-organizational issues complicate
operational issues
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 38
Functions of Data Administration
 Marketing
– Communicate existence of data administration to
organization
– Explain reason for existence of standards, policies, and
guidelines
– Describe in a positive light the services provided
 Data Standards
– Establish standard means for describing data items.
Standards include name, definition, description,
processing restrictions, etc.
– Establish data proponents
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 39
Functions of Data Administration
 Data Policies
– Establish organization-wide data policy, e.g., security,
data proponency, and distribution
 Forum for Data Conflict Resolution
– Establish procedures for reporting conflicts
– Provide means for hearing all perspectives and views
– Have authority to make decision to resolve conflict
 Return on Organization's Data Investment
– Focus attention on value of data investment
– Investigate new methodologies and technologies
– Take proactive attitude toward information management
Copyright © 2004 Database Processing: Fundamentals, Design, and Implementation, 9/e
by David M. Kroenke
Lecture 23 / Slide 40