Online Analytical Processing (OLAP)
Download
Report
Transcript Online Analytical Processing (OLAP)
Online Analytical
Processing (OLAP)
Hweichao Lu
CS157B-02 Spring 2007
What is OLAP
Basic idea: converting data into
information that decision makers need
Concept to analyze data by multiple
dimension in a structure called data cube
History
In 1993, E. F. Codd came up with the
term online analytical processing (OLAP)
and proposed 12 criteria to define an
OLAP database
the term OLAP seems perfect to describe
databases designed to facilitate decision
making (analysis) in an organization
Purpose of OLAP
To derive summarized information from
large volume database
To generate automated reports for
human view
Why need OLAP over
Relational Database I
Consistently fast response
OLAP obtains a consistently fast
response is by prestoring calculated
values
Why need OLAP over
Relational Database II
Metadata-based queries
provide analysis functions that are
difficult or impossible to express in SQL
SQL was developed primarily for
transaction systems, not for reporting
applications
Why need OLAP over
Relational Database III
Spreadsheet-style formulas
design the data structure with users in
mind.
Spreadsheets are key components of
business management because they are
intuitive to create
Step I
1. identify multidimensional data
measure attribute
(measure some value, can be
aggregated upon)
dimension attribute
(define the dimension and summary of
measure attribute)
(Cont.)
Each dimension is typically expressed as
a “hierarchy”
Hierarchy: Analyst is interested in
different level of detail of a dimension
Step II
2. Analyze multidimensional data into
cross-tabulation
row header: value for one attribute
column header: value for another attr.
individual cell: value aggregation
Step III
3. Visualize n-dimensional cube - data
cube
the word CUBE describe what in the
relational world would be the integration
of the fact table with dimension tables
Step IV
After you design the cube, you will use
the cube's structure to build a relational
database (known as a star schema) to
house the data for the cube
Step V
Once you load data into the relational
database, and then into the cube, you'll
be able to see how attributes, dimensions,
measures, and measure groups fit
together within a cube to create a
powerful analytical tool.
Star Schema
Cubes are easily stored in relational databases,
using a denormalized data structure called the
star schema, developed by Ralph Kimball
starts with a central fact table
Each row in the central fact table contains
some combination of keys that makes it unique.
These keys are called dimensions.
Slicing & Dicing
Additional Functionality that can be
thought of as viewing a slice of the data
cube, particularly when values for
multiple dimensions are fixed.
Slicing/Dicing simply consists of selecting
specific values for these attributes, which
are then displayed on top of the cross-tab
Rollup & Drill-down
OLAP permit users to view data at ay
desired level of granularity.
Rollup: moving from finer-granularity data
to coarser granularity
Drill-down: opposite to Rollup
OLAP Inplementation
Multidimensional OLAP (MOLAP)
Relational OLAP (ROLAP)
Hybrid OLAP (HOLAP)
MOLAP
The database is stored in a special, usually
proprietary, structure that is optimized for
multidimensional analysis.
+ : very fast query response time because data
is mostly pre-calculated
-: practical limit on the size because the time
taken to calculate the database and the space
required to hold these pre-calculated values
ROLAP
The database is a standard relational database
and the database model is a multidimensional
model, often referred to as a star or snowflake
model or schema.
+: more scalable solution
-: performance of the queries will be largely
governed by the complexity of the SQL and the
number and size of the tables being joined in
the query
HOLAP
a hybrid of ROLAP and MOLAP
can be thought of as a virtual database
whereby the higher levels of the
database are implemented as MOLAP
and the lower levels of the database as
ROLAP
DOLAP
The previous terms are used to refer to
server based OLAP technologies
DOLAP (Desktop OLAP)
DOLAP enables users to quickly pull
together small cubes that run on their
desktops or laptops
Conclusion
OLAP is a significant improvement over
query systems
OLAP is an interactive system to show
different summaries of multidimensional
data by interactively selecting the
attributes in a multidimensional data cube
References
IBM Redbooks. DB2 Cube Views: A Primer. Durham, NC,
USA: IBM, 2003. ebrary collections. San Jose State
University.
<http://site.ebrary.com/lib/sjsu/Doc?id=10113016&ppg=43>
Jacobson, Reed, Microsoft® SQL Server™ 2005 Analysis
Services Step by Step. Microsoft Press.
Berry, Michael J. A. Data Mining Techniques : For
Marketing, Sales, and Customer Relationship Management.
Hoboken, NJ, USA: John Wiley & Sons, Incorporated, 2004.
ebrary collections. San Jose State University.
<http://site.ebrary.com/lib/sjsu/Doc?id=10114278&ppg=522>
.