Transcript Slide 2
Introduction to OLAP
and Analysis Services from Microsoft
(ONLY for INTERNAL USE)
Josef Schiefer
IBM Watson Research Center
[email protected]
What is OLAP?
Slide 2
Online Analytical Processing coined by EF Codd in 1994 paper
contracted by Arbor Software*
Generally synonymous with earlier terms
such as Decisions Support, Business
Intelligence, Executive Information System
OLAP = Multidimensional Database
MOLAP: Multidimensional OLAP (Arbor
Essbase, Oracle Express)
ROLAP: Relational OLAP (Informix
MetaCube, Microstrategy DSS Agent)
OLAP is FASMI
Fast
Analysis
Shared
Multidimensional
Information
Nigel Pendse, Richard Creath - The OLAP Report
Slide 3
Cubes
Slide 4
A cube stores information in a
multidimensional structure and is the
central object in a multidimensional
database.
Each cube contains a set of dimensions
and measures.
Dimensions are derived from the tables
and columns in your data that provide the
categories you want to analyze.
Measures are the quantitative data
derived from your data columns
Dimensions
Slide 5
The dimensions you build should be
distinct categories you want to add to
cubes in your OLAP database.
example: geography, time, or employee
dimensions represented in the picture
Cube and Dimensions
Dimensions:
Product
Product, Region, Time
W
S
N
Juice
Cola
Milk
Cream
Gel
Soap
1 2 3 45 6 7
Slide 6
Month
Hierarchical
summarization paths
Product
Industry
Region
Country
Time
Category
Region
Quarter
Product
City
Office
Year
Month Week
Day
Dimensions and Hierarchy
Dimensions are the categories used to
organize or describe analysis information
Dimensions are used to navigate the
information and to summarize the details into
more aggregate data.
Frequently used dimensions
include time periods,
geography, products,
organization, and so on.
Often dimensions
are hierarchical
(World - Continents - Countries)
Slide 7
Measures =
numercial Values
Slide 8
Measures are the
quantitative data in an
OLAP database.
For example, values
such as sales, budget,
cost, and so on, are all
examples of measures.
Measure values are
organized in data
cubes according to
dimensions
Aggregations
Slide 9
Aggregations greatly improve query
efficiency and response time. A cube
can hold a number of aggregations.
The aggregation amount is based on
several factors - the size of the data,
the amount of storage space you
allocate for aggregation storage, the
mode of storage you select, and how
much you want to optimize the
aggregations design.
Primary OLAP Problems
Rigid, inflexible architectures
– MOLAP or ROLAP
Significant scalability problems
– Data explosion and sparsity
– Poor distributed client/server implementation
Separation of data warehousing from
OLAP tools
– Lack of integration between user tools and
OLAP
Difficult to prototype, develop, deploy
– Time and expense
Slide 10
MS-AS: Architecture
Microsoft Analysis Services are optimized
for all OLAP architectures and offers
seamless integration
MOLAP: aggregations & details managed in
an efficient multidimensional store
ROLAP: aggregations created in relational
store
HOLAP: different things to different vendors
– Aggregations: details in relational,
aggregations in MOLAP store
– Partitions: single logical cube physically
divided into multiple MOLAP and ROLAP
partitions
– Virtual cubes: “view-like” join of multiple
MOLAP and ROLAP cubes
Slide 11
MS-AS: Scalability
MS-AS offer major innovation
– Data explosion managed by partial preaggregation
– Automatic elimination of sparse storage
Partitioned cubes
– parallel query processing across clustered
servers
– fine tuning of aggregations, to better manage
performance and disk space trade-offs
Slide 12
MS-AS: Scalability
Cooperative client/server query
management and caching
– network traffic minimized
– server queries processed efficiently
Microsoft Data Cube Service
– desktop component ships with next release of
Office
– used with Excel, Access, and Web
– supports local, offline usage
Slide 13
Microsoft Data Cube Service
Basic architecture:
– Cache query results and metadata, not disk
pages.
Algorithms deduce missing data and
transform queries
– Aggregation
– Filtering
– Combination
Slide 14
Instant reply to cached queries
MS Data Cube Benefits
Slide 15
Efficient distribution of query and
calculation processing across client & server
Single component spans Microsoft desktop
and server platforms & products
Unifies the MD data access story across
Excel, MS-AS, and SQL Server
Enables Microsoft to establish industry
standard for MD data access
Basis for MS-AS and Excel mobile story
MS-AS: Integration
The Microsoft Analysis Services integrate the
maintenance of OLAP with the underlying
data warehouse
– Design the DW structure
– Create the DW tables/cubes
– Populate the DW tables/cubes
– Maintain by incremental loads
– Optimize by actual usage patterns
– Manage users, scripts, usage, metadata
– Multiple data sources (not just SQLS)
Slide 16
MS-AS: Integration
OLE DB for OLAP & ADO MD
– based upon existing data access technology
– establishes industry standard for MD data
access
Slide 17
OLE DB/ODBC enable MS-AS to access
data in all major RDBMs
Third party client applications
OLAP Problem: Complexity
OLAP products are traditionally difficult
to configure, develop, and deploy
– Arcane tools
– Heavy consulting
– Poor integration
Slide 18
3 Tier Architecture & Components
Client Tier
•Excel
•ActiveX Controls
•Third Party Applications
ADO MD
OLE DB for OLAP
DCube
MS-AS Server
Data Warehouse
Tier
•MS-AS Server
•SQL Server
DTS
OLTP Source Tier
•RDBMs
Slide 20
MOLAP
HOLAP
OLE DB
ROLAP
•Data selection & navigation
•Presentation and charting
•What-if formulas
•Client side caching
•Desktop object model
•Offline usage
•Multidimensional calcs
•MOLAP/ROLAP/HOLAP data
Modeling/aggregations
•Security
•Metadata management
•Server side caching
•Administrative tools
•Server object model
•Query distribution
Let‘s go to the
demonstration ...
Slide 21