102_CH-20 Information Integration

Download Report

Transcript 102_CH-20 Information Integration

CH-20 Information
Integration
Continued……
What is OLAP?
• On-Line Analytic Processing (OLAP)
• OLAP is used to query very large amount of data in the data warehouse of company
• It involves highly complex queries that use one or more aggregators
• OLAP queries are also called as decision support queries
What is OLTP?
• Common database operation touch very small amount of data and they are referred as
OLTP (online transaction processing)
• OLAP queries are considered as long transactions and long transactions locking the
entire database would shutdown the ordinary OLTP transactions so OLAP data is stored
separately in data warehouse rather then in ordinary database
Example of OLAP and OLTP queries
Consider data warehouse of automobile company the schema can be as follows
Sales (serialNo, date, dealer, price)
Autos (serialNo, model, color)
Dealers( name, city, state, phone)
The typical OLAP query can be for finding the average sales price by state
SELECT state, AVG (price)
FROM Sales, Dealers
WHERE Sales.dealer = Dealers.name AND date >= ‘2001-01-04’
GROUP BY state;
In same example the typical OLTP query can be for finding the price at which the auto with
serial number 123 was sold
Multidimensional view of OLTP data :
• In typical OLAP application we have a central relation called fact table.
• Fact table represents events or objects of interest.
• It is helpful to think that objects in the fact table are arranged in the multidimensional
space.
• consider the earlier example of automobile company the fact table can be build for sales
which is the object of interest and is viewed as a 3 dimensional data cube .
Multidimensional View Of OLAP Data
Each single point in cube represents sales of single automobile and dimension
represents properties of sales.
Star Schemas
A star schema consist of schema for the fact table, which links to several other
relations called “dimension tables”.
Slicing and Dicing
The row data cube can be partitioned along each dimension at some level of
granularity for analysis this partitioning operations are known as slicing and dicing.
• In SQL this partitioning is done by “ GROUP BY” clause.
• Lets consider the automobile example. suppose car named Gobi is not selling well and
we want to find exactly which colors are not doing well
SQL query is as follows:
SELECT color, SUM (Price)
FROM Sales NATURAL JOIN Autos
WHERE model = ‘Gobi’
GROUP BY color;
This query dice by color and slice by model, Focusing on particular model, the Gobi, and
ignores other data.
Thank You !
References:
Database systems the complete book by Garcia,Ullman,Windom