Operational Systems - Sheffield Hallam University

Download Report

Transcript Operational Systems - Sheffield Hallam University

Sheffield Hallam University
Data Warehousing & Data Mining
Principles
Operational v Analytical
Systems
1
A customer walks into a bank, and speaks to a
cashier. The cashier uses his/her computer to
answer the queries / fulfil the actions.
List five typical queries / actions
1.__________________________________
2.__________________________________
3.__________________________________
4.__________________________________
5.__________________________________

2
What similarities can you spot about the nature of the data
involved in your five queries / actions?
Hint: volume, up-to-dateness (“currency”), level of detail
List four similarities

__________________________________
1.
__________________________________
2.
__________________________________
3.
__________________________________
4.
3
What general characteristics do you think can be stated
about the IT application that the cashier uses to handle the
customer queries / actions?
Hint: Complete the following sentence: The application is
oriented towards ..........
List four characteristics

__________________________________
1.
__________________________________
2.
__________________________________
3.
__________________________________
4.
4
Operational Systems
1. High volume of
transactions
2. Small processing
per transaction
3. Frequent updating
of data
4. Data is always
current
5. Transaction driven
6. Predictable query types
7. Static structure
8. Content varies
9. High accuracy
10. High availability
11. Mature support
Table 1: Attributes of an Operational System
5
Explanation of points 7 & 8 on the previous slide
•How often does the underlying design of the bank’s
database have to change (tables, relationships, integrity
rules etc)?
•Just thinking about the current data (not the archive data),
how much bigger /smaller will the bank’s volume of data be
in one year’s time?
•Just thinking about the current data, how much of that
data will have different values in one year’s time?
Do you understand why we said
“Static structure; content varies” ?
6
Operational Systems
Operational systems are generally based on Relational
Database systems.
 Very highly optimised towards fast writing / retrieval of
small items of data (eg Oracle, DB2, SQL-Server, MySQL
= very long established, huge $$ research investments)
 Highly optimised towards using Relationships to fetch
related data (eg Customer Name, and current balance)
These reasons make Relational Databases extremely quick
7
Operational Systems
Operational systems are generally based on Relational
Database systems (cont ...)
The database itself can enforce Referential Integrity (eg
cannot delete customer name & address if they still have an account open)
Relational design: Only store each data item in one
place (eg if customer changes address, only one copy to change)
•
These reasons make Operational applications much easier to write.
8
The literature tends to use the term
On-Line Transactional Processing
(OLTP)
for what we have described as “operational” systems.
Transactional =
On-line : This term is a bit historic ... originally most
systems processed batches of data non-interactively
(cheques are one of the few batch-oriented systems left
now). Now systems all tend to be on-line / interactive
Since the industry calls it OLTP, we will too.
9
Management Reporting
Management have different needs to that of the
operational side of the business. Managers are much
more concerned with trends and totals and are
generally not so concerned with the finer details.
What they want are reporting systems that:




Give quick access to summaries of data
Have data structures that are business oriented
Allow users to explore the data
Give them control over report writing
10
The manager of the bank wants to analyse the
effectiveness of last month’s business
List five typical queries
1.__________________________________
2.__________________________________
3.__________________________________
4.__________________________________
5.__________________________________

Nb: TRY to stick with just data drawn from the Cashier system. But you will
find this hard. Later we will see that it is a feature of Analytical system that
they integrate data from many sources
11
What similarities can you spot about the nature of the data
involved in your five queries?
Hint: volume, up-to-dateness (“currency”), level of detail
List four similarities

__________________________________
1.
__________________________________
2.
__________________________________
3.
__________________________________
4.
12
What general characteristics do you think can be stated
about analytical applications?
Hint: Complete the following sentence: The application is
oriented towards ..........
List four characteristics

__________________________________
1.
__________________________________
2.
__________________________________
3.
__________________________________
4.
13
Analytical Systems
1. Small volume of
transactions
2. Often huge
processing per
transaction
3. Data output level is
summary
4. Data routinely added
to, but infrequently
changed
5. Analysis driven
6. Flexible results
structure
7. 'Fairly accurate' better
than no result
8. Medium availability
9. Requires different
database tools
Table 2: Attributes of an Analytical System 14
On-line Analytical Processing (OLAP)
The Management Data is better stored in a Data Warehouse.
 Data is stored in structures easy for business users to understand
(not constrained by Relational rules)
 Data held in duplicate if this makes access easier/quicker
(eg can hold summary/totals of data too)
 Out-of-date data held (with timestamp) as well as new
(allows examination of historical trends)
These sort of systems are referred to as OLAP systems or
On-Line Analytical Processing Systems.
15
The Process Flow
The diagram below shows how Data is converted
into Business Intelligence
Data
Information
Structure
Analyse
Knowledge
Intelligence
Apply
16
The diagram below shows the relationships
between the various components:
Application
Software, SAP, SQL
Server, Oracle,
Spreadsheets etc
OLAP
OLTP
SQL Server,
Oracle, Ingres,
Informix, DB/2
etc
Database
Cleansing/Staging
SQL Server
Analysis Services,
Crystal Analysis,
Oracle Discovery
etc
SAS Warehouse
Administrator,
SQL Server etc
Data
Warehouse
Data
Information
Structure
Knowledge
Analyse
Intelligence
Apply
Application
Software, SAP, SQL
Server, Oracle,
Spreadsheets etc
OLAP
OLTP
SQL Server,
Oracle, Ingres,
Informix, DB/2
etc
Database
Cleansing/Staging
SQL Server
Analysis Services,
Crystal Analysis,
Oracle Discovery
etc
SAS Warehouse
Administrator,
SQL Server etc
Data
Warehouse
Definition
A Data Warehouse is ...
"... where data is specifically
structured for query and analysis
performance and ease-of-use"
Kimball, 2002
19
Why a Data Warehouse?
Data Warehouses offer the flexibility needed to cope
with the Management demands. A major issue is
that often there are many differing OLTP systems
and other data storage media. The Data Warehouse
offers the opportunity to gather these together into
one system with a unified structure.
Because data is stored in a simplified aggregated
format it allow reports to be written by staff who have
a lesser computing background.
20
OLAP versus OLTP
If a report is designed to obtain information
from an OLTP system it will generally be:
• Slow to produce the answer
• Complicated to write
• Slow down the operational system
• Need complicated formulas for grouping data
21
Summary
Operational
Analytical
Data
Individual Items
Summarised
Data Relations
Simple Chains
Complex/Unknown
Time
Present
Past
Access
Record-at-a-time
Many records
Approach
Support a transaction Explore a domain
Implementation Relational
Follow the module!
22