CS 524 – High Performance Computing

Download Report

Transcript CS 524 – High Performance Computing

CS 543 – Data
Warehousing
Course Outline
Data Warehousing?

What is data warehousing?
 A paradigm
specifically designed for strategic business
information or decision making
 In essence, data warehousing is a data-driven decisionsupport system

What is a data warehouse?
 It
is an informational environment with the following
characteristics:




provides an integrated and total view of the enterprise (data), current
and historical, and makes it available easily for decision support
decision support transactions do not impact operational systems
maintains a consistent view of enterprise
provides flexible and interactive source of strategic information
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
2
Why Study DW?






New technologies (multidimensional modeling,
business intelligence, OLAP, querying models, etc)
Research potential (data mining, business intelligence,
ETL algorithms, multidimensional data analysis, query
optimizations, etc)
Industry demand
High market value of DW experts
Fulfill degree requirements
Easy course, want to sleep through it :)
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
3
Description

This course will
 cover
the concepts and techniques in the design and
construction of high-performance data warehouses
 discuss business, software, hardware, and design factors
influencing successful implementations of data warehouses
 focus on both dimensional and relational data modeling
 Distinguish between DSS (Decision Support System) and
OLTP
 Introduce OLAP and ETL algorithms and systems
 Provide hands-on experience with data warehousing tools
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
4
Goals




Introduction to the concepts and techniques in data
warehousing
Design and construction of high performing data
warehouses
Hands-on experience with a commercial data
warehousing tool (Teradata)
Motivation for research in large scale data analysis
and business intelligence
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
5
Before Taking This Course…
You should be comfortable with…
 Basics of databases
 CS
341 is a prerequisite
 Fundamentals of RDMS; EER modeling; concept of
normalization; querying; design and development exposure

Basics of programming and algorithms
 CS
213 is a prerequisite
 Understanding, evaluating, and implementing algorithms
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
6
After Taking this Course…
You should be able to …
 Design and construct data warehouses
 Understand the concepts and techniques in data
warehousing
 Use a data warehouse to extract strategic information
 Pursue further studies and research in data
warehousing, large data analysis, business intelligence,
and data mining
 Work with a commercial data warehousing tool
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
7
Grading

Points distribution
Labs/Assignments
Quizzes
Midterm exam
Final exam (comprehensive)
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
20%
15%
30%
35%
8
Policies (1)

Quizzes
 Quizzes
may or may not be announced. If a quiz is
announced it will be done 1 to 2 days in advance

Sharing
 No
copying is allowed for labs/assignments. Discussions are
encouraged; however, you must submit your own work
 Violators can face mark reduction and/or reported to
Disciplinary Committee for action

Plagiarism
 Do
NOT pass someone else’s work as yours! Write in your
words and cite the reference. This applies to code as well.
 Violators can face mark reduction and/or reported to
Disciplinary Committee for action
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
9
Policies (2)

Submission policy
 Submissions
are due at the day and time specified
 Late penalties: 1 day = 10%; 2 day late = 20%; not accepted
after 2 days
 An extension will be granted only if there is a need and when
requested several days in advance.

Rechecking policy
 For
quizzes and labs/assignments: No recheck request will be
entertained after 2 days of return
 For midterm exam: No recheck request will be entertained
after 5 days of return (and should be made at the time of
collection)
 For final exam: No recheck request will be entertained after
start of next quarter
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
10
Summarized Course Contents








DW fundamentals, need for a DW, decision support vs.
transaction processing, evolution of a DW
Business requirements as the driving force for the DW,
matching information to classes of users
Dimensional modeling
Architecture and Infrastructure, data extraction,
transformation and loading, data quality
Selected de-normalizations, horizontal and vertical
partitioning, materialized views
Physical design
Data mart design, web data warehousing
Current topics in data warehousing
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
11
Course Material

Primary text


Supplementary text


P. Ponniah, Data Warehousing Fundamentals, John Wiley &
Sons, 2001.
C. Imhoff et al., Mastering Data Warehouse Design:
Relational and Dimensional Techniques, John Wiley and
Sons, 2003.
Other resources




Lecture slides
Handouts
Web resources
Books in library
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
12
Course Web Site

For annuncements, lecture slides, handouts, labs,
assignments, quiz solutions, and web resources:
http://suraj.lums.edu.pk/~cs543s05/

The resource page has links to information available on
the Web. It is basically a meta-list for finding further
information.
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
13
Other Stuff

How to contact me?
 Office
hours: 10.30 to 12.30 MW (office: 429)
 E-mail: [email protected]
 By appointment

Philosophy
 Knowledge
cannot be taught; it is learned.
 Be excited. That is the best way to learn. I cannot teach
everything in class. Develop an inquisitive mind, ask
questions, and go beyond what is required.
 I don’t believe in strict grading. But… there has to be a way
of rewarding performance.
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
14
Pakistani Students…
“It was good to see that the students were quite good at
abstract discussions and given my teaching experience at
foreign universities, I would rate the batch I taught as
competitive.
“My advice to Pakistani students is that they need to become
aggressive learners and realise that a university education
assumes that the student is mature enough to take control of
his or her destiny.”

- Dr. Raja Muhammad Atif Azad, Lemerick, Ireland
Appeared in Dawn:
http://www.dawn.com/2006/03/02/letted.htm#4
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
15
Reference Books in LUMS Library (1)




C. Imhoff et al., Mastering the data warehouse:
relational and dimensional techniques, Wiley, 2003.
W. Inmon, Building the data warehouse, Wiley, 2005.
R. Kimball, The data warehouse toolkit: the complete
guteide to dimensional modeling, Wiley, 2002.
R. Kimball, The data warehouse ETL toolkit: practical
techniques for extracting, cleaning, conforming, and
delivering data, Wiley, 2004.
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
16
Reference Books in LUMS Library (2)







Building, using, and managing the data warehouse, Barquin,
Ramon C., ed.;Edelstein, Herbert A., ed., 005.74 B932, 1997.
Data warehousing and business intelligence for e-commerce,
Simon, Alan R.;Shaffer, Steven L., 658.84 S594D, 2001.
Data warehousing for e-business, Inmon, W. H.; Terdeman, R.
H.; Norris-Montanari, Joyce; Meers, Dan, 658.84 D232, 2001.
Data warehousing in the real world; a practical guide for
building decision support systems, Anahory, Sam; Murray,
Dennis, 005.74 A532D, 2000.
Data warehousing; concepts, techniques, products and
applications, Prabhu, C.S.R., 005.74 P895D, 2002.
Data warehousing; strategies, technologies, and techniques,
Mattison, Rob, 658.4038 M444D, 1996.
Introduction to business intelligence and data warehousing,
IBM, Prentice-Hall of India, 658.47 I619 2004
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
17
Reference Books in LUMS Library (3)






Data warehousing; the ultimate guide to building corporate
business intelligence, Educations B. V., SCN., ed., 005.74 D232,
2001.
Decision support in the data warehouse, Gray, Paul;Watson,
Hugh J., 005.74 G778D, 1998.
Intelligent data warehousing; from data preparation to data
mining, Chen, Zhengxin, 005.74 C518I, 2002.
The data webhouse toolkit; building the web-enabled data
warehouse, Kimball, Ralph; Merz, Richard, 005.74 K495D,
2000.
Oracle8i data warehousing plan, Corey, Michael ... [et al.],
005.7585 O631, 2001.
Data warehousing with Oracle; an administrator's handbook,
Yazdani, Sima;Wong, Shirley S., 005.74 S588D, 1998.
CS 543 - Data Warehousing (Sp 2005-2006) - Asim Karim @ LUMS
18