Introduction to Database Systems
Download
Report
Transcript Introduction to Database Systems
Data Warehousing/Mining
Comp 150DW
Course Overview
Instructor: Dan Hebert
Data Warehousing/Mining
1
Comp 150
Thursday 6:50 - 9:50 PM
Instructor - Mr. Dan Hebert
– email - [email protected]
– Location - Halligan Hall, rm. 108
Data Warehousing/Mining
2
Course Description
Fundamental concepts and techniques of data
warehousing and data mining
– concepts, principles, architecture, design, implementation,
and application of data warehousing and data mining
Topics: Data warehousing and OLAP technology for
data mining, data preprocessing, data mining
primitives, languages and systems, descriptive data
mining, both characterization and comparison,
association analysis, classification and prediction,
cluster analysis, mining complex types of data, and
applications and trends in data mining
Data Warehousing/Mining
3
Course Prerequisite
Comp 115 – Introduction to RDBMS
– Familiarity with programming with C/C++ is assumed
Students should be comfortable with:
–
–
–
–
–
–
–
–
relational model basics
relational algebra
SQL
Views
Security
conceptual database design and ER models
schema refinement and normal forms
physical database design and tuning
Data Warehousing/Mining
4
Required Textbook
Data Mining Concepts and Techniques
– Jiawei Han & Micheline Kamber
– Morgan Kaufmann Publishers; ISBN: 1-55860-489-8
Data Warehousing/Mining
5
Reading Schedule
Lecture Date
January 22
January 29
February 5
February 19
February 26
March 4
March 11
March 18
Topic
Reading: Text Chapter
Introduction to Comp 150, Introduction
1
Data Warehouse and OLAP
2
Technology for Data Mining
Aggregation in SQL, Data
Not In Book
Warehousing Introduction, Data
Warehousing Design
President’s Day Schedule Shift– No
Class
Data Warehouse Semantics
Not In Book
Semistructured Data
Data Preprocessing
3
Data Mining Primitives, Languages,
4
and System Architectures – Midterm
Review
Midterm Exam
Data Warehousing/Mining
6
Reading Schedule (continued)
Lecture Date
April 1
April 8
April 15
April 14
April 22
April 29
May 6
May 13
Data Warehousing/Mining
Topic
Concept Description: Characterization
and Comparison
Mining Association Rules in Large
Databases
Classification and Prediction
Cluster Analysis
Mining Complex Types of Data
Applications and Trends in Data
Mining – Final Exam Review
Reading Period/Project Completion
Final Exam
Reading: Text Chapter
5
6
7
8
9
10
7
Grading
Homework
Project
Midterm
Final
Data Warehousing/Mining
30%
10%
25%
35%
8
Homework
Assigned weekly (each Wednesday)
– Due at the start of lecture the following Wednesday
Late policy:
– Homework turned in up to one week after the due date 20% penalty.
– Homework turned in anytime later - 100% penalty
Typical homework assignment
– Exercises from the text
– “Hands-on” problems that involve building data
warehouses and performing data mining
Working with PostgresQL
Data Warehousing/Mining
9
Project
Develop a data warehouse and perform data
mining on it using Postgres as the
underlying datastore
Additional details provided as the course
progresses
Data Warehousing/Mining
10
Midterm & Final
Open book, open notes
Opportunity during class for review of
material covered prior to midterm and final
Data Warehousing/Mining
11
Computing Environment
All students will have a computer account
on psql.cs.tufts.edu
– Account will work on all workstations in the
SUN lab
Commercial RDBMS utilized will be
PostgreSQL
– For information http://www.postgresql.org/index.html
Data Warehousing/Mining
12
Course Homepage
Course web page will be available
Lectures/homework assignments will also
be posted in my account
– ~dhebert/comp150dw
Data Warehousing/Mining
13