Introduction to Database Systems

Download Report

Transcript Introduction to Database Systems

Data Warehousing/Mining
Comp 150DW
Course Overview
Instructor: Dan Hebert
Data Warehousing/Mining
1
Comp 150
Thursday 6:50 - 9:50 PM
 Instructor - Mr. Dan Hebert

– email - [email protected]
– Location - Halligan Hall, rm. 108
Data Warehousing/Mining
2
Course Description

Fundamental concepts and techniques of data
warehousing and data mining
– concepts, principles, architecture, design, implementation,
and application of data warehousing and data mining

Topics: Data warehousing and OLAP technology for
data mining, data preprocessing, data mining
primitives, languages and systems, descriptive data
mining, both characterization and comparison,
association analysis, classification and prediction,
cluster analysis, mining complex types of data, and
applications and trends in data mining
Data Warehousing/Mining
3
Course Prerequisite

Comp 115 – Introduction to RDBMS
– Familiarity with programming with C/C++ is assumed

Students should be comfortable with:
–
–
–
–
–
–
–
–
relational model basics
relational algebra
SQL
Views
Security
conceptual database design and ER models
schema refinement and normal forms
physical database design and tuning
Data Warehousing/Mining
4
Required Textbook

Data Mining Concepts and Techniques
– Jiawei Han & Micheline Kamber
– Morgan Kaufmann Publishers; ISBN: 1-55860-489-8
Data Warehousing/Mining
5
Reading Schedule
Lecture Date
January 22
January 29
February 5
February 19
February 26
March 4
March 11
March 18
Topic
Reading: Text Chapter
Introduction to Comp 150, Introduction
1
Data Warehouse and OLAP
2
Technology for Data Mining
Aggregation in SQL, Data
Not In Book
Warehousing Introduction, Data
Warehousing Design
President’s Day Schedule Shift– No
Class
Data Warehouse Semantics
Not In Book
Semistructured Data
Data Preprocessing
3
Data Mining Primitives, Languages,
4
and System Architectures – Midterm
Review
Midterm Exam
Data Warehousing/Mining
6
Reading Schedule (continued)
Lecture Date
April 1
April 8
April 15
April 14
April 22
April 29
May 6
May 13
Data Warehousing/Mining
Topic
Concept Description: Characterization
and Comparison
Mining Association Rules in Large
Databases
Classification and Prediction
Cluster Analysis
Mining Complex Types of Data
Applications and Trends in Data
Mining – Final Exam Review
Reading Period/Project Completion
Final Exam
Reading: Text Chapter
5
6
7
8
9
10
7
Grading
Homework
 Project
 Midterm
 Final

Data Warehousing/Mining
30%
10%
25%
35%
8
Homework

Assigned weekly (each Wednesday)
– Due at the start of lecture the following Wednesday

Late policy:
– Homework turned in up to one week after the due date 20% penalty.
– Homework turned in anytime later - 100% penalty

Typical homework assignment
– Exercises from the text
– “Hands-on” problems that involve building data
warehouses and performing data mining

Working with PostgresQL
Data Warehousing/Mining
9
Project
Develop a data warehouse and perform data
mining on it using Postgres as the
underlying datastore
 Additional details provided as the course
progresses

Data Warehousing/Mining
10
Midterm & Final
Open book, open notes
 Opportunity during class for review of
material covered prior to midterm and final

Data Warehousing/Mining
11
Computing Environment

All students will have a computer account
on psql.cs.tufts.edu
– Account will work on all workstations in the
SUN lab

Commercial RDBMS utilized will be
PostgreSQL
– For information http://www.postgresql.org/index.html
Data Warehousing/Mining
12
Course Homepage
Course web page will be available
 Lectures/homework assignments will also
be posted in my account

– ~dhebert/comp150dw
Data Warehousing/Mining
13