Course Overview - CSIM
Download
Report
Transcript Course Overview - CSIM
Information Retrieval and
Data Mining (AT71.07)
Comp. Sc. and Inf. Mgmt.
Asian Institute of Technology
Course Overview
Page 1
Instructor: Prof. Sumanta Guha
Office: 104 CSIM Building
Email: [email protected]
Telephone: 5714 i
Credits: 3(3-0)
Prerequisite:
Officially none
Course Website: http://www.cs.ait.ac.th/~guha/IRDM/
Course Overview
Page 2
Class times: Mon. & Th. 14:00-15:30
Discussion Group: Yahoo group – ait_csim_irdm
WWW: http://groups.yahoo.com/group/ait_csim_irdm/
Email: [email protected]
You must join the group!
Membership is currently open so anyone can join – just go to the link above and
click the join button. If you have a problem send me mail and I will invite you.
Important: It’s good for everyone to post questions and comments to
the discussion group!! Then, everybody benefits from the interaction.
Announcements by the instructor will always be posted to the group.
However, if you wish to see me in my office you are always welcome
(provided I am not busy). It’s best to make an appointment. Note that
I am not a morning person.
Please check the group frequently and please participate in
discussions !!
Course Overview
Page 3
Textbooks (required):
C. D. Manning, P. Raghavan, H. Schütze (2008), Introduction to
Information Retrieval, Cambridge University Press.
J. Han, M. Kamber, J. Pei (2011), Data Mining: Concepts and
Techniques, 3rd edition, Morgan Kaufmann (2nd ed. is fine !).
Course Overview
Page 4
Brief Course Outline:
We will alternate between information retrieval and data mining in the early
weeks – one period each week studying IR and the other DM. Later the two
threads will converge.
We will begin with Chapters 1, 2, 4, 6, … from the IR book and Chapters 5, 6,
7, … from the DM book. The reason for the omitted chapters is that they
are either elementary (left to the student to read on her own) or dig too
deep into one particular area (as our goal is a broad coverage not
specialization).
Objectives:
To learn the fundamental concepts of modern-day IRDM.
To become familiar with recent literature. IRDM is a young field so most
developments are, in fact, recent. Therefore, using original research papers
as source material is not only possible, but advised.
To acquire some familiarity with practical IRDM software, e.g., Weka.
Course Overview
Page 5
Reference Books:
M. J. A. Berry and G. Linoff (1997), Data Mining Techniques:
For Marketing, Sales, and Customer Relationship
Management, Wiley.
I. H. Witten and E. Frank (2001), Data Mining: Practical
Machine Learning Tools and Techniques, Morgan Kaufmann.
T. Soukup and I. Davidson (2002), Visual Data Mining:
Techniques and Tools for Data Visualization and Mining,
Wiley.
P. Tan, M. Steinbach and V. Kumar (2005), Introduction to Data
Mining, Addison-Wesley.
D. T. Larose (2006), Data Mining Methods and Models, Wiley.
B. Croft, D. Metzler, T. Strohman (2009), Search Engines:
Information Retrieval in Practice, Addison-Wesley.
Course Overview
Page 6
Grading System (tentative):
Mid-sem – 40%
Final – 60%
Enjoy the Course!
Be enthusiastic about the material because it is interesting,
practical, and extremely important in the modern day world. Our
job is to help you learn and enjoy the experience. We will do our
best but we also need your help.