Jieping - Arizona State University

Download Report

Transcript Jieping - Arizona State University

CSE 494/598: Numerical Linear
Algebra for Data Exploration
Jieping Ye
Department of Computer Science and
Engineering
Arizona State University
http://www.public.asu.edu/~jye02
Course Information
• Instructor: Dr. Jieping Ye
• Office: BY 568
• Phone: 480-727-7451
• Email: [email protected]
•
•
•
•
Web: www.public.asu.edu/~jye02/CLASSES/Fall-2007/
Time: MW 10:40AM - 11:55AM
Location: BYAC 110
Office hours: MW 2:30pm--4:00pm
Course Information (Cont’d)
• Prerequisite: Basic linear algebra skills.
• Course textbook: Matrix Methods in Data Mining and
Pattern Recognition. by Lars Elden, 2007.
• Objectives:
– teach the basics of numerical linear algebra
– provide extensive hands-on experience in applying
the linear algebra techniques to real-world
applications.
Course Information (Cont’d)
•
The Matrix Cookbook, by Kaare B. Petersen and Michael S. Pedersen.
Available on-line at
http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=3274
•
Introduction to Linear Algebra, by Gilbert Strang, 2003.
•
Applied Numerical Linear Algebra, by James W. Demmel, 1997.
•
Matrix Computations, by Gene H. Golub and Charles F. van Loan, 1996.
•
Pattern Recognition and Machine Learning, by Christopher M. Bishop,
2006.
•
The Elements of Statistical Learning: Data Mining, Inference, and
Prediction, by T. Hastie, R. Tibshirani, and J. Friedman, 2001.
Topics: Part I
• Linear algebra background
– Vectors and Matrices
– Linear Systems and Least Squares
– Singular Value Decomposition
– Reduced Rank Least Squares Models
– Tensor Decomposition
– Clustering and Non-Negative Matrix Factorization
Topics: Part II
• Applications
– Classification of Handwritten Digits and face images
– Text Mining
– Page Ranking for a Web Search Engine
– Automatic Key Word and Key Sentence Extraction
– Massive data compression using tensor SVD
– Clustering and classification of Microarray gene
expression data
– Gene expression pattern image classification and
retrieval
Tentative Class Schedule
Grading
•
•
•
•
•
Homework (6)
Project (1)
Exam (2)
Quiz (2)
Attendance
30%
10%
40%
10%
10%
• Assignments and projects are due at the beginning of
the lecture. Late assignments and projects will not be
accepted. Attendance to lecture is mandatory.
Classification of Handwritten Digits
Text Mining
• Understand methods for extracting useful information
from large and often unstructured collections of texts.
• Another closely related term is information retrieval.
• Vector space model for document representation
– Create a term-document matrix
• Each document is represented by a column vector
– Latent Semantic Indexing (LSI)
Page Ranking for a Web Search Engine
• Pagerank used in Google
• HITS
Face Recognition and Microarray Gene
Expression Data analysis
Gene Expression Pattern Image Analysis
(a-e) Series of five embryos stained with a probe (bgm)
(f-j) Series of five embryos stained with a probe (CG4829)
Survey
• Why are you taking this course?
• What would you like to gain from this course?
• What topics are you most interested in learning about
from this course?
• Any other suggestions?