Transcript Data Mining
Data Mining
David Eichmann
School of Library and Information Science
The University of Iowa
Why?
Given enough data represented through
enough dimensions, we loose the ability to see
the patterns
How?
Decision Trees
Nearest Neighbor Clustering
Neural Networks
Rule Induction
K-Means Clustering
What is it?
The automated extraction of hidden predictive
information from databases.
Key points
Automated
Hidden
Predictive
The Typical Process
Evaluation Criteria
Receiver Operating Characteristic Curves
But Nobody Said We Had To Do
MATH….
Forms of Data
Structured
Databases
Forms
Semi-Structured
Tables on the Web
Bibliographic citations
Graphs & charts
Unstructured
Full text (e.g., journal articles, physician
chart notes)
Images
Text Mining
Corpus now is a collection of text artifacts
Full text when you’ve got it (e.g. newswire)
Metadata when you don’t (e.g. MEDLINE)
The trick then becomes extracting ‘interesting’
relationships between ‘interesting’ entities
Who killed who
Who works for who
Who makes what
The Classic Entities
Persons
Organizations
Places (Geography)
Events
A Newswire Example
APW19981001.0262 [Israel(0.271), Jonathan Pollard (0.153),
Benjamin Netanyahu(0.102), Bill Clinton(0.102), United
States(0.055), ...]
Persons
Bill Clinton (3)
Jonathan Pollard (8)
Moshe Fogel (2)
Benjamin Netanyahu (2)
Israeli Embassy (1)
Organizations
Cabinet (1)
Places
Israel (16)
United States (5)
Washington (2)
In the Medical/Health Realm
UMLS an excellent framework
Organism
Chemical
Activity
Disease
A MEDLINE Example
Document: 89316090 - Reconstructive surgery in Nicaragua
Provided MeSH Keywords
Human
Nicaragua
Z01.107.169.690
Surgery, Plastic/*
G02.403.810.788
Phrases
[Reconstructive, surgery]
[Nicaragua]
[letter]
MeSH Terms
Surgery (1)
G02.403.810.762
Letter [Publication Type] (1)
Other Phrases
Concept Extraction
Example
“Roman forces under Julius Caesar invade Britain.”
(S (NP (NP Roman forces)
(PP under
(NP Julius Caesar)))
(VP invade
(NP Britain))
.)
Entity Attributes:
<organization Roman forces>
<person Julias Caesar>
<placename Britain>
Concepts:
<Roman forces - under - Julius Caesar>
<Roman forces - invade - Britain>
And a Small Demo…