Knowledge Discovery in the Digital Library

Download Report

Transcript Knowledge Discovery in the Digital Library

Knowledge Discovery in the Digital Library
Access tools for mining science
ICSTI Public Workshop
Presented by:
Bernard Dumouchel, Director-General
February 3, 2006
Overview
• Knowledge Discovery
– Linked-Literature Analysis
– Main Path Analysis
• Digital Libraries
• Integrating access into research
2
Knowledge Discovery
The process of transforming data into previously unknown
or unsuspected relationships.
(Trybula 1997)
• Process for discovering and extracting new information:
– Statistics
– Pattern recognition
– Machine learning
– Visualization
• Goal of knowledge discovery is to identify higher-level,
more abstract relationships between texts.
3
Knowledge Discovery
Data Mining
Knowledge
Discovery
Measure
Discrete quantities
Relationships
Expression
Probabilities
Interpretations
• Computationally intensive
• Augments human expertise:
– Interactive, mental process
4
Linked Literature
Analysis
• Don Swanson
– Specialization  Balkanization of science
– “Undiscovered Public Knowledge”
– Transitory links between disjoint concepts
»
»
A ∩ C = Null
ABC
5
Linked Literature
Analysis
Raynaud’s
Disease
Migraine
Blood Viscosity
Calcium
Fish Oils
Magnesium
Somatomedin C
HGH
Arginine
6
Linked Literature
Analysis
• ARROWSMITH
– Neil Smalheiser, MD, PhD
– Interactive software that extends the power of the
MEDLINE search
– http://arrowsmith.psych.uic.edu/arrowsmith_uic/index.html
• CISTI Research to generalize Linked Literature Analysis to
other scientific domains
7
Main Path Analysis
• A type of social network analysis
• Citation = formal record of intellectual link
• Citation network is a social network of science
• Study of webs of relationships between seemingly
disorganized items
8
Main Path Analysis
• Norman Hummon & Patrick Doreian (1989)
• Sequence of articles that best represent the development of
a research field
• Condenses web of relationships into a concise pathway
9
Main Path Analysis
Time 
10
Knowledge Discovery
• Analyze relationships, interpret structure of science
• Information is plentiful, knowing how it fits together is
knowledge
• Main Path Analysis, Linked Literature Analysis uncover
meaningful relationships which suggest new knowledge
11
Roles of the Digital
Library
• Institutional repositories
• Preservation of research data
• Systems that make information useful
Digital libraries are systems that make digital collections come
alive, that make them usefully accessible, and that make them
useful for accomplishing work.
(Lynch, 2002)
12
Access for e-science
• Access makes research easier
• Access tools with analysis make research faster, more
powerful
• Digital Library’s challenge: develop and offer tools where
research and access can be combined
13
Summary
• Seamless access not just about convenience
• Knowledge Discovery tools enables e-science to be a more
integral part of research
• Research libraries are the labs that make information useful
14
15