Presentation slides

Download Report

Transcript Presentation slides

A Concept Space Approach to
Semantic Exchange
Tobun Dorbin Ng
Dissertation Defense
April 19, 2000
Management Information Systems
The University of Arizona
Outline
•
•
•
•
•
•
Introduction
Literature Review
Research Questions & Methodologies
Concept Space Consultation
Concept Space Generation
Conclusions
Management Information Systems
The University of Arizona
Objective
• To investigate the use of information
technologies that clarify semantic
meaning to help users elaborate their
information needs by providing their
library-specific knowledge during the
information seeking process.
Introduction
Management Information Systems
The University of Arizona
Questions
& Problems
Users
Does a query truly represent
user information need?
Can these knowledge sources
adequately serve
users’ information needs?
Query
Introduction
Information Retrieval
Systems
•Keyword Search
•Inverted Index
•Summarization
•Visualization
Browsing
Classifications
Knowledge Spaces
Concept Spaces
Category Spaces
Document Set
Search for
Documents
Distributed, Heterogeneous
Database Collections
Knowledge Discovery
•Concept Association
•Cluster Analysis
Text
Management Information Systems
Image Video
The University of Arizona
Goal
• To adopt a user-centric and interactive
approach to helping users elaborate
their information needs with libraryspecific knowledge and simultaneously
gain insight into a library’s offerings
related to their information needs.
Introduction
Management Information Systems
The University of Arizona
Research Issues
• Interactive Consultation with Knowledge
Sources
• Automatic Generation of Semanticbearing Knowledge Sources from
Corresponding Libraries
Introduction
Management Information Systems
The University of Arizona
Static Nature of Knowledge in
Library Collection
• Characterizing Document Objects
• Characterizing Global Knowledge in
Document Collections
– Grand Coverage
– Knowledge of Knowledge
• Revealing Knowledge in Neighborhood
– Contextual Information
Literature Review
Management Information Systems
The University of Arizona
Dynamic Nature of User
Information Need
• Expressing User Need
– Information Need
• Dynamic, not directly observable or symbolized
– Indeterminism
– Opportunism
– Vocabulary Problem
– Recognition with Contextual Information
• Key Word In Context, Relevance Feedback
Literature Review
Management Information Systems
The University of Arizona
Perceiving Knowledge
• What is the user’s perspective of
knowledge?
• How does a user perceive retrieved or
derived knowledge?
• Computing Relevance?
Literature Review
Management Information Systems
The University of Arizona
Structure & Context: Aids To
Perceive Knowledge
• Structureless and Contextless
– Document List
• Structural but Contextless
– Dynamic Clustering
• Structural and Contextual
– Path to the Knowledge
Literature Review
Management Information Systems
The University of Arizona
Research
Questions
• Can
knowledge
sources be
used to help
users express
their
information
needs?
Research Questions & Methodologies
Users
Information
Need
Vocabulary
& Context
Context-rich
Query
Concept Consultation
Systems
Information Retrieval
Systems
•Keyword Search
•Inverted Index
•Summarization
•Visualization
Concept Exploration
•Branch-and-bound Search
•Hopfield Net Activation
Search for
Related
Concepts
Browsing
Classifications
Knowledge Spaces
Concept Spaces
Category Spaces
Context-coherent
Document Set
Search for
Documents
Distributed, Heterogeneous
Database Collections
Knowledge Discovery
•Concept Association
•Cluster Analysis
Text
Management Information Systems
Image Video
The University of Arizona
Research Methodologies
• Systems Development Approach
• Experimental Design
Research Questions & Methodologies
Management Information Systems
The University of Arizona
Concept Space Consultation
• Algorithmic Concept Exploration
• Large Networks of Knowledge
– Man-made Thesauri: LCSH & ACM CRCS
– Concept Spaces
• Spreading Activation
– Traversing a set of Knowledge Networks
automatically and suggesting a set of most
relevant concepts
Concept Space Consultation
Management Information Systems
The University of Arizona
Research Questions 1&2
• Would the automatic concept
exploration process be able to help
users identify more relevant concepts?
• Would such a process be able to
perform more efficient exploration of a
concept space than the conventional
manual browsing method?
Concept Space Consultation
Management Information Systems
The University of Arizona
Research Question 3
• If so, which algorithmic methods symbolic-based branch-and-bound or
neural network-based Hopfield net
algorithm - is better in terms of
gathering relevant concepts from
knowledge sources?
Concept Space Consultation
Management Information Systems
The University of Arizona
Research Questions 4&5
• Would the concept space consultation
process provide a semantic medium to
reduce the cognitive demand from users
in terms of elaborating information
needs?
• Would the concept exploration process
be able to help users find more relevant
documents?
Concept Space Consultation
Management Information Systems
The University of Arizona
Two Algorithms for
Spreading Activation
• Branch-and-bound Algorithm
– Semantic Net Based: “Optimal” Search
• Hopfield Net Algorithm
– Neural Net Based: Parallel Relaxation
Search
• Spreading Activation Process
– Activation, Weight Computation, Iteration
– Stopping Condition
Concept Space Consultation
Management Information Systems
The University of Arizona
User Evaluation
• 3 Subjects, 6 Tasks, 3 Phases
• Phase 1: Identify subject areas
• Phase 2: Find other topics using
spreading activation & manual browsing
• Phase 3: Document evaluation
Concept Space Consultation
Management Information Systems
The University of Arizona
Findings: Concepts
• Manual browsing achieved higher recall but
lower term precision than the algorithmic
systems.
• Manual browsing was also a much more
laborious and cognitively demanding process.
• When using the algorithms, subjects
reviewed the suggested terms more slowly
and treated them more seriously and carefully
than when performing manual browsing.
Concept Space Consultation
Management Information Systems
The University of Arizona
Findings: Documents
• No signification differences (in document
recall and precision) were observed between
the relevant documents suggested by the
algorithms and those generated via the
manual browsing process.
• Each approach could contribute to a larger
set of relevant documents for users.
• The essential differences were time spent
and cognitive effort in both approaches.
Concept Space Consultation
Management Information Systems
The University of Arizona
Publications
• Chen, H., Lynch, K. J., Basu, K., and Ng, T. D. “Generating,
Integrating, and Activating Thesauri for Concept-Based
Document Retrieval,” IEEE Expert, Special Series on Artificial
Intelligence in Text-Based Information Systems 8(2):25-34
(1993).
• Chen, H. and Ng, T.D. “An Algorithmic Approach to Concept
Exploration in a Large Knowledge Network (Automatic
Thesaurus Consultation): Symbolic Branch-and-bound Search
vs. Connectionist Hopfield Net Activation,” Journal of the
American Society for Information Science 3(5): 348-369 (1995).
Concept Space Consultation
Management Information Systems
The University of Arizona
Concept Space Generation
• Automatic Generation of Large-scale
Concept Spaces
• Feasibility and Scalability Issues of
Large-scale Concept Space Generation
– Domain Knowledge
– Computing Resources
Concept Space Generation
Management Information Systems
The University of Arizona
Research Question 1
• With regard to computing scalability,
would the technique of computer
generation of concept spaces be
applicable to very large textual
databases?
Concept Space Generation
Management Information Systems
The University of Arizona
Research Question 2
• With regard to domain specific
knowledge scalability, would concept
space generation by technology create
satisfactory domain-specific concept
associations from corresponding textual
databases?
Concept Space Generation
Management Information Systems
The University of Arizona
Research Question 3
• How does the quality of concept
associations in concept space
generated from very large textual
databases compare with that of a manmade domain-specific thesaurus?
Concept Space Generation
Management Information Systems
The University of Arizona
Concept Space Techniques
•
•
•
•
•
Document & Object List Collection
Object Filtering
Automatic Indexing
Co-occurrence Analysis
Parallel Supercomputing to Laptop
Computing
• Large to Small Collections
Concept Space Generation
Management Information Systems
The University of Arizona
User Evaluation
• 10 Subjects, 23 Tasks
• Recall & Recognition Phases
• Findings:
– Concept space has higher concept recall
– INSPEC thesaurus has higher concept
precision
– Concept space compliments man-made
thesaurus
Concept Space Generation
Management Information Systems
The University of Arizona
Publications
•
•
•
Chen, H., Schatz, B.R., Ng, T.D., Martinez, J., Kirchhoff, A., and Lin, C.
“A Parallel Computing Approach to Creating Engineering Concept
Spaces for Semantic Retrieval: The Illinois Digital Library Initiative
Project,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, Special Section on Digital Libraries: Representation and
Retrieval 18(8): 771-782 (1996).
Chen, H., Martinez, J., Ng, T. D., and Schatz, B. “A Concept Space
Approach to Addressing the Vocabulary Problem in Scientific
Information Retrieval: An Experiment on the Worm Community
Systems,” Journal of the American Society for Information Science
48(1):17-31 (1997).
Houston, A. L., Chen, H., Hubbard, S. M., Schatz, B. R., Ng, T. D.,
Sewell, R. R., and Tolle, K. M. “Medical Data Mining on the Internet:
Research on a Cancer Information System,” Artificial Intelligence
Review13(5/6):437-466 (1999).
Concept Space Generation
Management Information Systems
The University of Arizona
Corpuses & Applications
• INSPEC, CSQuest
http://ai.bpa.arizona.edu/cgi-bin/mcsquest
• CancerLit, Cancer Space
http://ai20.bpa.arizona.edu/cgi-bin/cancerlit/cn
• Webpages, ET-Space
http://ai.bpa.arizona.edu/cgi-bin/tng/ETSpace
• GeoRef & Petroleum Abstracts, GIS Space
http://ai10.bpa.arizona.edu/gis/
• Law Enforcement, COPLINK Concept Space
• DARPA ITO Project Summary Collection
http://ai6.bpa.arizona.edu/cgi-bin/tng/Psum
• CNN News, http://processc.inf.cs.cmu.edu/tng/inf/
Concept Space Generation
Management Information Systems
The University of Arizona
Conclusions
• Context-specific Concept Space
Consultation
• Concept Space As Semantic Exchange
Medium
Conclusions
Management Information Systems
The University of Arizona
Lessons Learned
• Both concept space consultation and
generation work
• “Strategic” use of knowledge sources
• Concept Space Technique is scalable
conceptually and computationally
• Insight to potentially retrieved
documents
Conclusions
Management Information Systems
The University of Arizona
Future Directions
•
•
•
•
Performing Summarization
Semantic Protocol for Machine Comm.
Multimedia Concept Association
Context Analysis with
– Metric Clusters: “distance” information
– Scalar Clusters: neighboring concepts of
two targeting concepts to compute their
similarity
Conclusions
Management Information Systems
The University of Arizona