THE INTERSPACE PROTOTYPE An Analysis Environment for
Download
Report
Transcript THE INTERSPACE PROTOTYPE An Analysis Environment for
Applications of the Interspace
Analysis for Community Repositories
Bruce Schatz
CANIS Laboratory
Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
www.canis.uiuc.edu, [email protected]
Graduate School of Informatics
Kyoto University, November 28, 2001
THE THIRD WAVE OF NET EVOLUTION
CONCEPTS
OBJECTS
PACKETS
Technological Progress
In the past decade, technology has created
the Genome and the Web
In 1991, these ideas were only plans
In 2001, they have already progressed
from research systems to commercial products
In the next decade, the revolution will actually
begin and the world will be completely different!
Paradigm Shift (Pre)
Towards Dry-Lab Biology, Walter Gilbert (Jan 1991)
“The new paradigm, now emerging, is that all the 'genes' will be known
(in the sense of being resident in databases available electronically), and
that the starting point of a biological investigation will be theoretical.
An individual scientist will begin with a theoretical conjecture, only then
turning to experiment to follow or test that hypothesis. ...
To use this flood of knowledge [the total sequence of the human and
model organisms], which will pour across the computer networks of the
world, biologists not only must become computer-literate, but also
change their approach to the problem of understanding life. ...
The Coming of Informational Science
Correlation of Information across Sources
Paradigm Shift (Post)
Dissecting Human Disease, Victor McKusick (Feb 2001)
Structural genomics
Genomics
Map-based gene discovery
Monogenic disorders
Specific DNA diagnosis
Analysis of one gene
Gene action
Etiology (mutation)
One species
Functional genomics
Proteomics
Sequence-based gene discovery
Multifactorial disorders
Monitoring susceptibility
Analysis of multi-gene pathways
Gene regulation
Pathogenesis (mechanism)
Several species
Analysis Environments I
The Present -- Year 2001
Search Central Archives
Locating a Generic (average) solution
mining sequences from the Genome
diagnosing diseases from the Clinical Trial
some Problems may have point Solutions
find the cystic fibrosis gene
find the diabetes treatment
Analysis Environments II
The Future -- Year 2011
Navigate Distributed Repositories
Locating a Specific (situational) solution
correlating sequences, genes, expressions
correlating diagnoses, treatments, lifestyles
most Problems have cluster Solutions
find genes for Heart Disease
find treatments for Arthritis
Testbeds of the Future
WCS -- a testbed for the world of 2001
community repositories before the Web
in 1991, a distributed analysis environment
MCS -- a testbed for the world of 2011
concept navigation before the Interspace
in 2001, a biomedical analysis environment
to enable Michigan Corridor faculty and students
to live in the world of the future (information space)
Community Systems
results
data
(database management)
(electronic mail)
knowledge
(hypertext annotations)
literature
(information retrieval)
Formal
news
(bulletin boards)
Informal
browse and share all the knowledge of a community
Worm Community System
WCS Information:
Literature BIOSIS, MEDLINE, newsletters, meetings
Data
Genes, Maps, Sequences, strains, cells
WCS Functionality
Browsing
search, navigation
Filtering
selection, analysis
Sharing
linking, publishing
WCS: 250 users at 50 labs across Internet (1991)
WCS
Molecular
WCS
Cellular
WCS
Publishing
WCS
Linking
WCS
invokes
gm
WCS
vis-à-vis
acedb
A Model Community
1984-1988 Telesophy (Bellcore)
1989-1994 WCS (Arizona)
testbed in molecular biology
National Model for Biomedical Informatics
prototype to federate objects
NAS National Collaboratories report
NIH Human Brain project
Translational Results
NCSA Mosaic into Web browsers
acedb (worm) into Genome databases
Biology Workbench, 10K users across Web
Towards A Model Discipline
1995-1999 Interspace (Illinois, Urbana)
2000-2004 MEDSPACE (Illinois, Chicago)
testbed in clinical medicine (plan, demo)
National Model for Biomedical Informatics
prototype to federate concepts
lead news in Science on MEDLINE dry-run
Best Paper at AMIA (Medical Informatics)
2001-2005 MCS (Michigan)
testbed in biomedical research
Michigan Interspace
Gather the Information Sources
Generate the Community Repositories
Michigan Corridor System (MCS)
each (department, institute, lab) has repository
text documents with articles and annotations
specialty datatypes: databases and motifs
Construct the Analysis Environment
federated concept navigation across repositories
type-dependent parsing for text/data interlinks
MCS Sources
Literature
Journals: MEDLINE, BIOSIS, full-text
Specialty Conferences (e.g. Neuroscience)
Community Newsletters, Lab Annotations
Databases
Sequences: GENBANK, Celera
Genes and Maps from Model Organisms
Microarray Expressions, Protein Structures
Gene Pathways, Cellular Anatomy
Ten Steps from Here to There
Determine Users
Develop Hardware
Determine Collections
Develop Software
Interlinks Automatic
Interlinks Manual
Community Literature
Concept Navigation
Custom Databases
Custom Software
(range of needs)
(networks)
(range of types)
(databases)
(name recognition)
(distributed annotation)
(journals, conferences)
(indexing, switching)
(community datasets)
(specialized analysis)
THE NET OF THE 21st CENTURY
Beyond Objects to Concepts
Beyond Search to Analysis
Problem Solving via Cross-Correlating
Multimedia Information across the Net
Every community has its own special library
Every community does semantic indexing
The Interspace is true Cyberspace
Healthcare Infrastructure
For Amateurs rather than Professionals
Generate the Personal Repositories
interact with health status questionnaires
builds a customized dynamic database
Construct the Analysis Environment
packaged Interspace with inferred navigation
Internet Health Monitors for ordinary persons
similarity matching to locate similar patients
Evolve Community Interspace
statistical clustering for lifestyle coaching