THE INTERSPACE PROTOTYPE An Analysis Environment for

Download Report

Transcript THE INTERSPACE PROTOTYPE An Analysis Environment for

Applications of the Interspace
Analysis for Community Repositories
Bruce Schatz
CANIS Laboratory
Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
www.canis.uiuc.edu, [email protected]
Graduate School of Informatics
Kyoto University, November 28, 2001
THE THIRD WAVE OF NET EVOLUTION
CONCEPTS
OBJECTS
PACKETS
Technological Progress
In the past decade, technology has created
the Genome and the Web
In 1991, these ideas were only plans
In 2001, they have already progressed
from research systems to commercial products
In the next decade, the revolution will actually
begin and the world will be completely different!
Paradigm Shift (Pre)
Towards Dry-Lab Biology, Walter Gilbert (Jan 1991)



“The new paradigm, now emerging, is that all the 'genes' will be known
(in the sense of being resident in databases available electronically), and
that the starting point of a biological investigation will be theoretical.
An individual scientist will begin with a theoretical conjecture, only then
turning to experiment to follow or test that hypothesis. ...
To use this flood of knowledge [the total sequence of the human and
model organisms], which will pour across the computer networks of the
world, biologists not only must become computer-literate, but also
change their approach to the problem of understanding life. ...
The Coming of Informational Science
Correlation of Information across Sources
Paradigm Shift (Post)
Dissecting Human Disease, Victor McKusick (Feb 2001)









Structural genomics
Genomics
Map-based gene discovery
Monogenic disorders
Specific DNA diagnosis
Analysis of one gene
Gene action
Etiology (mutation)
One species
Functional genomics
Proteomics
Sequence-based gene discovery
Multifactorial disorders
Monitoring susceptibility
Analysis of multi-gene pathways
Gene regulation
Pathogenesis (mechanism)
Several species
Analysis Environments I
The Present -- Year 2001

Search Central Archives

Locating a Generic (average) solution



mining sequences from the Genome
diagnosing diseases from the Clinical Trial
some Problems may have point Solutions


find the cystic fibrosis gene
find the diabetes treatment
Analysis Environments II
The Future -- Year 2011

Navigate Distributed Repositories

Locating a Specific (situational) solution



correlating sequences, genes, expressions
correlating diagnoses, treatments, lifestyles
most Problems have cluster Solutions


find genes for Heart Disease
find treatments for Arthritis
Testbeds of the Future


WCS -- a testbed for the world of 2001

community repositories before the Web

in 1991, a distributed analysis environment
MCS -- a testbed for the world of 2011

concept navigation before the Interspace

in 2001, a biomedical analysis environment
to enable Michigan Corridor faculty and students
to live in the world of the future (information space)
Community Systems
results
data
(database management)
(electronic mail)
knowledge
(hypertext annotations)
literature
(information retrieval)
Formal
news
(bulletin boards)
Informal
browse and share all the knowledge of a community
Worm Community System
WCS Information:
Literature BIOSIS, MEDLINE, newsletters, meetings
Data
Genes, Maps, Sequences, strains, cells

WCS Functionality
Browsing
search, navigation
Filtering
selection, analysis
Sharing
linking, publishing


WCS: 250 users at 50 labs across Internet (1991)
WCS
Molecular
WCS
Cellular
WCS
Publishing
WCS
Linking
WCS
invokes
gm
WCS
vis-à-vis
acedb
A Model Community

1984-1988 Telesophy (Bellcore)


1989-1994 WCS (Arizona)


testbed in molecular biology
National Model for Biomedical Informatics



prototype to federate objects
NAS National Collaboratories report
NIH Human Brain project
Translational Results



NCSA Mosaic into Web browsers
acedb (worm) into Genome databases
Biology Workbench, 10K users across Web
Towards A Model Discipline

1995-1999 Interspace (Illinois, Urbana)


2000-2004 MEDSPACE (Illinois, Chicago)


testbed in clinical medicine (plan, demo)
National Model for Biomedical Informatics



prototype to federate concepts
lead news in Science on MEDLINE dry-run
Best Paper at AMIA (Medical Informatics)
2001-2005 MCS (Michigan)

testbed in biomedical research
Michigan Interspace

Gather the Information Sources



Generate the Community Repositories



Michigan Corridor System (MCS)
each (department, institute, lab) has repository
text documents with articles and annotations
specialty datatypes: databases and motifs
Construct the Analysis Environment


federated concept navigation across repositories
type-dependent parsing for text/data interlinks
MCS Sources

Literature




Journals: MEDLINE, BIOSIS, full-text
Specialty Conferences (e.g. Neuroscience)
Community Newsletters, Lab Annotations
Databases




Sequences: GENBANK, Celera
Genes and Maps from Model Organisms
Microarray Expressions, Protein Structures
Gene Pathways, Cellular Anatomy
Ten Steps from Here to There










Determine Users
Develop Hardware
Determine Collections
Develop Software
Interlinks Automatic
Interlinks Manual
Community Literature
Concept Navigation
Custom Databases
Custom Software
(range of needs)
(networks)
(range of types)
(databases)
(name recognition)
(distributed annotation)
(journals, conferences)
(indexing, switching)
(community datasets)
(specialized analysis)
THE NET OF THE 21st CENTURY



Beyond Objects to Concepts
Beyond Search to Analysis
Problem Solving via Cross-Correlating
Multimedia Information across the Net

Every community has its own special library
Every community does semantic indexing

The Interspace is true Cyberspace

Healthcare Infrastructure

For Amateurs rather than Professionals



Generate the Personal Repositories



interact with health status questionnaires
builds a customized dynamic database
Construct the Analysis Environment


packaged Interspace with inferred navigation
Internet Health Monitors for ordinary persons
similarity matching to locate similar patients
Evolve Community Interspace

statistical clustering for lifestyle coaching