Food Standards - Stanford InfoLab
Download
Report
Transcript Food Standards - Stanford InfoLab
The Elucidation of Regulatory Networks
in Complex Biological Systems:
The Convergence
of
Biology, Medicine and Computing
G. Poste
Stanford University, 15 March 2002
[email protected]
The Analysis and Application of Principles
of Biological Design
biology
1750-1980
1980-2010
• the encoded
information content
of biological
systems
biology
chemistry
• the descriptive
narrative
• empirical
technology
genomics computing
• mechanistic
reductionism
• mapping the basis
of biological
variation
• rational
medicine and
customized
care
systems biology
Biology and Medicine as Information-Based Sciences
From Reductionism to Integrated Systems Biology
• individual genes
and proteins
• biological circuits,
pathways and networks
• molecular interactions
in simple systems
• assembly of higher
order systems
• limited, fragmented
datasets
• massive, integrated
datasheets
• poor annotation
• stringent, standardised
annotation
• limited capacity for
predictive simulation
• robust algorithms for
predictive biology
• biology in silico
• analog information
• digital information
21st Century Biology and Medicine
“SYSTEMS BIOLOGY”
• the design principles of biological order and complexity
• mapping the information content of biopathways and networks
Biotechnology
And
Systems
Biology
New
Analytical
Capabilities
Large Scale
Computing
“BIG BIOLOGY”
• interdisciplinary, massive datasets, information-based
• infrastructure, investment and education
Convergence :
The Technological Platforms Shaping
the Evolution of Healthcare
Rule-Based
Design Principles
Computational
Biology
Biotechnology
And
Systems
Biology
New
Analytical
Capabilities
Exploring
“Biospace”
Large Scale
Computing
Automation
Engineering
and Robotics
Materials
Science
Micro-/OptoElectronics
From Reductionism to Integrated Systems Biology
understanding the information content encoded in
biological networks
mapping the design rules for progressively greater
complexity of biological order
gene(s)
pathways, circuits and networks
progressively ordered assemblies: organelles, cells, tissues organs
homeostatic integration of myriad, complex, interactive networks
(Physiology)
High Level Abstraction of Biological Pathways and
Network Systems
Encoded Information
Pathways and Networks
Rule Sets
Plasticity
• adaptive fitness
• pathological peturbation
Predictive Biology
• directed evolution
• biology in silico
Novel Biospace
and
Carbon : Silicon Union
Global and Nodal Pathway Map of Genomic and
Proteomic Elements in Yeast Galactose Utilization
From: T. Ideker et. al. 2001. Science 292, 929
Genetic Networks
bioinformation processing involves leverage of
interactive feedback loops in diverse domains
- physical, chemical, electrical
genomic and proteomic codes represent a
dense network of nested hyperlinks
matter becomes code
Nonlinear Complexity in Biological Systems
distinct classes of nonlinear interactions
long-range (fractal) correlations
self-similarity, self-dissimilar and organized
criticality
pattern formation
complex adaptive networks
highly optimized tolerance = robustness with
fragility
barriers to cascading failures
deterministic chaos
emergent properties
Nonlinear Complexity in Biological Systems
abrupt changes
- bifurcations; intermittency/bursting;
bistability/multistability; phase transitions
nonlinear oscillations
- limit cycles; phase-resetting; entrainment
nonlinear waves
- spirals; scrolls; solitons
complex periodic cycles and quasiperiodicities
scale invariance
- fractal and multifractal scaling; long-range
correlations; self-organized criticality
stochastic resonance and related noise-modulated
mechanisms
time irreversibility
Information
and
Technology Platform Overload
Principal Themes in the
Analysis of Biological Systems
large scale
miniaturization
automation
parallelism
networked systems
real time, interactive, adaptive
Major Technology Gaps
rapid gene ID in complex genomes
structural genomics and protein structure-function
prediction
mapping the proteome
- abundance, modification, localisation and proteinprotein interactions
- large scale parallelism (protein-arrays)
- small organic molecule networks
mapping the metabolome
- circuits, modules, networks
robust predictive algorithms for ADMET profiling of
drug candidate SAR
The Need for Standards and Stringent Semantics
“... without which …..
wanton and luxuriant fancies climbing up into the Bed of Reason,
do not only defile it by unchaste and illegitimate embraces,
but instead of real conceptions and notices of things
do impregnate the mind with nothing
but Ayerie and Subventaneous Phantasmes”
Samuel Parker, FRS 1666
standards
standards
STANDARDS
The Analysis and Comprehension
of Biological Systems
descriptive
ignorance
initial
mechanistic
insights
complexity
• elucidation of
patterns
• defining rule
sets
defined
rule sets
• disease heterogeneity
• patient heterogeneity
• disease predisposition
burgeoning,
bewildering complexity
• elegant simplicity
revealed
• predictive biology
• right Rx : right disease
• right Rx : right patient
• from reactive treatment
to proactive prevention
molecular
phylogenies
and
geneology
chemical
SAR
Integrated
Distributed
Heterogeneous
Databases
and Databanks
biological
order
population genetics
clinical
databanks
data
warehousing
and
data mining
evolving
hardware
and
electronic
evolution
object-oriented
and pattern /
spatial array
recognition
Expert
Systems
and
Knowledge
Management
humancomputer
interface
systems
Convergence, Consilience, Cognition
and Computing
• more
science
• better science
• faster science
• cross-disciplinary
science
• interdisciplinary
convergence
• technological
convergence
• corporate
convergence
MEGADATA
Volume
The
Scalability
Crisis
• burgeoning data volumes
• more transactions
• increasing diversity of
datasets/apps
• expanding user
communities
• pressures on network
bandwidth
• complexity of
distributed environments
• rising performance
expectations
• confidentiality and privacy
Performance
Major Challenges for Life Sciences Computing
exponentially growing data repositories
(102TB/PB)
highly variable data formats and standards as
obstacles to data access and mining
inadequate attention to data Q.C./annotation
standards
excessive reliance on customized solutions and
fragmented data sources
inadequate access and integration of public and
private datasets
primitive data visualization tools
80% time spent on data preparation tasks and
20% on productive exploration
Major Challenges for Life Sciences Computing
Big Biology
infrastructure scale and capital investment
new tools for mining, visualization, simulation
data storage conventions and technologies
dynamic, adaptive, scalable systems
active networks
- software into the network
- subnet interoperability
- integration of distributed and collaborative working
environments
fast data access at all levels
- storage, I/O and networks to support analysis and
simulation
expanded bandwidth for high usage and high transfer rates
Bracing For the Inevitable : Petabyte-Size
Databases
1000 terabytes
250 billion text pages
20 million four drawer filing cabinets
2000 mile high tower of 1 billion diskettes
typical US consumer generates 100 Gbytes
personal data/lifetime
- education, insurance, credit, medical
100 million consumers 10,000 petabytes
Data Grids
from Napster and Gnutella
to
ubiquitous peer-to-peer exchange of data sets
to
apportioned distributed computing for
solutions of computationally massive problems
Informatics for Big Biology and
e.Health Networks
• instructive precedents in high end computing from
other disciplines
- cosmology, quantum chromodynamics,
climate research, materials
Europe
USA
• Scientific Simulation Initiative
• National Computational
Science Alliance
• Long Term Ecological Research
• NASA, DOE, NOAA
• Accelerated Strategic Computing
Initiative
•Grid Physics Network
•
•
•
•
•
UNICORE
Pangea
E-Science
LHC Challenge
E-Grid
The Bibliome
The Bibliome
Proof, logic
and
ontology
languages
• shared terms/ terminology
• machine-machine
communication
• inter-memetic translation
• self-evolving translators
• Resource Description
Framework
• eXtensible Markup
Language
• Metadata tagging standards
for interoperable distributed
archives
• self-assembling datasets
• self-describing documents
• HyperText Markup Language
• HyperText Transfer Protocol
• The first generation
Web
The
Global
Virtual
Archive/
Universal
Knowledge
Web
Modified from : T. Berners-Lee and J. Hendler Nature 2000 410, 1023
Metadata
WWW
I
Standardized Lexical Foundations for the
Annotation, Archiving and Analysis of Complex
Biological Systems
unique complexity of biological systems
multiple levels of abstraction
- organismal
- ecosystem dynamics
- social/memetic networks
qualitative not quantitative data
- diversity of experimental conditions
- inaccessibility/replication of experimental
conditions
upgrading to hybrid qualitative/quantitative
analysis tools
Standardized Lexical Foundations for the
Annotation, Archiving and Analysis of Complex
Biological Systems
entity classes : finite elements
action properties : state properties
intramolecular site interactions
intermolecular site interactions
massively parallel networks : unit modules
continuum systems
compartments
economy and parsimony
evolutionary relationships
network pathways
- redundancy (degeneracy), pleiotropy
- complex emergent properties
Standardized Lexical Foundations for the Annotation,
Archiving and Analysis of Complex Biological Systems
entity classes : finite elements
action properties : state properties
intramolecular site interactions
intermolecular site interactions
massively parallel networks : unit modules
continuum systems
compartments
economy and parsimony
evolutionary relationships
network pathways
- redundancy (degeneracy), pleiotropy
- complex emergent properties
submodels for searchable characteristics of functional knowledge
integration of submodels into web-based distributed model networks
Jabberwocky
“ ’Twas brillig and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves
And the mome raths outgrabe”
Lewis Carroll
The Divide Between Syntax and Semantics
“Colorless ideas sleep furiously”
Noam Chomsky (1957)
syntactically valid
semantically void
The Divide Between Syntax and Semantics
“Colorless green ideas sleep furiously”
Noam Chomsky (1957)
encoded genome structure (syntax) and diverse
expression repertoires (semantics)
- alternative splicing
- overlapping reading frames
- nonsense mutations
- differential modulation by different transcription
factors
database formats (syntax) and ontology
(semantics)
The Conceptual Complexity of Ontology Design
ontology
- set of axioms in a logical language
- representational vocabulary with precise
definitions of shared understanding
- axioms constrain interpretation of defined terms
XML versus ontology and evolution of the semantic
web
- XML less complex since semantics are not
represented
- objective to reduce uncertainty favors
ontologies
- objectives to reduce complexity favors XML
Convergence, Consilience, Cognition
and Computing
scientific, technological and economic
convergence
data
complexity
optimized
data
representation
data
scale
data
diversity
optimized
data
comprehension
optimized
data
utilization
• adaptive IT
• novel visualization • ‘mind in the loop’
• novel emergent
computing
and mining tools
networks
• modulation of
• human medicine
brain function for
interfaces
optimum perceptualization
Bounded Rationality
human mind’s processing capacity is small relative
to the size of the problems requiring
analysis/comprehension (Simon)
objective solutions require complexity reduction in
information, task and coordination
complexity reduction
- omission and abstraction
- division of labor (systems decomposition)
complexity reduction simultaneously increases
uncertainty (Fox)
implications for evolution of ontologies for the
semantic web
Enhancing Human Cognitive Capacities
for Optimizing information Utilization
escalating quantities and types of information
real time decision making
new multi-modal, multi-sensory high performance
human : information interfaces
representation and comprehensibility of
information flows
- optimize information representation (perception)
- modulation of brain function to optimize
comprehension
systemic application of advances in cognitive
neurobiology
Enhancing Human Cognitive Capacities
for Optimizing information Utilization
optimizing representations of information
- perceptualization
optimizing cognitive capacities
- states of the brain affect states of mind
(perception and cognition)
- perceptual modulation techniques
Interdisciplinary Linquistics :
Memetic Engineering
molspeak, medspeak, nerdspeak
standardization coding
speech recognition
object-oriented computing
synthetic intelligence
Molecular Medicine,
Population Segmentation
and
Targeted Patient Care
Population Genetics
large-scale
population genetics
geno-phenotype
correlations
in subpopulations
‘at-risk’
subpopulations
individual
risk
profiling
Linking Clinical Outcomes to Genetic Variation
population
genetics
haplotype blocks
SNP maps
low cost
high-throughput
genotyping
dbases
informatics
gene-disease
associations
ethics
Large-Scale Disease Association Genetics
and Disease Predisposition Risk Profiling
formidable logistics and cost
robust algorithms for
combinatorial gene interactions
slow evolution
complex ethical, legal and social issues
public acceptance and legislative controls
evidentiary standards and regulation
Legislative and Regulatory Considerations in the
Creation and Management of Large Scale
Population Health Data Networks
consent
identifiable (clinical) versus anonymous
(research) data
authentication of communicating parties
compliance
-
HIPAA (USA)
EU Data Directive
individual nation/US State requirements
ICH5 Common Technical Document
e.health
Content
Care
Population Databanks and the Rise of
Molecular Medicine
individual / family records
diabetes
CVD
CPD
renal
privacy and confidentiality
gene-disease correlations
stroke
CNS
gene-outcome correlations
gene-disease predisposition
associations
infection
cancer
individual (targeted) care
- optimum Tx
- predisposition and
proactive risk management
Who Knows Wins!
Health Databanks
population
dBase
individual
record
and
risk
profile
Population Segmentation
and Individual Patient
Profiling
•
•
•
•
Physician
Desk-Top
Network
clinical
pharmacy
lab data
outcomes
Shaping Physician Behaviour
decision support / control
Dx/PDx
Rx, PRx
clinical guidelines
education
e.Pharmacy
e.Home
Health
Rx validation
utilization
compliance
AE avoidance
wellness education
compliance
risk mitigation
remote monitoring
Shaping Consumer / Patient Behaviour
“The average person will have three to five
internet devices on their body by the end
of 2010…..
not just the mobile phone,
but health monitors,
maybe even an implanted device,
a GPS type of system, etc………..”
John Chambers
Cisco Systems
dot.CEO January 2001, p. 53
Consumer Health Information
Systems and Services
in-home to physician / pharmacy links
next generation tele-medicine and
personal health monitoring
compliance monitoring
independent living
emergency management
integration of new imaging /
diagnostic sensor systems
Biology and Medicine as Information-Based Disciplines
Cyber-Medicine
on-body / in-body / in-home remote devices
for health status / compliance monitoring
interactive computational software and
Rx of behavioral disorders
ubiquitous physician decision-support software
to optimize clinical care and compliance
The Evolution of Large-Scale Biology
genome sequencing
comparative genomics
proteomics
functional genomics
structural genomics
genetic circuits
biological order
complex systems
SNPs and gene-disease
association studies
large-scale population
and statistical genetics
robust geno-phenotype
correlations
individual genotyping
and disease risk profiling
INFORMATICS
Biology and Medicine as Information-Based Disciplines
Research
understanding the encoded instructions
for biological design
- genes proteins higher order assemblies
- abnormal information coding in disease
Clinical Medicine
assembly of large-scale population databases
- gene-disease correlations
- gene-Rx outcome correlations
- individual genotyping and disease
predisposition risk profiling
Systems Analysis
Biology as an Informational Science
new technological platforms
- automation, miniaturization, high-throughput
- parallelism
new computational tools
- scale, diversity of content
- mining algorithms
new organizational linkages
- convergence of biology and computing (science)
- health / telco / compco (technology)
Systems Analysis
Biology as an Informational Science
new skills
- graduate / post-graduate curricula
- clinical training
new organizational structures
- inter-disciplinary
new policies
- grant agencies
- national / international science
- regulation, legislation
Computational Biology
Grand Challenges
predictive simulation of gene regulation and
genetic networks
- from genotype to phenotype
fast algorithms for molecular simulations
modeling of molecular interactions, chemical
dynamics, transport and compartmentalization in
cells
metabolic and physiological simulations
scalar modeling
- molecules to cells to tissues to organs to
organisms to populations
predictive tools for pre-emptive stabilization of
system dysregulation
From Bioinformatics to Computational Biology
Bioinformatics : The Phenomenological Era
• ID and classification of statistical regulation among the most
recurrent objects
• optimum database design
• fast classification/clustering algorithms
• data mining software and ontological relationships
Computational Biology : The Theoretical Era
• elucidation of robust design rules
• higher order multistate detector and component interactions
• contextual recognition
• pathways, circuits, networks and higher order assemblies
• predictive biology
biology and medicine are in transition to become
information-based sciences
this transition will shift R&D focus from the current
reductionist framework to the analysis of biological
complexity (systems biology)
these transitions will demand adoption of large
scale analyses (big biology) and obligate adoption
of more stringent standardization
- data QC, annotation, curation
- dBase formats and clinical profiling tools
- massive computational capacity and dynamic,
scalable networks
- distributed computing and collaborative
networks
- from bioinformatics to ‘rules-based’
computational biology and cybermedicine