Transcript 34011
Disease Informatics: Terms and
Jargon to begin with
R. P. Deolankar
General Terms
Data and Information
Data
• Numbers
• Words
• Images
• Information is derived from the data
Information
• It is the knowledge derived from analysis of the data
• Inferences can be drawn from information
• The inferences drawn from earlier work provides the basis for
projected work
Target information and Information gap
Target information
• Information which is required but not
available
• The information goal intended to be attained
Information gap
• Total information required to hit the
information target minus available
information
Research question and Hypothesis
Research question
• This is the question, if answered, could eliminate the information
gap
• The cycle of setting the information target, locating the information
gap and raising new research questions is the part of process of
research
Hypothesis
• This is a tentative answer to the research question
• The hypothesis is tested by performing the experiment
• After testing, hypothesis is either accepted or rejected
Postulation
• Hypothesis that cannot be tested and hence taken for granted
• A statement as the basis of a theory
(Disease phenomenon is the result of several causes, not just one)
Multiple hypotheses
•
•
•
•
•
•
More effective way of organizing research
Provides stimulus for study and fact-finding
See the interaction of the several causes
Promotes much greater thoroughness
Leads to lines of inquiry that we might otherwise overlook
Avoids the pitfall of accepting weak or flawed evidence for
one hypothesis when another provides a more elegant
solution
Precautions
• Keeping a written list of multiple hypotheses is necessary
• Difficult to test
• Vacillation is preferable to the premature rush to a false conclusion
Thomas Chrowder Chamberlin
Author of Method of Multiple Working Hypotheses
What is ontology?
• Incomplete information gives rise to
speculation
• Hierarchical structuring of speculations about
things within a particular domain is ontology
• Ontology is the statement of a logical theory
Disease Ontology
• Controlled Medical Vocabulary
• Facilitate mapping of diseases and associated
conditions to codes such as ICD, SNOMED and
others
• Disease Ontology (DO) is developed at the
Bioinformatics Core Facility in collaboration
with the NuGene Project at the Center for
Genetic Medicine, USA
Clinical event
• Clinical: related to the health or disease
• Event: something that happens at a given place
and time
• Depicted at both the ends of “cause and effect
diagram”
• Link of a Disease Causal Chain
• Backend event: Event occurring earlier to the
focused event
• Frontend event: Event occurring next to the
focused event
Biomarker
• Indicator of event of health / disease / clinical
history
• Usually biochemical metabolite
• Indicator of normal biologic processes,
pathogenic processes, or pharmacologic
responses to a therapeutic intervention.
Disease Causal Chain
• Diagram depicting chain or net
• Links of chain are events
• Progress from one event to other is shown by
“Cause and effect” diagram
• Journey from one event to the other is driven
by factors
Model organism
• Animal model in study of diseases
• Discoveries made in the animal model provides
insight into the human disease study
• Studies include pathogenesis, potential causes
and treatments of diseases
• Basis: common descent of all living organisms,
and the conservation of metabolic and
developmental pathways and genetic material
over the course of evolution
• Research performed using poor quality animals
could be misguiding
Component cause
• Belief in one cause one effect is a major error
in disease investigation
• Single component cause does not result in
disease
• Virus is a component cause in a viral disease
• Subset of sufficient causes does not result in a
disease but could predispose
• Most causes of interest to the epidemiologist
are actually components of a sufficient cause
Sufficient cause
• Sufficient causes are constellation of
component causes that could result in a
disease
• Factors contributing susceptibility to virus are
also component causes of viral disease
• Disease can originate from either of several
different sufficient causes
Book by Rothman and Greenland
NCL-60 lines
• Cell lines for anticancer drug screening
• Developed by the National Cancer Institute,
Maryland, USA
• Reflect diverse cell lineages [lung, renal, colorectal,
ovarian, breast, prostate, central nervous system,
melanoma, and hematological malignancies]
• Such panels could be prepared for other diseases
also
Algorithm
• A precise rule or set of rules
• A sequence of instructions
• Specify how to solve some problem
Metathesaurus
• Vocabulary for information retrival
• Integrated from synonyms and antonyms for
common words and phrases (thesauri)
• e.g. Unified Medical Language System to
integrate into a single system the terminology
of the biomedical sciences
SNOMED CT and SNOMED RT
• SNOMED: Sytematized NOMencalture of
MEDicine
• CT for Clinical Terms
• RT for reference terminology
UMLS: Unified Medical Language
System
• UMLS is a metathesaurus
• Developed by the National Library of Medicine
(NLM)
• Contains Knowledge Sources (databases) and
associated software tools (programs)
• Useful for developers of computer system
UML: Unified Modeling Language
Not to be confused with UMLS
• A standardized general-purpose modeling language
in the field of software engineering
• UML includes a set of graphical notation techniques
• Creates abstract models of specific systems
• Diagrams: structure (Class, Component, Composite
structure, Deployment, Object and Package
diagrams), behavior (Activity, State and Use case)
and interaction (Communication, Interaction
overview, Sequence and Timing)
Semantic Network
• Knowledge diagram with graphic notation
• Looks like flow chart
• Contains patterns of interconnected nodes
and arcs
SPECIALIST Lexicon
• SPECIALIST is the name of Natural Language
Processing (NLP) System
• Lexicon (dictionary like document) developed
using SPECIALIST is SPECIALIST lexicon
• Vocabulary encompassing English and biomedical
terminology
• The lexicon entry for each word or term records
the syntactic, morphological, and orthographic
information needed by the SPECIALIST NLP
System
Genetic terminology
Essential genes
• Genes required for growth to a fertile adult
• Essential for viability
Housekeeping genes
• Involved in basic functions needed for the
sustenance of the cell
• Constitutively expressed
• They are always turned ON e.g. actin
Disease-associated genes
• Alleles carrying particular DNA sequences
associated with the presence of disease
• e.g. Gene UNC-93B deficiency as a genetic
etiology of Herpes Simplex Encephalitis
• Lack of Stat1 interferon signaling gene
enhances pathogenesis of a viral disease
Gene Ontology (GO)
• The Gene Ontology (GO) is a project
• Provides a controlled vocabulary to describe
gene and gene product attributes in any
organism
• (the molecular function of gene products;
their role in multi-step biological processes;
and their localization to cellular components)
Epigenetic
• Relating to, being, or involving a modification
in gene expression
• It is independent of the DNA sequence of a
gene
• DNA methylation, chromatin remodeling,
transcription factors etc
Paralogs: Paralogous genes
• Two genes or clusters of genes at different
chromosomal locations in the same organism
• Have structural similarities indicating that they
derived from a common ancestral gene
• Have diverged from the parent copy by
mutation and selection or drift.
Homologs: Homologous genes
• Homologs: Having the same relative position,
value, or structure, something (as a chemical
compound or a chromosome) that is
homologous
• Homologous sequences are of two types:
orthologous and paralogous
Orthologs: orthologous genes
• Orthologous genes: genes that have evolved
directly from an ancestral gene
• This is in contrast to paralogous genes
Interlogs
• Suppose protein molecules (from one species
of animal say human) A and B interact;
homologous protein molecules (from another
species of animal say dog) A’ and B’ also
interact, then interlogs are:
• Resembling pair of protein-protein
interactions (e.g. A-B and A'-B')
• Can be observed parallelly in two different
organisms
Interologous Interaction Database
• Web-accessible database to facilitate
experimentation and integrated
computational analysis with model organism
Protein-Protein-Interaction networks
Regulogs
• Sets of co-regulated genes for which the
regulatory sequence has been conserved
across multiple organisms
• The quantitative method assigns a confidence
score to each predicted regulog member on
the basis of the degree of conservation of
protein sequence and regulatory mechanisms
Translational medicine: ("Bench to
bedside" research)
• Clinical Research orienting interaction
between basic research and clinical medicine,
particularly in clinical trials
Systems biology
• Relatively new biological study field
• Focuses on the systematic study of complex
interactions in biological systems
• Uses a new perspective (integration instead of
reduction) to study complex interactions
Predictive medicine
• Identifying biological markers in order to
enroll individuals at high risk for developing a
disease in special early detection trials
Meta-analysis
• In statistics, a meta-analysis combines the
results of several studies that address a set of
related research hypotheses
Bayesian approach
• Statistical approach based on Bayes' theorem
• Application of Baye’s theorem: Bayes' theorem
can be applied to calculate the probability that
a positive medical test result of a disease is a
false positive hence retesting is planned
• Bayes' theorem can be also be applied to
calculate the probability of a false negative
Omics terms
Genomics
• The branch of genetics that studies organisms
in terms of their genomes (their full DNA
sequences)
Pharmacogenomics
• Study of how an individual's genetic
inheritance affects the body's response to
drugs
• Tailor-made for individuals and adapted to
each person's own genetic makeup
• Greater efficacy and safety
• Environment, diet, age, lifestyle, and state of
health all can influence a person's response to
medicines
Nutrigenomics
• Study of molecular relationships between
nutrition and the response of genes
• Personalized nutrition based on genotype
Phenomics
• Field of study concerned with the
characterization of phenotypes
• Phenotypes arise via the interaction of the
genome with the environment
Transcriptome and transcriptomics
Transcriptome
• The complete set of RNA products (mRNAs, or
transcripts in a particular tissue at a particular
time) that can be produced from the genome
Transcriptomics
• The study of the transcriptome
Proteome and proteomics
•
•
•
•
Proteome
PROTEin complement to a genOME
Proteomics
The qualitative and quantitative comparison
of proteomes
• The comparison under different conditions to
further unravel biological processes
Metabolome and Metabolomics
• Metabolome
• It represents the collection of all metabolites
in a biological organism, which are the end
products of its gene expression
• Metabolomics
• Study of metabolome under different
conditions