Petersx - Buffalo Ontology Site

Download Report

Transcript Petersx - Buffalo Ontology Site

The Immune Epitope Database Representing Experiments Using
the Ontology of Biomedical
Investigations
Bjoern Peters,
La Jolla Institute for
Allergy and Immunology
10/21/2011, UCSD
Presentation Overview
1. The Ontology of Biomedical Investigations
(OBI)
2. The Immune Epitope Database (IEDB)
3. Representing IEDB experiments using OBI
4. Flow cytometry in OBI
OBI – a user driven project
• 19 communities that recognized they were
trying to solve the same / related problems
• Members typically have one or more
applications that drive OBI development
• 6 year effort, 1+ phone calls per week, 1-2
meetings per year
• first stable release (Philly / 1.0) in Oct. 2009
 Open project with constant addition of new
communities, please consider joining!
High level class hierarchy (partial)
OBI – Recent Development
• eagle-i project has/is integrating large vocabulary
of research resources into OBI
• Evidence Ontology (ECO) codes are being mapped
1:1 to OBI classes to allow ‘round-tripping’
between simple codes (‘direct assay evidence’)
and expressive OWL
• Finalization of OBI-core:
– Subset of OBI with extra promises for stability and
quality
– Education tool for both users (where to look) and
developers (where to add stuff)
• planned process
–
–
–
–
–
–
–
–
–
investigation
study design execution
acquisition
specimen collection
human subject enrollment
material transformation
assay
data transformation
documenting
OBI inner core
• information content entity
–
–
–
–
–
–
–
–
–
document
study design
hypothesis textual entity
protocol
independent variable specification
dependent variable specification
measurement datum
data item
conclusion textual entity
• dependent continuant
–
–
–
–
measure function
investigation agent role
study subject role
specimen role
• material entity
– device
– population
– specimen
OBI outer core
•
•
•
•
•
•
•
biological_process from the Gene Ontology (GO)
cell from the Cell Ontology (CL)
cellular_component from the Gene Ontology (GO)
environmental material from the Environment Ontology (EnVO)
geographical location from Gazetteer
gross anatomical part from the Common Anatomy Reference Ontology (CARO)
Homo sapiens from the National Center for Biotechnology Information
Taxonomy (NCBITaxon)
• measurement unit label, included to connect to the Ontology of Units of Measurement (UO)
• molecular entity from Chemical Entities of Biological Interest (ChEBI)
• organism, included to connect to the National Center for Biotechnology Information
Taxonomy (NCBITaxon)
• quality, included to connect to the Phenotypic Quality Ontology (PATO)
The following two terms are not in OBI yet:
• disease course from the Ontology for General Medical Science (OGMS)
• molecular_function from the Gene Ontology (GO)
Presentation Overview
1. The Ontology of Biomedical Investigations
(OBI)
2. The Immune Epitope Database (IEDB)
3. Representing IEDB experiments using OBI
4. Flow cytometry in OBI
Immune Epitope Definition
An immune epitope is a part of a molecule
that is directly recognized by
adaptive immune receptors, specifically by
antibodies, B cell receptors, or T cell receptors
CD8+ T cell epitopes in viral infection
Mouse
Virus
MHC-I
APC
CD8+ T cell epitopes in viral infection
Mouse
Proliferation
T
T
T
Virus
TCR
MHC-I
Cytokine
Release
CD8
Cytotoxicity
APC
T cell
epitope
mapping
ORF 1
M G Q I V T M F E A L P H I
I D E V I N I V I
I V L I V I T G I K A V Y N ...
ORF 2
M G L K G P D I Y K G V Y Q F K S V E F D M S H L N L T M P N A C S A N N ...
ORF 3
M H N F C N L T S A F N K K T F D H T L M S I V S S L H L S I D G N S N Y ...
ORF 4
M S A Q S Q C R T F R G R V L D M F R T A F G G K Y M R S G W G W T G S D ...
ORF 5
M H C T Y A G P F G M S R I L L S Q E K T K F F T R R L A G T F T W T L S ...
ORF 6
M K C F G N T A V A K C N V N H D A E F C D M L R L I D Y N K A A L S K F ...
ORF 7
M L M R N H L L D L M G V P Y C N Y S K F W Y L E H A K T G E T S V P K C ...
ORF 8
M N M I T E M L R K D Y I K R Q G S T P L A L M D L L M F S T S A Y L V S ...
Goals of the Immune Epitope Database
and Analysis Resource (IEDB)
• To catalog, organize and make accessible immune
epitope related information
– B and T cell epitopes, MHC binding, MHC ligand elution
– Scope: infectious diseases, allergy, autoimmunity ,
transplantation. (HIV  LANL database; no cancer)
• Develop new methods to predict and model immune
responses ( IEDB Analysis Resource)
www.iedb.org
Populating the IEDB
Part III: Data representation
Part II: Document
categorization
Literature curation
Epitope discovery
contract submission
IEDB
www.iedb.org
Structure
Epitope
Source
Immunization
Context
Assay
Name
Chemical Type
Sequence
Domain / Region
Species
Strain
Antigen
Antigen Accession
Antigen Positions
Immunized Species
Immunogen Type
Administration
Antigen Type
Assay Type
Response Measured
MHC Allele
74A
Peptide/Protein
CLTEYILWV
Defined Epitope
Vaccinia virus Ankara
Ankara (MVA)
putative 21.7k protein
2772819
79-87
Homo sapiens
Source Species
Scarification
Epitope
ELISPOT
Cytokine Release-IFN-g
HLA-A*0201
Literature Curation Status
Category
Infectious disease
Allergy
Autoimmunity
Transplant
Total
#Relevant
articles
Percent
completed
10,260
1,639
5,160
99.5%
99.1%
99.1%
977
18,036
99.3%
99.3%
>99% in all categories since 2011
IEDB applications
Meta-Analyses
Prediction tool development
Presentation Overview
1. The Ontology of Biomedical Investigations
(OBI)
2. The Immune Epitope Database (IEDB)
3. Representing IEDB experiments using OBI
4. Flow cytometry in OBI
Using OBI to represent
experiments in the IEDB
epitope mapping experiments
T
T
B
APC
T Cell Response
B Cell Response
epitope
source
(material
entity)
organism
protein
protein
complex
High level database
structure
has part
peptide
epitope
structure
(material
entity)
discontinuous
protein
residues
is about
reference
(document)
journal
article
author
submission
carbohydrate
has participant
immune
recognition
assay
(process)
B cell response
T cell response
MHC binding
preceded by
immunization
(process)
Natural
Infection
Administered
Immunization
Replacing IEDB controlled vocabularies
with OBI classes
• Benefits:
– Increase consistency in data curation
– Avoid duplicates
– Improve documentation to external users
– Enhance search capabilities
Original approach: controlled vocabularies
• Used existing external ontologies as source where possible
(none available for epitope specific T cell assays)
• Maintain list of assays; if a publication uses an assay that is
different, add to this list  140 T cell assays
• Challenges :
– Ensure curators pick the right assays
– Communicate to external users what each assay is
– Avoid introducing duplicates (“MCP-1 IFA” = “CCL-2 histostain”)
• In addition we want to
– Search for groups of related assays
– Interoperability (lots of it)
 Create an OBI class for each entry in our list of assay types
OBI
hierarchy
Assay definition:
A planned process with the objective to
produce information about an evaluant
OWL (partial):
has_specified_input some
(material_entity
and (has_role some 'evaluant role'))
has_specified_output some
('information content entity'
and ('is about' some
(continuant
and (has_role some 'evaluant role'))))
T cell epitope assay design pattern
• Majority of assays could be defined with N&S conditions after
specifying two variables:
<parent assay type> and has_specified_output some
'measurement datum‘ and 'is about' some
(<GO process Y> and 'process is result of' some 'MHC:epitope
complex binding to TCR')
• For example: “IL-17 ELISPOT” in the IEDB is logically defined as
= 'ELISPOT assay‘ and has_specified_output some
'measurement datum‘ and 'is about' some
(‘IL-17 production’ and process is result of' some 'MHC:epitope
complex binding to TCR')
• Required expanding parent assay types (OBI) and GO process
Adding parent assay types to OBI
• label: cytometric bead array assay
• definition: An assay in which a series of beads coated with
antibodies specific for different analytes and marked with discrete
fluorescent labels are used to simultaneously capture and
quantitate soluble analytes using flow cytometric analysis.
• alternative term: multiplexed bead assay, CBA assay
• example of usage: Using a Luminex machine to detect IFN-gamma
and IL-10 in the supernatant of a cell culture
 “Parent” assay definitions are discussed in OBI as a group and
derived by consensus, to ensure exactness and ability to re-use.
 Child terms that follow design patterns are added without group
discussion
Modifying external ontologies
• Requests for new / modified terms are made through their
respective trackers
(sometimes additional prodding is needed)
• Often results in email discussions that clarify issues and
result in improved definitions (but take time)
• Succeeded with GO, ChEBI, PRO, OGMS, IDO, PATO, UO, …
• Resulting terms are imported into OBI to reference them
in logical definitions
(Using MIREOT mechanism)
• Some terms have no ‘natural home ontology’, and are
kept in OBI until they can be moved
Mapping IEDB assay types to OBI classes
IEDB
OBI ID
rdfs:label
Assay ID
198
OBI_0000891 assay of epitope specific T cell proliferation
52
OBI_0001194 ELISPOT assay of epitope specific transforming
growth factor-beta production by T cells
61
OBI_0001196 cytometric bead array assay of epitope specific IP-10
production by T cells
62
OBI_0001198 assay of epitope specific interleukin-27 production by
T cells
64
OBI_0001203 detection of specific nucleic acids with
complementary probes of epitope specific
transforming growth factor-beta production by T cells
is about biological process (epitope
specific T cell response)
epitope specific T cell proliferation
epitope specific transforming growth factorbeta production by T cells
epitope specific IP-10 production by T cells
assay type (technique)
65
OBI_0001206 ELISPOT assay of epitope specific granulocyte
macrophage colony-stimulating factor production by T
cells
OBI_0001209 assay of epitope specific interleukin-10 production by
T cells
OBI_0001210 cytometric bead array assay of epitope specific
transforming growth factor-beta production by T cells
epitope specific granulocyte macrophage
ELISPOT assay
colony-stimulating factor production by T cells
OBI_0001215 intracellular cytokine staining assay of epitope
specific cytotoxic T cell degranulation
OBI_0001216 cell culture analyte detection bioassay of epitope
specific interleukin-2 production by T cells
OBI_0001217 assay of epitope specific interleukin-22 production by
T cells
OBI_0001218 assay of epitope specific interleukin-8 production by T
cells
OBI_0001220 cell culture analyte detection bioassay of epitope
specific interleukin-10 production by T cells
OBI_0001222 ELISA of epitope specific RANTES production by T
cells
OBI_0001223 cytometric bead array assay of epitope specific
epitope specific cytotoxic T cell degranulation intracellular cytokine staining assay
assay
ELISPOT assay
cytometric bead array assay
epitope specific interleukin-27 production by T assay
cells
epitope specific transforming growth factordetection of specific nucleic acids
beta production by T cells
with complementary probes
Spreadsheet based template
66
84
143
153
154
171
174
194
196
epitope specific interleukin-10 production by T assay
cells
epitope specific transforming growth factorcytometric bead array assay
beta production by T cells
epitope specific
cells
epitope specific
cells
epitope specific
cells
epitope specific
cells
epitope specific
cells
epitope specific
interleukin-2 production by T
cell culture analyte detection
bioassay
interleukin-22 production by T assay
interleukin-8 production by T
assay
interleukin-10 production by T cell culture analyte detection
bioassay
RANTES production by T
ELISA
interleukin-1 beta production
cytometric bead array assay
Benefits of using OBI classes
for IEDB assay types internally
• Formal definitions of assay types serve as curation rules
• Issues arising in curation are reflected 1:1 by issues in
writing definitions
• Linking to GO identified duplicate assay types
(introduced in the IEDB controlled vocabulary as a result
of changes in nomenclature over time)
The same could have been achieved by carefully writing
definitions for our controlled vocabulary terms, but
ontologies can do more…
Reasoning introduces hierarchy
Display with
community specific
“IEDB alternative label”
Benefits of using OBI for external users
Required (minimal) modification of the assay type table
Assay type ID Assay type
[Primary Key] name
Ontology ID
[could be more than just OBI]
1
IFN-g ELISPOT http://purl.obolibrary.org/obo/OBI_0001414
2
Survival
http://purl.obolibrary.org/obo/OBI_0001334
3
IL-10 FACS
http://purl.obolibrary.org/obo/OBI_0000414
4
DTH
http://purl.obolibrary.org/obo/OBI_0002114
…
…
…
 This allowed us to use OBI
Ontology driven
search interface
• Search for groups of related assays
• Search using synonyms
•Use IEDB specific labels
Future work
• Export IEDB data into triple store, enabling
Sparql queries
 seamless interoperability
• Integration into rule based validation system
Overall Conclusions
• The IEDB catalogs and organizes experimental
data characterizing immune epitopes
• We implemented a machine learning pipeline to
identify and triage journal articles relevant for
subject areas of interest
• OBI provides a framework to represent
experimental information in an interoperable
and semantically rich format that has immediate
benefits for database resources such as the IEDB
Flow cytometry for IEDB
• IL-10 production (GO)
• Epitope specific IL-10 production by T cells (OBI helper
term).
– Textual Definition: “A biological process where T cells produce
IL-10 resulting from the recognition of a T cell epitope”
– Logical definition:
“'interleukin-10 production‘ and ('process is result of' some
'MHC:epitope complex binding to TCR')”
• Intracellular cytokine staining assay (OBI)
• T cell epitope intracellular cytokine staining IL-10 assay
(OBI, term that really just the IEDB wants)
 Tie to cells, cell populations
Immunology terms in OBI
• There is no ‘immune epitope ontology’
 merged into OBI
• These terms are looking for a new home:
–
–
–
–
–
disposition to be bound by immune receptor
binding
Epitope, antigen, immunogen, allergen, host
‘epitope specific cytokine production by T cells’
Environmental exposure / proximity to infectious agent (IDO)
Thanks!
La Jolla Institute for Allergy & Immunology
SAIC
• Stephen Greenlee
• Jason Cantrell
• Jason Buell
• Robert Hinman
• Kelly Wheeler
• Eric Gutt
San Diego Supercomputer
Center
• Phil Bourne
• Julia Ponomarenko
Technical University of
Denmark
• Ole Lund
• Morten Nielsen
University of Copenhagen
• Søren Buus