140) `IUPHAR/BPS guide to pharmacology (GtoPdb): Concise

Download Report

Transcript 140) `IUPHAR/BPS guide to pharmacology (GtoPdb): Concise

1
http://www.slideshare.net/cdsouthan/southannciuphar-acssandiego-59444512
IUPHAR/BPS guide to pharmacology (GtoPdb):
Concise mapping for the triples of chemistry,
data, and protein target classifications
Christopher Southan, Adam J. Pawson, Joanna L. Sharman, Elena
Faccenda, Simon Harding, Jamie Davis, IUPHAR/BPS Guide to
PHARMACOLOGY, Centre for Integrative Physiology, University of
Edinburgh
ACS Wed, Mar 16 CINF 140:Chemistry, Data & the Semantic Web: An
Important Triple to Advance Science 1:30 PM - 4:45 PM Room 25B
www.guidetopharmacology.org
1:35pm - 2:00pm
2
Abstract
(will be skipped for presentation)
The International Union of Basic and Clinical Pharmacology Committee on Receptor
Nomenclature and Drug Classification (NC-IUPHAR) provides authoritative reports on
G protein-coupled receptors (GPCRs) Nuclear Hormone Receptors and Ion Channels
as pharmacology-based classifications. While these recommendations surfaced as
Pharmacological Review papers (i.e. unstructured) since the 1990’s, they were
already underpinning the protein tables in GtoPdb's predecessor, IUPHAR-DB, by
2003. By 2012 this hierarchical data structure had expanded into the GtoPdb schema
covering essentially all target classes for pharmacology, drug discovery and chemical
biology. As of August 2015 the expert-curated relationship capture from the literature
covers 1505 target-to-ligand mappings of which 1228 human protein IDs have
quantitative interaction data recorded against 5860 chemical structures. The
motivation, evolutionary trajectory, the need for community engagement to fill data
gaps and future directions of the resource will be outlined. Descriptions will cover the
challenges of cross-referencing alternative gene/protein hierarches, each of which has
different navigational utilities and linkages to chemistry in GtoPdb. These now extend
beyond receptors to enzymes and include NC-IUPHAR, HGNC, UniProt, Ensembl,
InterPro, Gene Ontology and E.C. numbers. The adaption of our classifications to
encompass a new immunopharmacology project will also be discussed.
3
Outline
• Introduction to NC-IUPHAR
• Evolution of IUPHAR-DB to GtoPdb
• Relationship statistics
• Target hierarchy and navigation
• Triple challenges with taxol
• Protein mapping and data gaps
• Introducing Guide to Immunopharmacology
• Conclusions and plans
4
International Union of Basic and Clinical Pharmacology
Committee on Receptor Nomenclature and Drug
Classification (NC-IUPHAR)
• Section within IUPHAR umbrella organisation since 1987
• Issuing guidelines for the nomenclature and classification of human
•
•
•
•
•
•
biological targets of current and future medicines
Facilitating the interface between the Human Genome Project
entities as functional units and potential drug targets
Designating pharmacologically important polymorphisms
Developing an authoritative and freely available, global online
resource the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Establishment of target-specific subcommittees (650 members)
Associated with over 90 PubMed entries since 1995
Co-applicant on UK Wellcome Trust grants for the Edinburgh
University-based GtoPdb and GtoImmPdb projects
5
NC-IUPHAR
2015 output
6
NC-IUPHAR – Human Gene Nomenclature Committee collaboration
7
IUPHAR-DB launched in 2009:
unique model of committee-underpinned annotation
8
2012 to 2016: evolution of GtoPdb with major expansion
Human targets
Ligands
9
GtoPdb relationship statistics (Jan 2016)
15,000
PubMed IDs
14,117 affinity
values
6,149
Ligands
Our basic “triple”
1,786
Swiss-Prot IDs
10
Top Level NC-IUPHAR target classification
• NC-IUPHAR underpinned but largely HGNC-concordant
• Defers to target-class nomenclature outside NC-IUPHAR domains (e.g.
MEROPS for proteases, ESTER for a/b hydrolases)
• Includes pharmacologist-preferred NC-IUPHAR naming (e.g. Calciumactivated potassium channel KCa1.1 = KCNMA1)
• 65 “Quaternary Structure Subunit” annotations
• NC-IUPHAR use of lower-case and symbols can be problematic
11
Navigation: ligands > primary target
12
Navigation: paper > chemistry > target > affinity data
13
Trouble with triples (I) :
so which taxol drug structure is it?
12 CIDs include CAS 33069-62-4
Probably not the
virtual D52
14
Trouble with triples (II): so which is the molecular target?
15
Trouble with triples (III):
so which structure >
activity > target ?
• 22 CIDs share 4842
PubChem Bioassay
results
• 89% are aligned against
CID 36314
• 12 record actives
• None of the mixtures
have results
16
GtoPdb:
parsimonious
annotation
• We curate
selectively and
with high
stringency
• This results in
minimal rather
than maximal
triples coverage
17
More trouble with triples:
which targets are real and which IDs cross-map1:1?
UniProt, human
UniProt, human, Swiss-Prot
+ neXtProt
+ HGNC
+ Ensembl
+ CCDS
+ Entrez Gene ID
+ RefSeq
+ Evidence at protein level
=151,569
= 20,198
= 20,040
= 19,836
= 18,933
= 18,286
= 18,245
= 18,244
= 14,065
18
Even more trouble with triples: prodrugs and data gaps
• Data gaps could be experimentally filled with established assays
• For example, some early ACE inhibitors have no purified human protein
results (only rat, rabbit or hamster)
• Prodrugs may have no recorded activity – so cannot be target mapped
• On a good day we can get Ki and IC50 from the same paper
• How do we convince/entice folk to fill the gaps?
19
Utility of different target hierarchies
• NC-IUPHAR <> Swiss-Prot <> HGNC
• HGNC families and stems
• InterPro (includes Pfam)
• Genome Ontology (GO)
• EC numbers for enzymes
• Protein Ontology
• ChEMBL groupings
• Pathways (systems pharmacology)
• UniProt key words and cross-references
• Terminology for oligomeric complexes and splice variants
is problematic
20
Introducing the Guide to Immunopharmacology
• Wellcome Trust funded project initiated 4Q15
• Abbreviation will be GtoImmPdb.
• Homepage portal providing an immunological perspective
•
•
•
•
onto the database.
Will use same schema as GtoPdb but extended to
integrate GtoImmPdb data.
Search via biological processes and target annotations to
terms in the Gene Ontology (GO)
Mapping to a simplified specific process list
Provide search options via the Cell Ontology.
21
Intersects between GO immunology, GO inflammation and
GtoPdb targets with quantitative ligand interactions
22
Conclusions and plans
• Resolving triples across the bioactivity big data landscape is difficult
• Our approach is concise “small data” relationship mapping
• NC-IUPHAR > new nomenclature engagements
• Consolidate GtoPdb (< 2000 stringent target mappings)
• Instantiate GtoImmPdb
• RDF-ise GtoPdb for OpenPhacts
• PubChem BioAssay submission (target class splits)
• PubChem SID splits (e.g. approved drugs)
• Fill in legacy data gaps
• Expand (flexible) rules and relationship handling e.g. protein
interaction inhibitors, hybrid therapeutics, ligands with unknown
molecular mechanism
• Work on chemistry mapping retrieval
23
References, acknowledgments and questions
http://www.ncbi.nlm.nih.
gov/pubmed/24234439
• Please visit us http://www.guidetopharmacology.org/
• Curation rules are outlined in our FAQ, the 2014 and 2016 NAR
papers and blogposts
• Funders are acknowledged in the title slide
• To retrieve NC-IUPHAR's 95 Pharmacological Reviews nomenclature
publications in PubMed : (International[Title] AND Union[Title] AND
Pharmacology[Title] AND "Pharmacol Rev"[Journal])