No Slide Title
Download
Report
Transcript No Slide Title
The Pathway Tools Schema
Motivations for Understanding
Schema
SRI International
Bioinformatics
Pathway
Tools visualizations and analyses
depend upon the software being able to find
precise information in precise places within a
Pathway/Genome DB
When
writing complex queries to PGDBs, those
queries must name classes and slots within the
schema
A
Pathway/Genome Database is a web of
interconnected objects; each object represents a
biological entity
Reference
Pathway
SRI International
Bioinformatics
Tools User’s Guide, Volume I
Appendix A: Guide to the Pathway Tools Schema
SRI International
Bioinformatics
Web of Relationships for One Enzyme
TCA Cycle
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
Sdh-flavo
Sdh-Fe-S
Sdh-membrane-1
Sdh-membrane-2
sdhA
sdhB
sdhC
sdhD
Frame Data Model
Frame
SRI International
Bioinformatics
Data Model -- organizational structure for a
PGDB
Knowledge
base (KB, Database, DB)
Frames
Slots
Facets
Annotations
Knowledge Base
Collection
SRI International
Bioinformatics
of frames and their associated slots,
values, facets, and annotations
AKA: Database, PGDB
Can
be stored within
An Oracle or MySQL DB
A disk file
Pathway Tools binary program
Frames
SRI International
Bioinformatics
Entities with which facts are associated
Kinds of frames:
Classes: Genes, Pathways, Biosynthetic Pathways
Instances (objects): trpA, TCA cycle
Classes:
Superclass(es)
Subclass(es)
Instance(s)
A symbolic frame name (id, key) uniquely identifies each
frame
Slots
SRI International
Bioinformatics
Encode
attributes/properties of a frame
Integer, real number, string
Represent
relationships between frames
The value of a slot is the identifier of another frame
Every
slot is described by a “slot frame” in a KB
that defines meta information about that slot
SRI International
Bioinformatics
Slot Links
TCA Cycle
in-pathway
Succinate + FAD = fumarate + FADH2
reaction
Enzymatic-reaction
catalyzes
Succinate dehydrogenase
component-of
Sdh-flavo
Sdh-Fe-S
Sdh-membrane-1
Sdh-membrane-2
product
sdhA
sdhB
sdhC
sdhD
Slots
SRI International
Bioinformatics
Number
of values
Single valued
Multivalued: sets, bags
Slot
values
Any LISP object: Integer, real, string, symbol (frame name)
Slotunits
define properties of slots: datatypes,
classes, constraints
Two
slots are inverses if they encode opposite
relationships
Slot Product in class Genes
SRI International
Bioinformatics
Representation of Function
TCA Cycle
EC#
Keq
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
Cofactors
Inhibitors
Molecular wt
pI
Sdh-flavo
Sdh-Fe-S
Sdh-membrane-1
Sdh-membrane-2
sdhA
sdhB
sdhC
sdhD
Left-end-position
Monofunctional Monomer
Pathway
Reaction
Enzymatic-reaction
Monomer
Gene
SRI International
Bioinformatics
SRI International
Bioinformatics
Bifunctional Monomer
Pathway
Reaction
Reaction
Enzymatic-reaction
Enzymatic-reaction
Monomer
Gene
Monofunctional Multimer
SRI International
Bioinformatics
Pathway
Reaction
Enzymatic-reaction
Multimer
Monomer
Monomer
Monomer
Monomer
Gene
Gene
Gene
Gene
Pathway and Substrates
Reactant-1
Pathway
left
in-pathway
Reactant-2
Reaction
Product-1
Product-2
SRI International
Bioinformatics
right
Reaction
Reaction
Reaction
Transcriptional Regulation
trp
apoTrpR
trpLEDCBA
Int005
site001
Int001
pro001
Int003
trpL
trpE
trpD
trpC
trpB
trpA
SRI International
Bioinformatics
TrpR*trp
RpoSig70
Principle Classes
SRI International
Bioinformatics
Class names are capitalized, plural, separated by dashes
Genetic-Elements, with subclasses:
Chromosomes
Plasmids
Genes
Transcription-Units
RNAs
rRNAs, snRNAs, tRNAs, Charged-tRNAs
Proteins, with subclasses:
Polypeptides
Protein-Complexes
Principle Classes
Reactions,
with subclasses:
Transport-Reactions
Enzymatic-Reactions
Pathways
Compounds-And-Elements
SRI International
Bioinformatics
Frame IDs of Instances
Instance
frame ID conventions have evolved over
time
Examples:
Pathways
TRPSYN-PWY, P23-PWY
Genes
AG10045
Monomers
SRI International
Bioinformatics
TRPA-MONOMER, AG10045-MONOMER
Slots in Multiple Classes
SRI International
Bioinformatics
Common-Name
Synonyms
Names
(computed as union of Common-Name,
Synonyms)
Comment
Citations
DB-Links
Genes Slots
Component-Of
SRI International
Bioinformatics
(links to replicon, transcription
unit)
Left-End-Position
Right-End-Position
Centisome-Position
Transcription-Direction
Product
Proteins Slots
Molecular-Weight-Seq
Molecular-Weight-Exp
pI
Locations
Modified-Form
Unmodified-Form
Component-Of
SRI International
Bioinformatics
Polypeptides Slots
Gene
SRI International
Bioinformatics
Protein-Complexes Slots
Components
SRI International
Bioinformatics
Reactions Slots
SRI International
Bioinformatics
EC-Number
Left,
Right
Substrates (computed as union of Left, Right)
DeltaG0
Keq
Spontaneous?
Enzymatic-Reactions Slots
Enzyme
Reaction
Activators
Inhibitors
Physiologically-Relevant
Cofactors
Prosthetic-Groups
Alternative-Substrates
Alternative-Cofactors
SRI International
Bioinformatics
Pathways Slots
Reaction-List
Predecessors
Primaries
SRI International
Bioinformatics