schema-api-part1 - Bioinformatics Research Group at SRI

Download Report

Transcript schema-api-part1 - Bioinformatics Research Group at SRI

Computing with Pathway/Genome
Databases
1
SRI International Bioinformatics
 Aprox
2
presentation time: 1.5 hrs
SRI International Bioinformatics
Overview
 Summary
of Pathway Tools data access
mechanisms and formats
 Pathway
Tools APIs
 Overview
3
of Pathway Tools schema
SRI International Bioinformatics
Motivations to Understanding Schema
 When
writing complex queries to PGDBs, those
queries must refer to classes and slots within the
schema
 Queries using Lisp, Perl, Java APIs
 Queries using Structured Advanced Query Form
 Queries using BioVelo
 Find
4
all monomers longer than 1,000 amino acids
 (loop for g in (get-class-all-instances ‘|Genes|)
when (< 1000 (abs (- (get-slot-value g ‘left-end-position)
(get-slot-value g ‘right-end-position)
))
collect (get-slot-value
SRI International
Bioinformatics g ‘product) )
More Information
 Pathway



http://bioinformatics.ai.sri.com/ptools/
PTools APIs: http://brg.ai.sri.com/ptools/ptools-resources.html
Web services: http://biocyc.org/web-services.shtml
 Guide

Tools Web Site, Tutorial Slides
to the Pathway Tools Schema
http://biocyc.org/schema.shtml
 Curator's

6
Guide
http://bioinformatics.ai.sri.com/ptools/curatorsguide.pdf
SRI International Bioinformatics
References
 Ontology
Papers section of
http://biocyc.org/publications.shtml
 "An Evidence Ontology for use in Pathway/Genome
Databases"
7

"An ontology for biological function based on molecular
interactions"

"Representations of metabolic knowledge: Pathways"

"Representations of metabolic knowledge"
SRI International Bioinformatics
Data Exchange





8
APIs: Lisp API, Java API, and Perl API
 Read and modify access
Web services
Cyclone
Export to files
 BioPAX Export
Biopax.org
 Export PGDB genome to Genbank format
 Export entire PGDB as column-delimited and attribute-value file formats
 Export PGDB reactions as SBML -- sbml.org
 Import/Export of Pathways: between PGDBs
 Import/Export of Selected Frames, for Spreadsheets
 Import/Export of Compounds as Molfile, CML
BioWarehouse : Loader for Flatfiles, SQL access
 http://bioinformatics.ai.sri.com/biowarehouse/
 BMC Bioinformatics 7:170 2006
SRI International Bioinformatics
Pathway Tools Ontology / Schema
 Ontology
classes: 1621
 Datatype classes: Define objects from genomes to pathways
 Classification systems for pathways, chemical compounds,
enzymatic reactions (EC system)
 Protein Feature ontology
 Controlled vocabularies:


Cell Component Ontology
Evidence codes
 Comprehensive
set of 279 attributes and
relationships
9
SRI International Bioinformatics
High-Level Classes in the Pathway
Tools Ontology

Chemicals
Polymer-Segments
Protein-Features

Organisms

Reactions
Enzymatic-Reactions
Pathways
Regulation
-- Biochemical reactions
-- Link enzymes to reactions they catalyze
-- Metabolic and signaling pathways
-- Regulatory interactions

CCO
Evidence
Gene-Ontology-Terms
-- Cell Component Ontology
-- Evidence ontology
-- GO

Growth-Observations
-- Observations of growth of organism

Notes
Organizations, People
Publications
-- Timestamped, person-stamped notes









10
-- All molecules
-- Regions of polymers
-- Features on proteins
SRI International Bioinformatics
Navigating the Schema
11
SRI International Bioinformatics
Use GKB Editor to Inspect the
Pathway Tools Ontology
 GKB
Editor = Generic Knowledge Base Editor
 Type in Navigator window: (GKB)
or
 [Right-Click] Edit->Ontology Editor
 View->Browse
Class Hierarchy
 [Middle-Click] to expand hierarchy
 To view classes or instances, select them and:
 Frame -> List Frame Contents
 Frame -> Edit Frame
12
SRI International Bioinformatics
Use the SAQP to Inspect the Schema
13
SRI International Bioinformatics
Pathway Tools Schema
 Guide
to the Pathway Tools Schema
 Schema
14
overview diagram
SRI International Bioinformatics
Principal Classes

Class names are capitalized, plural, separated by dashes

Genetic-Elements, with subclasses:
 Chromosomes
 Plasmids
Genes
Transcription-Units
RNAs
 rRNAs, snRNAs, tRNAs, Charged-tRNAs
Proteins, with subclasses:
 Polypeptides
 Protein-Complexes




15
SRI International Bioinformatics
Principal Classes
 Reactions,
with subclasses:
 Transport-Reactions
 Enzymatic-Reactions
 Pathways
 Compounds-And-Elements
16
SRI International Bioinformatics
Principal Classes
 Regulation
17
SRI International Bioinformatics
Slot Links
TCA Cycle
in-pathway
Succinate + FAD = fumarate + FADH2
reaction
Enzymatic-reaction
catalyzes
Succinate dehydrogenase
component-of
Sdh-flavo
Sdh-Fe-S
Sdh-membrane-1
Sdh-membrane-2
product
sdhA
18
sdhB
SRI International Bioinformatics
sdhC
sdhD
Programmatic Access to BioCyc

Common LISP
•
Native language of Pathway Tools
•
Interactive & Mature Environment
•
Full Access to the Data & Many Utility Functions
•
Source code is available for academics

PerlCyc
•
API of Functions, Exposed to Perl
•
Communication through UNIX Socket

JavaCyc
•
API of Functions, Exposed to Java
•
Communication through UNIX Socket
•
19
Cyclone
SRI International Bioinformatics
Cyclone
 Developed
by Schachter and colleagues from
Genoscope
 http://nemo-cyclone.sourceforge.net/archi.php
 Cyclone
is a Java-based system that:
 Extracts data from a Pathway Tools PGDB
 Converts it to an XML schema
 Maps the data to Java objects and to a relational database
 Changes made to the data on the Java side can be
committed back to a Pathway Tools PGDB
20
SRI International Bioinformatics
Lisp API
 Accessible
whenever you start Pathway Tools
with the –lisp argument
 Lisp queries evaluate against the running
Pathway Tools binary and execute very fast
21
SRI International Bioinformatics
Ocelot Object Database
22
SRI International Bioinformatics
Pathway Tools Implementation Details
23

Platforms:
 Macintosh, PC/Linux, and PC/Windows platforms

Same binary can run as desktop app or Web server

Production-quality software
 Version control
 Two regular releases per year
 Extensive quality assurance
 Extensive documentation
 Auto-patch
 Automatic DB-upgrade

600,000 lines of Lisp code
SRI International Bioinformatics
Pathway Tools Architecture
Web
Mode
Lisp
Perl
Java
Disk
File
24
Pathway
Genome
Navigator
GFP API
Ocelot DBMS
SRI International Bioinformatics
Desktop
Mode
Protein Editor
Pathway Editor
Reaction Editor
Oracle
or
MySQL
Ocelot Object Database
 Frame
data model
 Classes, instances, inheritance
 Frames have slots that define their properties, attributes,
relationships
 A slot has one or more values

Datatypes include numbers, strings, etc.
 Slotunit
frames define metadata about slots:
 Domain, range, inverse
 Collection type, number of values, value constraints
25
SRI International Bioinformatics
Storage System Architecture
 File
KBs
 Read-only
applications can be distributed without
a relational DBMS
 Load all objects and code into Lisp memory
 Dump virtual memory to binary executable file
26
SRI International Bioinformatics
Ocelot Storage System Architecture

Persistent storage via disk files, MySQL or Oracle DBMS
 Concurrent development: MySQL or Oracle
 Single-user development: disk files

Relational DBMS storage
 RDBMS is submerged within Ocelot, invisible to users
 Frames transferred from RDBMS to Ocelot




27
On demand
By background prefetcher
Memory cache
Persistent disk cache to speed performance via Internet
SRI International Bioinformatics
Transaction Logging
 Relational
DBMS stores
 The latest version of each Ocelot frame
 A log of all GFP operations applied to KB
 Transaction
log enables:
 Reconstruction of earlier versions of KB
 View history of changes to an object
 Update replicates of a KB
 Detection of update conflicts during concurrency control
 Undo of updates
28
SRI International Bioinformatics
Optimistic Concurrency Control
 Locking
approach: edits to one object can require
locking all connected objects
 No locking
 User
performs updates in local workspace
 When
user commits changes, storage system
compares user changes against all other
committed changes
29
SRI International Bioinformatics
Ocelot Knowledge Server
Schema Evolution
 FRSs
store and process class and instance
information similarly
 Application can query schema information as
easily as it can query instances
 Schema
is stored within the DB
 Schema is self documenting
 Schema evolution facilitated by
 Easy addition/removal of slots, or alteration of slot datatypes
 Flexible data formats that do not require dumping/reloading
of data
30
SRI International Bioinformatics
Generic Frame Protocol (GFP)
A
library of procedures for accessing Ocelot DBs
 GFP
specification:
 http://www.ai.sri.com/~gfp/spec/paper/paper.html
A
small number of GFP functions are sufficient
for most complex queries
31
SRI International Bioinformatics
Example of a Single GFP Call
The General Pattern:
gfp-function(frame slot value ...)
(gfp-function frame slot value …)

LISP
(get-slot-values 'TRYPSYN-RXN 'LEFT)
==> (INDOLE-3-GLYCEROL-P SER)

32
SRI International Bioinformatics
Frame References
 At
the GFP level, every Ocelot frame can be
referred to using either symbol frame name or
frame object
 Most
GFP functions return frame objects
 Importance
33
of using fequal for comparisons
SRI International Bioinformatics
Generic Frame Protocol
34

get-class-all-instances (Class)
 Returns direct and indirect instances of Class

coercible-to-frame-p (Thing)
 Is Thing a frame? Returns True if Thing is the name of a frame, or a frame object;
else False
SRI International Bioinformatics
Generic Frame Protocol

Notation Frame.Slot means a specified slot of a specified
frame. Note: Slot must be a symbol!

get-slot-value(Frame Slot)
 Returns first value of Frame.Slot
get-slot-values(Frame Slot)
 Returns all values of Frame.Slot as a list




35
slot-has-value-p(Frame Slot)
 Returns True if Frame.Slot has at least one value; else False
member-slot-value-p(Frame Slot Value)
 Returns True if Value is one of the values of Frame.Slot; else False
Instance-all-instance-of-p(Instance Class)
 Returns True if Instance is an all-instance of Class
SRI International Bioinformatics
Generic Frame Protocol

36
print-frame(Frame)
 Prints the contents of Frame
SRI International Bioinformatics
Generic Frame Protocol – Update Operations
37

put-slot-value(Frame Slot Value)
 Replace the current value(s) of Frame.Slot with Value

put-slot-values(Frame Slot Value-List)
 Replace the current value(s) of Frame.Slot with Value-List, which must be a list of
values

add-slot-value(Frame Slot Value)
 Add Value to the current value(s) of Frame.Slot, if any

remove-slot-value(Frame Slot Value)
 Remove Value from the current value(s) of Frame.slot

replace-slot-value(Frame Slot Old-Value New-Value)
 In Frame.Slot, replace Old-Value with New-Value

remove-local-slot-values(Frame Slot)
 Remove all of the values of Frame.Slot
SRI International Bioinformatics
Generic Frame Protocol –
Update Operations
 save-kb

38
Saves the current KB
SRI International Bioinformatics
Additional Pathway Tools Functions –
Semantic Inference Layer
 Semantic
inference layer defines built-in
functions to compute commonly required
relationships in a PGDB
 http://bioinformatics.ai.sri.com/ptools/ptoolsfns.html
39
SRI International Bioinformatics
PerlCyc and JavaCyc
 Work
on Unix (Solaris or Linux) only
 Start up Pathway Tools with the –api arg
 Pathway Tools listens on a Unix socket – perl
program communicates through this socket
 Supports both querying and editing PGDBs
 Must run perl or java program on the same
machine that runs Pathway Tools
 This is a security measure, as the API server has no built-in
security
 Can only handle one connection at a time
40
SRI International Bioinformatics
Obtaining PerlCyc and JavaCyc
Download from
http://www.sgn.cornell.edu/downloads/
PerlCyc written and maintained by Lukas Mueller at
Boyce Thompson Institute for Plant Research.
JavaCyc written by Thomas Yan at Carnegie
Institute, maintained by Lukas Mueller.
Easy to extend…
41
SRI International Bioinformatics
Examples of PerlCyc, JavaCyc
Functions
GFP
functions (require knowledge of Pathway Tools
schema):
 getSlotValues
 get_slot_values
 getClassAllInstances
 get_class_all_instances
 putSlotValues
 put_slot_values
Pathway Tools functions (described at
http://bioinformatics.ai.sri.com/ptools/ptools-fns.html):
 genes_of_reaction
 genesOfReaction
 find_indexed_frame
 findIndexedFrame
 pathways_of_gene
 pathwaysOfGene
 transport_p
 transportP
42
SRI International Bioinformatics
Writing a PerlCyc or JavaCyc program



Create a PerlCyc, JavaCyc object:
perlcyc -> new (“ORGID”)
new Javacyc (“ORGID”)
Call PerlCyc, JavaCyc functions on this object:
my $cyc = perlcyc -> new (“ECOLI”);
my @pathways = $cyc -> all_pathways ();
Javacyc cyc = new Javacyc(“ECOLI”);
ArrayList pathways = cyc.allPathways ();
Functions return object IDs, not objects.
 Must connect to server again to retrieve attributes of an object.
foreach my $p (@pathways) {
print $cyc -> get_slot_value ($p, “COMMON-NAME”);}
for (int i=0; I < pathways.size(); i++) {
String pwy = (String) pathways.get(i);
System.out.println (cyc.getSlotValue (pwy, “COMMON-NAME”); }
43
SRI International Bioinformatics
Sample PerlCyc Query
 Number
of proteins in E. coli
use perlcyc;
my $cyc = perlcyc -> new (“ECOLI”);
my @proteins = $cyc->
get_class_all_instances("|Proteins|");
my $protein_count = scalar(@proteins);
print "Protein count: $protein_count.\n";
44
SRI International Bioinformatics
Sample PerlCyc Query
 Print
IDs of all proteins with molecular weight
between 10 and 20 kD and pI between 4 and 5.
use perlcyc;
my $cyc = perlcyc -> new (“ECOLI”);
foreach my $p ($cyc->get_class_all_instances("|Proteins|")) {
my $mw = $cyc->get_slot_value($p, "molecular-weight-kd");
my $pI = $cyc->get_slot_value($p, "pi");
if ($mw <= 20 && $mw >= 10 && $pI <= 5 && $pI >= 4) {
print "$p\n";
}
}
45
SRI International Bioinformatics
Sample PerlCyc Query
 List
all the transcription factors in E. coli, and the
list of genes that each regulates:
use perlcyc;
my $cyc = perlcyc -> new (“ECOLI”);
foreach my $p ($cyc->get_class_all_instances("|Proteins|")) {
if ($cyc->transcription_factor_p($p)) {
my $name = $cyc->get_slot_value($p, "common-name");
my %genes = ();
foreach my $tu ($cyc->regulon_of_protein($p)) {
foreach my $g ($cyc->transcription_unit_genes($tu)) {
$genes{$g} = $cyc->get_slot_value($g, "common-name");
}
}
print "\n\n$name: ";
print join " ", values %genes;
}
}
46
SRI International Bioinformatics
Sample Editing Using PerlCyc
 Add
a link from each gene to the corresponding
object in MY-DB (assume ID is same in both
cases)
use perlcyc;
my $cyc = perlcyc -> new (“HPY”);
my @genes = $cyc->get_class_all_instances (“|Genes|”);
foreach my $g (@genes) {
$cyc->add_slot_value ($g, “DBLINKS”, “(MY-DB \”$g\”)”);
}
$cyc->save_kb();
47
SRI International Bioinformatics
Sample JavaCyc Query:
Enzymes for which ATP is a regulator
import java.util.*;
public class JavacycSample {
public static void main(String[] args) {
Javacyc cyc = new Javacyc("ECOLI");
ArrayList regframes =
cyc.getClassAllInstances("|Regulation-of-Enzyme-Activity|");
for (int i = 0; i < regframes.size(); i++) {
String reg = (String)regframes.get(i);
boolean bool = cyc.memberSlotValueP(reg, “Regulator", "ATP");
if (bool) {
String enzrxn = cyc.getSlotValue (reg, “Regulated-Entity”);
String enzyme = cyc.getSlotValue (enzrxn, “Enzyme”);
System.out.println(enz); } } } }
48
SRI International Bioinformatics
Simple Lisp Query Example:
Enzymes for which ATP is a regulator
(defun atp-inhibits ()
(loop for x in (get-class-all-instances '|Regulation-of-Enzyme-Activity|)
;; Does the Regulator slot contain the compound ATP, and the mode
;; of regulation is negative (inhibition)?
when (and (member-slot-value-p x ‘Regulator 'ATP)
(member-slot-value-p x ‘Mode “-”) )
;; Whenever the test is positive, we collect the value of the slot Enzyme
;; of the Regulated-Entity of the regulatory interaction frame.
;; The collected values are returned as a list, once the loop terminates.
collect (get-slot-value (get-slot-value x ‘Regulated-Entity) ‘Enzyme) )
)
;;; invoking the query:
(select-organism :org-id 'ECOLI)
(atp-inhibits)
(get-slot-values 'TRYPSYN-RXN 'LEFT)
==> (INDOLE-3-GLYCEROL-P SER)
49
SRI International Bioinformatics
Simple Perl Query Example:
Enzymes for which ATP is a regulator
use perlcyc;
my $cyc = perlcyc -> new("ECOLI");
my @regs = $cyc -> get_class_all_instances("|Regulation-of-EnzymeActivity|");
## We check every instance of the class
foreach my $reg (@regs) {
## We test for whether the INHIBITORS-ALL
## slot contains the compound frame ATP
my $bool1 = $cyc -> member_slot_value_p($reg, “Regulator", "Atp");
my $bool2 = $cyc -> member_slot_value_p($reg, “Mode", “-");
if ($bool1 && $bool2) {
## Whenever the test is positive, we collect the value of the slot
ENZYME .
## The results are printed in the terminal.
my $enzrxn = $cyc -> get_slot_value($reg, “Regulated-Entity");
my $enz = $cyc -> get_slot_value($enzrxn, "Enzyme");
print STDOUT "$enz\n";
}
}
50
SRI International Bioinformatics
Getting started with Lisp






51
pathway-tools –lisp
(load “file”) (compile-file “file.lisp”)
Emacs is a useful editor
Pathway Tools source code is available: ask
Overview of Lisp information resources:
 http://bioinformatics.ai.sri.com/ptools/ptools-resources.html
Documented Pathway Tools Lisp functions:
 http://brg.ai.sri.com/ptools/ptools-fns.html
SRI International Bioinformatics
Viewing Results via the Answer List
 (loop
for r in (get-class-all-instances '|Reactions|)
when (< 3 (length (get-slot-values r 'left)))
collect r)
 (setq
answer *)
 (object-table answer)
 (replace-answer-list answer)
 (pt)
 Next
52
Answer
SRI International Bioinformatics
Query Gotchas
 Study
schema carefully
 :test #’fequal
 Cascade of slot-values: check for NIL
53
SRI International Bioinformatics
Semantic Inference Layer
relationships.lisp

Library of functions that encapsulate common query
building blocks and intricacies of navigating the schema

enzymes-of-gene
reactions-of-gene
pathways-of-gene
genes-of-pathway
pathway-hole-p
reactions-of-compound
top-containers(protein)
all-rxns(type) (:metab-smm :metab-all :metab-pathways :enzyme :transport







etc.)

54
(all-rxns :metab-pathways)
SRI International Bioinformatics
Pathway Tools Schema and
Semantic Inference Layer
Genes, Operons, and Replicons
55
SRI International Bioinformatics
Representing a Genome
components
genome
ORG
56
Gene1
CHROM1
Gene2
CHROM2
Gene3
PLASMID1

product
Classes:
 ORG is of class Organisms
 CHROM1 is of class Chromosomes
 PLASMID1 is of class Plasmids
 Gene1 is of class Genes
 Product1 is of class Polypeptides or RNA
SRI International Bioinformatics
Product1
Polynucleotides
Review slots of COLI and of COLI-K12
57
SRI International Bioinformatics
Genetic-Elements
 Sequence
is stored in a separate file or database
table
58
SRI International Bioinformatics
Polymer-Segments
Review slots of Genes
59
SRI International Bioinformatics
Complexities of Gene / Gene-Product
Relationships




The Product of a gene can be an instance of Polypeptides
or RNAs
An instance of Polypeptides can have more than one gene
encoding it
Sequence position:
 Nucleotide positions of starting and ending codons specified in Left-EndPosition and Right-End-Position (usually greater, except at origin)
 Transcription-Direction + / Alternative splicing:
 Nucleotide positions of starting and ending codons specified in Left-EndPosition and Right-End-Position
 Intron positions specified in Splice-Form-Introns of gene product

60
(200 300) (350 400)
SRI International Bioinformatics
Gene Reaction Schematic
61
SRI International Bioinformatics
Exercises
62

Find all genes on a given chromosome

Find all ribosomal RNAs

Find the DNA sequence of a given gene

Find all proteins longer than 1,000 amino acids
SRI International Bioinformatics
Exercises
Find all genes on a given chromosome
(defun genes-of-chrom (chrom)
(loop for x in (get-slot-values chrom ‘components)
when (instance-all-instance-of-p x ‘|Genes|)
collect x)
)

63

Find all ribosomal RNAs
 (get-class-all-instances ‘|rRNAs|)

Find the DNA sequence of a given gene
 (get-gene-sequence gene)
SRI International Bioinformatics
Exercises
 Find
all monomers longer than 1,000 nucleotides
 (loop for g in (get-class-all-instances ‘|Genes|)
for p = (get-slot-value g ‘product)
when (and (< 1000 (abs (- (get-slot-value g ‘left-end-position)
(get-slot-value g ‘right-end-position) )))
(instance-all-instance-of-p p ‘|Polypeptides|) )
collect p )
64
SRI International Bioinformatics
Proteins
65
SRI International Bioinformatics
Proteins and Protein Complexes
 Polypeptide:
the monomer protein product of a
gene (may have multiple isoforms, as indicated at
gene level)
 Protein
complex: proteins consisting of multiple
polypeptides or protein complexes
 Example:
DNA pol III
 DnaE is a polypeptide
 pol III core is DnaE and two other polypeptides
 pol III holoenzymes is several protein complexes combined
66
SRI International Bioinformatics
Protein Complex Relationships
67
SRI International Bioinformatics
Slots of a protein (DnaE)
 catalyzes
 Is
it an activator/reactant/etc?
 comments
 component-of
 dblinks
 features (edited in feature editor)
 Many
68
other features possible
SRI International Bioinformatics
A complex at the frame level (pol III)
 Same
features as polypeptide frame, different use
 comment
 component-of
and components
 note coefficients
69
SRI International Bioinformatics
Protein Complex Relationships
70
SRI International Bioinformatics
Relationships are Defined in Many
Places
 component-of
comes from creating a complex
 appears-in-left-side-of
comes from defining a
reaction (as do modified forms)
 inhibitor-of
comes from an enzymatic reaction
 can
only edit dna-footprint if protein has been
associated with a TU
71
SRI International Bioinformatics
Semantic Inference Layer
 Reactions-of-protein
(prot)
 Returns a list of rxns this protein catalyzes
 Transcription-units-of-proteins(prot)
 Returns a list of TU’s activated/inhibited by the given protein
 Transporter? (prot)
 Is this protein a transporter?
 Polypeptide-or-homomultimer?(prot)
 Transcription-factor? (prot)
 Obtain-protein-stats
 Returns 5 values

72
Length of : all-polypeptides, complexes, transporters, enzymes, etc…
SRI International Bioinformatics
Example
 Find
all enzymes that use pyridoxal phosphate as
a cofactor or prosthetic group

(loop for protein in (get-class-all-instances ‘|Proteins|)
for enzrxn = (get-slot-value protein ‘enzymatic-reaction)
when (and enzrxn
(or (member-slot-value-p enzrxn ‘cofactors ‘pyridoxal_phosphate)
(member-slot-value-p enzrxn ‘prosthetic-groups
‘pyridoxal_phosphate))
collect protein)
(member-slot-value-p frame slot value) : T if Value is one of the values of
Slot of Frame.
73
SRI International Bioinformatics
Example Queries
 Find
all homomultimers
 Find
proteins whose pI > 10, and that reside on
the negative strand of the first chromosome
74
SRI International Bioinformatics
Sample
 Find
all proteins without
a comment anywhere
75
SRI International Bioinformatics
Compounds / Reactions / Pathways
76
SRI International Bioinformatics
Compounds / Reactions / Pathways
 Think
of a three tiered structure:
 Reactions built on top of compounds
 Pathways built on top of reactions
 Metabolic network defined by reactions alone;
pathways are an additional “optional” structure
 Some reactions not part of a pathway
 Some reactions have no attached enzyme
 Some enzymes have no attached gene
77
SRI International Bioinformatics
Compounds
78
SRI International Bioinformatics
79
SRI International Bioinformatics
Compounds
 Relatively
few aspects of a compound defined
within the compound editor
 MW, formula calculated from edited structure
 Most
aspects defined in other editors
 “Pathway reactions” comes from reaction editing followed by
pathway editing
 Activator, etc come from the enzymatic reaction editor
80
SRI International Bioinformatics
-- Instance TRP --Types: |Amino-Acid|, |Aromatic-Amino-Acids|, |Non-polar-amino-acids|
APPEARS-IN-LEFT-SIDE-OF: RXN0-287, TRANS-RXN-76, TRYPTOPHAN-RXN,
TRYPTOPHAN--TRNA-LIGASE-RXN
APPEARS-IN-RIGHT-SIDE-OF: RXN0-2382, RXN0-301, TRANS-RXN-76, TRYPSYN-RXN
CHEMICAL-FORMULA: (C 11), (H 12), (N 2), (O 2)
COMMON-NAME: "L-tryptophan"
DBLINKS: (LIGAND-CPD "C00078" NIL |kaipa| 3311532640 NIL NIL),
(CAS "6912-86-3"), (CAS "73-22-3")
NAMES: "L-tryptophan", "W", "tryptacin", "trofan", "trp", "tryptophan",
"2-amino-3-indolylpropanic acid"
SMILES: "c1(c(CC(N)C(=O)O)c2(c([nH]1)cccc2))"
SYNONYMS: "W", "tryptacin", "trofan", "trp", "tryptophan",
"2-amino-3-indolylpropanic acid"
____________________________________________
81
SRI International Bioinformatics
Where is diphosphate in the
ontology?
82
SRI International Bioinformatics
Semantic Inference Layer
 Reactions-of-compound
(cpd)
 Pathways-of-compound (cpd)
 Is-substrate-an-autocatalytic-enzyme-p (cpd)
 Activated/inhibited-by? (cpds slots)
 Returns a list of enzrxns for which a cpd in cpds is a
modulator (example slots: activators-all, activators-allosteric)
 All-substrates (rxns)
 All unique substrates specified in the given rxns
 Has-structure-p (cpd)
 Obtain-cpd-stats
 Returns two values:

83
Length of :all-cpds, cpds with structures
SRI International Bioinformatics
Miscellaneous things….
History
List
 Back/Forward and History buttons
 Default list is 50 items
 Show
frame
 (print-frame ‘frame)
84
SRI International Bioinformatics
85
SRI International Bioinformatics
Queries with Multiple Answers

Navigator queries:
 Example: Substring search for “pyruvate”
 Selected list is placed on the Answer list
 Use “Next Answer” button to view each one of them

Lisp queries:
Example : Find reactions involving pyruvate as a substrate

(get-class-all-instances ‘|Compounds|)
(loop
for rxn in (get-class-all-instances ‘|Reactions|)
when (member ‘pyruvate (get-slot-values rxn ‘substrates)
collect rxn)
(replace-answer-list * )
86
SRI International Bioinformatics