schema-api-part1 - Bioinformatics Research Group at SRI
Download
Report
Transcript schema-api-part1 - Bioinformatics Research Group at SRI
Computing with Pathway/Genome
Databases
1
SRI International Bioinformatics
Aprox
2
presentation time: 1.5 hrs
SRI International Bioinformatics
Overview
Summary
of Pathway Tools data access
mechanisms and formats
Pathway
Tools APIs
Overview
3
of Pathway Tools schema
SRI International Bioinformatics
Motivations to Understanding Schema
When
writing complex queries to PGDBs, those
queries must refer to classes and slots within the
schema
Queries using Lisp, Perl, Java APIs
Queries using Structured Advanced Query Form
Queries using BioVelo
Find
4
all monomers longer than 1,000 amino acids
(loop for g in (get-class-all-instances ‘|Genes|)
when (< 1000 (abs (- (get-slot-value g ‘left-end-position)
(get-slot-value g ‘right-end-position)
))
collect (get-slot-value
SRI International
Bioinformatics g ‘product) )
More Information
Pathway
http://bioinformatics.ai.sri.com/ptools/
PTools APIs: http://brg.ai.sri.com/ptools/ptools-resources.html
Web services: http://biocyc.org/web-services.shtml
Guide
Tools Web Site, Tutorial Slides
to the Pathway Tools Schema
http://biocyc.org/schema.shtml
Curator's
6
Guide
http://bioinformatics.ai.sri.com/ptools/curatorsguide.pdf
SRI International Bioinformatics
References
Ontology
Papers section of
http://biocyc.org/publications.shtml
"An Evidence Ontology for use in Pathway/Genome
Databases"
7
"An ontology for biological function based on molecular
interactions"
"Representations of metabolic knowledge: Pathways"
"Representations of metabolic knowledge"
SRI International Bioinformatics
Data Exchange
8
APIs: Lisp API, Java API, and Perl API
Read and modify access
Web services
Cyclone
Export to files
BioPAX Export
Biopax.org
Export PGDB genome to Genbank format
Export entire PGDB as column-delimited and attribute-value file formats
Export PGDB reactions as SBML -- sbml.org
Import/Export of Pathways: between PGDBs
Import/Export of Selected Frames, for Spreadsheets
Import/Export of Compounds as Molfile, CML
BioWarehouse : Loader for Flatfiles, SQL access
http://bioinformatics.ai.sri.com/biowarehouse/
BMC Bioinformatics 7:170 2006
SRI International Bioinformatics
Pathway Tools Ontology / Schema
Ontology
classes: 1621
Datatype classes: Define objects from genomes to pathways
Classification systems for pathways, chemical compounds,
enzymatic reactions (EC system)
Protein Feature ontology
Controlled vocabularies:
Cell Component Ontology
Evidence codes
Comprehensive
set of 279 attributes and
relationships
9
SRI International Bioinformatics
High-Level Classes in the Pathway
Tools Ontology
Chemicals
Polymer-Segments
Protein-Features
Organisms
Reactions
Enzymatic-Reactions
Pathways
Regulation
-- Biochemical reactions
-- Link enzymes to reactions they catalyze
-- Metabolic and signaling pathways
-- Regulatory interactions
CCO
Evidence
Gene-Ontology-Terms
-- Cell Component Ontology
-- Evidence ontology
-- GO
Growth-Observations
-- Observations of growth of organism
Notes
Organizations, People
Publications
-- Timestamped, person-stamped notes
10
-- All molecules
-- Regions of polymers
-- Features on proteins
SRI International Bioinformatics
Navigating the Schema
11
SRI International Bioinformatics
Use GKB Editor to Inspect the
Pathway Tools Ontology
GKB
Editor = Generic Knowledge Base Editor
Type in Navigator window: (GKB)
or
[Right-Click] Edit->Ontology Editor
View->Browse
Class Hierarchy
[Middle-Click] to expand hierarchy
To view classes or instances, select them and:
Frame -> List Frame Contents
Frame -> Edit Frame
12
SRI International Bioinformatics
Use the SAQP to Inspect the Schema
13
SRI International Bioinformatics
Pathway Tools Schema
Guide
to the Pathway Tools Schema
Schema
14
overview diagram
SRI International Bioinformatics
Principal Classes
Class names are capitalized, plural, separated by dashes
Genetic-Elements, with subclasses:
Chromosomes
Plasmids
Genes
Transcription-Units
RNAs
rRNAs, snRNAs, tRNAs, Charged-tRNAs
Proteins, with subclasses:
Polypeptides
Protein-Complexes
15
SRI International Bioinformatics
Principal Classes
Reactions,
with subclasses:
Transport-Reactions
Enzymatic-Reactions
Pathways
Compounds-And-Elements
16
SRI International Bioinformatics
Principal Classes
Regulation
17
SRI International Bioinformatics
Slot Links
TCA Cycle
in-pathway
Succinate + FAD = fumarate + FADH2
reaction
Enzymatic-reaction
catalyzes
Succinate dehydrogenase
component-of
Sdh-flavo
Sdh-Fe-S
Sdh-membrane-1
Sdh-membrane-2
product
sdhA
18
sdhB
SRI International Bioinformatics
sdhC
sdhD
Programmatic Access to BioCyc
Common LISP
•
Native language of Pathway Tools
•
Interactive & Mature Environment
•
Full Access to the Data & Many Utility Functions
•
Source code is available for academics
PerlCyc
•
API of Functions, Exposed to Perl
•
Communication through UNIX Socket
JavaCyc
•
API of Functions, Exposed to Java
•
Communication through UNIX Socket
•
19
Cyclone
SRI International Bioinformatics
Cyclone
Developed
by Schachter and colleagues from
Genoscope
http://nemo-cyclone.sourceforge.net/archi.php
Cyclone
is a Java-based system that:
Extracts data from a Pathway Tools PGDB
Converts it to an XML schema
Maps the data to Java objects and to a relational database
Changes made to the data on the Java side can be
committed back to a Pathway Tools PGDB
20
SRI International Bioinformatics
Lisp API
Accessible
whenever you start Pathway Tools
with the –lisp argument
Lisp queries evaluate against the running
Pathway Tools binary and execute very fast
21
SRI International Bioinformatics
Ocelot Object Database
22
SRI International Bioinformatics
Pathway Tools Implementation Details
23
Platforms:
Macintosh, PC/Linux, and PC/Windows platforms
Same binary can run as desktop app or Web server
Production-quality software
Version control
Two regular releases per year
Extensive quality assurance
Extensive documentation
Auto-patch
Automatic DB-upgrade
600,000 lines of Lisp code
SRI International Bioinformatics
Pathway Tools Architecture
Web
Mode
Lisp
Perl
Java
Disk
File
24
Pathway
Genome
Navigator
GFP API
Ocelot DBMS
SRI International Bioinformatics
Desktop
Mode
Protein Editor
Pathway Editor
Reaction Editor
Oracle
or
MySQL
Ocelot Object Database
Frame
data model
Classes, instances, inheritance
Frames have slots that define their properties, attributes,
relationships
A slot has one or more values
Datatypes include numbers, strings, etc.
Slotunit
frames define metadata about slots:
Domain, range, inverse
Collection type, number of values, value constraints
25
SRI International Bioinformatics
Storage System Architecture
File
KBs
Read-only
applications can be distributed without
a relational DBMS
Load all objects and code into Lisp memory
Dump virtual memory to binary executable file
26
SRI International Bioinformatics
Ocelot Storage System Architecture
Persistent storage via disk files, MySQL or Oracle DBMS
Concurrent development: MySQL or Oracle
Single-user development: disk files
Relational DBMS storage
RDBMS is submerged within Ocelot, invisible to users
Frames transferred from RDBMS to Ocelot
27
On demand
By background prefetcher
Memory cache
Persistent disk cache to speed performance via Internet
SRI International Bioinformatics
Transaction Logging
Relational
DBMS stores
The latest version of each Ocelot frame
A log of all GFP operations applied to KB
Transaction
log enables:
Reconstruction of earlier versions of KB
View history of changes to an object
Update replicates of a KB
Detection of update conflicts during concurrency control
Undo of updates
28
SRI International Bioinformatics
Optimistic Concurrency Control
Locking
approach: edits to one object can require
locking all connected objects
No locking
User
performs updates in local workspace
When
user commits changes, storage system
compares user changes against all other
committed changes
29
SRI International Bioinformatics
Ocelot Knowledge Server
Schema Evolution
FRSs
store and process class and instance
information similarly
Application can query schema information as
easily as it can query instances
Schema
is stored within the DB
Schema is self documenting
Schema evolution facilitated by
Easy addition/removal of slots, or alteration of slot datatypes
Flexible data formats that do not require dumping/reloading
of data
30
SRI International Bioinformatics
Generic Frame Protocol (GFP)
A
library of procedures for accessing Ocelot DBs
GFP
specification:
http://www.ai.sri.com/~gfp/spec/paper/paper.html
A
small number of GFP functions are sufficient
for most complex queries
31
SRI International Bioinformatics
Example of a Single GFP Call
The General Pattern:
gfp-function(frame slot value ...)
(gfp-function frame slot value …)
LISP
(get-slot-values 'TRYPSYN-RXN 'LEFT)
==> (INDOLE-3-GLYCEROL-P SER)
32
SRI International Bioinformatics
Frame References
At
the GFP level, every Ocelot frame can be
referred to using either symbol frame name or
frame object
Most
GFP functions return frame objects
Importance
33
of using fequal for comparisons
SRI International Bioinformatics
Generic Frame Protocol
34
get-class-all-instances (Class)
Returns direct and indirect instances of Class
coercible-to-frame-p (Thing)
Is Thing a frame? Returns True if Thing is the name of a frame, or a frame object;
else False
SRI International Bioinformatics
Generic Frame Protocol
Notation Frame.Slot means a specified slot of a specified
frame. Note: Slot must be a symbol!
get-slot-value(Frame Slot)
Returns first value of Frame.Slot
get-slot-values(Frame Slot)
Returns all values of Frame.Slot as a list
35
slot-has-value-p(Frame Slot)
Returns True if Frame.Slot has at least one value; else False
member-slot-value-p(Frame Slot Value)
Returns True if Value is one of the values of Frame.Slot; else False
Instance-all-instance-of-p(Instance Class)
Returns True if Instance is an all-instance of Class
SRI International Bioinformatics
Generic Frame Protocol
36
print-frame(Frame)
Prints the contents of Frame
SRI International Bioinformatics
Generic Frame Protocol – Update Operations
37
put-slot-value(Frame Slot Value)
Replace the current value(s) of Frame.Slot with Value
put-slot-values(Frame Slot Value-List)
Replace the current value(s) of Frame.Slot with Value-List, which must be a list of
values
add-slot-value(Frame Slot Value)
Add Value to the current value(s) of Frame.Slot, if any
remove-slot-value(Frame Slot Value)
Remove Value from the current value(s) of Frame.slot
replace-slot-value(Frame Slot Old-Value New-Value)
In Frame.Slot, replace Old-Value with New-Value
remove-local-slot-values(Frame Slot)
Remove all of the values of Frame.Slot
SRI International Bioinformatics
Generic Frame Protocol –
Update Operations
save-kb
38
Saves the current KB
SRI International Bioinformatics
Additional Pathway Tools Functions –
Semantic Inference Layer
Semantic
inference layer defines built-in
functions to compute commonly required
relationships in a PGDB
http://bioinformatics.ai.sri.com/ptools/ptoolsfns.html
39
SRI International Bioinformatics
PerlCyc and JavaCyc
Work
on Unix (Solaris or Linux) only
Start up Pathway Tools with the –api arg
Pathway Tools listens on a Unix socket – perl
program communicates through this socket
Supports both querying and editing PGDBs
Must run perl or java program on the same
machine that runs Pathway Tools
This is a security measure, as the API server has no built-in
security
Can only handle one connection at a time
40
SRI International Bioinformatics
Obtaining PerlCyc and JavaCyc
Download from
http://www.sgn.cornell.edu/downloads/
PerlCyc written and maintained by Lukas Mueller at
Boyce Thompson Institute for Plant Research.
JavaCyc written by Thomas Yan at Carnegie
Institute, maintained by Lukas Mueller.
Easy to extend…
41
SRI International Bioinformatics
Examples of PerlCyc, JavaCyc
Functions
GFP
functions (require knowledge of Pathway Tools
schema):
getSlotValues
get_slot_values
getClassAllInstances
get_class_all_instances
putSlotValues
put_slot_values
Pathway Tools functions (described at
http://bioinformatics.ai.sri.com/ptools/ptools-fns.html):
genes_of_reaction
genesOfReaction
find_indexed_frame
findIndexedFrame
pathways_of_gene
pathwaysOfGene
transport_p
transportP
42
SRI International Bioinformatics
Writing a PerlCyc or JavaCyc program
Create a PerlCyc, JavaCyc object:
perlcyc -> new (“ORGID”)
new Javacyc (“ORGID”)
Call PerlCyc, JavaCyc functions on this object:
my $cyc = perlcyc -> new (“ECOLI”);
my @pathways = $cyc -> all_pathways ();
Javacyc cyc = new Javacyc(“ECOLI”);
ArrayList pathways = cyc.allPathways ();
Functions return object IDs, not objects.
Must connect to server again to retrieve attributes of an object.
foreach my $p (@pathways) {
print $cyc -> get_slot_value ($p, “COMMON-NAME”);}
for (int i=0; I < pathways.size(); i++) {
String pwy = (String) pathways.get(i);
System.out.println (cyc.getSlotValue (pwy, “COMMON-NAME”); }
43
SRI International Bioinformatics
Sample PerlCyc Query
Number
of proteins in E. coli
use perlcyc;
my $cyc = perlcyc -> new (“ECOLI”);
my @proteins = $cyc->
get_class_all_instances("|Proteins|");
my $protein_count = scalar(@proteins);
print "Protein count: $protein_count.\n";
44
SRI International Bioinformatics
Sample PerlCyc Query
Print
IDs of all proteins with molecular weight
between 10 and 20 kD and pI between 4 and 5.
use perlcyc;
my $cyc = perlcyc -> new (“ECOLI”);
foreach my $p ($cyc->get_class_all_instances("|Proteins|")) {
my $mw = $cyc->get_slot_value($p, "molecular-weight-kd");
my $pI = $cyc->get_slot_value($p, "pi");
if ($mw <= 20 && $mw >= 10 && $pI <= 5 && $pI >= 4) {
print "$p\n";
}
}
45
SRI International Bioinformatics
Sample PerlCyc Query
List
all the transcription factors in E. coli, and the
list of genes that each regulates:
use perlcyc;
my $cyc = perlcyc -> new (“ECOLI”);
foreach my $p ($cyc->get_class_all_instances("|Proteins|")) {
if ($cyc->transcription_factor_p($p)) {
my $name = $cyc->get_slot_value($p, "common-name");
my %genes = ();
foreach my $tu ($cyc->regulon_of_protein($p)) {
foreach my $g ($cyc->transcription_unit_genes($tu)) {
$genes{$g} = $cyc->get_slot_value($g, "common-name");
}
}
print "\n\n$name: ";
print join " ", values %genes;
}
}
46
SRI International Bioinformatics
Sample Editing Using PerlCyc
Add
a link from each gene to the corresponding
object in MY-DB (assume ID is same in both
cases)
use perlcyc;
my $cyc = perlcyc -> new (“HPY”);
my @genes = $cyc->get_class_all_instances (“|Genes|”);
foreach my $g (@genes) {
$cyc->add_slot_value ($g, “DBLINKS”, “(MY-DB \”$g\”)”);
}
$cyc->save_kb();
47
SRI International Bioinformatics
Sample JavaCyc Query:
Enzymes for which ATP is a regulator
import java.util.*;
public class JavacycSample {
public static void main(String[] args) {
Javacyc cyc = new Javacyc("ECOLI");
ArrayList regframes =
cyc.getClassAllInstances("|Regulation-of-Enzyme-Activity|");
for (int i = 0; i < regframes.size(); i++) {
String reg = (String)regframes.get(i);
boolean bool = cyc.memberSlotValueP(reg, “Regulator", "ATP");
if (bool) {
String enzrxn = cyc.getSlotValue (reg, “Regulated-Entity”);
String enzyme = cyc.getSlotValue (enzrxn, “Enzyme”);
System.out.println(enz); } } } }
48
SRI International Bioinformatics
Simple Lisp Query Example:
Enzymes for which ATP is a regulator
(defun atp-inhibits ()
(loop for x in (get-class-all-instances '|Regulation-of-Enzyme-Activity|)
;; Does the Regulator slot contain the compound ATP, and the mode
;; of regulation is negative (inhibition)?
when (and (member-slot-value-p x ‘Regulator 'ATP)
(member-slot-value-p x ‘Mode “-”) )
;; Whenever the test is positive, we collect the value of the slot Enzyme
;; of the Regulated-Entity of the regulatory interaction frame.
;; The collected values are returned as a list, once the loop terminates.
collect (get-slot-value (get-slot-value x ‘Regulated-Entity) ‘Enzyme) )
)
;;; invoking the query:
(select-organism :org-id 'ECOLI)
(atp-inhibits)
(get-slot-values 'TRYPSYN-RXN 'LEFT)
==> (INDOLE-3-GLYCEROL-P SER)
49
SRI International Bioinformatics
Simple Perl Query Example:
Enzymes for which ATP is a regulator
use perlcyc;
my $cyc = perlcyc -> new("ECOLI");
my @regs = $cyc -> get_class_all_instances("|Regulation-of-EnzymeActivity|");
## We check every instance of the class
foreach my $reg (@regs) {
## We test for whether the INHIBITORS-ALL
## slot contains the compound frame ATP
my $bool1 = $cyc -> member_slot_value_p($reg, “Regulator", "Atp");
my $bool2 = $cyc -> member_slot_value_p($reg, “Mode", “-");
if ($bool1 && $bool2) {
## Whenever the test is positive, we collect the value of the slot
ENZYME .
## The results are printed in the terminal.
my $enzrxn = $cyc -> get_slot_value($reg, “Regulated-Entity");
my $enz = $cyc -> get_slot_value($enzrxn, "Enzyme");
print STDOUT "$enz\n";
}
}
50
SRI International Bioinformatics
Getting started with Lisp
51
pathway-tools –lisp
(load “file”) (compile-file “file.lisp”)
Emacs is a useful editor
Pathway Tools source code is available: ask
Overview of Lisp information resources:
http://bioinformatics.ai.sri.com/ptools/ptools-resources.html
Documented Pathway Tools Lisp functions:
http://brg.ai.sri.com/ptools/ptools-fns.html
SRI International Bioinformatics
Viewing Results via the Answer List
(loop
for r in (get-class-all-instances '|Reactions|)
when (< 3 (length (get-slot-values r 'left)))
collect r)
(setq
answer *)
(object-table answer)
(replace-answer-list answer)
(pt)
Next
52
Answer
SRI International Bioinformatics
Query Gotchas
Study
schema carefully
:test #’fequal
Cascade of slot-values: check for NIL
53
SRI International Bioinformatics
Semantic Inference Layer
relationships.lisp
Library of functions that encapsulate common query
building blocks and intricacies of navigating the schema
enzymes-of-gene
reactions-of-gene
pathways-of-gene
genes-of-pathway
pathway-hole-p
reactions-of-compound
top-containers(protein)
all-rxns(type) (:metab-smm :metab-all :metab-pathways :enzyme :transport
etc.)
54
(all-rxns :metab-pathways)
SRI International Bioinformatics
Pathway Tools Schema and
Semantic Inference Layer
Genes, Operons, and Replicons
55
SRI International Bioinformatics
Representing a Genome
components
genome
ORG
56
Gene1
CHROM1
Gene2
CHROM2
Gene3
PLASMID1
product
Classes:
ORG is of class Organisms
CHROM1 is of class Chromosomes
PLASMID1 is of class Plasmids
Gene1 is of class Genes
Product1 is of class Polypeptides or RNA
SRI International Bioinformatics
Product1
Polynucleotides
Review slots of COLI and of COLI-K12
57
SRI International Bioinformatics
Genetic-Elements
Sequence
is stored in a separate file or database
table
58
SRI International Bioinformatics
Polymer-Segments
Review slots of Genes
59
SRI International Bioinformatics
Complexities of Gene / Gene-Product
Relationships
The Product of a gene can be an instance of Polypeptides
or RNAs
An instance of Polypeptides can have more than one gene
encoding it
Sequence position:
Nucleotide positions of starting and ending codons specified in Left-EndPosition and Right-End-Position (usually greater, except at origin)
Transcription-Direction + / Alternative splicing:
Nucleotide positions of starting and ending codons specified in Left-EndPosition and Right-End-Position
Intron positions specified in Splice-Form-Introns of gene product
60
(200 300) (350 400)
SRI International Bioinformatics
Gene Reaction Schematic
61
SRI International Bioinformatics
Exercises
62
Find all genes on a given chromosome
Find all ribosomal RNAs
Find the DNA sequence of a given gene
Find all proteins longer than 1,000 amino acids
SRI International Bioinformatics
Exercises
Find all genes on a given chromosome
(defun genes-of-chrom (chrom)
(loop for x in (get-slot-values chrom ‘components)
when (instance-all-instance-of-p x ‘|Genes|)
collect x)
)
63
Find all ribosomal RNAs
(get-class-all-instances ‘|rRNAs|)
Find the DNA sequence of a given gene
(get-gene-sequence gene)
SRI International Bioinformatics
Exercises
Find
all monomers longer than 1,000 nucleotides
(loop for g in (get-class-all-instances ‘|Genes|)
for p = (get-slot-value g ‘product)
when (and (< 1000 (abs (- (get-slot-value g ‘left-end-position)
(get-slot-value g ‘right-end-position) )))
(instance-all-instance-of-p p ‘|Polypeptides|) )
collect p )
64
SRI International Bioinformatics
Proteins
65
SRI International Bioinformatics
Proteins and Protein Complexes
Polypeptide:
the monomer protein product of a
gene (may have multiple isoforms, as indicated at
gene level)
Protein
complex: proteins consisting of multiple
polypeptides or protein complexes
Example:
DNA pol III
DnaE is a polypeptide
pol III core is DnaE and two other polypeptides
pol III holoenzymes is several protein complexes combined
66
SRI International Bioinformatics
Protein Complex Relationships
67
SRI International Bioinformatics
Slots of a protein (DnaE)
catalyzes
Is
it an activator/reactant/etc?
comments
component-of
dblinks
features (edited in feature editor)
Many
68
other features possible
SRI International Bioinformatics
A complex at the frame level (pol III)
Same
features as polypeptide frame, different use
comment
component-of
and components
note coefficients
69
SRI International Bioinformatics
Protein Complex Relationships
70
SRI International Bioinformatics
Relationships are Defined in Many
Places
component-of
comes from creating a complex
appears-in-left-side-of
comes from defining a
reaction (as do modified forms)
inhibitor-of
comes from an enzymatic reaction
can
only edit dna-footprint if protein has been
associated with a TU
71
SRI International Bioinformatics
Semantic Inference Layer
Reactions-of-protein
(prot)
Returns a list of rxns this protein catalyzes
Transcription-units-of-proteins(prot)
Returns a list of TU’s activated/inhibited by the given protein
Transporter? (prot)
Is this protein a transporter?
Polypeptide-or-homomultimer?(prot)
Transcription-factor? (prot)
Obtain-protein-stats
Returns 5 values
72
Length of : all-polypeptides, complexes, transporters, enzymes, etc…
SRI International Bioinformatics
Example
Find
all enzymes that use pyridoxal phosphate as
a cofactor or prosthetic group
(loop for protein in (get-class-all-instances ‘|Proteins|)
for enzrxn = (get-slot-value protein ‘enzymatic-reaction)
when (and enzrxn
(or (member-slot-value-p enzrxn ‘cofactors ‘pyridoxal_phosphate)
(member-slot-value-p enzrxn ‘prosthetic-groups
‘pyridoxal_phosphate))
collect protein)
(member-slot-value-p frame slot value) : T if Value is one of the values of
Slot of Frame.
73
SRI International Bioinformatics
Example Queries
Find
all homomultimers
Find
proteins whose pI > 10, and that reside on
the negative strand of the first chromosome
74
SRI International Bioinformatics
Sample
Find
all proteins without
a comment anywhere
75
SRI International Bioinformatics
Compounds / Reactions / Pathways
76
SRI International Bioinformatics
Compounds / Reactions / Pathways
Think
of a three tiered structure:
Reactions built on top of compounds
Pathways built on top of reactions
Metabolic network defined by reactions alone;
pathways are an additional “optional” structure
Some reactions not part of a pathway
Some reactions have no attached enzyme
Some enzymes have no attached gene
77
SRI International Bioinformatics
Compounds
78
SRI International Bioinformatics
79
SRI International Bioinformatics
Compounds
Relatively
few aspects of a compound defined
within the compound editor
MW, formula calculated from edited structure
Most
aspects defined in other editors
“Pathway reactions” comes from reaction editing followed by
pathway editing
Activator, etc come from the enzymatic reaction editor
80
SRI International Bioinformatics
-- Instance TRP --Types: |Amino-Acid|, |Aromatic-Amino-Acids|, |Non-polar-amino-acids|
APPEARS-IN-LEFT-SIDE-OF: RXN0-287, TRANS-RXN-76, TRYPTOPHAN-RXN,
TRYPTOPHAN--TRNA-LIGASE-RXN
APPEARS-IN-RIGHT-SIDE-OF: RXN0-2382, RXN0-301, TRANS-RXN-76, TRYPSYN-RXN
CHEMICAL-FORMULA: (C 11), (H 12), (N 2), (O 2)
COMMON-NAME: "L-tryptophan"
DBLINKS: (LIGAND-CPD "C00078" NIL |kaipa| 3311532640 NIL NIL),
(CAS "6912-86-3"), (CAS "73-22-3")
NAMES: "L-tryptophan", "W", "tryptacin", "trofan", "trp", "tryptophan",
"2-amino-3-indolylpropanic acid"
SMILES: "c1(c(CC(N)C(=O)O)c2(c([nH]1)cccc2))"
SYNONYMS: "W", "tryptacin", "trofan", "trp", "tryptophan",
"2-amino-3-indolylpropanic acid"
____________________________________________
81
SRI International Bioinformatics
Where is diphosphate in the
ontology?
82
SRI International Bioinformatics
Semantic Inference Layer
Reactions-of-compound
(cpd)
Pathways-of-compound (cpd)
Is-substrate-an-autocatalytic-enzyme-p (cpd)
Activated/inhibited-by? (cpds slots)
Returns a list of enzrxns for which a cpd in cpds is a
modulator (example slots: activators-all, activators-allosteric)
All-substrates (rxns)
All unique substrates specified in the given rxns
Has-structure-p (cpd)
Obtain-cpd-stats
Returns two values:
83
Length of :all-cpds, cpds with structures
SRI International Bioinformatics
Miscellaneous things….
History
List
Back/Forward and History buttons
Default list is 50 items
Show
frame
(print-frame ‘frame)
84
SRI International Bioinformatics
85
SRI International Bioinformatics
Queries with Multiple Answers
Navigator queries:
Example: Substring search for “pyruvate”
Selected list is placed on the Answer list
Use “Next Answer” button to view each one of them
Lisp queries:
Example : Find reactions involving pyruvate as a substrate
(get-class-all-instances ‘|Compounds|)
(loop
for rxn in (get-class-all-instances ‘|Reactions|)
when (member ‘pyruvate (get-slot-values rxn ‘substrates)
collect rxn)
(replace-answer-list * )
86
SRI International Bioinformatics