pathologic-cplxs+operons - Bioinformatics Research Group at

Download Report

Transcript pathologic-cplxs+operons - Bioinformatics Research Group at

SRI International
Bioinformatics
Selected PathoLogic Refining Tasks
 Creation
of Protein Complexes
 Assignment of Modified Proteins
 Operon Prediction
Creating Protein Complexes
 Refine
 When
SRI International
Bioinformatics
-> Create Protein Complexes
multiple polypeptides catalyze the same
reaction
 Could be isozymes  Do nothing
 Could be components of a complex
 Software can’t tell the difference
SRI International
Bioinformatics
Manually-assisted Complex Creation
Curators decide based
on:
 Names, e.g. subunit A,
subunit B
 How enzyme is organized in
other organisms
 Members of a complex are
often neighbors on
chromosome
 Specific biological
knowledge based on
literature, etc.
Complex creation tool:

Lists names, gene IDs
Shows reaction in
MetaCyc
 Indicates which genes
are neighbors


Leaves final decision
up to curator
Creating Protein Complexes
SRI International
Bioinformatics
SRI International
Bioinformatics
Complex Subunit Stoichiometries
Leave
coefficients blank if unknown
SRI International
Bioinformatics
Proteins that are Reaction Substrates



Reactions are defined in MetaCyc with protein classes as
substrates
Need to find which genes in the genome code for instances
of those classes.
Refine -> Assign Modified Proteins
 Finds all reactions that





Have an enzyme
Have a protein class as substrate
Name search for substrate
Presents possibilities, asks curator to choose
Chosen protein will be made a child of the protein class
SRI International
Bioinformatics
Proteins as Reaction Substrates
Operon Predictor
 Refine
-> Predict Transcription Units
SRI International
Bioinformatics
Nomenclature
SRI International
Bioinformatics
• WO pair = pair of genes within an operon
• TUB pair = pair of genes at a transcription unit boundary
(delineate operons)
SRI International
Bioinformatics
Operation of the operon predictor


For each contiguous gene pair, predict whether gene pairs
are within the same operon or at a transcription unit
boundary
Use pairwise predictions to identify potential operons
AB = TUB pair
BC = WO pair
CD = WO pair
DE = TUB pair
A
operon = BCD
B
C
D
E
Operon predictor
SRI International
Bioinformatics

We use method from Salgado et al, PNAS (2000)
as a starting point.
 Uses E. coli experimentally verified data as a training set.
 Compute log likelihood of two genes being WO or TUB pair
based on intergenic distance.

Predicts operon gene pairs based on:
 intergenic distance between genes
 genes in the same functional class
Operon predictor
SRI International
Bioinformatics
Additional features easily computed from a PGDB
1.
both genes products enzymes in the same metabolic
pathway
2.
both gene products monomers in the same protein
complex
3.
one gene product transports a substrate for a metabolic
pathway in which the other gene product is involved as an
enzyme
4.
a gene upstream or downstream from the gene pair (and
within the same directon) is related to either one of the
genes in the pair as per features 1, 2 and 3 above.