pathologic-cplxs+operons - Bioinformatics Research Group at
Download
Report
Transcript pathologic-cplxs+operons - Bioinformatics Research Group at
SRI International
Bioinformatics
Selected PathoLogic Refining Tasks
Creation
of Protein Complexes
Assignment of Modified Proteins
Operon Prediction
Creating Protein Complexes
Refine
When
SRI International
Bioinformatics
-> Create Protein Complexes
multiple polypeptides catalyze the same
reaction
Could be isozymes Do nothing
Could be components of a complex
Software can’t tell the difference
SRI International
Bioinformatics
Manually-assisted Complex Creation
Curators decide based
on:
Names, e.g. subunit A,
subunit B
How enzyme is organized in
other organisms
Members of a complex are
often neighbors on
chromosome
Specific biological
knowledge based on
literature, etc.
Complex creation tool:
Lists names, gene IDs
Shows reaction in
MetaCyc
Indicates which genes
are neighbors
Leaves final decision
up to curator
Creating Protein Complexes
SRI International
Bioinformatics
SRI International
Bioinformatics
Complex Subunit Stoichiometries
Leave
coefficients blank if unknown
SRI International
Bioinformatics
Proteins that are Reaction Substrates
Reactions are defined in MetaCyc with protein classes as
substrates
Need to find which genes in the genome code for instances
of those classes.
Refine -> Assign Modified Proteins
Finds all reactions that
Have an enzyme
Have a protein class as substrate
Name search for substrate
Presents possibilities, asks curator to choose
Chosen protein will be made a child of the protein class
SRI International
Bioinformatics
Proteins as Reaction Substrates
Operon Predictor
Refine
-> Predict Transcription Units
SRI International
Bioinformatics
Nomenclature
SRI International
Bioinformatics
• WO pair = pair of genes within an operon
• TUB pair = pair of genes at a transcription unit boundary
(delineate operons)
SRI International
Bioinformatics
Operation of the operon predictor
For each contiguous gene pair, predict whether gene pairs
are within the same operon or at a transcription unit
boundary
Use pairwise predictions to identify potential operons
AB = TUB pair
BC = WO pair
CD = WO pair
DE = TUB pair
A
operon = BCD
B
C
D
E
Operon predictor
SRI International
Bioinformatics
We use method from Salgado et al, PNAS (2000)
as a starting point.
Uses E. coli experimentally verified data as a training set.
Compute log likelihood of two genes being WO or TUB pair
based on intergenic distance.
Predicts operon gene pairs based on:
intergenic distance between genes
genes in the same functional class
Operon predictor
SRI International
Bioinformatics
Additional features easily computed from a PGDB
1.
both genes products enzymes in the same metabolic
pathway
2.
both gene products monomers in the same protein
complex
3.
one gene product transports a substrate for a metabolic
pathway in which the other gene product is involved as an
enzyme
4.
a gene upstream or downstream from the gene pair (and
within the same directon) is related to either one of the
genes in the pair as per features 1, 2 and 3 above.