Slajd 1 - PL-Grid

Download Report

Transcript Slajd 1 - PL-Grid

Polish Infrastructure
for Supporting Computational Science
in the European Research Space
Examining Protein Folding Process
Simulation and Searching for Common
Structure Motifs in a Protein Family as
Experiments in the GridSpace2 Virtual
Laboratory
T. JADCZYK, M. Malawski, M. Bubak,
and I. Roterman
ACC Cyfronet AGH
Cracow Grid Workshop
Krakow, 09.11.2011
EUROPEAN UNION
Outline
Protein folding process and Fuzzy Oil Drop (FOD) model
FOD algorithm
FOD experiment workflow
Sequences and proteins' structures comparison
Tools used
Conservation score
Searching for common structure motifs - experiment
workflow
2
Fuzzy Oil Drop model
 Protein Structure – chains, amino acids, atoms
 Protein Folding – primary, secondary, tetriary structure
 Fuzzy Oil Drop is Based on Kauzmann's Oil-Drop model of protein molecule
 Assumes folding process directed by water environment
 Extended from discrete one to 3-D Gauss function used to describe the idealized
hydrophobicity distribution
 Hydrophobic: "Water hating". Amino acids that prefer to be in a non-aqueous (lipid) environment because they
cannot make favorable interactions with water.
 The highest hydrophobicity concentration is expected in the center of the protein
body, with the decrease of its values toward the surface, where the hydrophobicity is
expected to be close to zero
3
Fuzzy Oil Drop – algorithm structure
 1. Input: PDB file
 2. Simplify protein's residues to
„effective atoms” representation
 3. Find two furthest atoms, Move
protein to origin center, rotate
 4. Determine theoretical
hydrophobicity distribution
 5. Calculate observed hydrophobicity,
 6. Test similarity of both distributions
 7. Store results:
 PDBID, chain, chain length,
organism, method, function
 O/T, O/R values
 Future: O/T , O/R profiles
4
Fuzzy Oil Drop – algorithm structure
5
Fuzzy Oil Drop – experiment workflow
6
Fuzzy Oil Drop – GS2 Experiment Workbench
7
Fuzzy Oil Drop – experiment results
 Input: PDB Database (March 2011), 71100 files, 11.4GB
 Final Results: 68100 proteins, 321800 chains
8
Structure and Sequence comparison
 Search for conservative areas to discover protein function or find ligand binding site
 Comparison on three levels of protein description:
 Amino acid sequence
 Structural codes
 3D structures
 W score to determine area's conservativeness
 Used alignment tools:
 ClustalW – multiple sequences alignment
 Mammoth – multiple structures alignment
9
Structure and sequence comparison - experiment
10
Thanks for your attention!
11