Transcript slides

Notes on Zing Conference
Chao Wang, Bin Ling
Overview
•
•
Key words: Chemical shift, docking, interaction
Freq:
–
–
–
–
–
–
•
Conclusion: 3
Method: 30
App: 9
Discovery: 15
New Insight: 7
Server: 5
Chao’s suggestion:
– No pure prediction, contact-assisted prediction
– Know about all direct-use tools
– Refinement
John Moult, 10 CASP experiments: Successes, Bottlenecks,
and Challenges in Protein Structure Prediction
•
•
•
•
•
•
GDT_TS remains the same
In CASP10, the number of overline increases
Methods that not worked: threading (single-domain), approximate potential
function, folding pathways (including discrete way, statistic zip and funnel)
Stay the same: SS-prediction, Loop modeling
Methods that worked: Multi-templates method. Consensus method,
structure clustering. Knowledge based method, for example, fragment
assembly and refinement.
Next: contact prediction (distance-assisted prediction), conformational
sampling, hierarchical potential, complexes (refer to CAPRI communitywide
experiment on the comparative evaluation of protein-protein docking for
structure prediction)
Nick Griphin, ECOD,
Evolutionary Classification of Protein Domains
•
•
•
•
•
•
Expand evolutionary core
Focus on Inter-domain docking
With sequence information
from the SCOP ‘‘multi-domain proteins’’ class
Methods:
– Identifying number and general position of domains.
– Refinement of domain boundaries.
– Sequence continuity and alternate domain definitions.
TMscore: between superfamily and fold in SCOP
Andriy Kryshafovych, New Developments in the Assessment of Contact
Predictions in CASP, and How to Quantify Improvements in Model
Accuracy Due to Incorporation of Contact Information
• The contact-assist protein structure prediction accuracy
will be largely improved.
• a prediction that satisfied the contact constraint does not
mean that the prediction is a good prediction.
• The predicted contact can help the structure prediction,
too.
• But the report didn’t answer how to deal with the low
accuracy of contact prediction.
• More details in the 4 Proteins papers.
Andrei Korostelev, Modeling the Structures of
Large Macromolecular Complexes
•
•
•
Main method: simulate annealing
Apply multistart simulated annealing crystallographic refinement to a 70S
ribosome-RF1 translation termination complex that was recently solved at
3.2 Å resolution.
The analysis improves the interpretability of the electron density map of this
2.5-MDa ribonucleo-protein complex and provides insights into its structural
dynamics.
Samuel Flores, Reliable Evaluation of Mutations in
Protein-Protein Interfaces
• Here we describe a community-wide assessment of methods to
predict the effects of mutations on protein-protein interactions.
• The most successful methods considered the effects of mutation on
monomer stability in addition to binding affinity, carried out explicit
side-chain sampling and backbone relaxation, evaluated packing,
electrostatic, and solvation effects, and correctly identified around a
third of the beneficial mutations.
Banu Ozkan, Novel Physics-Based Protein Structure
Refinement Method Through Local Unfolding and Refolding
• Pathway method is worked for single domain case.
• Analog to Genetic Algorithm, swap parts of two decoys that share
90% contacts, then unfold and refold, using the zipping software by
herself.
Faruck Morcos, Conformational Changes leave an
Evolutionary Footprint across Protein Lineages
• To determine a complete functional conformational landscape of
proteins.
• Methods: DCA
• reveal a signature of functionally important states in several protein
families, using direct coupling analysis, which detects residue pair
coevolution of protein sequence composition.
• DCA also predicts several intermediates or hidden states that are of
functional importance.
Dominik Gront,
Bioshell: Modular Platform for Biomolecular Modeling
Marcin Pawlowski, Protein Model Quality
Assessment Prediction by using a Residue
Specific Statistical Potential
• Traditional QA: domain knowledge physical,
knowledge-based potential, and consensus
• The proposed method relies on an assumption
that common substructure motifs among
different protein folds can still share similar
patterns of interaction with neighboring residues.
• The new potential is a modification of DFIRE
replace N(i, j, r) with N(i, j, AA, r, d).
Marek Cieplak, Energy Landscape and
Dynamics of a Lattice Model of Proteins
•
•
•
•
•
funnel-like landscape, transition state theory, high temperature to unfold, two state
protein P_N+P_D=1
square lattice
lattice tube model (Bonavar, Cieplak, Maritan 2004)
An Ising-like model for beta-hairpin (2004)
Consider as a two-dimension chain and the unfolding process is to change the
direction
Sebastian Kmiecik, Fast Modeling of Protein
Structure Flexibility
• a computationally efficient alternative to all-atom
molecular dynamics—a classical simulation approach.
Greg Chirikjian, Analysis of Preferred Orientations
of Interacting Substructures in Proteins
previous efforts to normalize angle distribution
data did not include effect: helices can interact
with each other in three distinct ways which we
refer to as ‘‘line-on-line,’’ ‘‘endpoint-to-line,’’ and
‘‘endpoint-to-endpoint,’’ and each of these
interactions has its own geometric effects which
must be included in the proper normalization of
data
For our H-form
Jie Liang, Sampling and Predicting Conformations
of Single and Multiple Loops of Proteins
• Previous: Sequential Chain Growth with
importance sampling
• Now: ab initio Distance-guided sequential
chain growth
• E_i_j_D: logodds
• multiple loops: reject-sampling
• local environment effect
Chris Sander, Protein structure prediction
from sequence variation
•
Published on NBT
•
•
•
Evolutionary couplings
contacts in proteins leave an evolutionary record
Although evolutionary couplings show promise for the identification of functional sites,
homomultimer contacts, alternative conformations and functional sites, many of the
predicted contacts involved in these protein features may appear as false positives in
the prediction of intradomain residue contacts.
Ron Elber, Predicting the Structure of
Protein Switches
•
•
•
Our approach was to create 2 proteins that (i) are stably folded into 2
different folds, (ii) have 2 different functions, and (iii) are very similar in
sequence. In this simplified sequence space we explore the mutational path
from one fold to another.
Two fold: 4beta+alpha, 3alpha
Chao’s comment: this is used to compare the stability of A1-A4 in helice.
Just like the 88% sequence identity PNAS paper.
Brinda Vallat, Modeling Proteins using a SuperSecondary Structure Library and NMR Chemical
Shift Information
• we present a hybrid modeling algorithm that relies on an exhaustive
Smotif library and on nuclear magnetic resonance chemical shift
patterns without any input of primary sequence information.
• In a test of 102 proteins, the algorithm delivered 90 homologymodel-quality models, among them 24 high-quality ones, and a
topologically correct solution for almost all cases.
Xiaoqin Zou, A Scoring Scheme for
Predicting Protein Structures
•
•
•
In this study, we have developed a statistical mechanics-based iterative
method to extract statistical atomic interaction potentials from known,
nonredundant protein structures.
Use coarse-grain atom pair distance to describe
Iterative to extract effective potential
Drena Dobbs, Analyzing & Predicting RNA-Protein
Interactions
•
•
•
•
•
Prediction improves using sequence info than structure info
Hypothesis: conformational changes upon RNA binding confound structurebased
Motif-based strategy: generate an RNA-protein motif lookup table
RNA sequence: 4-mer, Protein sequence: 3-mer, 7 letter reduced alphabet
SVM classifier
Shi-Jie Chen, Predicting Structure and Stability for RNA
Complexes with Intermolecular Loop-loop Tertiary Contacts
• Tetraloop-receptor is a frequently
occurring tertiary motif
• Obtaining motif by ss prediction
• Finding motif-motif contacts
Ioan Andricioaei, Protein and RNA Folding Exhibits
Universal Signatures of Granular Jamming
Upon folding, proteins develop a peak
in the interatomic force distributions
that falls on a universal curve with
experimentally measured forces on
jammed grains and droplets.
Keith Dunker Advantages of Intrinsic
Disorder for Protein Function
• List functions of 90 loops
• Chao’s comment: to build SSS, Smotif for threading
Sandor Vajda What Docking tells us about
Protein-protein Association?
• Classification of Protein Complexes based on Biophysics of
Association
“Tell me how you contact your partners,
and I'll tell you who you are.”
• Docking will change disorder structure.
• List 5 classes of complexes
Joel Sussman, Structure Based Drug Design: Can
Crystal Structures Lead to Erroneous Predictions?
• Conformational changes upon ligand binding thus involve
preexisting equilibrium dynamics.
• Consequently, rational drug design could benefit significantly from
conformations monitored by MD simulations of native targets.
William Ray, Visualizing Co-Evolution: What does Structure
look like, and How can we tell it from Function?
•
•
•
•
•
•
The evolutionary signature of structural constraints
not the most strongly coevolving residues, as have
been searched for, for decades by ever-morepowerful algorithms, but rather, diffuse cluster of
weekly-co-evolving residues.
Strongly co-evolving residues correlate more strongly
with functional requirements for fitness.
Mutual information, many alternatives
Consider as a peculiar type of graph
What’s necessary of protein function, Co-Evolution ->
function
Bin’s comments: Maybe we should change our focus
from single site contact to motif contact. From single
site co-relationship to window co-relationship
Jinbo Xu: A New Protein Statistical Potential Empowered
by High-Throughput Sequencing
•
•
•
•
•
•
Position specific
a context-specific potential.
The context-specific means a local window profile
This article takes a rather different view on the observed probability and
parameterizes it by the protein sequence profile context of the atoms and
the radius of the gyration, in addition to atom types.
Experiments confirm that our position-specific statistical potential
outperforms currently the popular ones in several decoy discrimination tests.
Imply that, in addition to reference state, the observed probability also
makes energy potentials different and evolutionary information greatly boost
performance of energy potentials.
Jianzhu Ma: MRFalign: Protein Homology
Detection through Alignment of Markov Random
Field
• Sheng has reported before
• Markov Random Field to build alignment
• Alignment Potentials: p(a) p(b) logodds, alignment logodds
• Bin’s comments: The efficiency is the problem.
Daisuke Kihara, Evaluating Protein Structure
Models by Predicted Pairwise Subunits
• generate a series of predicted models (decoys) of various
accuracies by our multiple protein docking pipeline, Multi-LZerD, for
three multi-chain complexes with 3, 4, and 6 chains.
• analyze the decoys in terms of the number of correctly predicted pair
conformations in the decoys.
• term the fraction of correctly predicted pairs (RMSD at the interface
of less than 4.0Å) as fpair and propose to use it for evaluation of the
accuracy of multiple protein docking.
• Residue contacts defined by Cβ−Cβ distance of 7.0 Å work best
overall among tested to identify proteins of the same fold.
• effective contact definitions differ from fold to fold, suggesting that
using different residue contact definition specific for each template
will lead to improvement of the performance of threading.
George Stan, Topology-Dependent Mechanisms of
Protein Unfolding and Translocation by AAA+
Nanomachines
• conserved loop motif: G-Ar-Phi-G
• Coarse grained model: Hydrophobic, Hodrophilic,
Natrual loop
• Unfolding HBP by force
• Chao: Experimental Results
Chen Keasar, Scoring Individual Protein Models by
a Purely Structural Function
• MESHI energy function
• MESHI is a software package for protein modeling. It is written
solely in Java in strict object oriented design (OOD). We hope that
the use of OOD will encourage other groups interested in protein
modeling to use of the code and take part in its development. Please
note though, that the MESHI is in a rather preliminary stage of
development. Not all the features you may expect are already
available (e.g. Molecular Dynamics), and documentation is less than
perfect. We will gladly receive any comment and within our limited
resources do our best to help users/developers.
A legalistic remark: MESHI is intended to be free for use and
development. We are still not sure about the exact format (GNU,
Open-Source or anything else).
• Prof Dong Xu suggests us to use MESHI energy function instead of
ROSETTA
Phil Bradley, Predicting Protein-Nucleic Acid
Structures and Interactions
•
•
Conformational flexibility is likely important for design and template-based
modeling, where non-native conformations need to be sampled and
accurately scored.
A successful application of such computational modeling techniques in the
construction of the TAL-DNA complex structure is discussed.
Falk Hoffmann Protein Structure Prediction
using Basin-Hopping Global Optimization
• In the current work, we exploit chemical shifts by
combining the basin-hopping approach to global
optimization with chemical shift restraints using a
penalty function.
• We further show that our chemical shift restraint
BH approach also works for incomplete chemical
shift assignments, where the information from
only one chemical shift type is considered.
Peter Rogen
Which Distance Measure is best
for Training and Testing Protein Pair Potentials?
• 4 metrics: rmsd, MTP mean Tods Potential,
GDT_TS, Q* 1-the fraction of native
contacts
Joanna I. Sulkowska
Ilya Vakser
Knowledge-Based Modelling of
Protein-Protein Interactions
•
•
Training sets of protein-protein matches were generated based on bound
and unbound forms of proteins taken from the DOCKGROUND resource.
Each residue was represented by a pseudo-atom in the geometric center of
the side chain.
Michael Widom Folding Kinetics of Riboswitch
Transcriptional Terminators and Sequesterers
•
•
employ kinetic Monte Carlo simulation to model the time-dependent folding
during transcription of riboswitch expression platforms.
both that riboswitch transcriptional terminator sequences have been
naturally selected for high folding efficiency, and that sequesterers can
maintain their function even in the presence of significant misfolding.
Leon Martinez
Rd.HMM Score as a Reliable Objective
Function for the Prediction of the Three-Dimensional
Structure of Proteins
•
•
•
•
•
The approach described in this work begins by generating a large number of amino
acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a
program with notable robustness in the assignment of amino acids to a known threedimensional structure.
The resulting sequence-sets showed no conservation of amino acids at active sites,
or protein-protein interfaces.
Hidden Markov models built from the resulting sequence sets were used to search
sequence databases.
Surprisingly, the models retrieved from the database sequences belonged to proteins
with the same or a very similar function.
Given an appropriate cutoff, the rate of false positives was zero.
Jianlin Cheng
Andrzej Joachimiak, Structure Determination of
Transcriptional Factors and their Complexes with
DNA
•
•
•
We have determined the crystal structures of HetR complexed with
palindromic DNA targets, 21, 23, and 29 bp at 2.50-, 3.00-, and 3.25-Å
resolution, respectively.
The highest-resolution structure shows fine details of specific protein–DNA
interactions.
The lower-resolution structures with longer DNA duplexes have similar
interaction patterns and show how the flap domains interact with DNA in a
sequence nonspecific fashion.
Jarek Meller Ultrafast Clustering of
Macromolecular Structures
•
•
•
consider a fast alternative, in which structural similarity is assessed using
1D profiles, e.g., consisting of relative solvent accessibilities and secondary
structures of equivalent amino acid residues in the respective models.
the new approach, dubbed 1D-Jury, allows to implicitly compare and rank N
models in O(N) time, as opposed to quadratic complexity of 3D-Jury and
related clusteringbased methods.
In addition, 1D-Jury avoids computationally expensive 3D superposition of
pairs of models.
Dong Xu, MUFOLD
• Mini-threading, Use structural fragments of
all sizes and broad confidence levels
• build distance constraints, Distance
matrix->MDScaling
• Zscore to assessment: Opus energy,
modelevaluator score, rapdf score, dfire,
hopp score, geometric score, DOPE
• MESHI Casp9 Ranking 2nd of FM
•
•
•
developed a database with predicted protein domains for five plant
proteomes (http://pfp.bio.nyu.edu) and used both protein structural fold
recognition and de novo Rosetta-based protein structure prediction to
predict protein structure for Arabidopsis and rice proteins.
Based on sequence similarity, we have identified ;15,000
orthologous/paralogous protein family clusters among these species and
used codon-based models to predict positive selection in protein evolution
within 175 of these sequence clusters.
results show that codons that display positive selection appear to be less
frequent in helical and strand regions are overrepresented in amino acid
residues that are associated with a change in protein secondary structure.
•
•
•
show that the performance of three mirrortree-related methodologies
depends on the set of organisms used for building the trees, and it is not
always directly related to the number of organisms in a simple way.
Certain subsets of organisms seem to be more suitable for the predictions
of certain types of interactions.
This relationship between type of interaction and optimal set of organism for
detecting them makes sense in the light of the phylogenetic distribution of
the organisms and the nature of the interactions.
•
•
•
•
We present a knowledge-based function to score protein decoys based on
their similarity to native structure.
A set of features is constructed to describe the structure and sequence of
the entire protein chain.
The features we use are associated with residue–residue distances,
residue–solvent distances, pairwise knowledge-based potentials and a fourbody potential.
This new approach enables us to obtain information both from decoys and
from native structures.