- Cal State LA - Instructional Web Server

Download Report

Transcript - Cal State LA - Instructional Web Server

An analysis of
pdb-care (PDB CArbohydrate REsidue check): a
program to support annotation of complex
carbohydrate structures in PDB files
by Thomas Lütteke
and Claus-W von der Lieth
By David Chapman
Background


Protein Data Bank includes 3-D data for
carbohydrate structures as well as amino acid
structures
3-D data for protein / carbohydrate interactions is
analyzed through X-Ray crytallography and
Nuclear Magnetic Resonance
 The absence of 3-D glycan data in PDB does
not necessarily mean a potential glycosolation
site is unoccupied
Background
The crytallography may have been done on
plasmid replicated proteins, which may not
have the same carbohydrates attached as the
human form.
 Glycosylation usually occurs at asparagine
residues in Asn-X-Ser/Thr sequons where X
does not equal proline
 Approximately 30% of all 1663 PDB entries
(Sep 2003) containing carbohydrates contain
errors in glycan description

Biological Significance

Protein / Carbohydrate interactions are
important because they are involved in a
variety of biological processes
 Fertilization
 Embryonic development
 Cellular differentiation
Background

High error rate in PDB glycan description is mainly due to
incorrect assignment of saccharide units
 Sequences for complex carbohydrates differ
significantly from single letter amino acid sequences
 The number of naturally occurring residues is much
larger for carbohydrates
 Each pair of monosaccharide residues can be linked in
several ways
 A residue can be connected to three or four others
(branching)
Background
Unlike amino acids, carbohydrates use a
three letter code which are defined the HET
dictionary in PDB
 A new residue name is required for each
stereochemically different sugar unit
 This makes the correct assignment
complicated, tedious and error prone

Background


Examples of Definitions of carbohydrate
residues:

AGC
alpha-D-Glucopyranose

BGC
beta -D-Glucopyranose

FCA
alpha-D-Fucose

FCB
beta-D-Fucose
There are more than 200 carbohydrate residues
used in PDB
Implementation


Pdb-care is based on the pdb2linucs carbohydrate detection
program
 Pdb2linucs is able to identify and assign carbohydrate
structures using only the reported atom types and their
3D coordinates
 The program output is in LINUCS notation and is used
to normalize complex carbohydrate structures
Pdb-care uses a translation table built in XML in order to
compare the LINUCS notation from pdb2linucs to the
residue assignments in the PDB group dictionary
Implementation
The translation table contains:
 141 monosaccharides
 31 oligosaccharides
 77 combined residues
 Pdb-care was written in the C language
 Front end is a web interface implemented in
PHP

Implementation
Pdb-care web interface can accommodate
either direct input using copy/paste of a pdb
file or locating a file on a local hard drive or
using a PDB-ID
 The pdb-care protocol reports the type of
problems, inconsistencies and errors
detected

Program Example

pdb-care examples
Conclusion


The authors made relevant points regarding the
biological significance of protein-carbohydrate
interactions and the need for accurate glycan
residue information in PDB.
However, the authors did not go into detail
regarding the actual implementation of the
translation table used in pdb-care so it is difficult
to judge the accuracy of their program.