Glycobiology Advance Access published online on May 9, 2007
Glycobiology, doi:10.1093/glycob/cwm050
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NetCGlyc 1.0: Prediction of mammalian C-mannosylation sites
Department of Medical Biochemistry and Biophysics, Karolinska Institutet, SE-171 77 Stockholm, Sweden
Stockholm Bioinformatics Center, Albanova, Stockholm University, SE-106 91 Stockholm, Sweden
1 To whom correspondence should be addressed; email: karin.julenius{at}sbc.su.se; fax: +46 8 313 445; tel: +46 8 524 86 976
Received on March 21, 2007; revised on May 3, 2007; accepted on May 5, 2007
C-mannosylation is the attachment of an
-mannopyranose to a tryptophan via a C-C link. The sequence WXXW, in which the first Trp becomes mannosylated, has been suggested as a consensus motif for the modification, but only 2/3 of known sites follow this rule. We have gathered a data set of 69 experimentally verified C-mannosylation sites from literature. We analyzed these for sequence context and found that apart from Trp in position +3, Cys is accepted in the same position. We also find a clear preference in position +1, where a small and/or polar residue (Ser, Ala, Gly, and Thr) is preferred and a Phe or Leu discriminated against. The Protein Data Bank was searched for structural information and five structures of C-mannosylated proteins were obtained. We showed that modified tryptophan residues are at least partly solvent-exposed. A method predicting the location of C-mannosylation sites in proteins was developed using a neural network approach. The best overall network used a 21-residue sequence input window plus information on the presence/absence of the WXXW motif. NetCGlyc 1.0 correctly predicts 93% of both positive and negative C-mannosylation sites. This is a significant improvement over the WXXW consensus motif itself, which only identifies 67% of positive sites. NetCGlyc 1.0 is available at http://www.cbs.dtu.dk/services/NetCGlyc/. Using NetCGlyc 1.0, we scanned the human genome and found 2573 exported or transmembrane transcripts with at least one predicted C-mannosylation site.
Key words: machine learning / neural networks / C-mannosylation / prediction