ASSP
A Program for assigning Secondary Structures in proteins

 ASSP - Validation of Results:

Dataset Preperation:

  • Preperation of training dataset involves following steps:
    • Total of 1195 representative folds were downloaded from ASTRAL-1.75 release database.
    • All the protein folds which were having ASTRAL-SPACI score <0.4 (total 266) were discarded.
    • Since the number of π-helices were very small, enrichment of the same was done by adding 79 protein chains with 30% sequence similarity.
    • Final dataset consists of 1008 protein chains with commonly assigned 5465 α, 2340 310 and 46 π-helices by STRIDE and DSSP.
  • Preperation of test dataset involves following steps:
    • Performance of ASSP and other algorithms were compared and analyzed on the four dataset considered by Martin et. al. (2005). Composition of datset is given below:
      HResMResLResNMR
      Resolution (Å) < 1.7  1.7 - 3.0  > 3.0  - 
      R-Factor < 0.19  < 0.30  > 0.30  - 
      Sequence Identity (%)< 30< 30< 30< 30
      Total Structures689624332296
    • For the analysis of π-helices, a dataset of 85 protein chains used by Fodje et. al. , was considered. These protein chains were shown to have π-helices. Authors have confirmed the presence of 104 π-helices on visual inspection.
    • Performance of ASSP was also checked on the proteins with 100% sequence identity, but solved by different methods (X-Ray, NMR and EM).

Analysis and comparison of ASSP results with different algorithms:

  • On training dataset
  • DSSPKAKSIPALSSESSTSTRIDEXTLSSTR
    ASSP 90.4 (93.8)  80.2 (96.7)   63.7 (99.1)   83.3 (89.6)   90.2 (92.5)   85.8 (85.4) 
    DSSP-97.966.785.293.988.3
    KAKSI-78.694.183.493.1
    PALSSE-98.198.997.3
    SST-86.782.9
    STRIDE-85.2
    XTLSSTR-

    Table1: Percentage agreement between different algorithms. Each cell in upper triangular matrix gives the % agreement between pair of algorithms on the residue level. The algorithm in column has been taken as reference. Percentage values in parantheses are the agreement between ASSP and other algorithms, where ASSP has been taken as reference. Only α-helices were considered.

    ASSPDSSPKAKSIPALSSEPROSSSSTSTRIDEXTLSSTR
      C (%)  2331 (63.3)2338 (60.4)2674 (56.2)1265 (27.1)2433 (61.0)2180 (54.8)2283 (58.7)2288 (58.6)
      K (%)  123 (3.3)264 (6.8)564 (11.8)2579 (55.3)324 (8.1)647 (16.3)349 (9.0)1254 (32.1)
      L (%)  1223 (33.2)1265 (32.7)1512 (31.8)812 (17.4)1229 (30.8)1147 (28.8)1251 (32.1)355 (9.1)
      U (%)  7 (0.2)6 (0.2)11 (0.2)7 (0.1)5 (0.1)6 (0.1)9 (0.2)8 (0.2)
      Total  36843873476146633991398038923905

    Table2: HELANAL-Plus geometry analysed on our training dataset. α-helices of length > 8 residues identified by all programs were considered. C (%), K (%), L (%) and U (%) are the number and percentage of helices classified by HELANAL-Plus as curved, kinked, linear and undefined. Total is the total number of α-helices of length > 8 residues.

  • On test dataset
  • HResMResLResNMR
      %α    %310    %π    %α    %310    %π    %α    %310    %π    %α    %310    %π  
      ASSP  35.23.50.935.73.40.832.13.60.832.53.70.8
      DSSP  35.34.80.0236.14.20.0233.73.30.0333.71.60.04
      KAKSI  36.43835.132.2
      PALSSE  57.657.354.654.6
      SST  34.81.50.535.31.50.433.11.70.0130.82.20.5
      STRIDE  36.550.0137.34.40.0134.33.50.335.120

    Table3: Percentage α, 310 and π helical content by different algorithms. KAKSI and PALSSE gives only helix, which includes all α, 310 and π helices. Hence only one corresponding cell is given. %α,%310 and %π are the percentage of residues constituting α, 310 and π helices respectively.

    DSSPKAKSIPALSSESSTSTRIDE
    ASSP 90.4 (93.4)  81.3 (95.7)  61.3 (99.9)  83.7 (88.4)  89.3 (95.4) 
    DSSP -  83.2  63.3  86.4  94 
    KAKSI   -  64.3  94.1  83.5 
    PALSSE     -  99.3  99.9 
    SST       -  85.4 
    STRIDE         - 

    Table4: Percentage agreement between different algorithms for Hres dataset. Arrangement is same as that of in table 2. Only α-helices were considered.

    ASSPDSSP-PPIIPROSSSEGNOXTLSSTR
    Twist (˚) 237.6 (9.2)  224.4 (44)  231.3 (35.9)  234.6 (27.8)  234.4 (33.7) 
    Rise per Residue (Å) 3.0 (0.1)  2.8 (0.5)  2.9 (0.4)  2.9 (0.4)  2.9 (0.5) 
    Radius (Å) 1.3 (0.1)  .4 (0.4)  1.4 (0.3)  1.4 (0.3)  1.3 (0.4) 

    Table5: Mean (std) values of twist, rise per residue and radius for the PPII helices assigned by different algorithms.

     Sl. No.  Hres  Res (Å)  Agreement  Lres  Res (Å)  Agreement  NMR  Agreement  EM  Agreement 
    1 1mms: A  2.57  -  1giy:L  5.5  100  1oln:A  100  1eg0:K  100 
    2 1ya7:D  2.3  -  1pma:D  3.4  84.7  2ku1:A  84.7  3c91:D  81.9 
    3 4j9z:R  1.66  -  1a29:A  2.74  86.5  1cfc:A  94.6  3j41:E  82.4 
    4 132l:A  1.8  -  2znw:Y  2.71  96.2  1e8l:A  92.3  4a8a:M  88.5 
    5 3h47:A  1.9  -  3p0a:A  5.95  97.5  2lf4:A  92.6  1vu4:0  92.6 

    Table6: ASSP assignment agreement among the same protein structure, solved by different methods or at different resolutions. PDB Ids under Hres column are the best resolved X-ray structure, while proteins in Lres columns are comparativly lower resolved ones. Part of the sequence common to all Hres, Lres, NMR and EM structures were considered for comparisonASSP assignment agreement among the same protein structure, solved by different methods or at different resolutions. PDB Ids under Hres column are the best resolved X-ray structure, while proteins in Lres columns are comparativly lower resolved ones. Part of the sequence common to all Hres, Lres, NMR and EM structures were considered for comparison.

Questions or problems regarding this web site should be directed to [mb@mbu.iisc.ernet.in].
Copyright © 2010 [Molecular Biophysics Unit,IISC]. All rights reserved.