Flavors of Protein Disorder

  • 3 flavors: V, C, S; distinguishable by amino acid compositions, sequence locations, biological function (pg 573)
  • Disordered protein: lacks specific tertiary structure. Composed of members with distinct and usually dynamic Ramachandran angles (pg 573)
  • Ordered protein: nearly all members have the same canonical set of Ramachandran angles (pg 573)
  • Disordered proteins can be "extended" or "collapsed" with a possible third intermediate state between them (pg 573)
  • Regions of extended disorder can exhibit secondary structure (pg 573)
  • Ordered proteins:
    • structural classes: α-helix, β-sheet, and other (pg 573)
    • protein folding classes: all α, all β, αβ, and α + β (pg 574)
  • Current set of disordered proteins for this dataset of 145 disordered proteins has is best divided into three subsets (pg 577), and these groupings and their memberships were not know beforehand (pg 582)
  • All predictors most heavily influenced by the presence of the most hydrophobic residues (pg 578)
  • Extended disorder had a lower sequence complexity, higher net charge, and reduced overall hydrophobicity compared with collapsed disorder, but flavors could not clearly distinguish between the two (pg 580)
  • Increase in disordered protein percentages: 26-51% in archea, 16-45% in eubacteria, 52-67% in eukaryotes (pg 581) from 7-33% in eubacteria, 9-37% in archea, and 35-51% in eukaryotes (cite) (pg 582)
  • Many disordered regions appear to have multiple flavors (pg 583)
  • Future enlargement of database will help with a more accurate estimate of the number of disorder flavors (pg 583)

Contents

[edit] 1 Statistical test for amino acid distributions (pg 579)

Protein sequences can be considered as random samples of amino acids taken from a distribution. Testing that two samples Failed to parse (Cannot write to or create math temp directory): S_1

and Failed to parse (Cannot write to or create math temp directory): S_2
are from the same distribution:

Failed to parse (Cannot write to or create math temp directory): D = \frac{n_1 n_2}{n_1 + n_2} \sum_{j=1}{20} \frac{(P_j^{(1)} - P_j^{(2)})^2}{P_j^{12}}


  • Failed to parse (Cannot write to or create math temp directory): n_1
and Failed to parse (Cannot write to or create math temp directory): n_2
number of examples from both samples
  • Failed to parse (Cannot write to or create math temp directory): P_j^{(1)}

, Failed to parse (Cannot write to or create math temp directory): P_j^{(2)} , Failed to parse (Cannot write to or create math temp directory): P_j^{(12)}

frequencies of amino acid Failed to parse (Cannot write to or create math temp directory): j
in both samples and in the joint sample, Failed to parse (Cannot write to or create math temp directory): S_1 + S_2


Checking that the D-values were significant: 1,000 random partitions of 145 disordered proteins into 3 subsets and calculated the D statistic for each partition.

[edit] 2 Flavors

  • All flavors different from ordered proteins (pg 579)
  • NMR characterized disorder evenly distributed between 3 flavors (pg 580)

[edit] 2.1 Flavor C

  • More histidine (H), methionine (M), and alanine (A) than found in ordered or other two flavors (pg 579)
  • More different from the other two flavors (pg 579)
  • Few eubacteria (12/18) biased towards C (pg 582)

[edit] 2.2 Flavor V

  • More of the least flexible amino acids (C, F, I, Y) than other disorder flavors (pg 579)
  • More similar to S than C, except by sequence complexity (pg 579)
  • More CD and proteolysis characterized disorder (pg 580)
  • Disordered proteins that bind to the genomic RNA of viruses not in V (pg 580)
  • Few disordered proteins that bind to DNA in V (pg 580)
  • Predicted in less than 5% of of the 5 bacterial genomes (pg 581)
  • Archea biased towards V (pg 582)
  • Some eubacteria (4/18) biased towards V (pg 582)

[edit] 2.3 Flavor S

  • Less histidine (H) than other flavors or ordered protein (pg 579)
  • More similar to V than C, except by sequence complexity (pg 579)
  • More X-ray disorder (pg 580)
  • Half of the disordered regions that bind to other proteins are in S (pg 580)
  • Over 40% of eukaryotic proteins estimated to have a long S region (pg 581)
  • Most eubacteria (12/18) biased towards S (pg 582)
XHTML 1.1 CSS 2 Sec 508