Intrinsically unstructured proteins evolve by repeat expansion

Main point: the genetic instability of repetitive regions combined with the structurally and functionally permissive nature of unstructured proteins powers the extension and possible functional expansion of disordered proteins

Note: The paper uses the term "intrinsically unstructured protein" (IUP), but these notes will use the term disordered protein


  • The functions fulfilled by disordered proteins depend on the lack of structure, so disorder is advantageous and evolutionarily stable for these proteins (pg 847)
  • The fraction of the genome encoding for disordered proteins increases with organism complexity (pg 847)
  • Paper discussion: observation that compositionally biased and low complexity regions, often found in disordered proteins, evolve rapidly by: repeat expansion, replication slippage, and substitution mutations; these repetitive segments are functionally indispensable and their three alternative evolutionary routes of expansion are probably driven by positive selection

[edit] 1 Tandem repeats

  • Tandem arrays of sequence repeats are abundant in all genomes studied so far
  • Classified by length of repeat unit (pg 847):
    • Satellite (several thousand bp)
    • Minisatellite (10-1000 bp)
    • Microsatellite (1-10 bp)
  • Evolved rapidly by: mitotic replication or meiotic recombination events (geneconversion and unequal crossover)
  • Coding micro- and minisatellites have a high level of interspecies variation and polymorphism and can rapidly generate new functional variants
  • Coding microsatellites are implicated mostly in neurodegenerative diseases but can also carry function
  • Coding minisatellites correspond to structural units arranged in tandem and constitute building blocks of stable secondary or supersecondary elements
  • Longer repeats encode for domains like autonomous folding units
  • large fraction of disordered proteins contains significant repeat regions, exceeding the frequency of any other group of proteins

Three different evolutionary types:

  • Type I: regions where repeats generated by tandem duplications do not undergo diversification
  • Type II: regions where repeats diversify due to their differential localization within the sequence or sequential changes, specialize in binding different ligands and become functionally non-redundant in time
  • Type III: regions where repeats acquire novel functions as a result of expansion

[edit] 2 Examples

  • Salivary proline-rich proteins (PRP), recognition (type I)
  • Titin PEVK domain, entropic chain (type I)
  • Fibronectin-binging protein A (type I)
  • Involucrin, Q-region, transglutaminase cross-linking (type I)
  • Neurofilament-H, KSP domain, entropic sidearm (type I)
  • Prion protein, octarepeats, copper binding (type III)
  • Sry, transactivator domain, complex assembly (type I or II)
  • Tau protein, imperfect repeats, microtubule binding and polymerization (type I)

[edit] 3 Conclusion

  • Repeat regions in disordered sequences carry important functions
  • The inherent genetic instability and structural/functional permissibility of disordered proteins encourages rapid and advantageous evolutionary changes
Facts about Intrinsically unstructured proteins evolve by repeat expansionRDF feed
Date published1 September 2003  +
Has authorP. Tompa  +
Paper topicDisordered proteins  +, and Protein evolution  +
PubMed ID12,938,174  +
Published inBioEssays  +
TitleIntrinsically unstructured proteins evolve by repeat expansion  +
XHTML 1.1 CSS 2 Sec 508