The rapid generation of mutation data matrices from protein sequences

[edit] 1 Existing scoring matrices

  • Unitary Protein Matrix (UPM) -- 1 for matching residues and 0 for other combinations
  • Genetic Code Matrix (GCM) - scores amino acid similarity by maximum number of common nucleotide bases between closest matching representative codons
  • Structure Genetic Matrix (SGM) -- McLachlan (1972), used statistical analysis of observed amino acid exchanges and transition values for each pair of amino acids depending on # of overlapping physico-chemical properties to bias the UPM

[edit] 2 Construction of raw PAM matrix

  • ideal: compare every present day sequence with its immediate predecessor
  • Dayhoff used common ancestor method, where closely homologous pairs of present day sequences had common ancestral sequences inferred
XHTML 1.1 CSS 2 Sec 508