BLOSUM

Expand: Need more on the pros/cons and present day use and theory.


BLOSUM stands for BLOcks SUbstitution Matrix.

BLOSUM works better than the PAM matrices for distantly related sequences.

[edit] 1 Weighting by clustering

Multiple contributions to amino acid pair frequencies from closely related members of a family were reduced by clustering the sequences in blocks weighted as a single sequence in counting pairs. Blocks were formed by grouping together sequence segments at a given percent identity or higher. If a sequence has a percent identity with any clustered sequence equal to or higher than the cutoff, it is included in that cluster.

The number in a BLOSUM matrix's name stands for the percent identity cutoff for the blocks. So BLOSUM80's clusters are composed of sequences at or above 80% identity, and BLOSUM62's clusters are composed of sequences at or above 62%.

[edit] 2 Links

[edit] 3 References

XHTML 1.1 CSS 2 Sec 508