Alignment data

Contents

[edit] 1 Iterations

Iterations done for each class until no value in the scaled log odds matrix deviated more than 1 from the previous iteration's scaled log odds matrix.

ClassDisordered IterationsOrdered Iterations
85%33
60% banded43
40% banded33

[edit] 2 Number of alignments

Both ordered and disordered having 213 families is a coincidence.

Disordered has fewer alignments per sequence than ordered. Disordered number of families decreases in lower banding, not increases. Banding increases included ordered families, however.

Note from previous: banded is not simply the unbanded minus the upper echelons, because the unbanded matrix classes will include things like 85% identity with over 1 gap. Banding can decrease the number of families present, presumably ones containing only similar sequences. Banding increases the percent mutation rate. Banding increases the sum off the nondiagonal. Banding closes the gap between the ordered and disordered mutation rate.

Num families Num seqs Num alignments Num substitutions
Ordered 85% 213 3,408 38,117 6,976,696
Ordered 60% banded 224 27,316 3,327,744 590,311,337
Ordered 40% banded 242 52,527 5,949,331 936,880,297
Disordered 85% 213 3,127 29,259 2,284,355
Disordered 60% banded 207 18,883 662,599 50,474,512
Disordered 40% banded 182 31,361 1,738,606 147,793,757

[edit] 2.1 Gap cutoff effects

Shows how many alignments are cut off by maximum gap limits.

Total alignments Gap cutoffs Included alignments Percent cutoff
Ordered 85% 68,397 0 41,551 39.250
Disordered 85% 76,840 0 29,259 61.922
Ordered 60% banded 3,483,138 4 3,327,744 4.461
Disordered 60% banded 750,526 4 662,599 11.715
Ordered 40% banded 8,548,724 4 5,949,331 30.407
Disordered 40% banded 2,417,993 4 1,738,606 28.097

[edit] 3 Percent identity quantiles

Rejoicing: ordered and disordered have very similar PIs now. Gap cutoffs do not seem to reduce the average PI by much, either, also good.

[edit] 3.1 With gap cutoffs

Quartiles with gap cutoffs
Gap cutoff 0% 25% 50% 75% 100% Average
Ordered 85% 0 85.000 88.186 92.063 96.429 99.890 92.285
Disordered 85% 0 85.000 87.879 91.304 94.737 99.932 91.521
Ordered 60% banded 4 60.000 63.953 68.468 74.107 85.000 69.365
Disordered 60% banded 4 60.000 63.636 68.571 75.510 85.000 69.826
Ordered 40% banded 4 40.000 45.033 49.701 54.651 60.000 49.855
Disordered 40% banded 4 40.000 43.704 48.246 53.731 60.000 48.841

[edit] 3.2 Without gap cutoffs

Quartiles with no gap cutoffs
0% 25% 50% 75% 100% Average
Ordered 85% 85.000 87.143 89.710 94.898 99.890 90.977
Disordered 85% 85.000 87.273 89.815 93.548 99.932 90.510
Ordered 60% banded 60.000 63.799 68.158 73.889 85.000 69.142
Disordered 60% banded 60.000 63.415 68.085 75.000 85.000 69.521
Ordered 40% banded 40.000 43.946 48.168 53.285 60.000 48.771
Disordered 40% banded 40.000 43.704 48.246 53.731 60.000 47.822

[edit] 4 Percent identity histograms

[edit] 4.1 85%

Percent identity proportions at 85% identity
Percent identity proportions at 85% identity

[edit] 4.2 60% banded

Percent identity proportions at 60% identity banded
Percent identity proportions at 60% identity banded

[edit] 4.3 40% banded

Percent identity proportions at 40% identity banded
Percent identity proportions at 40% identity banded

[edit] 5 Family contributions

Note: families with no alignments in a class are excluded from these charts. In these boxplots, the midline is the median and the box extends from the first to the third quartile. The whiskers extend to the most extreme data point no more than interquartile range from the box. (?)

Observations: ordered families have higher median family contributions. That means fluctuations in families should bother them less.

[edit] 6 Sequences

Number of unique sequences contributed by each family
Number of unique sequences contributed by each family

[edit] 7 Alignments

Number of alignments contributed by each family
Number of alignments contributed by each family

[edit] 8 Substitutions

Note: INCLUDE THIS ONE IN THE RESULTS BUT NOT THE SEQUENCE OR ALIGNMENT ONE.


Number of substitutions contributed by each family
Number of substitutions contributed by each family
XHTML 1.1 CSS 2 Sec 508