PAM250 extrapolations

Note: these PAM250 extrapolations have not been rounded to the nearest integer. They have been done with Failed to parse (Cannot write to or create math temp directory): 10*log_{10}

instead of Failed to parse (Cannot write to or create math temp directory): 2*log_2
like the other matrices, because Benner uses that C.

Contents

[edit] 1 PAM250 extrapolation sums

Note by Celeste: Audra, could you interpret this for me? I still am not sure what these values are telling us


Note by Audra: The sum gives an idea of how unlikely the matrix is, overall--the more negative numbers and the more extreme those negative numbers are compared to the incidence and magnitude of the positive numbers, the more negative the sum will be. The absolute sum gives an idea of the "magnitude" of the matrix from random chance--ie, the more numbers further away from 0, the higher this number is; the average value's distance from 0 is that number / 210. Originally, these numbers were used as a way of judging the scale of the PAM250 class differences, since the disordered differences were so much higher than the ordered ones.


Sums are of the upper triangular extrapolated PAM250 log odds matrix. Includes diagonals.

Disordered sum Ordered sum Disordered absolute sum Ordered absolute sum
85% -351 -402 1335 1331
60% -90 -326 955 1203
40% 31 -201 826 1018

[edit] 2 PAM250 class differences

Disordered Ordered
60% - 85% 548.47 424.02
40% - 85% 764.96 569.38
40% - 60% 289.37 230.84

[edit] 3 PAM250 expected values

Had to check this--expected values should all be negative, and the positive sum for D40 was worrying me.

Expected value
D85 PAM250 -0.73
O85 PAM250 -0.92
D60 PAM250 -0.31
O60 PAM250 -0.70
D40 PAM250 -0.27
O40 PAM250 -0.49

[edit] 4 Comparison with Genetic Code matrix

Disordered gets much closer to the genetic code matrix over time...but they are all very very different from the genetic code matrix still.

Disordered Ordered Difference
85% 1088.97 1084.13 4.84
60% 734.81 990.99 -256.18
40% 635.16 819.56 -184.40

For comparison of how different that is, the differences between disorder and order PAM250s are less, that is, they are more similar to each other than they are to the genetic code matrix:

PAM250
D - O
85% 502.43
60% 530.30
40% 469.39

Despite *not* having any steady increase/decrease pattern from 85% to 40% in the difference between order and disorder, disorder compared to the genetic code matrix steadily decreases as divergence increases.

This might be because of high diagonal values. Table of nondiagonal differences between PAM250s and GCM:

Disordered Ordered Difference
85% 898.99 902.84 -3.84
60% 569.15 815.90 -246.74
40% 489.34 658.20 -168.86

[edit] 4.1 log2 Test

All the GC comparisons had to be done with Failed to parse (Cannot write to or create math temp directory): 10log_{10}

values, but what about when we use the Failed to parse (Cannot write to or create math temp directory): 2log_{2}
substitution matrices?
Disordered Ordered Difference
85% 291.45 292.53 -1.09
60% 276.59 303.72 -27.13
40% 292.72 288.55 4.17
XHTML 1.1 CSS 2 Sec 508