Finding the information contained in
insulin is straight forward. The math is tedious, but the procedure is at least defined,
and today insulin contains 189 bits of information. So how much of this information does
insulin require to provide a selective advantage? This question is much more difficult to
answer.
This is where human insight is necessary. Some amino acids side chains
have very similar chemical properties. Others are similar in size. Thus, some amino acid
substitutions should be allowed even if they are not found. These are summarized below
with the chemical trait given in parentheses:
Group1: leucine , isoleucine, valine, alanine, and methionine (do not like water, so they
tend to cluster on the inside of the protein).
Group 2: tyrosine, phenylalanine, and tryptophan (very large amino acids that can
influence protein folding).
Group 3: aspartate and glutamate (acidic - proton donors, like water).
group 4: histidine, arginine, and lysine (basic - proton acceptors, like water).
Group 5: glutamine and asparagine (charged and like water).
Group 6: serine and threonine (like water, tend to be found on the outside of protein).
Group 7: glycine (very small).
Group 8: proline (introduces a bend into the chain).
Group 9: cysteine (cross links peptide chains).
Based on these properties, this chapter will propose the following procedure
to calculate knowledge: if a column in a multiple alignment sequence like figure 4.3 only
contains a single amino acid, or if the variation is limited to any one of the above 9
groups, then the column should be included in the calculation for molecular knowledge. If
the column contains amino acids from different groups then it should be excluded.
Furthermore, for the columns included in the molecular knowledge
calculation, all amino acids in the same group must be included whether they are present
in the alignment or not. For example, at position 19 in table 4.2, only isoleucine and
valine are found. But because alanine, methionine, and leucine belong to group 1, it is
assumed that these amino acids can be substituted at position 19 without destroying the
function of insulin. With this procedure, table 4.2 becomes table 4.4. The parenthesis in
table 4.4, represent amino acids that are not present in the multiple sequence alignment
(figure 4.3). The positions that are assigned 0 bits all have amino acids from more than
one of the 9 predefined groups.
Table 4.4: Molecular Knowledge in B chain of Insulin
pos |
allowed amino
acids |
bits |
pos |
allowed amino
acids |
bits |
2 |
phe, ala, leu, val |
0 |
17 |
phe, tyr ,(trp) |
3.7 |
3 |
val, ala, pro |
0 |
18 |
leu, (ile),(val), (ala), (met) |
1.8 |
4 |
pro, lys, asn |
0 |
19 |
val, ile, (ala), (leu), (met) |
1.8 |
5 |
gln, (asn) |
4 |
20 |
cys |
5 |
6 |
his, arg, (lys) |
2.7 |
21 |
gly |
4 |
7 |
leu, (ile), (leu), (val), (met) |
1.8 |
22 |
asp, glu |
4 |
8 |
cys |
5 |
23 |
arg, (lys), (his) |
2.7 |
9 |
gly |
4 |
24 |
gly |
4 |
10 |
ala, pro, ser |
0 |
25 |
phe, (tyr), (trp) |
3.7 |
11 |
his, (lys), (arg) |
2.4 |
26 |
phe, tyr, (trp) |
3.7 |
12 |
leu, (ile), (val), (ala), (met) |
1.8 |
27 |
tyr, (phe), (trp) |
3.7 |
13 |
val, (ile), (leu), (ala), (met) |
1.8 |
28 |
thr, ser, asn |
0 |
14 |
glu, asp |
4 |
29 |
pro |
4 |
15 |
ala, (leu), (ile),(val), (met) |
1.8 |
30 |
lys, arg, (his) |
2.4 |
16 |
leu, (ala), (val),(Ile), (met) |
1.8 |
31 |
ala, thr, ser, - |
0 |
Total = 76 bits
Example calculation: at position 3 val, ala and pro are found. Because these amino acids
are in different groups, the knowledge is defined as zero bits. At position 16 only leu is
found, but ala, val, ile, and met probably will not be that damaging to protein function
because they are in the same group. The total number of codons that encode these 5 amino
acids is 18. Thus, knowledge = 3.32 x log[ 64/18] = 1.8 bits.
Comparing table 4.4 to table 4.2, it is clear that knowledge is much less than
information (76 bits vs. 108 bits). The ratio of knowledge to information for the insulin
B chain is thus 76/108 = 70%. The same procedure is repeated for the A chain as shown in
table 4.5.
Table 4.5: Molecular Knowledge in Insulin A Chain
pos |
allowed amino
acids |
bits |
pos |
allowed amino
acids |
bits |
32 |
gly |
4 |
43 |
ser, asn, asp |
0 |
33 |
ile, (val), (leu), (ala), (met) |
1.8 |
44 |
leu, ile, (val), (ala), (met) |
1.8 |
34 |
val, (leu), (ile), (ala), (met) |
1.8 |
45 |
phe, tyr, (trp) |
3.7 |
35 |
glu, asp |
4 |
46 |
gln, asp |
0 |
36 |
gln, (asn) |
4 |
47 |
leu, (val), (ile), (ala), (met) |
1.8 |
37 |
cys |
5 |
48 |
glu, gln |
0 |
38 |
cys |
5 |
49 |
asn, ser,his |
0 |
39 |
glu, his, thr, ala |
0 |
50 |
tyr, (phe), (trp) |
3.7 |
40 |
asn, lys, arg, ser, gly |
0 |
51 |
cys |
5 |
41 |
pro,thr,ile,val |
0 |
52 |
asn, (gln) |
4 |
42 |
cys |
5 |
|
|
|
Total
= 51 bits
next: Insulin Models
home: Intelligent Design and the Origin of Life
|