|Published (Last):||9 August 2008|
|PDF File Size:||13.4 Mb|
|ePub File Size:||10.27 Mb|
|Price:||Free* [*Free Regsitration Required]|
Not long ago, I read a bit on the problem of random packing — if you park cars randomly on a street, how many cars can you fit in before you run out of space? During their work on this problem and its related variants, mathematicians Itoh and Solomon discovered a seemingly surprising connection to… molecular biology. However, the human body is capable of creating only 20 amino acids. The following table summarizes which acid is encoded by which triplet:.
There are several interesting phenomena in this table. I am no professional in the field of biochemistry, but I will give my humble thoughts on why they occur. First, note that the amino acids tend to be encoded in clusters. The answer to the first is probably rooted in resistance to mutations. Suppose that due to a random mutation, one of the letters of the codon changes. If the mutation is random — i.
However, if the same acid had codons that are all different by two or three nucleotides, chances are that any mutation would change it. This can be more formally stated in terms of Hamming distance — which, given two strings, tells in how many places they are different from one another. Codons which are further away from each other will have a smaller chance to turn into one another due to mutation, and scattering the encoding across the whole table means a large Hamming distance from one codon to another.
The answer to the second question, why are there variations in the last nucleotide, I do not know, but can only speculate.
One idea is that in the distant past, organisms were simpler, and used less building blocks. Specifically, codons were of length two, and not three, and in the transition from two to three, a nucleotide was added at the end. Another idea, by my friend Eyal, is that the ribosome, which is responsible for putting the proteins together, has lower accuracy after already processing two nucleotides, and hence we want redundancy in the third.
A second question regarding the distribution of encodings in the table above is, why are there 21 different functions? Why not more, and why not less?
It is here that a mathematical model may provide some insight, as found by Itoh and Solomon. A codon may be written as a sequence of three letters out of a choice of the four letters CGAU.
Choose one at random, and write it down. However, write them down only if their Hamming distance from each of the strings you have already written down is at least 2. At the end, all the strings that you have written down will be different from each other by more than 1 bit. Sometimes the process ends and you have written only 14 strings, and other times you have written a full 32 one for each opposite corner of the six dimensional hypercube generated by the strings.
The natural question to ask is, if you repeat this process, what is average number of strings that you eventually write down? I did not solve this analytically, but I did write a Lisp program to calculate it. The average of runs turns out to be about This is strikingly close to the 21 different functions encoded by codons… Of course, this may be entirely coincidental.
Whatever way the modern genetic apparatus has developed, it was certainly more complex than picking out random strings and checking if they are close to each other or not. Further, a Hamming distance of 2 may still describe codons which are different by only one nucleotide, since it takes two bits for each letter.
However, one cannot but think that there is indeed a connection, for as we have established above, resistance to mutations is caused by having a large Hamming distance between different amino acids. In any case, this is just a quotation of a result, not a proposed model. Still, it is interesting to see that our own genetics are basically governed by the same considerations and principles we use when designing an error-resistant communication system and encoding.
Anyway, your point about 21 being a very likely number for a random choice process is interesting, especially if it could be modelled onto an evolutionary process. It might interest you. You are commenting using your WordPress. You are commenting using your Google account. You are commenting using your Twitter account. You are commenting using your Facebook account. Notify me of new comments via email. Notify me of new posts via email.
Skip to content Sarcastic Resonance. The following table summarizes which acid is encoded by which triplet: There are several interesting phenomena in this table. Share this: Twitter Facebook. Like this: Like Loading Leave a Reply Cancel reply Enter your comment here Fill in your details below or click an icon to log in:.
Oh no, there's been an error
Error Correcting Capacity. This sequence will be about the error-correcting capacity of a linear code. This metric is necessary to justify the principle of decoding that is returning the nearest codeword to the received vector. The metric principle is based on the following concept: the Hamming distance between two vectors is the number of coordinates in which they differ.
Code parfait et code MDS