Quantitative Biology / q-bio.GN Quantitative Biology / q-bio.PE

Origin and evolution of the genetic code: The universal enigma

February 23, 2026

Reading time: 7 minute

...

📝 Original Info

Title: Origin and evolution of the genetic code: The universal enigma
ArXiv ID: 0807.4749
Date: 2008-09-10
Authors: ** Eugene V. Koonin, Artem S. Novozhilov **

📝 Abstract

The genetic code is nearly universal, and the arrangement of the codons in the standard codon table is highly non-random. The three main concepts on origin and evolution of the code are the stereochemical theory; the coevolution theory; and the error minimization theory. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it is highly robust to translational error but there is a huge number of more robust codes, so that the standard code potentially could evolve from a random code via a short sequence of codon series reassignments. Thus, much of the evolution that led to the standard code can be interpreted as a combination of frozen accident with selection for translational error minimization although contributions from coevolution of the code with metabolic pathways and/or weak affinities between amino acids and nucleotide triplets cannot be ruled out. However, such scenarios for the code evolution are based on formal schemes whose relevance to the actual primordial evolution is uncertain, so much caution in interpretation is necessary. A real understanding of the code's origin and evolution is likely to be attainable only in conjunction with a credible scenario for the evolution of the coding principle itself and the translation system.

💡 Deep Analysis

Deep Dive into Origin and evolution of the genetic code: The universal enigma.

📄 Full Content

arXiv:0807.4749v2 [q-bio.GN] 10 Sep 2008 Origin and evolution of the genetic code: The universal enigma Eugene V. Koonin∗and Artem S. Novozhilov National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 Abstract The genetic code is nearly universal, and the arrangement of the codons in the stan- dard codon table is highly non-random. The three main concepts on origin and evolution of the code are the stereochemical theory, according to which codon assignments are dictated by physico-chemical aﬃnity between amino acids and the cognate codons (anticodons); the coevolution theory, which posits that the code structure coevolved with amino acid biosyn- thesis pathways; and the error minimization theory under which selection to minimize the adverse eﬀect of point mutations and translation errors was the principal factor of the code’s evolution. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis, i.e., the notion that the standard code might have no special properties but was ﬁxed simply because all extant life forms share a common ancestor and remained, mostly, unchanged because of the deleterious eﬀect of codon reassignment. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it is highly robust to translational misreading but there are a huge number of more robust codes, so that the standard code potentially could evolve from a random code via a short sequence of codon series reassignments. Thus, much of the evolution that led to the standard code can be interpreted as a combination of frozen accident with selection for error minimization although contributions from coevolution of the code with metabolic pathways and/or weak aﬃnities between amino acids and nucleotide triplets cannot be ruled out. However, such scenarios for the code evolution are based on formal schemes whose relevance to the actual primordial evolution is uncertain, so much caution in interpretation is necessary. A real un- derstanding of the code’s origin and evolution is likely to be attainable only in conjunction with a credible scenario for the evolution of the coding principle itself and the translation system. Keywords: Evolution of the genetic code, stereochemical theory, coevolution theory, adap- tive theory 1 Introduction Shortly after the genetic code of Escherichia coli was deciphered (Nirenberg et al. 1963), it was recognized that this particular mapping of 64 codons to 20 amino acids and two punctuation marks (start and stop signals) is shared, with relatively minor modiﬁcations, by all known life forms on earth (Hinegardner and Engelberg 1963; Woese, Hinegardner, and Engelberg 1964). ∗e-mail: koonin@ncbi.nlm.nih.gov 1 UUU [F] Phe UUC [F] Phe UUA [L] Leu UUG [L] Leu UCU [S] Ser UCC [S] Ser UCA [S] Ser UCG [S] Ser UAU [Y] Tyr UAC [Y] Tyr UAA [ ] Ter UAG [ ] Ter UGU [C] Cys UGC [C] Cys UGA [ ] Ter UGG [W] Trp CUU [L] Leu CUC [L] Leu CUA [L] Leu CUG [L] Leu CCU [P] Pro CCC [P] Pro CCA [P] Pro CCG [P] Pro CAU [H] His CAC [H] His CAA [Q] Gln CAG [Q] Gln CGU [R] Arg CGC [R] Arg CGA [R] Arg CGG [R] Arg AUU [I] Ile AUC [I] Ile AUA [I] Ile AUG [M] Met ACU [T] Thr ACC [T] Thr ACA [T] Thr ACG [T] Thr AAU [N] Asn AAC [N] Asn AAA [K] Lys AAG [K] Lys AGU [S] Ser AGC [S] Ser AGA [R] Arg AGG [R] Arg GUU [V] Val GUC [V] Val GUA [V] Val GUG [V] Val GCU [A] Ala GCC [A] Ala GCA [A] Ala GCG [A] Ala GAU [D] Asp GAC [D] Asp GAA [E] Glu GAG [E] Glu GGU [G] Gly GGC [G] Gly GGA [G] Gly GGG [G] Gly Figure 1. The standard genetic code. The codon series are shaded in accordance with the polar requirement scale values (Woese et al. 1966b), which is a measure of an amino acid’s hydrophobicity: the greater hydrophobicity the darker the shading (the stop codons are shaded black). Even a perfunctory inspection of the standard genetic code table (Fig. 1) shows that the arrangement of amino acid assignments is manifestly nonrandom (Woese 1965a; Woese 1967; Crick 1968; Ycas 1969). Generally, related codons (i.e., the codons that diﬀer by only one nucleotide) tend to code for either the same or two related amino acids, i.e., amino acids that are physico-chemically similar (although there are no unambiguous criteria to deﬁne physicochemical similarity). The fundamental question is how these regularities of the standard code came into being, considering that there are more than 1084 possible alternative code tables if each of the 20 amino acids and the stop signal are to be assigned to at least one codon. More speciﬁcally, the question is, what kind of interplay of chemical constraints, historical accidents, and evolutionary forces could have produced the standard amino acid assignment, which displays many remarkable properties. The features of the code that seem to require a special explanation include, but are not limited to, the block structure of the code, which is thought to be a necessary condit

…(Full text truncated)…

🇰🇷 이 논문을 한글로 읽기

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on ArXiv data.

Origin and evolution of the genetic code: The universal enigma

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network

A new distance for high level RNA secondary structure comparison

An Adaptive Strategy for the Classification of G-Protein Coupled Receptors

Start searching

No results found