📝 Original Info
- Title: Origin and evolution of the genetic code: The universal enigma
- ArXiv ID: 0807.4749
- Date: 2008-09-10
- Authors: ** Eugene V. Koonin, Artem S. Novozhilov **
📝 Abstract
The genetic code is nearly universal, and the arrangement of the codons in the standard codon table is highly non-random. The three main concepts on origin and evolution of the code are the stereochemical theory; the coevolution theory; and the error minimization theory. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it is highly robust to translational error but there is a huge number of more robust codes, so that the standard code potentially could evolve from a random code via a short sequence of codon series reassignments. Thus, much of the evolution that led to the standard code can be interpreted as a combination of frozen accident with selection for translational error minimization although contributions from coevolution of the code with metabolic pathways and/or weak affinities between amino acids and nucleotide triplets cannot be ruled out. However, such scenarios for the code evolution are based on formal schemes whose relevance to the actual primordial evolution is uncertain, so much caution in interpretation is necessary. A real understanding of the code's origin and evolution is likely to be attainable only in conjunction with a credible scenario for the evolution of the coding principle itself and the translation system.
💡 Deep Analysis
Deep Dive into Origin and evolution of the genetic code: The universal enigma.
The genetic code is nearly universal, and the arrangement of the codons in the standard codon table is highly non-random. The three main concepts on origin and evolution of the code are the stereochemical theory; the coevolution theory; and the error minimization theory. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it is highly robust to translational error but there is a huge number of more robust codes, so that the standard code potentially could evolve from a random code via a short sequence of codon series reassignments. Thus, much of the evolution that led to the standard code can be interpreted as a combination of frozen accident with selection for translational error minimization although contributions from coevolution of the code with metabolic pathways and/or weak affinities between amino acids and nucleotide triplets ca
📄 Full Content
arXiv:0807.4749v2 [q-bio.GN] 10 Sep 2008
Origin and evolution of the genetic code: The universal enigma
Eugene V. Koonin∗and Artem S. Novozhilov
National Center for Biotechnology Information,
National Library of Medicine, National Institutes of Health, Bethesda, MD 20894
Abstract
The genetic code is nearly universal, and the arrangement of the codons in the stan-
dard codon table is highly non-random. The three main concepts on origin and evolution of
the code are the stereochemical theory, according to which codon assignments are dictated
by physico-chemical affinity between amino acids and the cognate codons (anticodons); the
coevolution theory, which posits that the code structure coevolved with amino acid biosyn-
thesis pathways; and the error minimization theory under which selection to minimize the
adverse effect of point mutations and translation errors was the principal factor of the code’s
evolution. These theories are not mutually exclusive and are also compatible with the frozen
accident hypothesis, i.e., the notion that the standard code might have no special properties
but was fixed simply because all extant life forms share a common ancestor and remained,
mostly, unchanged because of the deleterious effect of codon reassignment. Mathematical
analysis of the structure and possible evolutionary trajectories of the code shows that it is
highly robust to translational misreading but there are a huge number of more robust codes,
so that the standard code potentially could evolve from a random code via a short sequence
of codon series reassignments. Thus, much of the evolution that led to the standard code
can be interpreted as a combination of frozen accident with selection for error minimization
although contributions from coevolution of the code with metabolic pathways and/or weak
affinities between amino acids and nucleotide triplets cannot be ruled out. However, such
scenarios for the code evolution are based on formal schemes whose relevance to the actual
primordial evolution is uncertain, so much caution in interpretation is necessary. A real un-
derstanding of the code’s origin and evolution is likely to be attainable only in conjunction
with a credible scenario for the evolution of the coding principle itself and the translation
system.
Keywords:
Evolution of the genetic code, stereochemical theory, coevolution theory, adap-
tive theory
1
Introduction
Shortly after the genetic code of Escherichia coli was deciphered (Nirenberg et al. 1963), it was
recognized that this particular mapping of 64 codons to 20 amino acids and two punctuation
marks (start and stop signals) is shared, with relatively minor modifications, by all known life
forms on earth (Hinegardner and Engelberg 1963; Woese, Hinegardner, and Engelberg 1964).
∗e-mail: koonin@ncbi.nlm.nih.gov
1
UUU
[F] Phe
UUC
[F] Phe
UUA
[L] Leu
UUG
[L] Leu
UCU
[S] Ser
UCC
[S] Ser
UCA
[S] Ser
UCG
[S] Ser
UAU
[Y] Tyr
UAC
[Y] Tyr
UAA [ ]
Ter
UAG [ ]
Ter
UGU
[C] Cys
UGC
[C] Cys
UGA [ ]
Ter
UGG
[W] Trp
CUU
[L] Leu
CUC
[L] Leu
CUA
[L] Leu
CUG
[L] Leu
CCU
[P] Pro
CCC
[P] Pro
CCA
[P] Pro
CCG
[P] Pro
CAU
[H] His
CAC
[H] His
CAA
[Q] Gln
CAG
[Q] Gln
CGU
[R] Arg
CGC
[R] Arg
CGA
[R] Arg
CGG
[R] Arg
AUU
[I]
Ile
AUC
[I]
Ile
AUA
[I]
Ile
AUG
[M] Met
ACU
[T] Thr
ACC
[T] Thr
ACA
[T] Thr
ACG
[T] Thr
AAU
[N] Asn
AAC
[N] Asn
AAA
[K] Lys
AAG
[K] Lys
AGU
[S] Ser
AGC
[S] Ser
AGA
[R] Arg
AGG
[R] Arg
GUU
[V] Val
GUC
[V] Val
GUA
[V] Val
GUG
[V] Val
GCU
[A] Ala
GCC
[A] Ala
GCA
[A] Ala
GCG
[A] Ala
GAU
[D] Asp
GAC
[D] Asp
GAA
[E] Glu
GAG
[E] Glu
GGU
[G] Gly
GGC
[G] Gly
GGA
[G] Gly
GGG
[G] Gly
Figure 1.
The standard genetic code.
The codon series are shaded in accordance with the polar
requirement scale values (Woese et al. 1966b), which is a measure of an amino acid’s hydrophobicity:
the greater hydrophobicity the darker the shading (the stop codons are shaded black).
Even a perfunctory inspection of the standard genetic code table (Fig.
1) shows that the
arrangement of amino acid assignments is manifestly nonrandom (Woese 1965a; Woese 1967;
Crick 1968; Ycas 1969).
Generally, related codons (i.e., the codons that differ by only one
nucleotide) tend to code for either the same or two related amino acids, i.e., amino acids that are
physico-chemically similar (although there are no unambiguous criteria to define physicochemical
similarity).
The fundamental question is how these regularities of the standard code came
into being, considering that there are more than 1084 possible alternative code tables if each
of the 20 amino acids and the stop signal are to be assigned to at least one codon.
More
specifically, the question is, what kind of interplay of chemical constraints, historical accidents,
and evolutionary forces could have produced the standard amino acid assignment, which displays
many remarkable properties. The features of the code that seem to require a special explanation
include, but are not limited to, the block structure of the code, which is thought to be a necessary
condit
…(Full text truncated)…
Reference
This content is AI-processed based on ArXiv data.