Evolution
Intelligent Design
Circular Reasoning in Origin of Life Research: Insights from a Recent Study on the Genetic Code

After studying countless articles purporting to explain the origin or transformation of life through natural processes, I have seen certain patterns emerge repeatedly. One of the most common is employing circular reasoning. The logic usually proceeds as follows:
- Investigators examine some trait that exists in different species, such as a specific protein.
- They assume that the different versions of the trait evolved from a common ancestor, and they attempt to envision the ancestral version, such as an ancestral protein’s amino acid sequence.
- They imagine what steps could have transformed the ancestral version into the modern versions, such as the series of amino acid changes in an evolving protein.
Researchers rarely attempt to demonstrate the plausibility of the proposed steps. Just crafting an engaging story is often sufficient in their minds. This pattern is clearly demonstrated in an article recently published in PNAS, “Order of amino acid recruitment into the genetic code resolved by last universal common ancestor’s protein domains.” It purports to elucidate the origin and evolution of the genetic code, but it is entirely based on circular reasoning.
Enigma of the Genetic Code
Explaining the origin of the genetic code has proven to be one of the most intractable problems in origin of life research. Even the most primitive biological information storage and retrieval system requires the following:
- The sequence of nucleotides in DNA must encode the sequences of amino acids corresponding to all essential cellular proteins.
- A complex protein must separate the two DNA strands to allow for the genetic information to be accessed. The protein in modern cells is called helicase.
- As the strands separate, DNA starts to coil. A protein must uncoil the DNA by breaking it, passing one section of the DNA through the break, and then mending the DNA fragments. The protein in modern cells is called topoisomerase.
- A suite of proteins must transfer the information from DNA to RNA in a process called transcription.
- Another suite of proteins must translate the information in RNA into the amino acid sequences corresponding to proteins.
Origins researchers assume that the original system was simpler than today. The modern genetic code encodes 20+ amino acids (e.g., valine) into sets of three nucleotides known as codons (e.g., GTA). The first autonomous cell is believed to have used only around half of the amino acids used today. Trifonov (2000) proposes that the original set includes the following nine:
- Glycine (Gly, G)
- Alanine (Ala, A)
- Valine (Val, V)
- Aspartic acid (Asp, D)
- Proline (Pro, P)
- Serine (Ser, S)
- Glutamic acid (Glu, E)
- Leucine (Leu, L)
- Threonine (Thr, T)
The other amino acids are believed to have been added to the code sequentially. The last ones to be added could not have formed through natural processes in non-trace quantities, so cells are believed to have evolved the chemical pathways to manufacture them before they were incorporated into the genetic code. The amino acids that must have initially been manufactured in cells include those with the more complicated atomic structures such as histidine (His, H) and tyrosine (Tyr, Y).
New Order of Incorporation
Previous studies postulated the order of amino acid incorporation based on such factors as the ease with which the amino acids could have formed on the early Earth through simple chemistry, the complexity of the amino acids’ structure, and their biosynthetic pathways in modern cells. In contrast, the PNAS study compared the sequences of closely related proteins in modern organisms to reconstruct the sequences of ancestral proteins believed to reside in the last universal common ancestor (LUCA) of life today. The investigators also reconstructed sequences of proteins believed to reside in even earlier cells. They compared the sequences in ancient proteins to those today to postulate the amino acids’ order of incorporation.
An article posted on Phys.org, “The origin of genetic code: Study finds textbook version needs revision,” summarizes the research as follows:
To get a handle on when a specific amino acid likely was recruited into the genetic code, the researchers used statistical data analysis tools to compare the enrichment of each individual amino acid in protein sequences dating back to LUCA, and even farther back in time. An amino acid that shows up preferentially in ancient sequences was likely incorporated early on. Conversely, LUCA’s sequences are depleted for amino acids that were recruited later but became available by the time less ancient protein sequences emerged.
The study still concludes that the simplest amino acids comprised the original code, but it proposes an order of incorporation of later amino acids that differs from previous proposals:
We find that smaller amino acids were added to the code earlier, with no additional predictive power in the previous consensus order. Metal-binding (cysteine and histidine) and sulfur-containing (cysteine and methionine) amino acids were added to the genetic code much earlier than previously thought. Methionine and histidine were added to the code earlier than expected from their molecular weights and glutamine later.
Circular Reasoning and Causal Circularity
Neither the PNAS study nor earlier studies present substantive details about how the genetic code originated or how amino acids were later added. They simply assume that everything transpired through undirected natural processes and then, based on circumstantial evidence, construct the order in which they were incorporated. When serious questions are raised about the details of what would have been required to engineer the genetic systems or significantly modify them, the entire narrative collapses.
The central paradox is that neither proteins nor the genetic code could have evolved until the basic machinery was in place to translate DNA sequences — or RNA sequences in the RNA World hypothesis — into amino acid sequences. Yet the proteins required for these processes, such as polymerase and topoisomerase, require the amino acids that are believed to have been added after the enzymes were needed. Late-arriving amino acids are required for such essential tasks as maintaining protein stability, identifying target molecules, adding nucleotides to a growing chain, and joining broken chains (here, here). They are even required in the enzymes that manufacture them (here, here). This challenge has been termed by biologist Ann Gauger causal circularity (here, here).
The PNAS study and earlier studies represent honest attempts to explain the genetic code within the materialist framework, which only allows for natural causes to be considered. The investigators’ methodologies and conclusions are reasonable given their philosophical commitments. Yet their research entirely sidesteps the central challenge: explaining how the intricate systems of translation and replication could emerge without pre-existing machinery and the structurally complex amino acids that they are believed to precede. An honest evaluation of the evidence, freed from bias, inevitably leads to the conclusion that the information systems in cells resulted from a mind.