ENV recently reported on new research on long non-coding RNAs. We observed that a prior commitment to the “central dogma” of molecular biology has influenced assumptions about the non-coding portions of the genome. On further reflection, we decided that some readers may appreciate and benefit from a bit of clarification.
In the article, we offered the following definition of the central dogma: “DNA is transcribed into RNA and RNA is translated into amino acids to make proteins.” Yes, that’s not bad.
Indeed, a survey of biochemistry and genetic textbooks spanning several decades gives the following definitions of the central dogma (please note that not all textbooks actually use the phrase “central dogma”). Here, for example, is the caption under a flow chart showing the relationship “DNA ? RNA ? Protein” with a circle to indicate DNA replication: “The central dogma of molecular genetics, showing the flow of genetic information via the three fundamental processes of replication, transcription, and translation. Later we shall see that the central dogma had to be modified [emphasis added].” The source: Lehninger Principles of Biochemistry, Albert L. Lehninger (Johns Hopkins University School of Medicine), Worth Publishers, 1982, p. 792.
How was the central dogma modified? According to this text:
It was observed that onset of protein synthesis in cells is accompanied by an increase in the RNA content of the cytoplasm and an increase in its rate of turnover. These and other observations led Francis Crick to propose, as part of the central dogma of molecular genetics, that RNA serves to carry genetic information from DNA to the process of protein biosynthesis in the ribosome. Later, in 1961, Francois Jacob and Jacques Monod proposed the name messenger RNA for that portion of the total cell RNA carrying the genetic information from DNA to the ribosomes…[p. 853]
Regarding the discovery of reverse transcriptase:
Their discovery aroused much attention, particularly because it constituted molecular proof that genetic information can sometimes flow “backward,” i.e. from RNA to DNA. It also provided a mechanism for the incorporation into the host-cell genome of cancer genes carried in the form of RNA by RNA viruses. Because of this discovery the central dogma of molecular biology has had to be restated, as shown in Figure 28-25. [The figure demonstrates that information can flow from RNA to DNA.] The RNA viruses containing reverse transcriptases are also known as retroviruses…” [p. 863, emphasis added]
Regarding the discovery that RNA can make more RNA: “These properties thus require that the central dogma of molecular genetics be modified even further, as shown in Figure 28-26″ (p. 865, emphasis added).
Some additional citations:
From Biochemistry 2nd ed., Lippincott Illustrated Reviews, Pamela C. Champe and Richard A. Harvey, J.B. Lippincott Company, 1994, p. 389, emphasis added:
Genetic information, stored in the chromosomes and transmitted to daughter cells through DNA replication is expressed through transcription to RNA and, in the case of mRNA, subsequent translation into polypeptide chains (Figure 32.1). This flow of information from DNA to RNA to protein is termed the “central dogma” and is descriptive of all organisms (with the exception of some viruses that have RNA as a repository of their genetic information).
From Garrett and Grisham’s Biochemistry 2nd ed., Harcourt College Publishers, 1999, p. 1014:
In 1958, Francis Crick enunciated the “central dogma of molecular biology” (Figure 31.1). This scheme outlined the residue-by-residue transfer of biological information as encoded in the primary structure of the informational biopolymers, nucleic acids, and proteins. The predominant path of information transfer, DNA ?RNA ? protein, postulated that RNA was an information carrier between DNA and proteins, the agents of biological function.
From The Molecular Biology of the Cell 4th ed., Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, Peter Walter, Garland Science, 2002, p. 301, emphasis added:
The DNA in genomes does not direct protein synthesis itself, but instead uses RNA as an intermediary molecule. When the cell needs a particular protein, the nucleotide sequence of the appropriate portion of the immensely long DNA molecule in a chromosome is first copied into RNA (a process called transcription). It is these RNA copies of segments of the DNA that are used directly as templates to direct the synthesis of the protein (a process called translation). The flow of genetic information in cells is therefore from DNA to RNA to protein (Figure 6-2). All cells, from bacteria to humans, express their genetic information in this way — a principle so fundamental that it is termed the central dogma of molecular biology.
From the same source, p. 301:
Despite the universality of the central dogma, there are important variations in the way information flows from DNA to protein. Principal among these is that RNA transcripts in eukaryotic cells are subject to a series of processing steps in the nucleus, including RNA splicing, before they are permitted to exit from the nucleus and be translated into protein…Finally, although we focus on the production of the proteins encoded by the genome in this chapter, we see that for some genes RNA is the final product.
This text does not use the term “central dogma” but “gene expression” and “flow of information” and describes it in the following manner:
This process of transcription is followed by translation, the synthesis of proteins according to instructions given by mRNA templates. Thus the flow of genetic information, or gene expression, in normal cells is:
DNA ? RNA ? Protein
From Biochemistry 5th ed, Jeremy M. Berg, John L. Tymoczko, and Lubert Stryer (international edition), W.H. Freeman and Company, 2002, p. 118:
This flow of information is dependent on the genetic code, which defines the relation between the sequence of bases in DNA (or its mRNA transcript) and the sequence of amino acids in a protein. The code is nearly the same in all organisms: a sequence of three bases, called a codon, specifies an amino acid.
From the same text, p. 128-129:
Another important class of RNA virus comprises the retroviruses, so called because the genetic information flows from RNA to DNA rather than from DNA to RNA. This class includes human immunodeficiency virus 1 (HIV-1), the cause of AIDS, as well as a number of RNA viruses that produce tumors in susceptible animals. Retrovirus particles contain two copies of a single-stranded RNA molecule. On entering the cell, the RNA is copied into DNA through the action of a viral enzyme called reverse transcriptase (Figure 5.23). The resulting double-helical DNA version of the viral genome can become incorporated into the chromosomal DNA of the host and is replicated along with the normal cellular DNA.
From AP Edition Biology 7th ed., Neil A. Campbell and Jane B. Reece, Pearson Education, 2005, p. 312, emphasis added. This text does not use the term “central dogma”:
Let’s summarize: Genes program protein synthesis via genetic messages in the form of messenger RNA. Put another way, cells are governed by a molecular chain of command: DNA? RNA ? protein.”
From Concepts of Genetics 8th ed., William S. Klug, Michael R. Cummings, and Charlotte A. Spencer, Pearson Education, 2006, p. 232, emphasis in the original text:
Expression of the stored genetic information is a complex process and is the basis for the concept of information flow within the cell. Figure 10-1 shows a simplified illustration of this concept. The initial event is the transcription of DNA, resulting in the synthesis of three types of RNA molecules: messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA). Of these, mRNA are translated into proteins. Each type of mRNA is the product of a specific gene and leads to the synthesis of a different protein. Translation occurs in conjunction with rRNA-containing ribosomes and involves tRNA, which acts as an adapter to convert the chemical information in mRNA to the amino acids that make up proteins. Collectively, these processes serves as the foundation for the central dogma of molecular genetics: DNA makes RNA, which makes proteins.
And lastly, the following reference is not from a textbook, but rather from James Watson’s book on DNA (DNA: The Secret of Life Knopf, 2003, p. 69):
Francis Crick would later refer to this DNA ? RNA ? protein flow of information as the “central dogma.”
Nature published an article in 1970 in which Francis Crick defended his original definition of the central dogma. This was at a time when the discovery of reverse transcriptase (or retroviruses) called the one-way directional nature of the definition into question. Crick defined the central dogma there as follows:
The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information. It states that such information cannot be transferred from protein to either protein or nucleic acid.
While Crick described the flow of information in this article, he distinguished it from his definition of the central dogma. He emphasized that information cannot be transferred from protein to either protein or nucleic acid. He offers four points of clarification about the central dogma:
1) It says nothing about what the machinery of the transfer is made of, and in particular nothing about errors. (It was assumed that, in general, the accuracy of transfer was high.)
2) It says nothing about control mechanisms — that is, about the rate at which the processes work.
3) It was intended to apply only to present-day organisms, and not to events in the remote past, such as the origin of life or the origin of the code.
4) It is not the same, as is commonly assumed, as the sequence hypothesis… In particular the sequence hypothesis was a positive statement saying that the (overall) transfer of nucleic acid ? protein did exist, whereas the central dogma was a negative statement saying that transfer from protein did not exist.
In various textbooks, including Watson’s book on DNA, the central dogma describes a flow of information from DNA to RNA to protein. Crick calls this the sequence hypothesis. Watson’s definition seems to have taken hold in most textbooks today.
Pertinent to current research on non-coding DNA, Crick’s paper as well as other definitions of the central dogma do not specifically address the function of non-coding DNA, which comprises more of the human genome than does DNA that codes for proteins. The prevailing view was that coding DNA was of a higher importance and priority was placed on studying it.
This is an assumption that was drawn from the central dogma or the sequence hypothesis: DNA’s primary function is information transfer in a nucleotide-by-nucleotide reading of the sequence. This assumption was held so strongly that many scientists, including Francis Crick and Leslie Orgel, believed non-coding DNA must be “junk” or otherwise an evolutionary relic.
However, this view of the genome is fast becoming outmoded. Non-coding DNA has functions apart from that of DNA as specified in Crick’s definition of the sequence hypothesis and Watson’s definition of the central dogma.