A new paper in the journal Cellular and Molecular Life Sciences, “Integration of syntactic and semantic properties of the DNA code reveals chromosomes as thermodynamic machines converting energy into information,” argues that cellular mechanisms involved in processing genetic information make up an irreducibly complex system. The system requires genetic information, genetic machinery keyed to read that genetic information, as well as specific chromosomal organization. All of these components are necessary for what the paper calls “the organisational complexity of the genetic regulation system.”
To be precise, the paper uses the term “irreducible organization” but it amounts to the same thing as biochemist Michael Behe’s “irreducible complexity,” and points implicitly to the same challenge to Darwinian accounts of origins.
The paper aims to critique the reductionist “Jacob-Monod paradigm,” which fails to appreciate the complexity of genetic information, as well as the interaction between transcription factors and their target genes. Of course we’re all familiar with genetic information in DNA being required to produce proteins. But the paper argues that in addition to the “digital information” in the primary DNA sequence, there is also “analog information” in the three-dimensional structure of chromosomes:
Recent studies have made it increasingly evident that the primary sequence of DNA in addition to the linear genetic code also provides three-dimensional information by means of spatially ordered supercoil structures relevant to all DNA transactions, including transcriptional control. In this review, we adopt the previously introduced terms “analog” and “digital” with regard to the two logically distinct types of information provided by the DNA. … [A]ny DNA gene is a carrier of digital information by virtue of its unique base sequence. Moreover, a gene conceived as an isolated piece of linear code (no matter whether this isolation occurs at the level of transcription or posttranscriptional processing), is a discontinuous entity that can be expressed or not, thus principally consistent with an “on-or-off” logic and, therefore, belonging to digital information type. Conversely, the physicochemical properties of DNA, as exemplified by supercoiling and mechanical stiffness, are determined not by individual base pairs but by the additive interactions of successive base steps. Supercoiling is by definition a continuous parameter ranging between positive and negative values (you can have more or less of it), and so belongs to analog information type.
On top of the various forms of information in DNA, chromosomal structure is vital for gene regulation, as it helps control the interaction between transcription factors (TFs) that initiate transcription of their target genes (TGs):
several studies have proposed that the organisation of chromosomal structure on the evolutionary time scale is largely determined by the need of spatial optimisation of TF-TG interactions.
The paper then argues that the system of genetic regulation in cells is characterized by “irreducible organization”:
Genetic regulation is crucial not only for sustaining the self-reproduction of a cell but also for substituting its worn-out constituents. This implies that a genetic regulation system, as a system consisting of physical elements, must be able not only to perform its primary function but also to perceive any internal changes of state so that it retains the potential, for example, to replenish its own components. In other words, it has to be self-referential. This peculiarity of organisation becomes conspicuous when compared to information coding in natural language, the syntactic and semantic properties of which provide logically different types of information. Syntax determines the structure of the rules of language and, thus, the way in which the words are assembled in sentences, whereas semantics determine the meaning of the words and so the available vocabulary. However, the structural rules of language cannot determine the meanings of the words, and nor is the vocabulary determinative for the structural rules of the language (we do not concern ourselves with any generative mechanisms relevant to the formal language theory here). Therefore, viewed as a coding system composed of two non-convertible types of information, natural language is not self-referential. By the same token, the Jacob-Monod paradigm separating the gene regulatory context from the genetic information is at variance with self-referential organisation. Notably, we do not use this term in the sense of elaborated mathematical concepts of distinction, circulation, feedback, re-entry, recursion, etc. Self-referential organisation, as we put it here, implies inter-conversion of information between logically distinct coding systems specifying each other reciprocally. Thus, the holistic approach assumes selfreferentiality (completeness of the contained information and full consistency of the different codes) as an irreducible organisational complexity of the genetic regulation system of any cell.
Put another way, this implies that the structural dynamics of the chromosome must be fully convertible into its genetic expression and vice versa. Since the DNA is an essential carrier of genetic information, the fundamental question is how this self-referential organisation is encoded in the sequence of the DNA polymer.
The article even specifies that there are “Three basic components underlying the irreducible organisational complexity of any living cell” where “the organisation is essentially circular with all three basic components standing in relationship to reciprocal determination.” Those three components are specified as transcriptional machinery, DNA topology, and metabolic energy. The authors are perplexed by how the “irreducible” and “circular” organization of this system arose since they admit, “we face a ‘chicken or egg’ dilemma — on the one hand the TF-TG interactions are determinative for the chromosomal structure, and on the other hand this very same structure determines the regulatory interactions.”
As noted, the paper recognizes that there are other types of information in DNA beyond merely the sequence of bases. What’s incredible is that even though these two types of information are specified through different physical means, they nonetheless interact to regulate gene expression. The article explains that the supercoiling structure of DNA is vital to regulating gene expression, and at the same time it’s not specified by the base-pair sequence. However, the base-pair sequence does interact with the supercoiling, and is more prone to localized untwisting to allow transcription:
In general, the regions of chromosomes that are sites for topological manipulation (such as, e.g., transcription and replication initiation sites) correlate strongly with low base stacking energies and high flexibility. Indeed, the sequences at the start sites of transcription and replication are prone to localised untwisting, whereas the termination sites — and especially the regions between two converging translocases (be it a replisome or RNA polymerase) — appear to easily adopt a writhed configuration acting as supercoil repositories. The emerging view is that manipulation of superhelical density and regulation of partitioning between twist and writhe is a fundamental property of both.
There are other levels of organization that have to do with the location and shape of the chromosome in time and space:
spatiotemporal integration of the analog (syntactic) and digital (semantic) properties of the chromosomal DNA code appears as a basic device coordinating the bacterial growth program. This coordination is facilitated by organising genes in a highly conserved order and orientation…
The article concludes: “chromosomes act as machines in which coordinated topological transitions operating at local (e.g. transcription initiation sites), regional (constrained superhelical domains) and global (entire chromosomes) levels specify the genetic activity.”
All of this is pretty technical, but a summary sent to me by a pro-ID biologist helps explain how all of these difference levels of information are coordinated with one-another to facilitate basic cellular functions:
The authors describe the supercoiling (superhelicity) of the DNA, which affects levels of transcription in a rheostatic (analog) manner, is arranged in a gradient from the origin of replication to the terminus. Anabolic functions, which are expressed early in the cell cycle, show a preference to be on the leading strand (with regard to replication) and are organized close to the origin of replication, whereas catabolic functions are expressed late in the cell cycle, organized toward the terminal region of replication. Furthermore, the anabolic genes require high negative superhelicity for transcription, which is increased during rapid growth and therefore rapid replication of the DNA. So, during rapid growth, when anabolic functions become a limiting factor, a bottleneck if you will, the DNA replication generates more strain on the chromosome, i.e. more negative superhelicity, which is exactly the parameters for increasing anabolic functions. Brilliant.
How did these various independent levels of information become “coordinated”? Brilliance seems the best explanation for something brilliant.