A new scientific paper, titled “Dichotomy in the definition of prescriptive information suggests both prescribed data and prescribed algorithms: biosemiotics applications in genomic systems,” seeks to study functional information in biological systems.
As we’ve discussed before, the concept of “Shannon information” does not distinguish between functional information and non-functional information and, for this reason, it is not always useful as a measure of biological information. As a result, many theorists have suggested developing methods for measuring biological information that take into account the function of the sequence. There have been a variety of terms employed for measuring functional biological information — complex and specified information (CSI), prescriptive information (PI), or Functional Sequence Complexity (FSC). The paper states:
Biological information frequently manifests its “meaning” through instruction or actual production of formal bio-function. Such information is called Prescriptive Information (PI). PI programs organize and execute a prescribed set of choices. Closer examination of this term in cellular systems has led to a dichotomy in its definition suggesting both prescribed data and prescribed algorithms are constituents of PI. This paper looks at this dichotomy as expressed in both the genetic code and in the central dogma of protein synthesis. An example of a genetic algorithm is modeled after the ribosome, and an examination of the protein synthesis process is used to differentiate PI data from PI algorithms.
(David J. D’Onofrio, David L. Abel, and Donald E. Johnson, “Dichotomy in the definition of prescriptive information suggests both prescribed data and prescribed algorithms: biosemiotics applications in genomic systems,” Theoretical Biology and Medical Modelling, Vol. 9:8 (2012).)
They are not the only ones to recognize the need to measure biological information in terms of its function. In 2003, Nobel Prize-winning origin-of-life researcher Jack Szostak wrote a review article in Nature lamenting that the problem with “classical information theory” is that it “does not consider the meaning of a message” and instead defines information “as simply that required to specify, store or transmit the string.” According to Szostak, “a new measure of information — functional information — is required” in order to take account of the ability of a given protein sequence to perform a given function. In 2007 Szostak co-published a paper in Proceedings of the National Academy of Sciences, with Robert Hazen and other scientists, furthering these arguments. Criticizing those who insist on measuring biological complexity using the outmoded tools of Shannon information, the authors wrote, “A complexity metric is of little utility unless its conceptual framework and predictive power result in a deeper understanding of the behavior of complex systems.” Thus they “propose to measure the complexity of a system in terms of functional information, the information required to encode a specific function.”
This new paper in Theoretical Biology and Medical Modelling suggests taking much the same approach, looking specifically at how the ribosome handles information. According to the paper:
In our view the ribosome is a machine that executes a sequence of discrete instructions operating upon a set of arbitrary discrete codon packages (PI data) producing a protein product as its output. The machine can produce any variation of protein product by simply changing the syntax of both the tRNA (anti-codon/amino acid map) and the DNA codons.
The authors argue that the ribosome uses an “algorithm” to create proteins:
An operational analysis of the ribosome has revealed that this molecular machine with all of its parts follows an order of operations to produce a protein product. This order of operations has been detailed in a step-by-step process that has been observed to be self-executable. The ribosome operation has been proposed to be algorithmic (Ralgorithm) because it has been shown to contain a step-by-step process flow allowing for decision control, iterative branching and halting capability. The R-algorithm contains logical structures of linear sequencing, branch and conditional control. All of these features at a minimum meet the definition of an algorithm and when combined with the data from the mRNA, satisfy the rule that Algorithm = data + control. Remembering that mere constraints cannot serve as bona fide formal controls, we therefore conclude that the ribosome is a physical instantiation of an algorithm.
This algorithm is instantiated via a “programming language”:
There is a synergy between the machinery of the ribosome and its coherence with the language context of the DNA/RNA environment, reinforcing the prescribed algorithmic operations of the ribosome. There is no known physicodynamic cause for the codon to tRNA translation scheme. Since all genes can be modeled using rules (be they grammar or logical) rather than physicodynamic determinism, we inductively assert that the operation and organization of the genome operate under the influence of a programming language. The genome can be considered as a collective ensemble of instructions and data. Portions of the DNA sequences are algorithmic instantiations. This is evidenced for example, by pre-initiation, enhancer and promoter regions, lincRNA’s, siRNA’s and a host of other instructive sequences, that collectively instruct direct functionality such as gene regulation. In addition to the instruction constructs, the genome is also composed of data in the form of codons. This results in mature mRNAs that are handled as data by other processors (ribosome) which are executing their own algorithms. In other words there are “multiple programming languages” in the cell.
And what, in our experience, generates programming languages?