Orgelian Specified Complexity

As I noted at the start of this series on “specified complexity,” which I’m concluding today, Leslie Orgel introduced that term in his 1973 book The Origins of Life. Although specified complexity as developed by Winston Ewert, Robert Marks, and me attempts to get at the same informational reality that Orgel was trying to grasp, our formulations differ in important ways.

For a fuller understanding of specified complexity, as an appendix to the series, it will therefore help to review what Orgel originally had in mind and to see where our formulation of the concept improves on his. Strictly speaking, this subject is mainly of historical interest. Because The Origins of Life is out of print and hard to get, I will quote from it extensively, offering exegetical commentary. I will focus on the three pages of his book where Orgel introduces and then discusses specified complexity (pages 189–191).

“Terrestrial Biology”

Orgel introduces the term “specified complexity” in a section titled “Terrestrial Biology.” Elsewhere in his book, Orgel also considers non-terrestrial biology, which is why the title of his book refers to the origins (plural) of life — radically different forms of life might arise in different parts of the universe. To set the stage for introducing specified complexity, Orgel discusses the various commonly cited defining features of life, such reproduction or metabolism. Thinking these don’t get at the essence of life, he introduces the term that is the focus of this series:

It is possible to make a more fundamental distinction between living and nonliving things by examining their molecular structure and molecular behavior. In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple, well-specified structures because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures which are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. (p. 189)

So far, so good. Everything Orgel writes here makes good intuitive sense. It matches up with the three types of order discussed at the start of this series: repetitive order, random order, complex specified order. Wanting to put specified complexity on a firmer theoretical basis, Orgel next connects it to information theory:

These vague ideas can be made more precise by introducing the idea of information. Roughly speaking, the information content of a structure is the minimum number of instructions needed to specify the structure. One can see intuitively that many instructions are needed to specify a complex structure. On the other hand, a simple repeating structure can be specified in rather few instructions. Complex but random structures, by definition. need hardly be specified at all. (p. 190)

Orgel’s elaboration here of specified complexity calls for further clarification. His use of the term “information content” is ill-defined. He unpacks it in terms of “minimum number of instructions needed to specify a structure.” This suggests a Kolmogorov information measure. Yet complex specified structures, according to him, require lots of instructions, and so suggest high Kolmogorov information. By contrast, specified complexity as developed in this series requires low Kolmogorov information.

At the same time, for Orgel to write that “complex but random structures … need hardly be specified at all” suggests low Kolmogorov complexity for random structures, which is exactly the opposite of how Kolmogorov information characterizes randomness. For Kolmogorov, the random structures are those that are incompressible, and thus, in Orgel’s usage, require many instructions to specify (not “need hardly be specified at all”).

Perhaps Orgel had something else in mind — I am trying to read him charitably — but from the vantage of information theory, his options are limited. Shannon and Kolmogorov are, for Orgel, the only games in town. And yet, Shannon information, focused as it is on probability rather than instruction sets, doesn’t clarify Orgel’s last remarks. Fortunately, Orgel elaborates on them with three examples:

These differences are made clear by the following example. Suppose a chemist agreed to synthesize anything that could be described accurately to him. How many instructions would he need to make a crystal, a mixture of random DNA-like polymers or the DNA of the bacterium E. coli? (p. 190)

This passage seems promising for understanding what Orgel is getting at with specified complexity. Nonetheless, it also suggests that Orgel is understanding information entirely in terms of instruction sets for building chemical systems, which then weds him entirely to a Kolmogorov rather than Shannon view of information. In particular, nothing here suggests that he will bring both views of information together under a coherent umbrella.

The Language of Short Descriptions

Here’s is how Orgel elaborates the first example, which is replete with the language of short descriptions (as in the account of specified complexity given in this series):

To describe the crystal we had in mind, we would need to specify which substance we wanted and the way in which the molecules were to be packed together in the crystal. The first requirement could be conveyed in a short sentence. The second would be almost as brief, because we could describe how we wanted the first few molecules packed together, and then say “and keep on doing the same.” Structural information has to be given only once because the crystal is regular. (p. 190)

This example has very much the feel of our earlier example in which Kolmogorov information was illustrated in a sequence of 100 identical coin tosses (0 for tails) described very simply by “repeat ‘0’ 100 times.” For specified complexity as developed in this series, an example like this one by Orgel yields a low degree of specified complexity. It combines both low Shannon information (the crystal forms reliably and repeatedly with high probability and thus low complexity) and low Kolmogorov information (the crystal requires a short description of instruction set). It exhibits specified non-complexity, or what could be called specified simplicity.

A Fatal Difficulty

Orgel’s next example, focused on randomness, is more revealing, and indicates a fatal difficulty with his approach to specified complexity:

It would be almost as easy to tell the chemist how to make a mixture of random DNA-like polymers. We would first specify the proportion of each of the four nucleotides in the mixture. Then, we would say, “Mix the nucleotides in the required proportions, choose nucleotide molecules at random from the mixture, and join them together in the order you find them.” In this way the chemist would be sure to make polymers with the specified composition, but the sequences would be random. (p. 190)

Orgel’s account of forming random polymers here betrays information-theoretic confusion. Previously, he was using the terms “specify” and “specified” in the sense of giving a full instruction set to bring about a given structure — in this case, a given nucleotide polymer. But that’s not what he is doing here. Instead, he is giving a recipe for forming random nucleotide polymers in general. Granted, the recipe is short (i.e., bring together the right separate ingredients and mix), suggesting a short description length since it would be “easy” to tell a chemist how to produce it.

But the synthetic chemist here is producing not just one random polymer but a whole bunch of them. And even if the chemist produced a single such polymer, it would not be precisely identified. Rather, it would belong to a class of random polymers. To identify and actually build a given random polymer would require a large instructional set, and would thus indicate high, not low Kolmogorov information, contrary to what Orgel is saying here about random polymers.

Finally, let’s turn to the example that for Orgel motivates his introduction of the term “specified complexity” in the first place:

It is quite impossible to produce a corresponding simple set of instructions that would enable the chemist to synthesize the DNA of E. coli. In this case, the sequence matters: only by specifying the sequence letter-by-letter (about 4,000,000 instructions) could we tell the chemist what we wanted him to make. The synthetic chemist would need a book of instructions rather than a few short sentences. (p. 190)

Orgel’s Takeaway

Given this last example, it becomes clear that for Orgel, specified complexity is all about requiring a long instructional set to generate a structure. Orgel’s takeaway, then, is this:

It is important to notice that each polymer molecule on a random mixture has a sequence just as definite as that of E. coli DNA. However, in a random mixture the sequences are not specified. Whereas in E. coli, the DNA sequence is crucial. Two random mixtures contain quite different polymer sequences, but the DNA sequences in two E. coli cells are identical because they are specified. The polymer sequences are complex but random: although E. coli DNA is also complex, it is specified In a unique way. (pp. 190–191)

This is confused. The reason it’s confused is that Orgel’s account of specified complexity commits a category mistake. He admits that a random sequence requires just as long an instruction set to generate as E. coli DNA because both are, as he puts it, “definite.” Yet with random sequences, he looks at an entire class or range of random sequences whereas with E. coli DNA, he is looking at one particular sequence.

Orgel is correct, as far as he goes, that from an instruction set point of view, it’s easy to generate elements from such a class of random sequences. And yet, from an instruction set point of view, it is no easier to generate a particular random sequence than a particular non-random sequence, such as E. coli DNA. That’s the category mistake. Orgel is applying instruction sets in two very different ways, one to a class of sequences, the other to particular sequences. But he fails to note the difference.

A Different Tack

The approach to specified complexity that Winston Ewert and I take, as characterized in this series, takes a different tack. Repetitive order yields high probability and specification, and therefore combines low Shannon and low Kolmogorov information, yielding, as we’ve seen, what can be called specified simplicity. This is consistent with Orgel. But note that our approach yields a specified complexity value (albeit a low one in this case). Specified complexity, as a difference between Shannon and Kolmogorov complexity, takes continuous values and thus comes in degrees. For repetitive order, specified complexity, as characterized in this series, will thus take on low values.

That said, Orgel’s application of specified complexity to distinguish a random nucleotide polymer from E. coli DNA diverges sharply from how specified complexity as outlined in this series applies to these same polymers. A random sequence, within the scheme outlined in the series, will have large Shannon information but also, because it has no short description, will have large Kolmogorov information, so the two will cancel each other, and the specified complexity of such a sequence will be low or indeterminate.

On the other hand, for E. coli DNA, within the scheme outlined in this series, there will be work to do in showing that it actually exhibits specified complexity. The problem is that the particular sequence in question will have low probability and thus high Shannon information. At the same time, that particular sequence will be unlikely to have a short exact description. Rather, what will be needed to characterize the E. coli DNA as exhibiting specified complexity within the scheme of this series is a short description to which the sequence answers but which also describes an event of small probability, thus combining high Shannon information with low Kolmogorov information.

Specified complexity as characterized in this series and applied to this example will thus mean that the description will include not just the particular sequence in question but a range of sequences that answer to the description. Note that there is no category mistake here as there was with Orgel. The point of specified complexity as developed in this series is always with matching events and descriptions of those events, where any particular event is described provided it answers to the description. For instance, a die rolls exhibiting a 6 answers to the description “an even die roll.”

So, is there a simple description of the E. coli DNA that shows this sequence to exhibit specified complexity in the sense outlined in this series? That’s in fact not an easy question to answer. The truth of Darwinian evolution versus intelligent design hinges on the answer. Orgel realized this when he wrote the following immediately after introducing the concept of specified complexity, though his reference to miracles is a red herring (at issue is whether life is the result of intelligence, and there’s no reason to think that intelligence as operating in nature need act miraculously):

Since, as scientists, we must not postulate miracles we must suppose that the appearance of “life” is necessarily preceded by a period of evolution. At first, replicating structures are formed that have low but non-zero information content. Natural selection leads to the development of a series of structures of increasing complexity and information content, until one is formed which we are prepared to call “living.” (p. 192)

Orgel is here proposing the life evolves to increasing levels of complexity, where at each stage nothing radically improbable is happening. Natural selection is thus seen as a probability amplifier that renders probable what otherwise would be improbable. Is there a simple description to which the E. coli DNA answers and which is highly improbable, not just when the isolated nucleotides making up the E. coli DNA are viewed as a purely random mixture but rather by factoring in their evolvability via Darwinian evolution?

A Tough Question

That’s a tough question to answer precisely because evaluating the probability of forming E. coli DNA with or without natural selection is far from clear. Given Orgel’s account of specified complexity, he would have to say that the E. coli DNA exhibits specified complexity. But within the account of specified complexity given in this series, ascribing specified complexity always requires doing some work, finding a description to which an observed event answers, showing the description to be short, and showing the event precisely identified by the description has small probability, implying high Shannon information and low Kolmogorov information.

For intelligent design in biology, the challenge in demonstrating specified complexity is always to find a biological system that can be briefly described (yielding low Kolmogorov complexity) and whose evolvability, even by Darwinian means, has small probability (yielding high Shannon information). Orgel’s understanding of specified complexity is quite different. In my view, it is not only conceptually incoherent but also stacks the deck unduly in favor of Darwinian evolution.

To sum up, I have presented Orgel’s account of specified complexity at length so that readers can decide for themselves which account of specified complexity they prefer, Orgel’s or the one presented in this series.

Editor’s note: This article appeared originally at BillDembski.com.

Evolution News_{& Science Today}

Evolution

Intelligent Design