Why We Do Not Evolve Software?

https://journals.sagepub.com/doi/10.1177/1176934318815906

1 Like

Great find! A good example of why we software engineers find ID so obvious.

Our analysis of relevant literature shows that no one has succeeded at evolving nontrivial software from scratch; in other words, the Darwinian algorithm works in theory but does not work in practice, when applied in the domain of software production. The reason we do not evolve software is that the space of working programs is very large and discrete. Although hill-climbing heuristic–based evolutionary computations are excellent at solving many optimization problems, they fail in the domains of noncontinuous fitness.87 This is also the reason we do not evolve complex alife or novel engineering designs. With respect to our 2 predictions, we can conclude that (1) simulations of evolution do not produce comparably complex artifacts and (2) running EAs longer leads to progressively diminishing results. With respect to the 3 falsifiability conditions, we observe that all 3 are true as of this writing. Likewise, neither the longest-running EA nor the most complex-evolved algorithm nor the most complex digital organism are a part of our common cultural knowledge. This is not an unrealistic expectation as successful software programs, such as Deep Blue88 or Alpha Go,89,90 are well known to the public.

This reminds me of the urban myth about the scientist who modeled the flight of bumblebees and found that they shouldn’t be able to fly.

5 Likes

OR it’s why code is a poor analogy to DNA. Genetic Algorithms do a better job with neural networks,

3 Likes

Also a good example of why evolutionary biologists find so many ID pushing software engineers clueless on actual science.

2 Likes

You have to get time on a supercomputer to properly predict protein folding and protein interactions, and even then it takes pre-existing models to get something close. I really don’t see how any computer program could properly model all of the biology involved, especially when you are talking about interactions between millions of proteins and nucleotide molecules.

“Why We Do Not Evolve Software?” is a great title for a journal article. But it also reminds me of similar essays by scholars of centuries (and decades) past. I’ll not try to compile a carefully footnoted list but will paraphrase a few of them from memory:

“If science is powerful, why can’t it produce a biochemical in the laboratory? (No, only the living God and the life God creates can produce an organic compound.)” [Frederick Wohler demolished that one. He synthesized urea in his laboratory. That was a shocking announcement at the time and organic chemistry adopted a new definition: carbon chemistry.]

“Science has produced impressive lighter-than-air machines—but science has entirely failed to produce a machine that flies in any way like birds do!”

“Why can’t computer scientists write a program which beats a grand master at chess? That goal appears illusive.”

“Why has machine translation of human languages largely been abandoned? We shouldn’t hold our breath.” (I remember such articles in the 1970’s.)

Of course, most of us probably had a first reaction to the article which was basically, “Huh? We do evolve software! That’s why Genetic Algorithms are very important tools in software engineering.” Yes, as one reads the abstract, the key is apparently the solving of “non-trivial” problems. I have mixed feelings about successfully defining the boundary between trivial and non-trivial problems which are addressed by Evolutionary Algorithms. And I’m not so sure that the fact that human intelligence plays such a big role in applying EAs necessarily negates the fact that we do evolve software in the course of software engineering.

I’m not casually dismissing the journal article. I just wonder how it will be regarded in another generation or two.

2 Likes

The first part of the article correctly claims that GA algorithms alone do a poor job of explaining how humans develop software. So far so good.

It is correct that intelligent behavior, such as software development, does not rely solely on trial and error type learning or even solely on the pattern and correlation-seeking approaches of deep learning algorithms. Intelligence requires embodied learning of causal models so that counterfactuals can be evaluated (possibly subconsciously).

As best I can tell on a brief reading, the second part of the article tries to apply the search algorithm analogy to various scientific domains: origin of life, origin of intelligence, brain development. But even assuming that the best science models in those domains can be summarized as searches, in each case ithe article ignores the constraints that the resulting science places on the structure of the search space and so on the effective means to find local optima in that space.

Overall, the article seems to me to be another example of the ID worldview that math dictates correct science: that is, that one can draw conclusions about the world by a priori mathematical reasoning based on intuitive assessment of probabilities and complexity, ungrounded in the relevant science and empirical knowledge.

3 Likes

My professional life was mostly about managing software development. There are some interesting parallels between modern software development and biological evolution.

The latest approaches to software development, namely agile methodology, have taken the lessons of evolution to heart: they de-emphasize ornate design processes and instead emphasize smaller cycles of trial, feedback, improvement.

One can even draw analogies between agile software development techniques and biological evolution:

  1. Fitness is reflected by user requirements. In agile development, one uses small sets of user stories, ie something close to current fitness function, rather than large requirements documents.

  2. Automated unit tests to confirm that no current functionality is lost: that corresponds to biological fitness eliminating non-viable mutations quickly.

  3. Code, not architecture. Agile de-emphasizes paper design and emphasizes code to gain immediate feedback on fitness (ie through user testing).

  4. Technical debt: but a recognized problem with agile approaches is accumulation of technical debt, which is software that is difficult to modify due to non-generalized design. That corresponds to the “poor design” resulting from biological evolution.

3 Likes

I think we can also learn from the “Tornado in a junkyard assembling a 747” analogy. That analogy fails because airplane are not living things that reproduce. BUT if we extend the analogy to include the environment of all airplanes and the human that build them, wee see an “evolution” of aviation technology starting before the Wright Flyer and extending to the present day. Along the way there has been an enormous amount of experimentation and failure before they most successful design were put into production - not entirely unlike the process of evolution by natural selection - but here it’s what flies, how fast, how much cargo, how economical?

I think we might make the same parallel to evolving software, but we need to extend the boundary of the software environment to include humans. In this sense we do see the evolution of software. I realize that isn’t the same thing as computers evolving software on their own; I am suggesting that if we want to achieve that goal we need to reconsider the sort of environment being simulated, and make it more like the environment where software does evolve.

To make that a little less hand-wavy, I suggest that a combination of GA and agent-based-modelling will be more successful that GA alone in evolving software. The agents would have knowledge of specific types of coding tasks, but not the overall goal.

3 Likes

I haven’t seen a very compelling argument for this position. I see another thread has gone on for hundreds of posts on this topic.

The only argument I see is the genetic code is not analogous to human written code. That’s true, but does not prove it is not some sort of digital code.

If a computer simulation does not mimic what we observe in nature, which is wrong, nature or the simulation?

Analogies don’t have to be perfect, so I have no problem with using computer code as an analogy for DNA. However, problems arise when you forget that it is an analogy and start to make conclusions about DNA based on computer code.

5 Likes

Then you haven’t been paying attention.

3 Likes

In that case we need some kind of summary or pin function in these threads, because there is no way I’m analyzing hundreds of comments to figure out what all the counter arguments are. I read through the beginning of that long thread, and other threads where this debate has shown up, and DNA != human code is the only argument that has stood out to me.

I grant I may have missed some subtleties somewhere, but for whatever reason, I cannot recall any compelling reason to think the genetic code is not a digital code, and thus the same theory that applies to Turing machines and computer code can be applied to the genetic code. Hence my befuddlement why ID is not just a completely obvious implication of the genetic code.

Granting the genetic code is a digital code, so I can call it a “program”, here is my list of other reasons I think you all don’t think the OP applies:

  • Function is much denser that valid computer programs.
  • Pathways to functionality are much more well orchestrated than in the case of evolving computer programs.
  • There is error correcting built into the processing stage, so invalid genetic programs are adjusted to become valid programs. So, we can blast a program with a bunch of errors which will move it into the neighborhood of another valid program, which the error correcting mechanism will complete to become a genetic program that generates a new kind of functionality.

Anything besides these? Note, these reasons are not reasons the genetic code is not a computational code. They are reasons that grant it is a computational code, yet still a genetic algorithm should be able to evolve some new functionality, and thus these reasons can be analyzed using standard computational theory.

That is why I had been splitting topics in the past, but that seemed to annoy some people. Regardless, right at the start I’ve been repeating a different response on that particular case any ways.

It seems you forgot to complete your sentence. Looking forward to seeing you try.

Function is much less dense in DNA than in a computer program. That is something we can objectively measure too.

Not sure I agree.

I’m not sure I’d call it error correction, but the general sentiment is correct here. This is an example of a difference.

There are more differences too…

This is an example of why I don’t find the argument “genetic code != computer code” convincing.

Error correction is a well established part of computer coding. You are only able to see what I’m typing to you because of a myriad of error correcting codes from hardware to software to internet.

So, error correction is not a dissimilarity between genetic code and computer code.

Can you give an example of anything which cannot be expressed as a digital code? (perhaps simpler if we limit to things found in nature) If not, the anything could be a digital code. IOW: Call DNA a digital code does not appear to tell us anything useful about DNA.

I meant to mention something about networks here. I don’t think neural networks apply, but chemical networks, if I can call it that. Chains of reaction and products in a micro-environment of promoters and inhibitors. Chemical reactions have an element of randomness which we do not see in any human designed code.

1 Like

Yes, everything in a discrete finite realm (i.e. our universe as we know it) can be expressed as a digital code. However, not everything operates as a digital code. A digital code lists a series of options, like checkmarks on a form to fill out, toggle switches, and based on which options are selected produce a certain result. Very little in nature operates this way, except for the genetic code. Hence it being called the “genetic code.”

Not even that in all cases. There are modifiers and conditions that can alter sequence “processing outcomes” as well.

1 Like