PLOS Genetics on non-random mutations

Gonna need to see a citation for this. We haven’t discovered 20 billion microbial species.

I was conservative and rounded the percentage of discovered species down:

We estimate that Earth is inhabited by 10^11–10^12 microbial species. This prediction is based on ecological theory reformulated for large-scale predictions, an expansive dominance scaling law, a richness scaling relationship with empirical and theoretical support, and the largest molecular surveys compiled to date. The profound magnitude of our prediction for Earth’s microbial diversity stresses the need for continued investigation. We expect the dominance scaling law that we uncovered to be valuable in predicting richness, commonness, and rarity across all scales of abundance. To move forward, biologists will need to push beyond current computational limits and increase their investment in collaborative sampling efforts to catalog Earth’s microbial diversity. For context, ∼104 species have been cultured, less than 105 species are represented by classified sequences, and the entirety of the EMP has cataloged less than 107 species, 29% of which were only detected twice. Powerful relationships like those documented here and a greater unified study of commonness and rarity will greatly contribute to finding the potentially 99.999% of microbial taxa that remain undiscovered. (Locey & Lennon, 2016)

Yeah, that certainly doesn’t indicate anything like the number of discovered species you were implying.

@davecarlson: You’re right, the potential for falsification is even greater than I suggested.

You are confusing the last universal common ancestor with the first life form. LUCA itself has a pretty substantial evolutionary history. The fact that all it’s descendants inherited it’s genes doesn’t mean those genes did not themselves evolve before it.

By the way, what is the nature of the archaeal system analogous to the SOS response in bacteria? Are they actually homologous? If they aren’t, that implies analogous systems have evolved at least twice and were not actually present in LUCA.

@Krauze, I am curious as to what you imagine this computer might be like. I think the pieces of such an instrument are already in place, and that they don’t seem to be taxing the cell. But you may have a very different notion of this.

I’m pretty sure this isn’t possible if you spend a bit of time thinking about what this would require. First of all, you’d need a computer that could accurately predict protein folding under a multitude of physical circumstances. We can barely do that even with warehouse-size supercomputers. Then you’d need the computer to be able to analyze new challenges, and it’s difficult to imagine how that would even work. It’s supposed to be able to predict how the environment will affect the organism in the future at the molecular level, and then compute not only how adding a new protein to this environment will alter it, but also that the fitness effect of this protein will be beneficial, in addition to how it folds?

Now to make matters worse the computer need to know the genome of the organism it sits in, and be able to figure out how to create this new protein with the right properties by incremental mutation of already existing genetic sequences? Forget about even doing these calculations and having this kind of knowledge in a computer smaller than a cellular organelle, which is absurd on it’s face, you now also want this computer system to be in possession of a system of targeted mutagenesis that could literally take any genetic sequence and induce any desired mutation?

You haven’t considered what it even takes to have a system that can induce ONE specific mutation. Suppose there’s a particular spot in the genome where there’s a T nucleotide, and this needs to be changed to a C. What is required to reliably induce this mutation, and be able to discriminate against every other T nucleotide in the genome? Lots of very specific DNA binding proteins are required just to ensure they target the right stretch of DNA, and each of these targeting proteins need their own regulation to ensure they are only expressed at the right time and moment, and to prevent their accidental binding to other parts of the genome.

So we are talking many tens of thousands of basepairs of DNA just to encode a system that can make ONE particular mutation. Now multiply that across an entire bacterial genome, you might have 3 million basepairs, for all of which you need a system able to induce a specific mutation your occult molecular supercomputer has ordained will be required to meet a new environmental challenge. A system to change the T into A, into a G, and into a C. For each basepair. And then a system to induce a deletion, or an insertion, and not just any insertion, a system to inset any particular required nucleotide(or more?), so one of each. Suppose the system computers that an existing protein requires a 6 nucleotide insertion GCATTG in some spot in a gene? How baroquely complex a system of proteins does it require to be able to produce such a piece of DNA on demand? Now that system needs to be encoded in DNA too.

So we can multiply the required genome length by a factor of at least 40.000. Literally we now need to take that 3 megabase bacterial genome, and multiply it’s length by a factor of 4x10^4.

But now we’ve made the genome 40.000 times larger, which means we have 40.000 times more genome that also needs to be able to mutate by targeted mutagenesis, and we’re caught in a positive feedback loop where every time we need a new mutation, we need more genome to encode the systems capable of mutating this new stretch of genome, but his new genome also needs to be able to be mutated, so we need … more genome to encode the systems that can mutate this new stretch of genome, ad infinitum.

Even if you imagine there’s a clever system of compression that can significantly reduce the amount of extra space required, say down to a factor of 100 instead of 40.000, you’re still caught with this problem that any time you want to add a system to make new potentially required mutations, you’ll need more genome to encode it, which itself needs to be able to be mutated by this supercomputer system that is somehow miraculously small and unimaginably powerful.

Sorry, it just isn’t possible. You haven’t even begun to consider what you are proposing.

3 Likes

I’m in agreement with @Rumraket here: “Sorry, it just isn’t possible. You haven’t even begun to consider what you are proposing.”

@Rumraket, I’m getting to your replies later. Busy day around the house today.

It was meant as a response to you. It seems to me it’s you who is proposing a teleological system such as I described exists, and I took Art to be implying that the system that actually exists is the blind evolutionary process where random(unpredicted, ateleological) mutations are filtered by natural selection.

It doesn’t have foresight, it can’t compute how proteins will fold(they fold, and nothing knows or determined whether it would be beneficial beforehand) or determine what is required for adaptation, so when something advantageous evolves it’s usually because there were lots of misses.

1 Like

It helps to read for understanding, not merely scan for things to disagree with.

Of course, if I’ve misunderstood your point then I’ve made a mistake.

My point was that even though one could in principle imagine such a cellular computer, it would be problematic (for the very reasons mentioned by you). A better solution would be the one we observe, where lots of cells experiment with solving a problem (using mutations) and where succesful attempts spread by “cross-talk” between cells (through horizontal genetic transfer and exchange of plasmids). In other words, let life itself be the “computer”.

Everyone can make mistakes. But I wonder if you would have written such a vigorous critique if you’d known you were arguing the same position as someone positive towards ID? Aid and comfort to the enemy and all that.

No, I’m well aware of the extensive evolutionary pre-LUCA history required by the ateleological view. As I wrote: “every organism with structure X is descended from an organism without that structure, all the way back to the first replicator.”

Of course, the teleological view doesn’t need to posit a hypothetical history of increasingly simple lifeforms. The teleological view is free to consider the first lifeforms as consisting of cells of comparable complexity to bacteria and archaea found today, including SOS response systems.

This is where the issue of risky predictions discussed by @T_aquaticus and me come into play. We’ve barely scratched the surface in terms of discovering microbes, and there are about one trillion undiscovered microbial species out there. In other words, lots of opportunities for finding those simple cellular ancestors, from before the SOS response system arose. If the SOS response system consists of an originally designed core (as I suspect it does), I’m betting won’t find these.

I haven’t looked at the archaeal SOS response system. But for unrelated reasons I tentatively suggest that the original population of engineered cells consisted of both representatives of archaea and bacteria, and that the two domains therefore do not share a common ancestor. That would explain the lack of homology between the core components of the DNA replication systems (Leipe et al., 1999) and distinct membrane chemistries and lack of homology between the enzymes of lipid biosynthesis in archaea and bacteria (Koonin, 2009).

Then we are in agreement here.

Everyone can make mistakes. But I wonder if you would have written such a vigorous critique if you’d known you were arguing the same position as someone positive towards ID? Aid and comfort to the enemy and all that.

I’m not afraid of calling out what I see as bad arguments on “my side” of things and to spend some time explaining why I think they’re bad, if that’s what you’re asking.

1 Like

An evolutionary history for which there is actually substantial evidence, I might add.

Of course, the teleological view doesn’t need to posit a hypothetical history of increasingly simple lifeforms. The teleological view is free to consider the first lifeforms as consisting of cells of comparable complexity to bacteria and archaea found today, including SOS response systems.

And so much the worse for it as a predictive theory, it seems to be compatible with life starting at any imaginable level of complexity.

This is where the issue of risky predictions discussed by @T_aquaticus and me come into play. We’ve barely scratched the surface in terms of discovering microbes, and there are about one trillion undiscovered microbial species out there.

Well we don’t know that there’s actually a trillion, that’s an unsubstantiated hypothetical estimation which you now seem to be proposing as if it was an established fact.

Though I would certainly agree there are still many undiscovered prokaryotic species out there, and finding and analyzing them is sure to improve our understanding of the early evolutionary history of life.

In other words, lots of opportunities for finding those simple cellular ancestors, from before the SOS response system arose.

That supposes they would have survived to the present day. Presumably DNA repair is advantageous to any organism that can suffer damage to their genetic material (that would be all of them) which is why it evolved (or, heck, was designed) in the first place, which would make it rather difficult to imagine a species of prokaryotes have survived for ~3800 million years to the without any such system.

If the SOS response system consists of an originally designed core (as I suspect it does), I’m betting won’t find these.

Though problematically that is not a prediction that actually distinguishes it from common descent of extant life from an ancestor that had such a system.

Right, but that doesn’t explain why the core components of the translation system are homologous(and a host of other widely though not universally distributed genes are too), which implies they really do share common descent. By implication their shared ancestor might not even have had a DNA genome but RNA instead (a subject still of much contention though), which could even be taken as actual evidence for a simpler stage of cellular life before many of the systems we see distributed in extant cells had evolved.

It seems rather odd to jump to independent ancestry just because not ALL genes are universally shared even though some are.

Let’s step back and and consider just what I am thinking of.

First, the backdrop - that regulated mutagenesis as we understand it induces mutations in hundreds or thousands of genes, and may include a gene changes to which will be adaptive for a given stress or challenge. The challenge for the hypothetical designer of the cells that seeded life on earth is to target just those genes that will be adaptive for a given challenge, and avoid all the collateral damage.

I would argue that all of the machinery that is needed to accomplish this exists in cells today (and thus should have been in the cells that seeded life on earth). Certainly, cells can sense stresses and ask for a host of downstream responses. Also, cells have several ways to target and alter, with singe base resolution, specific nucleic acid targets (RNA editing and CRISPR/Cas come immediately to mind). So, there already exists a foundation whereby one might design a system wherein specific challenges lead to alteration of specific genes (without all the collateral damage). One need only link the expression of the appropriate guide RNAs with the challenges posed to the cell.

Of course, we must grant our designer a bit of skill here, in that it will need to be able to anticipate which families of enzymes should be targeted to respond to a given challenge. But that shouldn’t be beyond the abilities of an entity that can seed life at the outset. Heck, we are talking about a handful of basic chemical reactions and enzyme types, certainly not an insurmountable accounting or anticipatory challenge. It should also be possible (perhaps even trivial) to build in a degree of evolvability so that life can incorporate new mutational schemes and explore/expand the reach of the original blueprint, merely by modifying the complement of guide RNAs.

None of this sounds particularly impossible. And, except for bringing some new enzymatic functionality to the CRISPR/Cas toolkit, my scheme is “off the shelf” as it were, and clearly doesn’t tax cells in an insurmountable fashion.

It’s an estimation, carried out by microbiology diversity researchers, and published in the peer-reviewed literature. As all other estimations, it is open to further revision, but even if the number is off by a factor of, say, three, it doesn’t change my point that we’ve barely started scratching the surface of microbial diversity.

A minor point, but I wouldn’t consider the SOS response a mechanism of DNA repair, but of regulated mutagenesis.

More to the point, why is it difficult to imagine that an organism originally lacking an SOS response would survive to the present day? There are organisms today that have lost the SOS response. Are you saying that if an organism that was ancestrally lacking an SOS response was discovered, evolutionary biology would be in trouble?

The common descent view makes no predictions regarding the current existence of organisms ancestrally lacking an SOS response. If such an organism was found to exist, the explanation would be that it occupied an ecological niche in which the SOS response was not required, just like extant organisms where the lack of an SOS response is a derived trait did. And if no such organisms are found to exist, the explanation would be that they had gone extinct or had failed to be discovered.

Another minor quibble; the core components of the translation system are similar. Whether that similarity is due to common descent (homology) is the issue in question.

The teleological view would explain the similarity of those translation system components in terms of function, not history. Either their function is so tightly constrained that it can only be carried out by a specifically constructed set of parts, or the requirements of the archaeal vs. the bacterial cell architecture are such that the design can be re-used. An interesting avenue of research under a teleological paradigm.

Mind you, we aren’t talking about a random set of genes. We’re talking about genes involved in DNA replication and cell membrane synthesis, two of the most fundamental processes of life (replication and seclusion).

That very much depends on your perspective. Where you see evidence of a simpler stage of cellular life, I see attempted reconstructions of LUCA that result in an organism defined by negations (no DNA replication, no cell membrane) that requires one to abandon cell biology.

Interesting system, Art. Lots of hypotheticals, but I suppose that in principle a system like that can be imagined.

Of course, restricting mutagenesis to specific genes would decrease the risk of damaging mutations, but it would also decrease the chance that a solution is found in an unexpected part of the organism’s genome. If “evolution is cleverer than you are,” as Leslie Orgel is said to have remarked, then maybe that is because it hasn’t been restrained in its experiments.

Nevertheless you argued as if it was fact, and the mere fact that it is published doesn’t make it that. I’ve already agreed with the overall point that there’s more microbial diversity out there than have so far been characterized, so it’s a minor point.

It is both. The act of repairing damaged DNA(such as double strand breaks) often has the effect of causing mutations where the DNA is being repaired. This is just one aspect of the full SOS response. To find an organism without the full SOS response is not the same as finding an organism without DNA repair.

I didn’t say SOS response, I said DNA repair.

An organism that was ancestrally lacking an SOS response? No, that would rather seem to confirm that the SOS response was not originally present but evolved later.

The only thing to take away from this is that the presence or absence of the SOS response is not evidence for or against common descent.

But your supposed teleological hypothesis seems to be in no better position here as it doesn’t seem to actually predict anything that sets it apart from common descent. If your proposed alternative is supposed to predict something that would make us able to observationally distinguish between common descent, or whatever it is you call your alternative then it needs to actually posit something that would be incompatible with or highly surprising on common descent.

That’s a question you are posing, but the answer is actually known. They are not merely similar, but exhibit significant degrees of nesting hiearchical structure, and individual components of the translation system when subjected to phylogentic analysis still overwhelmingly corroborate a common branching topology. As in, the phylogenies you get from 16S rRNA are very similar to the trees you get from 23S rRNA, and to the trees we get from tRNA, from aminoactyl-tRNA-synthetases, to individual ribosomal proteins, and so on.

There’s no functional reason why independently inferred phylogenetic trees derived from translation system components should converge on a similar topology of branching. Functional constraint would explain why they are merely similar in sequence of encoding or in structure, but it would not explain the fact that different components yield so similar trees.

You could add the translation system to that, since the proteins responsible for carrying out replication are all basically produced by the translation system, and the cell membrane lipids are biosynthesized by protein enzymes made by the translation system.

That’s like saying evidence that we were once oceanic fish over 400 million years ago require us to abandon human or mammalian biology. Uhm, yes.

Yes, once upon a time, we weren’t human, or even mammals, but something else. I think the study of evolutionary biology is one big elucidation of this principle of life becoming something else than it used to be. Of change and replacement, loss and addition.
Though I don’t see why we should think life wasn’t actually cellular, would a cell with RNA instead of DNA not be a cell? Or based on a different kind of membrane material, simpler fatty acids or fatty alcohols?

1 Like

That is exactly what I am talking about. Anything that helps a species survive is labelled afterwards as good design simply because it helped the species survive. This is ad hoc reasoning.

This would also be true for a lack of design. You need something that would distinguish design from universal common descent/evolution.

Ummm, that is wrong. Common descent explains why species share features.

1 Like