Genetic evidence for common ancestry (split-off from "Dating the Noachian Deluge")

@thoughtful asked me in the “Dating the Noachian Deluge” thread to share the genetic evidence which convinced me about common ancestry, so I’ll summarize it here for her. There are many converging lines of genetic evidence for common ancestry, but I’ll just include the three that convinced me the most.

  1. Ancestral sequence convergence

Using phylogenetic techniques, it’s possible to determine the ancestral sequence of a group of organisms with high accuracy. The accuracy of ancestral sequence reconstruction (ASR) techniques has been confirmed by many studies.

For example, an early study used ASR to determine the ancestral sequence of 111 modern HIV sequences, and the resulting predicted sequence was found to be very close to a preserved 1958 HIV sequence from Africa (Zhu et al. 1998). More recently, another study induced rapid mutagenesis in a red fluorescent protein to produce 19 different strains. They then predicted the ancestral sequence using five different ASR techniques, with accuracies between 97.88 and 98.17% (Randall et al. 2016). So we know that ASR is very accurate.

Now, using ASR, it’s possible to test a strong prediction of common ancestry (CA). Since CA states that two given groups of organisms shares common ancestry, it predicts that the ancestral sequences for those groups of organisms should be closer to one another than the extant (modern) sequences of such organisms. In contrast, the separate ancestry (SA) hypothesis states that such groups of organisms began at different points and diversified from there.

(Taken from Testing a Strong Prediction of Universal Common Ancestry – EvoGrad)

This prediction was finally tested in 2013 by White, Zhong, and Penny. Using a statistical test involving ASR, they calculated the probability (p-value) for SA for eight different sixteen different clades. Here are their results:

Table 2 from White, Zhong, and Penny (2013). P-values are in column “p (Χ^2)”.

As you should be able to see in the above table, they found ancestral sequence convergence at progressively deeper levels on the “tree of life.” The absolute maximum p-value was 1.05E-6 (for Vertebrata+Urochordata and Echinoderms+Hemichords), which is just infinitesimal. In statistics, usually a p-value of 0.01 or less is enough to reject the null hypothesis with high confidence – all of the p-values for the null hypothesis (SA) in this study were orders of magnitude lower than 0.01. So this shows, with extremely high confidence, that (at least) all plants and all animals, respectively, share common ancestry.

This evidence for common ancestry isn’t susceptible to any of the usual creationist objections. For example, ancestral sequence convergence can’t have been due to a “common designer” – unless that designer was being duplicitous. And there’s no bias in the statistical tests; they didn’t presuppose common ancestry between the groups tested, nor did they even presuppose common ancestry within the groups.

The only possible way this could have been happened on creationism is by chance. But the chance that none of the groups tested are related is 2.59E-132, basically impossible. It can really only have been due to common ancestry.

  1. Shared endogenous retroviruses

When an organism is infected by an RNA retrovirus, the virus inserts its genome into that organism’s cells, which basically turns their cells into a virus factory that allows the retrovirus to reproduce. If this happens in the germline of an organism, then the genetic material from the virus becomes part of that organism’s offspring, and is passed on from generation to generation. The genetic material is now known as an endogenous retrovirus (ERV).

This is where common ancestry comes in. According to CA (at least, human CA) humans should share at least some of our ERVs with other great apes, since it’s very unlikely that every ERV in the human genome came after the human-chimp split. In contrast, SA predicts that there should be no ERVs shared between humans and other great apes, since it’s extremely unlikely for a retrovirus to insert in exactly the same place in two different lineages.

This prediction was tested in 2018 by Grandi et al. According to their research, out of 211 human ERVs that were examined, there are 205 that chimps have in the exact same location. Likewise, we share 207 ERVs with gorillas, 205 with orangutans, 190 with gibbons, and 131 with rhesus monkeys.

Now, based on a study by Wang et al. (2007), it’s been calculated that there are about 10 million different locations for retroviruses to insert in the human genome (see here). Using binomial distribution, it’s possible to calculate the chance that we share 205 out of 211 ERVs with chimps by random chance (as would be necessary if SA is true) – this chance is 8.954E-1424, or in other words, 1 in 112 followed by 1421 zeroes.

Obviously, this chance is so infinitesimal it renders separate ancestry for humans and chimps effectively impossible. For this reason, creationists have chosen to argue that ERVs weren’t actually inserted by retroviruses, but were created in place as functional elements of the genome.

However, this is also impossible unless God was intentionally being duplicitous. There is actually a huge body of evidence showing that ERVs are the result of retrovirus insertion. For one, there are genetic ‘scars’ surrounding ERVs, consisting of “Long Terminal Repeats” and small DNA duplications on either side of the ERV (e.g., Mamedov et al. 2004), which are only produced by the enzymes used in retrovirus insertion. Furthermore, we have actually observed the process of endogenization (e.g., Crittenden et al. 1989), and there are many ERVs which are currently being fixed in human (Belshaw et al. 2005), chicken (Lee et al. 2017), and koala genomes (Stoye 2006). In light of all of this evidence, it’s clear that ERVs must have been produced by retroviruses.

Some creationists have argued that since ERVs are functional and contain ‘information,’ they must have been created. But that’s not true. First of all, only a few ERVs have function, and we still share many non-functional ERVs with chimps.

Second, the type of ‘function’ that creationists point to is gene regulatory function. But one recent study showed that no less than 1 in 10 random DNA sequences 100 nucleotides in length can act as gene regulators, with no modification (Yona et al. 2018). So this type of function evidently does not require a creator, it can appear totally randomly. In light of this, it’s not at all surprising that some ERVs were co-opted as gene regulators.

Third, we’ve actually observed an ERV that we know to have been inserted by a retrovirus confer function to its host. There are a few ERVs which can confer a special “dilute coat color” allele to a mouse when they insert in the correct location, and this has been known for decades (Jenkins et al. 1981; Hutchison et al. 1984; Tanave and Koide 2020). Creating an entirely new coat color is certainly a biochemical function, especially by creationist definitions of “function” (by which some 80% of the genome is “functional”). So this definitively demonstrates that ERVs can have a function without being created in place.

In summary, we share 205 out of 211 ERVs with chimps, and >130 out of 211 with even distantly related primates like rhesus monkeys. The probability of this occurring if we don’t share common ancestry is infinitesimal. All of the evidence shows that ERVs are the result of viral insertion, even those which confer function to their host, so God could not have created them in place unless He was being duplicitous.

  1. Shared silent mutations

As you probably already know, DNA codes for proteins. Specifically, each amino acid is coded for by different three-nucleotide DNA codons. However, there is functional redundancy in the DNA code, so the same amino acid can be coded for by different codons – for example, the amino acid serine is coded for by the UCU, UCC, UCA, and UCG codons. Because of this, there are some sites in the genome where single-nucleotide mutations do not affect the output protein. Such mutations are effectively neutral, and the sites where they occur are referred to as “silent sites.”

Now, CA (human CA, that is) predicts that such silent mutations should consistently produce the same nested hierarchical pattern between humans and other great apes, since we share common ancestry. In contrast, SA predicts that they would not produce a consistent nested hierarchy, because a “common designer” would only create genetic differences where they make a functional difference. This is true according to creationists themselves: AiG tells us that “the Creator God… use[d] similar design plans for his creatures when best suited for particular functions.”

This prediction was tested in 2016 by Bontrager et al. What they found is that silent sites do indeed produce a consistent phylogenetic signal between humans and great apes. Statistically, the p-value for the null hypothesis of separate ancestry was between 3E-9 and 1E-12, based on the phylogenetic tree agreement. Again, in statistics, usually a p-value of 0.01 or lower is enough to reject the null hypothesis, and this is many orders of magnitude lower than that. Therefore, we can (yet again) confidently reject separate ancestry for humans and other great apes, and conclude that we do share common ancestry with all the other primates.

I suppose the only ‘out’ for a creationist is to suppose that silent mutations do make a functional difference. One recent study suggests that silent mutations in yeast are strongly deleterious (Shen et al. 2022), and creationists have subsequently jumped on this bandwagon (Carter 2022).

However, as other authors have pointed out, this would overturn decades of evidence to the contrary, and there are significant methodological problems with Shen et al.'s study (Kruglyak et al. 2022). It seems safe to say that silent mutations do not have a fitness effect, or at least that if they do, it’s certainly not enough to explain the extremely strong phylogenetic signal of silent sites between humans and other primates.

That was a lot, but it doesn’t even scratch the surface of all of the evidence that I considered before realizing that universal common ancestry is true. Those are just the three things that convinced me the most. Those are also the only quantitative tests of common ancestry that have been done (that I know of), and I’m more of a numbers guy, so I guess that explains why I found them convincing.

Hopefully that helped you understand why I concluded that common ancestry is true, @thoughtful!


Hi Andrew
First of all I appreciate the thoughtful argument.

I was convinced by this argument in the past but what is left open is how a retrovirus that gets inserted in a specific location in an animal and eventually becomes fixed in that population. Let’s say retrovirus x is fixed in the same location in humans and chimps. If we traced that back to the common ancestor that retrovirus had to get fixed in that population in the same location. This in itself is an unlikely event if the insertion location is random. Now multiply the probability of that event by all the common retroviruses.

Are these sequences really retroviruses?

I see you made an argument on this but there seems to be more detail on how this story is feasible as you try and trace how the events happened.

1 Like

Why is that an unlikely event? If a common ancestor between all humans and chimps had a retrovirus insert, then by default it would be in the same location in all humans and chimps. That’s not unlikely at all.

Yep, they are, for the reasons I discussed above. The existence of long terminal repeats and terminal site duplications – which are only produced by the enzymes used in retrovirus insertion (reverse transcriptase and integrase) – shows that ERVs are the result of retrovirus insertion. If God created them in place, He was being intentionally duplicitous by including these LTRs and TSDs.


Hi Andrew
The insertion itself in not unlikely in an individual but the insertion in a specific location would need fixation in the population to have two isolated (humans and chimps) populations pass this down as a fixed mutation. According to neutral theory the rate of mutation is equal to the rate of fixation. The problem is these retroviruses in a specific location being repeated is a rare event (infrequent mutation) given the insertions are random. You now have to show this happened multiple times.

For argument sake I will agree with this claim pending resolving the problem of specific location retroviruses being fixed in a population based on random insertion.

Ah, I think I see your confusion. No, the ERV wouldn’t have to insert in the same place multiple times. Since these 205 shared ERVs would have existed in the common ancestor of humans and chimps, it wouldn’t need to occur in all humans and chimps, it would simply be passed down to all of them.

I may be misunderstanding your question though.

It’s not as though these ERVs have to be fixed in the population rapidly, either. There are millions of years during which they can be fixed. We share 131 ERVs with rhesus monkeys, and 205 with orangutans, so our lineage would have only needed to accumulate about 74 ERVs between the Cercoipithicoidea/Hominoidea split (ca. 30 Ma) and the Pongidae/Hominidae split (ca. 14 mya) – in other words, about one ERV fixation per 200000 years.

This is a very plausible rate of fixation, seeing as there at least 27 ERVs which are insertionally polymorphic in humans even today (Marchi et al. 2014), and probably many more which have not yet been discovered. So 74 ERVs becoming fixed over a period of 16 million years isn’t at all implausible, in fact, it seems to be a rather slow rate of fixation compared to what’s occurring in humans today.


I think you see only some of Bill’s confusion. Bill seems to think that fixation occurs by multiple mutations throughout the population. He doesn’t understand the neutralist mantra that the fixation rate is equal to the mutation rate, and he thinks it has something to do with those multiple mutations. I don’t know why he thinks that, but he does.

Bill, if you’re reading: the insertion only has to happen once, in a single individual, and spreads through the population by drift. Now of course most insertions will not spread. Many more insertions happen than ever become fixed or even reach any appreciable frequency. But some do, and those are the ones we see.


Hi Andrew
The question is how did these ERV’s get fixed in the common ancestor?

What is happening is the argument is using common descent to solve the problem of random insertion in the same location being improbable but common descent as you stated assumes that they exist in the common ancestor.

How did these rare events originate? How did they spread in the population?

Have you looked at any population genetic models to validate there is enough time given how rare the individual events are?

Just a counter argument to stimulate thinking. I am impressed with how you are articulating your position. I have learned that these arguments that can be very convincing are more problematic once you dig further into them.

Hi John
We are not talking about a single rare mutation but hundreds. The Behe/Lynch models were limited to 6 rare mutations for Behe and 2 for Lynch.

They originated by retroviral insertion, a very common event resulting from retroviral infection. They spread in the population by drift, over the course of many generations.

They aren’t rare at all. The overwhelming majority of insertions are lost within a few generations, but there are enough. Remember, the number of retroviral insertions fixed in each generation equals (on average) the number that occur in the average individual.

You are wrong about that. You were claiming that many insertions in the same place were necessary. You have no been informed that only one is necessary to explain any given shared insertion. Those hundreds of shared insertions happened over many millions of years, in various ancestors of various clades of primates.


Hi Andrew

I certainly don’t need convincing about common descent but the detail you provide especially on ERVs is fascinating. Would you mind if I quoted some of your material elsewhere, where I’ve been exchanging views wit ID proponents?

1 Like

Just one insertion out of many happening, in just one individual out of many who gets infected with the virus.

This happens many times during the evolution of primates. Retroviruses come and go, groups/tribes of primates become infected and many individuals get retroviral insertions, and it just takes one out of all those that get it, to go to fixation.

In the same way all other traits spread in a population. They pass it on to their children, who grow up to have children of their own etc.

Other people who had insertions in other places of their genome had fewer children on average, so we got this particular insertion rather than another.


Alan, feel free to use it. It’s not like any of it is my work anyway, it’s just citing the work of other people.

1 Like

@John_Harshman is right, you’re badly misunderstanding something about genetic drift and fixation. I’m not entirely sure what you’re misunderstanding, but you should know that only one mutation needs to occur to come to fixation via genetic drift. It doesn’t need to happen more than once to be fixed.

Also, ERV insertion by itself isn’t rare at all; as I explained earlier, there are >27 ERVs which are insertionally polymorphic in humans even today (Marchi et al. 2014). What’s rare is multiple ERVs inserting in the same place in two different individuals, but that doesn’t need to happen for it to come to fixation. So it’s not at all implausible that one ERV should become fixed per 200000 years in a hominid population of about 10000 individuals. No, I haven’t looked at any population genetics models, but it seems intuitively plausible to me.


I’m not sure that’s true. The “problems” that you’re seeing seem to come from a misunderstanding of population genetics.

No, we’re only talking about a single mutation, and one that’s not all that rare at all. A mutation only has to happen one time before it reaches fixation by genetic drift.


Then you should be able to cite a model that shows this is feasible. How do hundreds of retroviruses in the same position become fixed in a population? What are the parameters and what is the waiting time to fixation?

This is your central misconception. It isn’t hundreds of retroviruses in the same position. It’s one retroviral insertion. They become fixed in the same way a point mutation becomes fixed: by being inherited over many generations. Your link is irrelevant; did you put it in by accident?


There are about 100 point mutations per generation in the germ line every generation. How many retroviruses are there in the germ line per generation?

That varies a lot, depending on whether the population is currently experiencing a retroviral infection. But let’s consider: there have been around 35 million point substitutions in the human and chimp lineages (combined) since their separation. How many retroviral insertions per generation would be required to achieve a proportional result?


How are you still this confused?


According to neutral theory about .2% of the common ancestors would need to have the retro virus fixed in their germ line on average.
Would we expect the sequences of these retroviruses to be conserved?
Some retro viruses we know like HIV can be deleterious so that is another issue which challenges fixation.

How are you still so gullible? :slight_smile:

This is word salad. I have no idea where you got that notion.

More word salad. Do you have any idea what you’re talking about here?

We would not. But I’m suspecting you don’t understand what “conserved” means.