Some molecular evidence for human evolution

I can’t answer for what seems extraordinarily high to you. But no, it isn’t high. Let’s remember that there are only 4 possible states at any position. Given the rarity of transversions, it might be accurate to say that there are only two likely states.

2 Likes

I think it makes an excellent case for common ancestry, but does it make a strong case for universal common descent, or in your words “common ancestry of all living things on earth”? Many creationists would agree with a significant amount of common ancestry, I think, but disagree that it is truly universal. So do scientists have a sense of how universal it is? In other words, has there been enough tests like what you and @John_Harshman have done across not only multiple sequences in the human genome compared to apes and monkeys, but across the entire “tree of life”?

2 Likes

Yes, there has, though in fact the deepest comparisons use protein sequences rather than DNA sequences. Still, the difference between what creationists are willing to accept and what’s accessible through DNA sequences is still quite large.

1 Like

According @glipsnort’s article at 32, the sum of G-C, A-T, A-G/G-T mutations should be roughly equal to the number of transitions mutations (C-T or A-G). So, in your example, on the 22 cases of homoplasy, we would have expected to see about 50% of transitions, not 100% as it is the case. What is the explanation for this strange observation?

1 Like

Several explanations: first, transitions are more likely to be silent than transversions and, if non-silent, more likely to connect similar amino acids; second, the transition bias in mitochondrial DNA is greater than in nuclear DNA; third, you will note that most of the changes are C-T, because there’s a base composition bias making G rare; fourth, transversions are often complicated by subsequent transitions that make them no longer informative for our sample — see, for example, the second to last site.

7 Likes

Fifth, you need two transversions to get into this list, so the occurrence would only be a quarter as common as transitions, even if transversions were half as common as transitions.

6 Likes

Very much so. As John noted, when DNA identities approach the statistically undetectable level, protein identities/similarites are still there.

I’ll add that when the protein identities/similarities poop out, there remain structural homologies. Universal common ancestry predicts that we’ll find a lot more of those in the future.

2 Likes

This is surprising to me, mostly because it seems like a lot of work (both sequencing and the comparison). Do you (or @Mercer or @NLENTS or anybody else) know of any good seminal work reviewing this (establishing universal common ancestry through DNA/Protein sequences) that you could point me towards? I’d like to dig into it a little bit more. As a non-expert, sometimes it’s nice to get a peak inside some of the major results of another field.

For some ribosomal RNA sequences used to infer the deepest phylogenies, they actually do use the DNA sequences(as they would be directly complementary to RNA), because even at these ages ribosomal RNA sequences are decently well conserved. For example, extant bacterial 23s ribosomal RNA sequences have roughly 55-60% sequence identity to extant archaeal ribosomal sequences.

In an attempt to try to show some evidence for universal common descent, and inspired by a blogpost by evograd, almost a year ago I inferred a bacterial, and an archaeal tree using 23s ribosomal RNA sequences. What I wanted to try to show is that, without using outgroup rooting(thus forcing the result), internal nodes in the two trees(inferred independently of each other, using midpoint rooting) still exhibit ancestral convergence. The idea is that if bacteria and archaea really did evolve from a common ancestor, as we go deeper back in time towards the approximate root of each tree, the sequences should become more similar, both with respect to sequence identity, and by alingment scores.

That does indeed appear to be the case. I never finished aligning up all my sequences. I don’t do this professionally so did a lot of copy-pasting into browser windows using online tools to infer trees and get alignment scores, and I still have hundreds of alignments to do, so forgive me if I won’t finish this up anytime soon. But here is a screenshot of a preliminary result:


Alignment scores go from lower=more green, to higher=more red.

Here are the bacterial and archaeal trees with labeled nodes.

The root nodes for the trees are N1 for both. As you can see, alignment scores between nodes in each tree become progressively more red in color as we move closer to the roots of each tree. Interestingly, the particular data set I had collected seems to imply that the true root of my bacterial RNA tree lies closer to node N2(it gets ever so slightly better alignment scores to archaeal sequences), than the midpoint root node N1.

5 Likes

This might not be quite what you’re looking for, but you could start with it:

Theobald D.L. A formal test of the theory of universal common ancestry. Nature 2010; 465:219-223.

2 Likes

Jordan, while some of the other responders have covered this, let me also chime in and say that part of the logic of universal ancestry is not that any one comparison or experiment can show it all at once (although there are efforts to do that), but that all kinds of overlapping ones thread together to cover the tree of life. One challenge is one of resolution. Some genes are excellent for inferring distant relationships (like genes for 16S rRNA, COX1, ITS1/2) but poor at resolving closer relationships because the sequences will be 100% identical or nearly so. Other genes are great for inferring close ancestry (because they are relatively new and evolving more rapidly) but useless for more distant relationships (because they are only found in certain lineages). There are literally thousands of studies with hundreds of genes (and protein sequences) and you can think of these as generating overlapping “contigs” of relationships that can then be pieced together like a giant puzzle.

But also, the evidence for universal ancestry can also be found in the remarkable similarity of the construction of cells in ways that are totally arbitrary and could have turned out any number of ways. The simplest example of this is the genetic code itself, the code by which DNA triplets encode amino acid. There is no structural reason that the code had to work out any specific way. It’s essentially a random code and yet, nearly all living things use the exact same code in the exact same ways, with 64 possible codons coding for 21 possible outputs, 20 amino acids and a stop codon. (The exceptions are extremely rare and limited to a codon or two) Here are some other incredible commonalities across all living cells:
1.) Identity, structure, and function of phospholipids
2.) Universal use of basic energy currency: ATP, pyruvate, glucose, Acetyl-CoA
3.) Ribosome structure (two subunits, large and small, made of very similar proteins and rRNA)
4.) DNA and RNA are structurally and functionally conserved in all cells
5.) Enzymes like ATP synthase, tRNA activating enzymes, those involved in electron transport chain, glycolysis, tRNA synthesis and activation, chaperonins…

These could all have ended up any possible way and there are plenty of theoretical ways that these could have been designed “better” by rational design. That all cells share the basic machinery of life, with all its limitations and shortcomings (Carbon-fixation by photosynthesis is especially inefficient) is the most powerful evidence of common ancestry to me. But I’m a cell/molecular biologist, so that’s what speaks to me. :slight_smile:

4 Likes

This reminds me of some of the challenges in radioisotope dating. Some radionuclides are good for very old samples and some (like C-14) for very new ones. This has led to misinterpretation by non-experts at times (looking at much of the YEC literature). I imagine there are analogous issues in phylogenetics where you’re either too identical or not enough for some folks.

1 Like

@Jordan, I second this. There’s no reason to expect to find a single paper on this subject.

If you’d like to try some of this for yourself, looking at myosins offers a two-fer: the head (motor) domains show deep homologies, while the tails (oligomerization and/or cargo binding) vary much more, limited to eukaryotes of course.

2 Likes

Yep, exactly. I think radiometry is a good analogy.

I’d not heard of this before, but I’d love to explore it as a possible classroom activity. Do you have any resources on this? I’m also a fan of DIY projects with this stuff (and even having the students DIY if there is time), but if you have something as a launching point, that would be most appreciated!

2 Likes

I’d suggest it as DIY for both you and the students.

You can locate the head/tail boundary easily by aligning the protein sequences two myosins of different classes, like a muscle (II) and a V (my speciality). It’s the N-terminal ~900 residues for most. Some have N-terminal doodads that will change that.

Using head domains goes very deep and has been used as a tool:

Richards, T. A. & Cavalier-Smith, T. Myosin domain evolution and the primary divergence of eukaryotes. Nature 436(7054):1113-8

Then, if you want to go into the tails of a particular group, I’d suggest my beloved myosin-V family. It has 3 members in vertebrate genomes.

If you need some help, maybe we should go through the alignments in public on a new thread and others can play along…

3 Likes

Another good analogy would be telescopes. Some telescopes are great for focusing in on single galaxies, but there is still a lot of information that we can gather from telescopes that have a large field of view but lower resolution. Different tools have different resolution, and the pictures they gather are used to put the larger collage together. No single paper using a single telescope is going to give us all the answers.

Because I am a science geek, the Large Synpoptic Survey Telescope and the James Webb telescope are the two upcoming telescopes to get excited about. They also fit neatly into the wide field of view v. high resolution analogy.

If my understanding is correct, this is why mitochondrial DNA is so helpful in closely related species. The rate of divergence is much higher which allows for greater resolution over shorter time periods. What you are looking for is DNA that will accumulate a certain number of mutations over the time period you are looking for, not too high and not too low.

Since you’re a chemist, I strongly recommend Nick Lane’s book:
https://www.amazon.com/Vital-Question-Evolution-Origins-Complex-dp-0393088812/dp/0393088812/ref=mt_hardcover?_encoding=UTF8&me=&qid=

It’s written from a metabolic POV, so it’s not only a relief from nucleic acids, it may be a better way to think about life’s origins.

2 Likes

I think I resemble that remark :roll_eyes:.

1 Like

Let’s start with your 694 nucleotides sequence within the last common ancestor of humans and gibbons. And let’s say that 4% of the positions within this sequence have changed by drift during the journey leading to humans. Now, what is the probability P that a given mutation that has occurred along the branch leading to human also occurred during the journey leading to gibbons? As a rough estimate, can’t we say that P is more or less equal to (0,04^2), ie,. 0,16%?
According to the above reasoning, we would expect no more than 1 or 2 cases of homoplasy in your example. But we observe 22 of them.

There are many problems with your calculation, the greatest being your assumption that all sites are equally free to vary. Given that many sites are fixed, that 4% divergence will be concentrated into a much smaller piece of that 694 nucleotides.

3 Likes