What @John_Harshman said. But I figure it actually takes a bit of explaining. The bootstrap is basically a test of tree consistency.
You have to understand something about how the phylogeny is made in the first place, and then how it’s tested for consistency. As it says in the paper it’s inferred on the basis of 54 nuclear genes from each of those 191 species. It’s important to note that those 54 nuclear genes come from different chromosomes, have wildly different functions, some come from coding regions and some from non-coding regions. Introns, exons, genes that regulate other genes and control development, genes that function as enzymes, genes that take part in DNA replication, etc. Some from X and some from Y chromosomes. So whatever functional constraint you might imagine operates on some gene, it’s rather difficult to see how the same constraint can be operating on another, particularly in a sense where that constraint should force trees inferred from each of those 54 loci to give similar gene trees. That just doesn’t make sense.
Anyway, what that means is that in every species they collect the sequences for those 54 genes and put them literally end-to-end in one long DNA sequence(~35000 DNA basepairs for each species), and then they construct one giant alignment from those 191 x 35000bp sequences. This alignment is then used to infer a tree with a phylogenetic algorithm.
Here’s where the bootstrap comes in.
The bootstrap basically constructs a new alignment from the original alignment, by randomly pulling columns from the original one and building a new one. Then a new tree is inferred from this new alignment derived from random sampling of the original alignment. Then the trees are compared and it is noted how many nodes are different between the trees. This is done one hundred times. That is, 100 times a new alignment is made of the same size as the original, by taking random pieces of the original alignment and inferring a new tree from this random sample of the data. Since the original data is randomly sampled, you will not get the same alignment every time, and so in some alignments you get more data from some genes and less from others. And since it’s random and there’s so much data, it’s very unlikely you get the same random samples. So each tree should have different biases.
There’s this nice table in Baum & Smith’s Tree Thinking that shows what a resampled bootstrap data set is. Notice how for example column 35 from the original alignment was sampled 4 times for the bootstrap data, which means whatever tree is inferred from that new bootstrap alignment will be more strongly biased by the data from column 35 than the tree from the original alignment:
In the primate phylogeny paper the number of columns is of course then ~35000 and there’s 191 rows.
100 times such a new resampled alignment is created, a new tree inferred, and the new trees are compared.
It is then noted for each node in the tree how many times out of those 100 that particular node is recovered. So the bootstrap value for each node is then literally how often that node is found by randomly sampling the original data set and making a new tree from that sample. If the data from those 54 genes has a lot of inconsistency, it is likely it should show by low bootstrap values for lots of nodes. I think a rule of thumb is that bootstrap values above 90 are considered good, which means that node is recovered in 90 out of 100 times.
The vast majority of nodes in the tree from the molecular phylogeny of primates has bootstrap values above 90.
Again, it’s important to understand just how different the trees could be. For 191 species there are 6.2137x10407 possible rooted trees. Trees can disagree in an incomprehensible number of ways. Finding almost identical ones over and over again basically no matter what genetic locus it is inferred from is an incredible degree of consistency. What explanation is there for that, other than common descent? That is that they have been sharing the same genealogical history, and are therefore constrained by the same speciation events, having been part of the genomes of members of the same species and evolved for similar amounts of time? Common functional constraint, or common design as creationists often invoke, simply does not explain that. What IS this supposed functional constraint operating on such wildly different DNA sequences that forces them to give similar trees?
So that’s it. Did that make sense?
Edit: Btw the number of bootstraps replicates doesn’t have to be 100, it can be many more. But it’s extremely computationally demanding creating phylogenetic trees from so much data so afaik it’s kept to something like 1000 replicates or below and the bootstrap value for a node can be something like 852, or 978 out of 1000, or whatever.