Would you like an explanation? This whole thread seems to have died.
I suppose that does mean you want a reply, though I have to say this is the least communicative discussion group I’ve seen.
So, ANOPA (Analysis of Pattern) is a baraminological ordination method. First you create a multidimensional character space, which is nothing more than a character matrix. Then you construct a centroid for that space: the unweighted average character state (this assumes numerical characters) for each character over all taxa. This gives you the first dimension of ANOPA: the Euclidean distance between a taxon and the centroid (a0). To get the second and third dimensions you must designate an outlier with high distance from the centroid (probably what would be called an outgroup in real systematics) and draw a line in character space between the outlier and the centroid. The second dimension (t0) is the perpendicular distance between a taxon and the line, while the third dimension (d2) is the distance from the centroid along that line to the perpendicular. One can derive another parameter from the angle between the taxon, the centroid, and the origin (presumably the “all states are zero” point), and use that to generate new coordinates for each taxon in a 3D space. You use those distances to generate a plot that you then inspect (visually?) for clusters, and clusters are considered separate baramins. I think I have all that right.
So why is this cargo cult science? Because there is no justification offered for why the various features of this method were chosen or why clusters in ANOPA plots should be expected to correspond to baramins. You just do it, you get a plot, you write it up, and it looks all very sciencey.
So, problems? Well, first, the absence of justification for the method is the most obvious. The correlation of the dimensions seems problematic too, as a0^2 = t0^2 + d2^2. There are many other questions. What do Euclidean distances actually mean here? Why are simple distances relevant to baramins? What is the meaning of the centroid? The outlier? Does this method actually cluster taxa that are close in character space? What do the clusters, if any, actually mean? What would we expect from taxa in the same baramin? From taxa in different baramins? Why? And so on.
I would also think that you could form a baramin for almost any group you wanted depending on the characteristics and the outgroup. For example, it would seem trivially easy to get clustering for bears, humans, and bats into a baramin with fish as your outgroup.
Just to remind you, I am a professor with a full time job, active research, and no staff. I’m happy to host everyone here but please do what you can to move important things forward, and police yourselves according to our shared goals.
Yes, I suppose there is a rough correlation between the distances generated by ANOPA and patristic distances, so the choice of a sufficiently distant outroup/outlier would tend to render the ingroup tightly clustered. Then again, any deep-ish divergence within a clade would tend to be represented as two clusters with greater likelihood as the outgroup/outlier were closer to the root.
It does seem arbitrary, isn’t all of cladistics subject to this problem?
This might be my cue to get all huffy. It most certainly is not. If by “cladistics” you refer to modern phylogenetics, its methods have extensive theoretical and empirical justification, and systematists are constantly examining and validating their conclusions. There is a huge literature on this subject. If you aren’t familiar with it I can probably point you in that direction.
Now, I will allow that the occasional molecular biologist just grabs a few sequences, dumps them into MEGA, and pushes a button for a neighbor-joining tree. But that’s not how actual phylogeneticists work.
Anyway, phylogenetics is by no means cargo cult science and shows none of its identifying features. ANOPA, on the other hand, shows those features prominently.
I most certainly do not refer to phylogenetics. I refer to cladistics of phenotypic data, which seems much more arbitrary.
That’s phylogenetics too. But there is still plenty of theoretical and empirical justification for methods even in morphological phylogenetics and still a huge literature on the subject. Scoring morphological characters certainly isn’t as easy as and has more subjectivity than aligning sequences, but it’s still done using justified procedures. And I will point out that its results usually agree with molecular phylogenies; when they don’t, that’s worth a publication all by itself. There is no reason to equate it with cargo cult science.
If I understand the situation carefully, cladistics uses monophyletic groups while baramins would be paraphyletic groups. To use an example, humans belong to the Hominidae clade which is monophyletic while baraminologists put humans and apes into two separate paraphyletic clades. Monophyletic groups are robust and empirically supported while paraphyletic groups are arbitrary and not well supported.
Is this the essence of it?
I don’t think so. It isn’t a question of monophyly vs. paraphyly. It’s a question of separate creation vs. common descent. Baraminologists try to identify separately created groups, and my point is that their methods for determining the limits of such groups are not justified even on their terms, quite aside from the question of whether such groups do exist or whether we would consider them paraphyletic. (If they existed they would not be paraphyletic; a “kind” is necessarily a clade.) As to whether baraminologists commonly designate groups that scientists would consider paraphyletic, there really are not enough examples of baraminologists designating groups to draw any sort of conclusion. Most often they pick a recognized taxon, probably a clade, and declare it to be a baramin. The main exception would be Hominidae, about which there is no consensus. And the focus is not on whether apes are a single baramin; it’s on making sure that humans are.
It seems to me that ANOPA is somewhat akin to standard tree-building methods, but instead of building a tree it makes an a priori choice to eliminate hierarchy in favor of a single layer of clusters.
This a priori choice makes no sense to me. Surely even a baraminologist would regard breeds of dogs as closer to one another than to, say, various breeds of coyotes. So there is really a nested hierarchy within a baramin.
Once such a nested hierarchy is accepted, there is no mathematical justification for setting up an impassable barrier at the baramin level. Nested hierarchy would extend to the grouping of closer baramin into larger clusters. And then larger clusters into even larger clusters.
Eventually, this method would arrive at a tree of life. The only reason not to follow the method to its logical mathematical conclusion is an a priori commitment not to go that far.
What might my analysis be overlooking?
It’s more akin to ordination methods, in which a multidimensional space is collapsed into fewer dimensions to facilitate display and understanding. Principal components analysis is the one you may be most familiar with. You can try to turn PCA clusters into a tree, but there’s no obvious algorithm to do it objectively. Same with ANOPA. Also, PCA scores have meanings, but I’m not sure ANOPA scores do.
You’re right, I missed the part about projecting the N-dimensional space (where N = number of characters) into a 3-dimensional space. Thanks for the enlightenment, John.
Unlike the PCA approach, which seeks to minimize information loss, ANOPA seems to discard almost all of the information about any species that are not close neighbors of the outgroup. Thus it must by definition be a truly unfruitful approach to doing systematic biology.**
Am I seeing this correctly?
**However, by discarding the vast majority of relevant information, ANOPA seriously impedes any ability to construct meaningful nested hierarchies. To someone committed to supporting YEC viewpoints, as opposed to supporting the best possible science, this is no doubt a feature rather than a flaw.
All true. ANOPA is fairly easy to visualize in 3-dimensional space. I suppose PCA is too, i.e. just the three orthogonal axes of a 3-D ellipsoid. ANOPA, however, makes less sense when you visualize it. What’s the biological meaning of the centroid? What’s the biological meaning of the line from outgroup to centroid? And what’s the biological meaning of the distances from that line to the taxa or from the centroid to the perpendicular? I can’t imagine.
If the number of characters N = 50, for example, even a 3-dimensional PCA would likely discard too much information. However, you would at least be able to understand the meaning of a PCA axis by inspecting which source dimensions have the largest values along the axis.
One might also ask what ANOPA gets from Euclidean distances that wouldn’t be equally well or better represented by Manhattan distances.