This.
Consider the space of all possible nucleotide sequences. Simply picking some long genome sequence at random and then working out (if we imagine we can do this) what sort of organism some randomly picked nucleotide sequence is going to result in.
I think we can agree that randomly and blindly picking a genome sequence that corresponds to a highly anatomically streamlined fish is unfathomably remote. In the space of all possible nucleotide sequences that vast majority must either be nonfunctional, as in not result in a viable form of life at all, or at the very least result in something very, very different from a highly streamlined fish.
Yet it should also be very clear that in so far as we have an organism that can live and swim in water, it is essentially guaranteed that under natural selection this species can and will evolve ever more streamlined body morphology. So the real question isnât whether it is likely to pick a streamlined fish from among all possible genomes, thatâs just not how anything evolves. The real question is what is the probability that a mutation occurs that affect body morphology in a way that reduces aquadynamic drag, in a member of the species? Given that these obviously occur in large quantities every generation (the drag coefficient of fish is a variable and heritable trait in any population), one can move gradually towards extremely rare genotypes.
This is the power of cumulative selection, and why an attractor in a space can vastly affect the probability of obtaining something that is a priori very unlikely.
It is an entirely reasonable and interesting question to ponder what that attractor might be in protein sequence space. It doesnât actually have to be(and probably isnât, in many cases) a persistent selection for a specific function. If certain functions are extremely rare in protein sequence space then you obviously canât select for those functions right out of the gate, but you can select for something else that might lead to such rare functions as an unavoidable byproduct. Some of the rarest functions in protein sequence space might in fact have begun long after the proteinâs first origin, as spandrels that resulted as a byproduct of some other selection pressure, such as the long-term consequence of selection against the inherent aggregation propensity of polypeptide sequences.