https://www.biorxiv.org/content/10.64898/2025.12.04.691947v1.full
Turns out new AI-based structure prediction tools can now corroborate the long-standing hypothesis that protein folding structures are much easier to evolve from repeats of smaller peptides, than entirely random amino acid sequences.
Conclusions
Our results show that the inclusion of repeats enriches the fraction of otherwise random sequences that can adopt folded conformations. With repeat lengths of 5 to 20 residues, over 1% of the sequences are predicted to fold, mostly into repeating solenoids or β-hairpins, although we also observe helical bundles and tightly wound helical screws that represent a new family of super-secondary structures. With the inclusion of INDELs between blocks of repeats, we also observe the formation of helical bundles at a frequency of 0.5 to 1%.
On the other hand, beyond a repeat length of 30 residues, the frequency of observing predicted folded structures decreased sharply, reaching 0.001% for sequences lacking repeats. Also, these structures were all globular. Evidently, without the imposition of a sequence repeat, the probability of adopting a repeating conformational form becomes very low, and only more asymmetric arrangements typical of the domains seen in globular proteins are observed. The resulting structures are representative of ones seen in globular proteins, and include the α-helical, α/β, and all-β classes of proteins.
Turns out that somewhere around 0.1 to 1% of sequences built from repeats of random fragments between 5 to 20 amino acids long, are foldable proteins.
Let’s contrast that with what they write on the Discovery Institute blog “Science and Culture Today” that Douglas Axe’s research has shown (my bold):
This paper is interesting because it relates to the work of Douglas Axe that resulted in a paper in the Journal of Molecular Biology in 2004. Axe answered questions about this paper earlier this year, and also mentioned it in his recent book Undeniable (p. 54). In the paper, Axe estimated the prevalence of sequences that could fold into a functional shape by random combinations. It was already known that the functional space was a small fraction of sequence space, but Axe put a number on it based on his experience with random changes to an enzyme. He estimated that one in 10^74 sequences of 150 amino acids could fold and thereby perform some function — any function.
Another one of those “what’s a mere 75 orders of magnitude discrepancy between friends?” situations.