A lot of work has been done recently that shows, often mutation-by-mutation, how a non-coding region evolved across the genic threshold.
Evolution of "antifreeze gene" in Arctic fish
De novo emergence of adaptive membrane proteins in yeast
https://www.nature.com/articles/s41467-020-14500-z
Emergence of de novo transcripts in yeast
https://www.nature.com/articles/s41467-021-20911-3
Structural and functional characterization of goddard de novo gene in Drosophila
This Nature Communications report contains a nice summary of the state of research on de novo genes as of 2021.
https://www.nature.com/articles/s41467-021-21667-6
Orphan genes are involved in drought adaptations in domesticated cowpea
Structure and function of naturally evolved *de novo* proteins
According to this summary of the state of research:
the genetic mechanisms underlying the emergence of these ‘ de novo ’ protein coding genes (‘ de novo emergence’) are now quite well understood
Thus biologists are focusing their attention on the roles and interactions of the emergent de novo proteins. The 5-stage Pittsburgh model of de novo protein evolution is discussed.
There is of course plenty of room for exploration and discovery on the topic of de novo proteins, but this summary article should put to rest the erroneous notion that biologists do not have a working model that can be tested. Computational tools (such as the maximum likelihood and MCMC models of the Mani and Tlusty paper) are particularly important in resolving wait time issues.
https://www.sciencedirect.com/science/article/pii/S0959440X2030213X
Best,
Chris Falter