News in experiments on de novo protein coding gene origination

Knopp et al are at it again:


Antibiotic resistance is a rapidly increasing medical problem that severely limits the success of antibiotic treatments, and the identification of resistance determinants is key for surveillance and control of resistance dissemination. Horizontal transfer is the dominant mechanism for spread of resistance genes between bacteria but little is known about the original emergence of resistance genes. Here, we examined experimentally if random sequences can generate novel antibiotic resistance determinants de novo . By utilizing highly diverse expression libraries encoding random sequences to select for open reading frames that confer resistance to the last-resort antibiotic colistin in Escherichia coli , six d e novo colistin resistance conferring peptides (Dcr) were identified. The peptides act via direct interactions with the sensor kinase PmrB (also termed BasS in E . coli ), causing an activation of the PmrAB two-component system (TCS), modification of the lipid A domain of lipopolysaccharide and subsequent colistin resistance. This kinase-activation was extended to other TCS by generation of chimeric sensor kinases. Our results demonstrate that peptides with novel activities mediated via specific peptide-protein interactions in the transmembrane domain of a sensory transducer can be selected de novo , suggesting that the origination of such peptides from non-coding regions is conceivable. In addition, we identified a novel class of resistance determinants for a key antibiotic that is used as a last resort treatment for several significant pathogens. The high-level resistance provided at low expression levels, absence of significant growth defects and the functionality of Dcr peptides across different genera suggest that this class of peptides could potentially evolve as bona fide resistance determinants in natura .

Author summary

We expressed over 100 million randomly generated DNA sequences in Escherichia coli and selected 6 variants that encode peptides that provide resistance to the last-resort antibiotic colistin. We show that the selected peptides are auxiliary activators of the two-component system PmrAB, and that resistance is mediated via modifications of the cell envelope causing decreased antibiotic uptake. This is the first example where random expression libraries have been employed to select for peptides that perform an activating function by direct peptide-protein interactions in vivo , adding support to the idea that non-coding DNA can serve as a substrate for de novo gene evolution. Additionally, the described peptides expand the narrow list of colistin resistance genes and further analyses of clinical isolates will be necessary to determine if similar resistance determinants have evolved in natura .


Can you boil that down for a math geek? :laughing:

Doug Axe et al. are wrong again, as yet another specific function is easy to find in a random sampling of protein sequence space.

1 Like

They basically make a large number (about 100 million) of random protein coding genes with a size from 20 to 50 amino acids in length (technically these are in the size range considered “peptides”, not “proteins”), put these genes into lots of E coli cells grown in the presence of an antibiotic (that E coli normally is not resistant to), to see if any of them are capable of giving resistance to E coli against the antibiotic.

We generated a set of highly diverse plasmid libraries encoding randomly generated short open reading frames (sORFs) that were expressed from a strong promoter to select for de novo sequences that provide a beneficial effect in Escherichia coli [16]. The five libraries encode 10 to 50 amino acids with either no bias (rnd 10, 20, 50a), a restriction to primordial amino acids (rnd 50b) to mimic the amino acid availability supposedly present when the first genes originated de novo [19] or a bias for hydrophilic amino acids (rnd 50c) to promote intrinsic disorder and functional promiscuity (Fig 1A). By utilizing large scale library cloning and parallel plasmid transformations of approximately 80 pooled ligation reactions, we generated a set of over 5.8 x 108 unique sequences in total. We subjected the five libraries to a selection for variants that are able to confer resistance to the clinically important antibiotic colistin (polymyxin E) (Fig 1B). This last-resort antimicrobial peptide is classified as a ‘critically important antimicrobial for humans with the highest priority’ by the World Health Organization [20].

They find six different such genes capable of giving E coli the ability to grow in the presence of the antibiotic.

From this selection, we isolated six inserts that enabled growth of E . coli BW25113 at normally inhibitory colistin concentrations. The encoded peptides range from 26 to 51 amino acids and do not share sequence homologies with one another (ClustalOmega) (S1 Table) or other proteins (tblastn) using default search parameters (see Materials and methods).

They then determine how these new genes function (how they make E coli able to grow in the presence of the antibiotic). Lots of technical stuff there about controls and what not.

The work is consistent with work by other labs, which shows novel protein coding genes are likely to begin as small-ish hydrophobic peptides that insert in the cell membrane.

They also show that these peptides are all gain-of-function peptides that activate other proteins, and that they do not cause any growth defects (as other types of antibiotic resistance mechanisms often come with some sort of functional or fitness trade-off).

I think the discussion is somewhat readable even for laypeople. Well worth a look.


Take some random parts, throw them under the hood and see if it runs. Very cool! :slight_smile:

1 Like

Oh I should add, the genes are thought to mediate function by direct peptide-protein interactions. One of those things Michael Behe argued in his Edge of Evolution book are too unlikely to evolve.

Dcr peptides function as auxiliary activators of PmrB

The isolated peptides act in a PmrAB-dependent manner, causing full activation of the regulon. Based on the high hydrophobicity and predictions for transmembrane helix formation, we hypothesized that the mode of action involves a direct interaction with the membrane-localized sensor kinase PmrB. While no such regulator for PmrB is known, auxiliary proteins are described in other TCSs[2830]. For example, the functionally related TCS PhoPQ has been shown to be regulated by the membrane protein UgtL by a direct binding to the sensor kinase PhoQ [31]. The employment of a bacterial two-hybrid system, in which the proteins of interest are fused individually to two components (T18 and T25) of an adenylate cyclase, has been particularly useful for demonstrating interactions between auxiliary regulators and their corresponding kinases. Co-localization of the proteins of interest activates the two reporter-fragments, which produces cAMP and drives expression of a reporter gene (Fig 3A)[32]. We chose to test three colistin resistance peptides (Dcr1, Dcr2 and Dcr3) for interaction with PmrB using the bacterial two-hybrid system. In the initial assays, all variants lost the ability to confer colistin resistance when fused to the T25 subunit and we did not detect any interactions with PmrB. We hypothesized that the loss in function was likely due to steric hindrance from the T25 fusions. Therefore, we cloned randomized sequences (NNN repeats) between T25 and the different colistin resistance peptides to act as linkers and selected for fusion variants that maintained functionality, i.e. the ability to confer resistance (Fig 3B). We isolated one fusion variant that contained four concatenated 10x(NNN) repeats and maintained the same level of resistance as the originally selected, non-fused Dcr3. This variant (Dcr3L) caused a strong activation of the two-hybrid system reporter, providing compelling evidence for an interaction between the Dcr peptides and PmrB (Fig 3C). This interaction was specific for PmrB, as no interaction between Dcr3L and PmrA could be detected.



A screen of 10^8 random polypeptides yields 6 that interact with a sensor kinase.

A protein-interacting domain like this is roughly the same as what Behe would call a “CCC”. According to Behe, one should expect to find one such peptide in a population of 10^20. Finding six in 10^8 is out of the question, if one believes the Edge of Evolution.

In other words, yet another finding that shows just how completely wrong Behe is.


In the spirit of math geekiness, this (6 in 10^8) gives 99% confidence that the probability random polypeptides will interact with a sensor kinase is at least one in 4.29E7, (~2.33E-8).

And yet perfectly consistent with every other random screen experiment I’ve seen. Almost as though Behe didn’t bother checking or was dishonest in his treatment of the literature.


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.