Deep Learning the Determinants of Protein Function

swamidass · May 27, 2021, 5:52pm

https://www.nature.com/articles/s41467-021-23246-1

Understanding the structural determinants of a protein’s biochemical properties, such as activity and stability, is a major challenge in biology and medicine. Comparing computer simulations of protein variants with different biochemical properties is an increasingly powerful means to drive progress. However, success often hinges on dimensionality reduction algorithms for simplifying the complex ensemble of structures each variant adopts. Unfortunately, common algorithms rely on potentially misleading assumptions about what structural features are important, such as emphasizing larger geometric changes over smaller ones. Here we present DiffNets, self-supervised autoencoders that avoid such assumptions, and automatically identify the relevant features, by requiring that the low-dimensional representations they learn are sufficient to predict the biochemical differences between protein variants. For example, DiffNets automatically identify subtle structural signatures that predict the relative stabilities of β-lactamase variants and duty ratios of myosin isoforms. DiffNets should also be applicable to understanding other perturbations, such as ligand binding.

One of my more fun and out of the box project was just published. Given all the discussion about protein structure and function here, I thought this would be interesting to the community.

Dan_Eastwood · May 27, 2021, 10:49pm

Congratulations!

Chris_Falter · May 28, 2021, 1:27am

Very ingenious model design! The use of E-M and of concatenated latent spaces from 2 auto-encoders is quite instructive.

Two thumbs up!

swamidass · May 28, 2021, 6:09am

My group came up with that EM trick for multinstance training a few years back. It has now made its way into solving previously unsolved problems in about four domains now . I’m really proud of that!

system · June 4, 2021, 6:09am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Language Models Generate Functional Proteins With Completely Novel Sequences Conversation Science , Design , Artificial-Intelligence	2	419	August 29, 2023
An information-theoretic rationale for the evolution of symmetric protein complexes Conversation Science	1	375	January 18, 2022
Simulating 500 million years of evolution with a language model Conversation Science , Artificial-Intelligence	9	183	February 2, 2025
Computation in cells, by protein dimerization Conversation Science , Design	2	36	March 7, 2025
Evidence favoring de novo gene evolution, and an actual population genetic model of de novo gene gain Conversation Science	8	759	September 9, 2021

Deep Learning the Determinants of Protein Function

Related topics