Revolutionizing the Protein Landscape: MIT Researchers …

MIT researchers have created machine learning algorithms to create novel proteins beyond those in nature. They employed generative models to predict the amino acid sequences of proteins that meet particular structural requirements. These models learn the molecular linkages that govern how proteins develop. The models can produce millions of proteins in just a few days, giving researchers access to a variety of fresh research possibilities. This tool could be used to create food coatings based on proteins that would keep producing fresher for longer while still being safe for people to consume or to create materials with particular mechanical properties that might eventually replace materials made from ceramics or petroleum with materials that have a significantly lower carbon footprint.

The order of the amino acids in a protein chain influences the protein’s mechanical properties. Chains of amino acids are folded together in 3D patterns to form proteins. Although hundreds of proteins produced by evolution have been identified, experts believe that a vast majority of amino acid sequences are still unknown. Deep learning algorithms that can forecast the structure of protein for some amino acid sequences have recently been created by researchers to speed up the process of protein discovery. However, the inverse problem, which involves foretelling a series of amino acid sequences that satisfy design objectives, has proven to be more difficult. When creating proteins, attention-based diffusion models must be able to learn very long-range associations because a single mutation in a lengthy amino acid sequence might make or break the entire structure. By first learning to recover the training data by eliminating the noise, a diffusion model can then learn to produce new data by first introducing noise to the training data.

Using this architecture, the researchers created two machine-learning models that can forecast a wide range of novel amino acid sequences that will result in proteins that match predetermined structural design goals. Users enter desired percentages of various structures for the model that works with overall structural qualities, and the model then constructs sequences that adhere to those targets. The scientist also selects the order of amino acid structures for the second model, providing much finer-grained control. The models are linked to a protein folding prediction algorithm that the researchers use to ascertain the protein’s three-dimensional (3D) structure. They then compute the resulting properties and compare them to the design requirements.

By contrasting the novel proteins with well-known proteins with comparable structural characteristics, they were able to test their models. A majority of them shared 50 to 60 percent of their amino acid sequences with already known ones, although several also included wholly unique sequences. According to the degree of similarity, several of the produced proteins are synthesizable. The researchers attempted to fool the models by feeding them design targets that were physically impossible in order to make sure the predicted proteins made sense. They were amazed to observe that the models yielded the nearest synthesizable answer rather than the unlikely proteins.

Check out the Paper and MIT Blog. Don’t forget to join our 20k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at

🚀 Check Out 100’s AI Tools in AI Tools Club

Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.

Source link

You cannot copy content of this page