xi's moments
Home | Innovation

Shanghai scientists develop novel protein design tech using AI

By ZHOU WENTING | chinadaily.com.cn | Updated: 2025-03-24 17:31

Shanghai scientists have made breakthrough in protein design by leveraging artificial intelligence, establishing the world's largest protein sequence dataset and designing models based on this dataset, which enables targeted modification and selection of proteins with specific functionalities.

Such advancement carries the potential to drastically reduce the time and cost involved in industrial protein modification, said the research team from Shanghai Jiao Tong University on Saturday.

Proteins play crucial roles in everyday production and life activities, from drug preparation to stain removal, plastic degradation, and green manufacturing. However, natural proteins often require modification to withstand acidic or alkaline conditions, temperature variations, and other environmental factors to meet specific application requirements.

For example, scientists say that if a protein is used in laundry detergent, it must be able to withstand cold and hot water, so that it can play a role in decomposing stains in the actual washing process.

Traditionally, protein modification methods relied on thousands of trial-and-error experiments, posing challenges in terms of time, cost, and intensive trial processes within the industry.

The Shanghai team's innovative approach transforms protein production from a slow trial-and-error process to an efficient and precise design method, significantly reducing the research and development cycle from the conventional two to five years to just six to 12 months.

Their technology allows for directed modification or selection of proteins with unique properties, such as extreme heat resistance, alkaline resistance, and resistance to gastrointestinal digestion. Such proteins hold immense promise in various fields, including biotechnology, pharmaceutical research, and industrial production.

This achievement, together with industry-leading automation equipment, has been industrialized, turning protein design from complex science to a much simpler project.

The team's protein sequence dataset — Venus-Protein Outsize Database, or Venus-Pod — contains over 9 billion pieces of protein sequence information covering a wide range of organisms from conventional terrestrial life forms to extremophilic microorganisms. The dataset contains 3.62 billion terrestrial microbial protein sequences, 2.94 billion marine microbial protein sequences, 2.43 billion antibody protein sequences, and 60 million viral protein sequences.

Notably, 500 million among them are labeled with functional tags indicating the working temperature, pressure, acidity, and alkalinity of each protein.

Based on the Venus-Pod dataset, the team trained the Venus series models, which focus on predicting and designing protein functions with unparalleled accuracy in protein mutation functionality, ranking at the top of the industry leaderboard, said Hong Liang, lead scientist on the research team.

The Venus series models boast two core functions: AI-directed protein evolution, and AI-empowered massive screening, said Hong.

"The first can optimize various properties of underperforming protein products to meet specific application requirements, while the latter is able to precisely identify proteins with exceptional functionalities from the vast unknown functional protein dataset, such as extreme heat resistance and gastrointestinal digestion resistance. Such unconventional proteins are expected to bring innovative breakthroughs to related scientific fields and industries," he said.

In conjunction with the Venus series models, the team has also developed the world's first integrated machine that allows processing of low-throughput, high-volume protein expression, purification, and functional testing. The machine can continuously complete over 100 protein expression, purification, and testing tasks within 24 hours. This automation improves efficiency nearly tenfold compared to manual methods, significantly reducing the labor, resources, and time costs in the research and development process, thereby boosting the efficiency of protein engineering and synthetic biology research.

Over the past two years, the Venus series models have successfully designed multiple proteins that are now entering the industrialization phase.

For instance, in the field of early diagnosis of Alzheimer's disease, researchers optimized an alkaline phosphatase (ALP), a type of enzyme, to achieve three times the activity of the best global product so far, enabling the detection of biomarkers for Alzheimer's disease at extremely low concentrations.

The modified ALP has entered the 200-liter scale-up production stage, marking the successful transformation of the Venus series models in practical industrial applications. Researchers said such achievement holds vast value in diagnostic chemiluminescent detection projects requiring ultra-sensitive testing.

Global Edition
BACK TO THE TOP
Copyright 1995 - 2025 . All rights reserved. The content (including but not limited to text, photo, multimedia information, etc) published in this site belongs to China Daily Information Co (CDIC). Without written authorization from CDIC, such content shall not be republished or used in any form. Note: Browsers with 1024*768 or higher resolution are suggested for this site.
License for publishing multimedia online 0108263

Registration Number: 130349
站长统计