Skip to main content

Profluent Introduces ProGen3, Demonstrating Scaling Laws for Foundation Models in Writing Biology

  •  Scientific preprint examines the first wet laboratory evidence of scaling benefits for biological design involving billion-parameter model sizes trained on over 3.4 billion protein sequences
  • Platform enables single-shot design of antibodies for 20 drug targets that address 7 million patients with $660B in historical sales and development of an ultra-compact gene editor – all available for licensing
  • The frontier AI model suite for protein generation to be made available to select partners through an exclusive early access program

Profluent, the AI-first protein design company, today unveiled ProGen3, a family of frontier protein models. Trained on the world’s largest, highly curated dataset of protein sequences, the company’s foundation model for protein generation enables writing new biology to solve challenges across biomedicine, agriculture, and industrial applications.

With ProGen3, Profluent provides evidence that the AI breakthroughs that revolutionized natural language processing are now ready for impact in biology. Just as large language models learn the underlying rules and patterns of language and gain new emergent capabilities as they scale with increased data and computing power, Profluent has shown that the relationship between scale and performance also applies to biological design. In a first for the field, the company detailed real-world evidence for scaling billion-parameter model sizes to over 3.4 billion full-length protein sequences. This achievement signals that AI biological models will continue to unlock more value as they scale, enabling a future of programmable biology and a shift from incidental discovery to intentional design.

“The tremendous advancements we’ve witnessed in AI for text generation are linked to scaling laws, where increasing compute and data translates to improved performance and unlocked capabilities. The next frontier for these scaling benefits will be in the physical realm – particularly in biology,” said Ali Madani, Ph.D., Profluent co-founder and Chief Executive Officer. “We view this as the starting line in the race toward emergent capabilities in biological design and look forward to working with our partners to actualize the potential of our protein design technology.”

ProGen3 expands on previous models with unprecedented scale of data, curation, and model size. In a preprint on bioRxiv, Profluent details the first compute-optimal scaling laws for sparse protein language models, ranging from 112 million to 46 billion parameters, pre-trained on 1.5 trillion tokens. As evidenced in wet laboratory validation, Profluent finds that larger models generate high fitness proteins for a wide diversity of protein families. Through computational and experimental results, larger models are also more responsive to alignment to laboratory data, resulting in improved protein fitness prediction and sequence generation capabilities. Additionally, Profluent continues to expand its curated database of proteins, which currently contains approximately 80 billion sequences, approximately 30 times more sequences than the AlphaFold database.

By integrating its frontier model in the broader Profluent platform inclusive of its in-house wet lab, Profluent continues to tackle moonshots in biological design. Early versions of its foundation models have been used to design the world’s first AI-created and open-source gene editor, OpenCRISPR-1. With its latest platform, Profluent has designed a novel, ultracompact editor and expanded beyond genomic medicine to tackle challenging problems in the biologics space. In particular, Profluent has designed antibodies, termed OpenAntibodies, in a single shot that rival or beat blockbuster therapeutic antibodies for 20 disease targets, which have collectively treated 7 million patients and yielded over $660 billion in cumulative drug sales. As shown for antibodies, Profluent’s foundation models are broadly useful across multiple modalities and sectors.

Profluent is working with leaders in therapeutics, agriculture, and biomanufacturing on applications of its AI-designed proteins. Partners can access the company’s technology in the following ways:

  • Molecules: Straightforward licensing of AI-generated antibodies and editors or a strategic alliance on bespoke solutions
  • Models: Exclusive early access program that allows select partners to customize our best foundation models to specific data and use-cases

For more details about ProGen3, licensing, and partnership opportunities, visit https://www.profluent.bio/showcase/progen3.

About Profluent

Profluent is an AI-first company pushing the frontier of de novo protein design to author new biology. Grounded in nature with AI as an interpreter, Profluent’s powerful foundation model platform unlocks solutions that transform medicine, agriculture, and beyond. Founded in 2022 and headquartered in Emeryville, CA, Profluent is backed by leading investors including Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures. To learn more, visit profluent.bio.

Contacts

Stock Quote API & Stock News API supplied by www.cloudquote.io
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the following
Privacy Policy and Terms and Conditions.