Profluent Bio raises $106M to design proteins from scratch with AI — trained on 100 billion proteins and 20 trillion tokens
Nov 19, 2025 with Ali Madani
Key Points
- Profluent Bio raises $106 million from investors including Jeff Bezos to design proteins from scratch with AI, replacing what CEO Ali Madani calls the 'caveman-like' approach of discovering medicines by accident.
- The company's models train on over 100 billion proteins and 20 trillion tokens, roughly 50 times more protein data than AlphaFold 3, then refine predictions through wet-lab testing in human cells.
- Profluent has already released OpenCRISPR-1, an AI-designed gene-editing protein now used by thousands of researchers across pharma, biotech, and academic labs.
Summary
Profluent Bio raised $106 million to build AI models that design proteins from scratch. Jeff Bezos is among the backers. CEO Ali Madani holds a PhD in machine learning from UC Berkeley and previously led a biology-focused language model effort at Salesforce.
Most life-saving medicines, from penicillin to CRISPR-Cas9, were found by accident. Madani calls this approach "caveman-like" and argues Profluent is replacing random discovery with AI-designed proteins built to specification.
Training scale
Profluent's models are pre-trained on over 100 billion proteins, representing more than 20 trillion tokens. AlphaFold 3 was trained on roughly 2 to 3 billion proteins by comparison. The pre-training data comes from proteins that evolved under natural selection, which Madani describes as the biological equivalent of scraping the internet to learn grammar and semantics.
The post-training loop brings the lab into the cycle. Profluent generates candidate protein sequences, then tests them in human cells and relevant cellular contexts rather than just in test tubes, and feeds those results back into the models. Madani frames this wet-lab-to-model feedback cycle as the defining differentiator. The models improve because they are grounded in real biological assays instead of in silico prediction alone.
Early traction
Profluent released a protein called OpenCRISPR-1, generated from scratch using models trained on gene-editing proteins. Thousands of researchers at large pharma companies, small biotechs, academic institutions, and industrial labs now use it. The company has commercial partners across therapeutics, diagnostics, biomanufacturing, and agriculture.
Madani positions Profluent at roughly the GPT-1 or GPT-2 stage of maturity. The raise is intended to extend the pre-training and wet-lab infrastructure needed to push further along that curve.