Google's AI model generates a novel cancer hypothesis — validated in living cells
Oct 16, 2025
Key Points
- Google's C2S model identified a drug candidate that converts immune-cold tumors into immune-hot tumors, validated in living cells—the first concrete win for AI in drug discovery against 600 FDA-approved cancer drugs.
- The model treats cells as text tokens ranked by gene expression, proving scaling laws apply to biology as they do to code generation, with qualitative improvements from 4 billion to 27 billion parameters.
- Real bottlenecks shift from compute to wet-lab cycle time and FDA approval, leaving data availability as the next constraint before Google can scale from one discovery to thousands.
Summary
Google's AI model C2S, a 27-billion-parameter foundation model built with Yale and based on Gemini, generated a novel hypothesis about cancer cell behavior and validated it in living cells. The model identified a drug that acts as a conditional amplifier, boosting immune signals to turn cold tumors into hot tumors that killer T cells can target. The discovery was experimentally validated in living cells, not in animal models or humans.
For context, roughly 600 FDA-approved drug-cancer indication pairs exist in the U.S. alone. Only about 5% of drugs submitted to the FDA get approved. Aggregated across decades, human-discovered cancer drug candidates number in the tens of thousands or hundreds of thousands, narrowed by the FDA to just 600 on market. Google found one correlation with AI. The slope matters more than the y-intercept.
The discovery arrived within 24 hours of OpenAI's o1 announcement. Google deserves credit for the execution, but the timing highlights a competitive reality: with hundreds of billions in revenue, Google can dedicate resources to exploratory science that OpenAI, despite its resources, may not prioritize as core to its roadmap.
Text as universal interface
Yann LeCun's concept of text as a universal interface applies here. The model treats cells as text, representing each cell as a bag of words with genes ranked from most to least expressed. No 3D modeling required. Just tokens. The same architecture that powers code generation and retrieval can power biology. Scaling laws apply to useful applications in biology, and the team saw qualitative improvements scaling from roughly 4 billion parameters up to 27 billion.
Not an AGI scenario
This is emphatically not an AGI discovery. Humans and machines collaborate. The AI surfaces hypotheses; scientists validate them in the wet lab. The real constraint isn't compute or model scale but the cycle time of actual testing and FDA approval. This could follow the pattern of the 2005 DARPA Grand Challenge, which took 20 years before Waymo deployed autonomous vehicles at scale. Regulation and wet-lab throughput become the bottleneck, not model capability.
The open question is whether Google can produce thousands of these discoveries. If they scale the model further and expand the training data, the pattern suggests they can. But data availability may become the limiting factor before compute does. No established data broker for cellular and drug interaction datasets exists yet. That gap signals early innings.
For narrative, the announcement aligns with Google's mission to organize the world's information and make it useful. It reframes concerns about AI's electricity consumption with a concrete proof point that the technology delivers on its decade-old promise to accelerate medicine.