Modal Labs raises $87M Series B led by Lux Capital to scale serverless AI infrastructure
Sep 29, 2025 with Erik Bernhardsson
Key Points
- Modal Labs closes $87M Series B led by Lux Capital to scale serverless GPU infrastructure charged per usage hour, targeting companies burned by overcapacity from 2023-2024 GPU hoarding.
- Modal expands upmarket from early-stage startups to later-stage and enterprise customers by positioning itself as a high-level software layer above Kubernetes and Docker.
- CEO Erik Bernhardsson sees NVIDIA's CUDA as the only viable option for two years but views Google TPUs as a credible longer-term alternative and acknowledges coding agents could eventually break CUDA lock-in.
Summary
Modal Labs has closed an $87 million Series B led by Lux Capital, with existing investors also participating. The round positions the company to scale its serverless GPU infrastructure platform, which charges on a pure usage basis — per GPU hour — rather than requiring customers to make large capacity reservations.
Eric (CEO, Modal Labs) frames the core value proposition around the widespread misallocation of GPU spend over the past two years. Companies that rushed to lock in thousands of GPUs during the perceived scarcity window of 2023-2024 are now sitting on underutilized capacity. Modal's model is explicitly designed to avoid that trap, scaling compute up and down automatically so customers pay only for active runtime.
The platform targets engineers directly, positioning itself as a high-level software layer above traditional orchestration tools like Kubernetes and Docker. Early product-market fit came with startups, but traction has since expanded to later-stage, public, and enterprise customers. Modal is not competing for hyperscale contracts at the Stargate level.
On the hardware landscape, Eric views NVIDIA as effectively the only viable option for customers over the next two years, citing CUDA's software ecosystem as the decisive advantage. He is personally bullish on Google TPUs as a credible alternative further out, and acknowledges AMD and other ASIC players as longer-term variables. Notably, he concedes CUDA is notoriously difficult to work with, which he sees as an opening for better kernel-writing tools to emerge — pointing to Modular as one company attempting that abstraction layer. The broader thesis, raised in discussion, is that coding agents running autonomously could eventually port CUDA-dependent workloads to competing hardware, which would undermine the lock-in narrative significantly.