Interview

Nathan Lambert: America needs its own DeepSeek — the case for publicly funding open-source AI to stay competitive with China

Jul 9, 2025 with Nathan Lambert

Key Points

  • Chinese model Qwen has displaced Meta's Llama as the dominant base for open-source AI research over the past three to five months, offering a practical breadth of model sizes that researchers prefer to DeepSeek's frontier-class but hard-to-finetune weights.
  • Nathan Lambert argues the US could fund a fully open model ecosystem at a fraction of what major AI labs spend on proprietary development, positioning public compute investment as cheap geopolitical insurance against Chinese AI dominance.
  • OpenAI's anticipated open model release will target specific gaps with permissive licensing like Apache or MIT, sidestepping the usage restrictions in Meta's Llama license and matching the licensing posture already adopted by leading Chinese labs.
Nathan Lambert: America needs its own DeepSeek — the case for publicly funding open-source AI to stay competitive with China

Summary

The open-source AI ecosystem in the US has a structural problem: the research community has quietly migrated toward Chinese models. Nathan Lambert, a researcher at the Allen Institute for AI (AI2), argues that over the past three to five months the dominant base for open-source AI development has shifted to Qwen, Alibaba's model family, displacing Llama as the de facto open research standard.

The bifurcation between DeepSeek and Qwen explains the shift clearly. DeepSeek produces frontier-class models with permissive weights that are widely hosted by cloud providers, but their size makes fine-tuning impractical for most researchers. Qwen takes a different approach, releasing tens of models across a broad parameter range, from 500 million parameters up to large frontier scale, covering both base and post-trained variants. That breadth makes Qwen the practical choice for anyone building niche products or doing constrained-compute research.

Meta's Llama ceded this ground. Llama 3 was the open research standard, so dominant that Lambert joked Hugging Face could have been rebranded around it. Llama 4 shifted toward more bespoke, internally oriented releases, and new leadership hires at Meta are seen as diluting Zuckerberg's historically strong open-source commitment. Lambert puts the probability that Meta doubles down and reclaims the national-champion role at roughly 50/50, leaning against it.

The policy argument Lambert makes is straightforward. The cost of maintaining a fully open US model ecosystem, with weights, training data, and code released publicly, is a fraction of what major AI labs spend on proprietary development. A targeted compute investment, doubling or tripling what AI2 currently has access to for pre-training, would produce a meaningfully better open American model. The political case is, in his framing, an easy win given bipartisan anxiety about Chinese AI dominance.

The structural obstacle is not funding in principle but execution. Relocating AI talent to stand up a government-backed open lab is operationally difficult. AI2 itself, though already partially embedded in academia through co-appointments at the University of Washington, has a cultural profile too academic to replicate the resource-intensive scaling mentality that drove early OpenAI. Lambert's preferred path is incremental: route more compute to existing institutions like AI2 rather than building new entities.

On the corporate side, Lambert identifies Nvidia and AMD as the most natural private-sector funders of a US open-source counterweight. Both have direct exposure to the risk that Qwen's momentum pulls researchers toward Huawei's hardware and software stack, an outcome that erodes the value of their own ecosystems. Funding open US models is cheap insurance against that scenario.

OpenAI's anticipated open model release is expected to be a single, tightly scoped artifact targeting one specific gap, whether ultra-long context, low-latency agent inference, or a particular reasoning model size, rather than a Qwen-style suite covering the full research stack. OpenAI has also signaled it will commit to genuinely permissive licensing, Apache or MIT, avoiding the usage restrictions and legal carve-outs embedded in Meta's Llama license terms. That licensing posture, which the leading Chinese labs have already adopted, is seen as a meaningful step toward simplifying adoption and closing the credibility gap.