News

Cursor's Composer 2 is built on Kimi K2 with RL fine-tuning, not trained from scratch as implied

Mar 20, 2026

Key Points

  • Cursor's Composer 2 is built on open-source Kimi K2 with reinforcement learning fine-tuning, not trained from scratch, a discrepancy discovered when developers found the model identifier in the company's configuration.
  • Co-founder Lee Robinson acknowledged the approach as validation of open-source strategy, claiming only one quarter of final compute came from Kimi K2 base with three quarters applied to RL optimization.
  • The reputational damage stems from ambiguous positioning around 'continual pre-training' rather than the technical strategy itself, which is industry-standard; licensing compliance remains unclear.

Summary

Cursor's Composer 2, announced yesterday, is built on Kimi K2 with reinforcement learning fine-tuning rather than trained from scratch, despite earlier framing that suggested otherwise. The discovery emerged when a developer inspecting Cursor's OpenAI base URL configuration spotted the model identifier kimi_k2_p5_rl, exposing the underlying architecture.

Cursor co-founder Lee Robinson acknowledged the approach in a follow-up comment, framing it as validation of open-source strategy. He stated that Composer 2 started from an open-source base and that the company will pursue full pretraining in the future. Robinson claimed that only one quarter of the final model's compute came from the Kimi K2 base, with the remaining three quarters applied to RL optimization.

The framing matters because Cursor had positioned Composer 2 using language around "continual pre-training," which created ambiguity about whether the model was trained from scratch. A straightforward disclosure—that Cursor took a world-class open-source model and improved it for a specific domain—would likely have drawn minimal pushback. Instead, the oblique positioning triggered skepticism across developer communities.

The technical strategy itself is defensible. RL fine-tuning on open-source bases is a standard and effective approach across the industry; Anthropic-trained Claude models have been similarly fine-tuned by Chinese open-source labs, and the method allows targeted optimization on narrow tasks like coding without the full cost of pretraining. Cursor's reported performance gains on SWE Bench Multilingual (1%) and Terminal Bench 2.0 (21%) reflect this trade-off: modest improvements relative to the compute invested, tailored specifically to coding workflows.

Licensing compliance remains an open question. Cursor claims to be following license terms through its inference partner, and the full disclosure may already exist in the model card or terms of service without having been widely read. No concrete licensing violation has been documented.

The core tension is reputational rather than strategic. The company executed a sensible engineering decision but presented it in a way that invited the opposite interpretation, generating unnecessary criticism. Despite the surrounding noise, Cursor's business metrics—growth in ARR and continued user expansion—remain the actual measure of competitive viability.