Crusoe acquires stealth AI inference startup Atapar to dominate GPU memory optimization
Aug 21, 2025 with Chase Lochmiller & Alon Yariv
Key Points
- Crusoe acquires Atapar, a stealth AI inference startup, to deploy ATOM, a memory virtualization layer that moves model weights and KV cache across GPU clusters to boost utilization.
- Doubling GPU utilization through better memory management cuts inference cost per token in half, the core economics for competing cloud providers.
- Atapar's co-founders include an engineer who led infrastructure at OpenAI for five years, bringing production-scale orchestration expertise to Crusoe's gigawatt-scale AI factory plans.
Summary
Crusoe has acquired Atapar, a stealth-stage AI infrastructure startup, to strengthen the memory optimization layer of its GPU cloud platform. Chase Lochmiller, Crusoe's CEO, and Atapar co-founder Alon Yariv announced the deal, with the full Atapar team joining Crusoe.
Memory as the inference bottleneck
AI inference infrastructure is badly underutilized because memory is the primary constraint. Models are roughly a thousand times larger than standard containerized applications, so loading one onto a GPU takes significant time. KV cache volumes—the inference equivalent of a user session—weigh gigabytes, compared to the kilobytes or megabytes of a typical consumer app session. That three-orders-of-magnitude gap in memory demand means GPU utilization collapses under real-world workloads that shift between models, prompt types, and resource demands.
Atapar built ATOM, a unified memory layer that virtualizes GPU memory across a cluster, moving model weights and KV cache assets fluidly and at high speed to wherever compute needs them. Alon describes it as the VMware equivalent for AI infrastructure—native virtualization, but for memory rather than compute.
Why Crusoe acquired it
Doubling GPU utilization through better KV cache and model memory management cuts cost per token in half, according to Lochmiller. For a cloud provider competing on dollar-per-GPU-hour, that translates directly to margin. The acquisition is also a reliability play. Raw chip availability means little if data cannot move into the GPU efficiently or if workloads are unstable.
Atapar currently supports Nvidia hardware officially but is architecturally hardware-agnostic. It integrates orthogonally to the CUDA stack and can connect to any inference runtime, including vLLM. That flexibility matters as Crusoe scales toward what Lochmiller describes as gigawatt-scale AI factories.
Inference cost and the role of software
Inference pricing will follow two curves simultaneously, in Lochmiller's view. One is a steady downward slope tied to Nvidia's silicon roadmap. The other comes from periodic step-change improvements in low-level software optimizations. Hosting widely-used open-source models will commoditize, making memory optimization tricks like ATOM instrumental rather than optional.
Alon built the company in roughly a year, starting from the conviction that inference—not training—is where AI economics actually get resolved. One of Atapar's co-founders led infrastructure at OpenAI for five years, which shaped the team's focus on the orchestration problems that only appear at production scale.