Interview

OpenAI's Nikunj Handa on the Responses API's new MCP support and why remote MCP servers are a breakthrough for agentic apps

May 21, 2025 with Nikunj Handa

Key Points

OpenAI's Responses API now supports remote MCP servers, letting developers connect to Stripe, Shopify, Asana, Linear, and HubSpot in four lines of code.
MCP servers designed for LLMs reduce agent errors by combining multiple API calls into single functions, improving reliability in agentic workflows.
OpenAI is opening its code interpreter tool through the Responses API, exposing the Python execution environment that powers o3's image reasoning capabilities.

OpenAI's Nikunj Handa on the Responses API's new MCP support and why remote MCP servers are a breakthrough for agentic apps

Summary

OpenAI's Nikunj Handa, a product manager focused on the developer API, used the launch to announce remote MCP server support inside the Responses API — the successor to Chat Completions that OpenAI shipped roughly two months ago.

The Responses API was designed from the ground up for reasoning models like o3 and o4 mini. Unlike Chat Completions, which required developers to manually loop requests until a task completed, the Responses API accepts a task and a set of tools in a single call and runs autonomously until it has a result. The addition of remote MCP support means developers can now connect that loop directly to hosted third-party servers from Stripe, Shopify, Asana, Linear, and HubSpot, among others, in roughly four lines of code.

Why MCP over a direct API call

The honest case for MCP is not purely about developer time, though Handa acknowledges that collapsing two hours of integration work into a few lines matters at the margin. The deeper argument is that companies building MCP servers are designing their interfaces specifically for LLMs — combining multiple underlying API calls into single functions and returning tightly scoped responses rather than large JSON objects. That design discipline reduces the number of steps an agent has to take and lowers the probability of the model taking a wrong turn and needing to backtrack. Because LLMs are non-deterministic, curated tool surfaces meaningfully improve reliability in agentic workflows.

Handa also flags that the shift from local-only MCP to remotely hosted servers, enabled by the streamable HTTP transport protocol, is what unlocks the ecosystem at scale. Local MCP was a meaningful constraint. Remote hosting removes it.

Code interpreter and o3's image reasoning

Separately, OpenAI is also opening up its code interpreter tool — the Python execution environment — through the Responses API. Handa explains that o3's image analysis capability is not purely neural: when the model receives an image, it writes Python to crop, zoom, invert, and manipulate it, then runs web searches on top of that. The reasoning traces visible to users reflect actual Python execution. The code interpreter tool now makes that same capability available to developers building on the API.

MCP itself is free; developers pay only for tokens consumed.

Internal Codex adoption

On OpenAI's own use of Codex, Handa says engineers across the company are using it continuously for tasks ranging from fixing error messages to more complex work. His own use case is documentation — flagging typos or heading changes, submitting the task, and letting Codex open a pull request without pulling in a colleague.

You might also like...

OpenAI ships GPT-5 Codex model as coding agent usage grows 10x in one month

Sep 16, 2025

Alex from OpenAI's Codex team: 60-hour autonomous coding sessions, enterprise pull, and the human teammate analogy

Oct 28, 2025

OpenAI Codex launches macOS app: a companion to IDEs with adaptive thinking and multimodal input

Feb 2, 2026