Remote MCP server, Code Interpreter, Image Generation in API

OpenAI Responses API

The Responses API for developers and businesses now includes features like Code Interpreter, image generation, and remote MCP server support.

OpenAI’s primary API, basic for creating agentic applications, the Responses API, is getting additional built-in tools today. Along with features like image generation, Code Interpreter, and improved file search, this also offers support for all remote Model Context Protocol (Remote MCP) servers. These tools are compatible with OpenAI o-series reasoning models, GPT 4.1 series, and GPT 4o series.

The Responses API now allows o3 and o4-mini to call tools and functions directly within their chain of thought, resulting in more pertinent and contextually rich responses. By preserving reasoning tokens across requests and tool calls, o3 and o4-mini with the Responses API enhance model intelligence while lowering developer costs and latency.

The Responses API, a fundamental API component for creating agentic systems, has seen substantial enhancements. Since its launch in March 2025, the API has been utilized by hundreds of thousands of developers, handling trillions of tokens for a range of agentic applications, including education helpers, market intelligence agents, and coding agents.

The latest improvements improve the functionality and dependability of agentic systems developed with the Responses API by introducing new features and built-in tools.

Additional Resources in the Responses API

The Responses API now has a number of new built-in tools:

Support for Remote MCP Servers

Any remote Model Context Protocol (remote MCP) server can now be used to connect to tools hosted by the API. MCP is an open protocol that standardizes the way that applications give Large Language Models (LLMs) context. Because MCP servers are supported, developers may use little code to link OpenAI models to tools on a variety of well-known platforms, including Cloudflare, HubSpot, Intercom, PayPal, Plaid, Shopify, Stripe, Square, Twilio, and Zapier. To help the ecosystem and advance this standard, OpenAI has also joined the MCP steering committee.

Image creation

OpenAI’s most recent image creation model, gpt-image-1, is now available to developers as a utility within the Responses API. Features like multi-turn edits, which enable granular, step-by-step image modification through prompts, and real-time streaming for previews as the image is being made are supported by this tool. The addition of image production as a tool in the Responses API is a novel approach, even if it can also be done through the independent Images API. The o3 model in the reasoning model series supports this tool.

Code Interpreter

The Responses API now includes this utility. Data analysis, resolving intricate mathematics and coding issues, and empowering models to comprehend and work with images in a profound way a process known as “thinking with images” are among the activities that the Code Interpreter can help with. Using the Code Interpreter as part of their chain-of-thought has helped models like o3 and o4-mini perform better on benchmarks like Humanity’s Last Exam.

Improvements to File Search

Although file search has been available since the API’s March 2025 launch, new features have been added. Based on a user query, developers can utilize the file search tool to extract pertinent document chunks into the context of the model. The updates provide attribute filtering with arrays and allow searches across different vector stores.

The GPT-4o series, GPT-4.1 series, and OpenAI o-series reasoning models (described as o1, o3, o3-mini, and o4-mini for availability in the pricing/availability section) are all compatible with these tools. With a single API request, developers can create more powerful agents by utilizing these built-in technologies. On industry-standard benchmarks, models that call numerous tools while reasoning have demonstrated noticeably better tool calling performance. The ability of o3 and o4-mini to call tools and functions directly within their line of reasoning results in more contextually rich and pertinent responses.

Additionally, the ability to save reasoning tokens across tool calls and requests enhances model intelligence while lowering latency and cost.

The Responses API’s New Features

Along with the new tools, developers and businesses may now benefit from additional features that increase privacy, visibility, and dependability:

Background Mode: This functionality enables developers to more reliably and asynchronously handle lengthy processes. It can take several minutes to answer complex problems using reasoning models, and background mode helps prevent timeouts and network problems. To see the most recent state, developers can either begin streaming events or poll these background objects to see if they are complete. This functionality is comparable to what is found in agentic products such as Operator, Codex, and deep research.

Reasoning Summaries: The model’s internal flow of ideas can now be succinctly summarised in natural language by the API. This facilitates debugging, auditing, and creating better end-user experiences for developers and is similar to a functionality found in ChatGPT. Summaries of reasoning are offered free of charge.

Reusing reasoning items between API queries is now possible for customers who qualify for Zero Data Retention (ZDR) to encrypted reasoning pieces, none of which are kept on OpenAI’s servers. Reusing reasoning items between function calls improves intelligence, lowers token usage, and raises cache hit rates for models such as o3 and o4-mini, which in turn leads to decreased latency and costs.

Cost and Availability

These new features and tools are all accessible right now. The OpenAI o-series reasoning models (o1, o3, o3-mini, and o4-mini) as well as the GPT-4o and GPT-4.1 series support them. However, only the o3 model within the rationale series particularly supports image production.

  • The cost of the current tools has not altered. The new tools’ precise price is given:
  • With a 75% discount on cached input tokens, image production costs $5.00/1M for text input tokens, $10.00/1M for image input tokens, and $40.00/1M for image output tokens.
  • Code Interpreter: each container costs $0.03.
  • File search: $2.50/1k tool calls and $0.10/GB of vector storage daily.
  • Remote MCP server utility: Developers are charged for the output tokens from the API; calling the tool itself does not incur any further fees.
Tool/ServicePricing Details
Image Generation (with 75% discount on cached input tokens)
– Text input tokens$5.00 per 1 million tokens
– Image input tokens$10.00 per 1 million tokens
– Image output tokens$40.00 per 1 million tokens
Code Interpreter$0.03 per container
File Search$2.50 per 1,000 tool calls
$0.10 per GB of vector storage per day
Remote MCP Server UtilityNo cost to call the tool itself
Charges apply only for output tokens from API
Drakshi
Drakshi
Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.
RELATED ARTICLES

Page Content

Recent Posts

Index