Convert Any REST or GraphQL API to MCP: Zero Code, Zero Deployments

Every time you want to expose an API to an LLM, you write another MCP server. At Agoda, with hundreds of internal services each with its own schema and auth, "another MCP server" means hundreds of them.

We didn't do that. We built API Agent instead.

One server. Any API. Point it at a GraphQL endpoint or OpenAPI spec, send two headers from your MCP client, and it figures out the rest. No integration code, no new deployment per API.

This is a shorter version of my original post on the Agoda Engineering & Design blog.

API Agent sits between MCP clients and GraphQL or REST APIs, handling schema introspection, LLM orchestration, and DuckDB SQL post-processing. — API Agent turns one MCP server into a bridge for many APIs.

Why this keeps happening

Building an MCP server for an API is the same work every time. Read the schema, turn operations into tool definitions, handle auth, deploy it. The URL changes. The auth headers change. Everything else is identical.

MCP is clean and does what it says. The protocol isn't the problem. Nothing in the ecosystem stops you from doing the same structural work over and over, so that's what people do.

How it works

API Agent routes incoming requests by HTTP headers. Your MCP client sends X-Target-URL and X-API-Type with each request. The server uses those to find the right API, introspect the schema if it hasn't already, and return the right tool definitions.

You configure those headers once per API in your MCP client config (see below). Each named entry points to a different target API but the same running API Agent process. Add ten entries, get ten apparently-separate MCP servers, zero additional deployments.

For GraphQL it runs an introspection query at connection time and turns each operation into a callable tool. For REST it fetches the OpenAPI spec and does the same. The schemas are cached, so subsequent requests don't re-fetch. Nothing to configure by hand.

API Agent server architecture showing dynamic tool naming, query and execute tools, GraphQL and REST agents, DuckDB, and external APIs. — The implementation stays small: schema-aware tools, API-specific agents, and DuckDB as the data layer.

Getting started

OPENAI_API_KEY=your_key uvx --from git+https://github.com/agoda-com/api-agent api-agent

Add to your MCP config:

{
  "mcpServers": {
    "rickandmorty": {
      "url": "http://localhost:3000/mcp",
      "headers": {
        "X-Target-URL": "https://rickandmortyapi.com/graphql",
        "X-API-Type": "graphql"
      }
    }
  }
}

APIs that need auth take a third header, X-Target-Headers, as a JSON string with whatever credentials the target API expects.

The SQL layer

Real APIs return a lot of data. An internal analytics service might come back with thousands of rows. The model can't filter what it hasn't seen yet. The raw response hits the context window first, and if it's large enough, that's where the request dies.

API Agent puts DuckDB between the API response and the model. It's an in-process, in-memory SQL engine that handles JSON natively, so there's nothing extra to run or configure. When a tool call returns a large result set, the agent writes a SQL query to filter and aggregate it before passing anything to the model. Ask for the top 10 characters by episode count and you get exactly that, not the full dataset.

Without processing, a large API response overflows the LLM context. With DuckDB, SQL filters the response before the model sees it. — DuckDB keeps the large response local and sends the model only the useful slice.

This also handles things the API itself doesn't support: joining results across multiple endpoints, ranking, aggregating over paginated responses. We picked SQL over something like Python because it's sandboxed by design. No network calls, no filesystem access, no side effects. And LLMs write decent SQL.

Recipes

The first time you ask something complex, such as "compare alive vs. dead characters by species, only for species with more than 10 characters", the agent works through it: figures out which API calls to make, fetches the data, writes SQL, returns the answer. That reasoning takes time and tokens.

After it succeeds, it extracts a parameterized version of that workflow and saves it as a "recipe," exposed as a new MCP tool prefixed with r_. The next time a similar question comes in, the agent runs the recipe directly. No LLM reasoning step, just execution. The results are also more consistent since the same query runs each time rather than being reconstructed from scratch.

A first request creates a reusable recipe template, and a later similar request executes the recipe directly with lower latency and token cost. — Recipes turn repeated agent reasoning into a direct reusable workflow.

Safety

The server is read-only by default. Mutations are blocked unless you explicitly allowlist specific operations in config. If your LLM hallucinates a write operation against an API it shouldn't be touching, nothing happens.

Read the longer original version on Agoda Engineering & Design, or see the code at github.com/agoda-com/api-agent.