Integrating Model Context Protocol with Gemini: The Definitive Guide to Modern Tool Calling (Agentic-AI)

Integrating Model Context Protocol with Gemini: The Definitive Guide to Modern Tool Calling (Agentic-AI)

When it comes to tool-calling and agentic AI, the internet seems awash with tutorials for Anthropic’s Claude or the latest from OpenAI’s GPTs. But what if you want to ride the (less-documented) Gemini wave? If you’ve searched far and wide for a comprehensive guide on integrating the Model Context Protocol (MCP), Gemini, and modern tool schemas, only to find sparse blog posts tailored for other ecosystems, you’re not alone.

This tutorial fills that void, focusing on practical integration of Gemini and Model Context Protocol, from both server and client perspectives. We’ll draw on Node.js with TypeScript, @google/genai for Gemini and schema definition, and @modelcontextprotocol/sdk as our primary toolkit. By the end, you should be able registering tools server-side, and activate them client-side, all via MCP.

But before diving into code, let’s clarify a crucial foundation: what is Model Context Protocol? MCP is an emerging open protocol standard for AI tool-calling, defining how AI models (clients) can discover and call arbitrary server-side tools, in a language and vendor-agnostic way (basically not dependent on the language used to develop server-side tools, or the AI model vendor). Think of it as the universal handshake between advanced AI clients and your server’s unique APIs or business logic. By using MCP, you unlock true interoperability: write a tool once, and access it from any AI that speaks the protocol.

Ready? Let’s dive in!

Setting Up Your AI Tooling Playground

Before we start, we need the 'ol 'getting started boilerplate'. This phase is all about laying down some groundwork, and covers initializing a project and installing the bare-metal essentials.

Step 1: Initialize Your Node.js Project

Crack open a terminal and create a fresh directory:

mkdir gemini-mcp-tutorial && cd gemini-mcp-tutorial
npm init -y

Step 2: Install Required Dependencies

You’ll need just four core libraries:

  • @google/genai: Official Google AI SDK (includes gemini, and other handy features like schema definitions for tool-calling).
  • @modelcontextprotocol/sdk: The heart of MCP communication (both server and client).
  • (Optionally) google-libphonenumber, if you want to parse phone numbers as we will in our example tool.

Install them in a single swing:

npm install @google/genai @modelcontextprotocol/sdk google-libphonenumber

If you’re using TypeScript (recommended), also install typing essentials:

npm install -D typescript ts-node @types/node

And, as with any journey, don’t forget to wire up your favorite Node.js (tsconfig.json and so forth).


Defining MCP Servers and Tools

With dependencies squared away, let’s architect both the server (which hosts our tools) and the client (which acts as a Gemini-empowered AI agent that calls them).

(A) Creating an MCP Server

Your MCP server will describe its tools and listen for tool-calling requests from clients. Let’s work from the skeleton up.

Start by defining the MCP server itself:

// src/mcp/server.ts
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';

const mcpServer = new McpServer({
  name: 'music-guys-mcp',
  version: '1.0.0',
  capabilities: {
    resources: {},
    tools: {},
  },
});

export default mcpServer;

Next, you’ll want to actually run this server and choose how it listens for requests. Here, we’ll use a simple stdio (Standard Input/Output) transport, which works beautifully for testing and local development:

// src/mcp/serverMain.ts
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import mcpServer from './server';

export const createServer = async () => {
  const transport = new StdioServerTransport();
  await mcpServer.connect(transport);
  console.info('MCP Server running on stdio');
};

createServer().catch((error) => {
  console.error('Fatal error in main():', error);
  process.exit(1);
});

Just like that, you’ve spun up an MCP-compliant tool server, ready for calls.

(B) Defining and Registering a Tool

Let’s register a real tool. Here’s a genuinely practical example: a tool that fetches user details by phone number. (We’ve stripped out some internal details, but you’re welcome to adapt as needed.)

// src/mcp/tools/getUser.ts
import mcpServer from '../server';
import { PhoneNumberUtil } from 'google-libphonenumber';

mcpServer.tool(
  'get-user-details',
  "Returns user details according to their phone number, or 'undefined' if user doesn't exist.",
  {
    phoneNumber: {
      type: 'string',
      description: 'User phone number including country code as registered.',
      required: true,
    },
  },
  async (args) => {
    const phoneNumber = args.phoneNumber;
    try {
      let formattedPhone = phoneNumber.startsWith('+')
        ? phoneNumber
        : `+${phoneNumber}`;
      const phoneUtil = PhoneNumberUtil.getInstance();
      const parsedNum = phoneUtil.parse(formattedPhone);

      // Here you'll implement your user lookup logic (e.g., using your database)
      // const user = ...;

      // For demonstration:
      if (parsedNum.getNationalNumber() === 1234567890) {
        // Mocked user:
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify({
                id: 'user-id',
                name: 'Jane Doe',
                phone: formattedPhone,
              }),
            },
          ],
        };
      }
      return {
        content: [
          {
            type: 'text',
            text: 'User not found',
          },
        ],
      };
    } catch (e) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to parse phone number: ${phoneNumber}: ${e}`,
          },
        ],
      };
    }
  }
);

And wire up your tool handler so it loads when the server starts (add this to the serverMain.ts file we've just defined above):

// src/mcp/serverMain.ts (add at the top)
import './tools/getUser';

Now, run your server with ts-node src/mcp/serverMain.ts, and it will listen for tool calls from any MCP-compatible client!


Implementing the MCP Client

At this point, we're half-way there. In this section, I'll walk you through on how to connect Gemini (or any MCP-aware model) to your server.

(A) MCP Client Bootstrap

First, let’s lay down the scaffolding for establishing the connection to the server (and to Gemini itself):

// src/mcp/client.ts
import { Client, StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { GoogleGenAI } from '@google/genai';

export class MCPClient {
  private mcp: Client;
  private googleGenAI: GoogleGenAI;
  private tools: any[] = [];
  private transport: StdioClientTransport | null = null;

  constructor(apiKey: string) {
    this.googleGenAI = new GoogleGenAI({ apiKey });
    this.mcp = new Client({
      name: 'music-guys-mcp-client-cli',
      version: '1.0.0',
    });
  }

  // Connect through stdio to our running mcp server.
  async connectToServer(serverScriptPath: string) {
    const command = process.execPath;
    this.transport = new StdioClientTransport({
      command,
      args: [serverScriptPath],
    });
    await this.mcp.connect(this.transport);

    // Fetch available tools
    const toolsResult = await this.mcp.listTools();
    this.tools = toolsResult.tools.map(tool => ({
      functionDeclarations: [{
        name: tool.name,
        description: tool.description,
        parameters: tool.inputSchema,
      }],
    }));
    console.log('Available tools:', this.tools.map(t => t.functionDeclarations[0].name));
  }

  // Called via our chat loop, and is the main function that interfaces between tools and the Gemini model.
  async processQuery(query: string) {
    const response = await this.googleGenAI.models.generateContent({
      model: 'gemini-2.5-pro',
      contents: [
        {role: 'user', parts: [{text: query}]},
      ],
      // We've defined the list of tools that we have available and sent them as part of the request.
      config: {tools: this.tools, maxOutputTokens: 1024},
    });

    // Handle and display response, including any tool calls (see full method below for more)
    response.candidates?.forEach(cand =>
      cand.content?.parts?.forEach( async part => {
        const toolName = part.functionCall.name;
        const toolArgs = part.functionCall.args as
          | { [x: string]: unknown }
          | undefined;
        console.log('toolName', toolName);
        console.log('toolArgs: ', toolArgs);

        const tool = serverTools[toolName];
        if (tool) {
          const result = await tool.callback(toolArgs || {});

          contents.push({
            role: 'model',
            parts: [{text: (result.content as any)?.[0]?.text}],
          });

          if ((result.content as any)?.[0]?.text) {
            finalText.push((result.content as any)?.[0]?.text);
          }

          // call LLM to also get additional responses.
          const response = await this.googleGenAI.models.generateContent({
            model: this.modelName,
            contents: contents,
            config: {tools: this.tools, maxOutputTokens: 1024},
          });

          for (const candidate of response.candidates || []) {
            const content = candidate.content;
            for (const part of content?.parts || []) {
              if (part.text) {
                finalText.push(part.text);
              }
            }
          }
        }
      }));
  }
}

What is processQuery Doing?

This function is the heart of the Gemini+MCP client:
It orchestrates a conversation between the user, the Gemini model, and your set of tools (accessible via the Model Context Protocol).

The function does several key things: TLDR;

  1. Submits a user query to Gemini (with a list of available tools).
  2. Inspects Gemini’s response to determine if a tool function needs to be called.
  3. Invokes local/server tools when Gemini requests it (via a functionCall).
  4. Feeds the tool results back into Gemini for a more complete or grounded answer.
  5. Displays the final answer(s) to the user.

Here's a step by step breakdown for a clearer understanding

  1. Sending the User Query to Gemini
const response = await this.googleGenAI.models.generateContent({
  model: 'gemini-2.5-pro',
  contents: [
    {role: 'user', parts: [{text: query}]},
  ],
  config: {tools: this.tools, maxOutputTokens: 1024},
});
  • What’s happening here?
    You’re sending the user's question to Gemini’s generateContent() API.
  • Why the tools in config?
    This tells Gemini, “Hey, here are the functions (‘tools’) I can call if you need them.”
  • What does this let Gemini do?
    Gemini can, instead of answering directly, choose to call one of your tools if it thinks that’s appropriate (e.g. “get-user-details”).
  1. Handling the Gemini Response (Including Tool Calls)
response.candidates?.forEach(cand =>
  cand.content?.parts?.forEach( async part => {
    ...
  })
);
  • What’s happening here?
    Gemini may return several “candidates” (possible responses). Each candidate has “parts” which are chunks of its response, which could be text, images, or function calls.
  • Why multiple loops?
    To handle all possible parts of all possible responses.
  1. Detecting Function Calls (Tool Requests)
const toolName = part.functionCall.name;
const toolArgs = part.functionCall.args as { [x: string]: unknown } | undefined;
  • What’s happening here?
    If Gemini thinks the answer requires calling a tool, it will return a functionCall instead of just text.
  • toolName/toolArgs:
    These are the name and arguments for the requested tool.
  1. Calling the Local/Server Tool with Those Arguments
const tool = serverTools[toolName];
if (tool) {
  const result = await tool.callback(toolArgs || {});
  • What’s happening here?
    You look up (from your list of registered server tools) the one Gemini wants to call.
    • If it exists, you run it, passing the parsed arguments.
  • This is where your server, database, or API logic runs.
  1. Feeding the Tool Result Back to Gemini
contents.push({
  role: 'model',
  parts: [{text: (result.content as any)?.[0]?.text}],
});
  • What’s happening here?
    After getting the tool’s output, you add it to the contents. This simulates Gemini’s internal dialogue and gives it the additional “real” data it just requested.
  1. Asking Gemini for a Final Answer (Now Informed by the Tool Output)
const response = await this.googleGenAI.models.generateContent({
  model: this.modelName,
  contents: contents,
  config: {tools: this.tools, maxOutputTokens: 1024},
});
  • What’s happening here?
    Now that Gemini has the tool output, you ask for another response. Gemini can now integrate that tool data and produce a final, well-informed answer.
  1. Output the Final Result(s)
for (const candidate of response.candidates || []) {
  const content = candidate.content;
  for (const part of content?.parts || []) {
    if (part.text) {
      finalText.push(part.text);
    }
  }
}
  • What’s happening here?
    Finally, you extract and print/return all the textual answers Gemini crafts after seeing the real data from your tool.

Why This Structure?

  • Two calls to Gemini:
    First, to see if a tool call is requested; second, to let Gemini integrate the tool output into its answer.
  • MCP Tool Calling:
    The magic of MCP+Gemini is this “AI calls your tools, you execute, and AI incorporates the result into its reasoning.”

(B) Running the Client Loop

Welcome to the interactive part. The chatLoop lets you query our Gemini model, who'll make the administrative decision on whether to call your MCP server’s tool as needed.

// src/mcp/runClient.ts
import { MCPClient } from './client';
// Ensure you have your Google API key ready
const API_KEY = process.env.GEMINI_API_KEY;

async function main() {
  const mcpClient = new MCPClient(API_KEY);
  await mcpClient.connectToServer(__dirname + '/serverMain.js');
  // Minimal chat/question loop
  process.stdin.setEncoding('utf-8');
  process.stdout.write("Ask your query or type 'quit':\n");

  process.stdin.on('data', async (input) => {
    const query = input.trim();
    if (query.toLowerCase() === 'quit') {
      process.exit(0);
    }
    await mcpClient.processQuery(query);
    process.stdout.write("\nAsk next query or 'quit':\n");
  });
}
main();

Now, simply run your client (ts-node src/mcp/runClient.ts), type in a phone number, or ask it to give you your details (as the get-user-details tool expects), and watch as Gemini seamlessly calls your custom tool!


Wrapping Up: What We’ve Built and Where You Can Go Next

Congratulations! You’ve now experienced end-to-end Model Context Protocol integration tailored to Gemini, spanning both a tool-hosting server and a tool-calling client. Feel free to expand your tool arsenal by adding more business logic and tool endpoints (CRUD ops, APIs, complex workflows). Have thoughts, questions, or wild new use cases? Drop a comment or issue below! Happy contextual hacking!