AI illustration by KhanHub showing a futuristic interface where open-source language models like OpenChat and Deepseek perform tool calling with Ollama in 2025

Best Open-Source Language Models for Tool Calling in 2025 (With Ollama Setup Guide)

As AI continues to evolve, developers are pushing language models beyond simple text generation. One key feature in high demand is tool calling — the ability for a model to select and execute predefined functions based on natural language input.

While GPT-4 and Claude 3 handle this with ease, open-source alternatives have traditionally lagged behind. But with advancements in models like OpenChat, Deepseek, and Qwen, open-source LLMs are now catching up.

So, which open-source models are actually good at tool calling in 2025? And how do you run them locally using Ollama? Let’s break down the problem, explore top model choices, and offer a step-by-step solution.

1. The Problem: Tool Calling in Open-Source LLMs

Tool calling requires structured reasoning: the model must parse a user's request, map it to a tool, and generate the correct arguments in a predefined schema (usually JSON).

Challenges include:

Hallucinating tool names or incorrect parameters
Malformed JSON output
Lack of structured training for function invocation

“GPT-4 can do it, but I want an open-source option. Anything close?”

2. Understanding Tool Calling

Tool calling enables an LLM to:

Recognize user intent
Select an appropriate tool/function
Return structured arguments (e.g., JSON)

Use cases include AI assistants, coding agents, web automation, and chatbot plugins.

3. Best Open-Source Models for Tool Calling

✅ OpenChat 3.5 / 3.6

Strengths: Instruction-following, low hallucination, schema-adherence

Command: ollama run openchat

✅ Deepseek-V2

Strengths: Code and function reasoning, nested tool support

Command: ollama run deepseek-coder

✅ Qwen1.5 / Qwen2

Strengths: Multilingual, accurate JSON handling

Command: ollama run qwen

✅ CodeGemma / Code LLaMA

Strengths: Schema parsing, code structure understanding

Note: Best for IDEs and developer tools; may require schema priming.

✅ Phi-3

Strengths: Works on 8GB RAM, reliable for simple calls

Command: ollama run phi3

4. Tips to Improve Tool-Calling Accuracy

Use a strong system prompt: “You are an AI assistant. Only respond in valid JSON using the tools provided.”
Define tools with clarity:

[
  {
    "name": "get_weather",
    "description": "Returns weather info",
    "parameters": { "city": "string" }
  }
]

Fine-tune using LoRA: If your use case is specialized, fine-tuning yields higher precision.

5. How to Run These Models with Ollama

Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh
Run a model:
ollama run openchat
ollama run deepseek-coder
ollama run phi3
Test with a sample function call and check the output.

6. Community Insights

OpenChat 3.5 is widely praised for accurate tool selection.
Deepseek-V2 handles structured multi-parameter functions well.
Phi-3 works surprisingly well on entry-level hardware.
Qwen1.5 needs more memory but delivers solid accuracy.

Recommended toolkits to pair with these models include CrewAI, LangGraph, and OpenDevin.

7. Conclusion

Tool calling is no longer limited to proprietary models. In 2025, open-source LLMs like OpenChat, Deepseek, and Qwen provide reliable function calling — especially when used with clear prompts and local runners like Ollama.

Model recommendations based on use case:

For lightweight usage: Phi-3 or Deepseek 1.3B
For accurate JSON output: OpenChat 3.6 or Deepseek-V2
For complex apps: Qwen1.5 or CodeGemma (with more RAM)

All you need is Ollama to get started with these models locally — no API key or cloud required.

8. Sample Prompt for Testing

System Prompt:

You are an AI agent. Only use the tools provided below. Return a JSON object with the selected tool and parameters.

Tools:
[
  {
    "name": "get_weather",
    "description": "Gets the weather for a specific city",
    "parameters": { "city": "string" }
  },
  {
    "name": "get_news",
    "description": "Fetches top news headlines based on a topic",
    "parameters": { "topic": "string" }
  }
]

User: What’s the weather like in Chicago today?

Expected Output:

{
  "tool": "get_weather",
  "parameters": {
    "city": "Chicago"
  }
}

FAQ

What is tool calling in LLMs?

Tool calling allows a language model to choose and invoke a predefined function (like getting the weather or accessing data) by formatting input parameters correctly — typically in JSON.

Which open-source models are best for tool calling?

Top models include OpenChat 3.5/3.6, Deepseek-V2, Qwen1.5, CodeGemma, and Phi-3 — depending on your hardware and use case.

Can I run these models locally?

Yes. You can run these models locally using Ollama with a simple command-line install, even on modest hardware.

KhanHub

Search This Blog

Best Open-Source Models for Tool Calling (2025) | Ollama Guide