Developing a Resilient Multi-Tool Gemma 4 Agent

By Matthew Mayo on May 22, 2026 in Artificial Intelligence

In this guide, we’ll walk you through the process of enhancing a basic tool-calling script, turning it into a resilient agent capable of managing unexpected errors. This transformation will address common pitfalls like tool malfunctions, erroneous model outputs, and service unavailability.

The scope of this article covers:

Structuring an iterative agent loop with safeguards against excess iterations.
Identifying and managing four primary types of errors that can arise during tool calls.
Creating intuitive error messages that help the model learn how to recover efficiently, minimizing wasted steps.

Building a Multi-Tool Gemma 4 Agent with Error Recovery

Developing a Resilient Multi-Tool Gemma 4 Agent

Introduction

In a prior tutorial, we connected Gemma 4 to several Python functions through Ollama’s tool-calling API. This setup enabled a basic single-turn dispatcher: the model selects a tool, the script executes it, and results are returned. While functional, it barely scratches the surface of what qualifies as a bona fide agent.

What differentiates a basic tool-calling demonstration from a fully-fledged agent is the capacity to manage failures effectively. Tools encounter issues. Whether it's the model misidentifying a function, sending inappropriate data types, or querying a city that's not in the lookup table, things can—and will—go awry. If an upstream API times out or a required parameter is missing, the previous incarnation of our script would either crash or rely on a rudimentary try/except clause that ultimately surrendered. That approach is adequate for demonstration purposes but insufficient for long-term operation.

This article reassembles the agent with the presumption that failures are the norm, detailing methods for graceful error recovery. The approach is straightforward: catch errors at the input boundary, translate them into actionable messages that the model can interpret, deliver these back to the model, and allow it to choose whether to retry, circumvent the problem, or inform the user of the failure. We will also implement a structured iterative agent loop with a defined limit on iteration counts.

You can access the full script here. This article highlights the key components that matter in this process.

Rethinking the Tool Loop

The initial dispatcher executed a single round: send the user's query, call the tools, run them, and return the resultant output. This one-shot interaction suffices when the model hits the nail on the head with its first response, but leaves no room for correction when errors occur. If a tool fails, the model has just one shot to react before the process concludes. If it wishes to invoke another tool based on the initial result, the chance is lost; we've already reached the end of the road.

An effective agent loop must be iterative. The fundamental structure is simple:

Transmit the current message history to the model.
If the model generates tool calls, execute them, append the results to the history, and continue looping.
If a plain text response is generated, that represents the conclusion. Return the answer.
Set a limit at MAX_ITERATIONS to prevent a perplexed model from unnecessarily exhausting CPU resources.

The last point is vital. Smaller models can get trapped in loops, repeatedly invoking the same tool or oscillating between two options. There's nothing more frustrating than returning to a terminal to find your laptop’s fans raging because Gemma decided to check the weather in London for the umpteenth time.

Here’s a look at how the loop is structured:

def run_agent(user_query):
    messages = [{"role": "user", "content": user_query}]

    for iteration in range(1, MAX_ITERATIONS + 1):
        payload = {
            "model": MODEL_NAME,
            "messages": messages,
            "tools": available_tools,
            "stream": False,
        }

        print(f"[EXECUTION — iteration {iteration}]")
        print("  ● Querying model...\n")

        try:
            response_data = call_ollama(payload)
        except Exception as e:
            print(f"  └─ [ERROR] Error calling Ollama API: {e}")
            print(f"  └─ Make sure Ollama is running and {MODEL_NAME} is pulled.")
            return

        message = response_data.get("message", {})
        tool_calls = message.get("tool_calls") or []

        # Branch A: the model wants to use tools
        if tool_calls:
            print(f"[TOOL EXECUTION — {len(tool_calls)} call(s)]")
            messages.append(message)
            tool_messages = print_tool_calls(tool_calls)
            messages.extend(tool_messages)
            print()
            continue

        # Branch B: the model produced a final answer
        print("[RESPONSE]")
        print(message.get("content", "") + "\n")
        return

    # Safety rail: we exhausted MAX_ITERATIONS without a final answer
    print("[RESPONSE]")
    print(
        f"Hit the {MAX_ITERATIONS}-iteration cap without a final answer. "
        "This usually means the model is stuck in a tool-calling loop. "
        "Try simplifying the query.\n"
    )

Error Handling Framework

When you're building an agent framework that relies heavily on external API calls, effective error handling isn't just a nice-to-have; it's essential. The execution process we've outlined allows us to identify and manage errors in real-time, providing the model with immediate feedback. This enhances the overall responsiveness of the system. One of the fundamental pillars of this approach is that the model is stateless. It doesn’t hold any memory of past interactions on its own. Instead, all relevant context—like the user’s original request and the outputs from any tool calls—gets packaged into each loop iteration. This ongoing context is what enables the model to respond intelligently to failures, as it can consider past messages, including any error messages, and adjust its actions accordingly.

Creating Tools for Robust Testing

Next, we develop a set of tools designed to be fully deterministic and offline. This design choice negates the complexities tied to network dependencies—a significant advantage when exploring error-handling protocols. You won't find API keys or unreliable external services here, just straightforward functionality. This allows us to intentionally invoke different failure scenarios and observe how the framework responds. Here’s the lineup of tools we'll be working with: - **`get_weather(city)`**: Retrieves weather information from a preset dictionary of data. - **`get_local_time(city)`**: Calculates the current time for a given city using the `zoneinfo` module. - **`convert_currency(amount, from_currency, to_currency)`**: Performs currency conversions based on a fixed exchange rate table. - **`get_city_population(city)`**: Looks up population figures from a static dataset. The basic data sets are defined at the top of the script, making them easily accessible for our tool functions. This design choice sets the stage for predictable behavior, which is vital when testing our error-handling architecture. Here’s a look at the foundational data we're working with:

CITY_DATA = {
    "london":     {"timezone": "Europe/London",       "population": 8_982_000},
    "tokyo":      {"timezone": "Asia/Tokyo",          "population": 13_960_000},
    "sao paulo":  {"timezone": "America/Sao_Paulo",   "population": 12_330_000},
    "paris":      {"timezone": "Europe/Paris",        "population":  2_161_000},
    "new york":   {"timezone": "America/New_York",    "population":  8_336_000},
    "sydney":     {"timezone": "Australia/Sydney",    "population":  5_312_000},
    "mumbai":     {"timezone": "Asia/Kolkata",        "population": 20_410_000},
}

EXCHANGE_RATES = {
    "USD": 1.00,  "EUR": 0.92,  "GBP": 0.79,  "JPY": 156.40,
    "BRL": 5.12,  "CAD": 1.37,  "AUD": 1.51,  "INR": 83.20,
}

It's this straightforward setup that gives us the platform to explore error management effectively, incrementally enhancing the system’s reliability.

The Bigger Picture

As we wrap up this analysis, it’s vital to reflect on what these numbers represent beyond mere data points. The populations of major cities like Paris (2.16 million), New York (8.34 million), Sydney (5.31 million), and Mumbai (20.41 million) do more than illustrate urban density—they reveal the shifting dynamics of migration, resource allocation, and infrastructure development in our increasingly interconnected world. What’s striking is the sheer scale of Mumbai's population compared to its peers. With over 20 million residents, its challenges around urban planning and service provision dwarf those of the other cities mentioned. It’s easy to overlook these implications, but if you're working in urban development or policy, understanding the root causes and potential solutions for such disparities could shape the future of city living.

Currency Exchange Trends

Looking into the exchange rates listed, a keen observer might note the stability of the USD against other currencies. At a rate of 1 USD to 0.92 EUR and 156.40 JPY, for instance, these figures reflect not just economic relationships but also hint at geopolitical developments. The GBP’s rate of 0.79 and the elevated rate of the BRL at 5.12 signal interesting trends that may affect international trade and investment strategies. The clarity of this data is both a boon and a potential pitfall. While these numbers provide a snapshot of current economic conditions, they can also mask deeper underlying issues. Currency fluctuations impact global markets in ways that are sometimes unpredictable. It's worth keeping an eye on these rates if you're involved in international business—knowing when to hedge your bets could be the difference between profit and loss.

Final Thoughts

In synthesizing these insights, we find that when examining complex urban settings or navigating currency markets, a straightforward portrayal misses many nuances. These indicators push us to think critically about the social and economic forces at play. So, as we move forward, remember that the data isn’t just numbers; it reflects real-world conditions—and that's what makes it matter. What strategies will we adopt to deal with these ongoing complexities? That’s the question we need to keep asking ourselves.