An LLM without tools is just an autocomplete engine — extraordinarily knowledgeable, but completely passive. It can describe how to search the web, but it cannot actually search it. It can write a file path, but it cannot write the file. Tools are what transform a language model into an agent — an entity that can take actions, retrieve live data, modify state, and produce real-world outcomes. Everything in this guide is about bridging the gap between "the model knows what to do" and "the model can actually do it."
How Tool Calling Works
Tool calling (also called "function calling") is a structured protocol between you and the model. You declare what tools exist, the model decides when to call them, and your code actually runs them. The model never executes code directly — it only requests that you run a function by name with specific arguments.
This separation is important: it means the model stays in a sandboxed text world while your Python code handles the real execution, giving you full control over what actions actually happen.
The Three Phases
After you return the result, the model continues its response — it may call more tools, or it may generate a final answer to the user. The loop continues until the model produces a text-only response.
Basic Tool Definition Structure
tool = {
"name": "tool_name",
"description": "What this tool does and when to use it — be specific!",
"input_schema": {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "What this parameter is for"
},
"param2": {
"type": "integer",
"description": "Another parameter with a clear purpose",
"default": 5
}
},
"required": ["param1"]
}
}
Here is a minimal end-to-end example showing the full tool-calling loop with the Anthropic SDK:
import anthropic
import json
client = anthropic.Anthropic()
# 1. Define your tool
tools = [{
"name": "get_weather",
"description": "Get the current weather for a city. Use when the user asks about weather conditions.",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name, e.g. 'London' or 'Tokyo'"}
},
"required": ["city"]
}
}]
# 2. Your actual function (this runs in YOUR code, not the model)
def get_weather(city: str) -> str:
# In production, call a real weather API here
return f"Weather in {city}: 18°C, partly cloudy, wind 12 km/h"
messages = [{"role": "user", "content": "What's the weather like in London?"}]
# 3. Agentic loop
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages
)
# 4. Check if the model wants to call a tool
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
# 5. Execute the tool in YOUR code
if block.name == "get_weather":
result = get_weather(**block.input)
else:
result = f"Unknown tool: {block.name}"
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# 6. Add model response + tool results back to message history
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
else:
# 7. Model is done — extract the final text response
for block in response.content:
if hasattr(block, "text"):
print(block.text)
break
Tool 1 — Web Search
Web search is the most universally useful tool you can give an agent. It transforms the model from "knows things up to training cutoff" to "can look anything up right now." It also directly fixes one of the most common failure modes: hallucinating sources, URLs, and recent facts.
The simplest option is DuckDuckGo's HTML interface — completely free, no API key required, no rate limits that will surprise you with a bill. For production use, consider Brave Search API ($3/1,000 queries) or Serper ($1/1,000 queries) for better reliability.
import requests
from bs4 import BeautifulSoup
import json
from urllib.parse import quote
def search_web(query: str, num_results: int = 5) -> str:
"""
Search the web using DuckDuckGo HTML interface.
No API key required.
"""
try:
encoded_query = quote(query)
url = f"https://html.duckduckgo.com/html/?q={encoded_query}"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36",
"Accept-Language": "en-US,en;q=0.5"
}
response = requests.get(url, headers=headers, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.text, "html.parser")
results = []
for result in soup.find_all("div", class_="result", limit=num_results):
title_el = result.find("a", class_="result__a")
snippet_el = result.find("a", class_="result__snippet")
url_el = result.find("a", class_="result__url")
if title_el:
results.append({
"title": title_el.get_text(strip=True),
"url": url_el.get_text(strip=True) if url_el else "",
"snippet": snippet_el.get_text(strip=True) if snippet_el else ""
})
if not results:
return json.dumps({"error": "No results found", "query": query})
return json.dumps(results, indent=2)
except requests.Timeout:
return json.dumps({"error": "Search timed out after 10s", "query": query})
except requests.HTTPError as e:
return json.dumps({"error": f"HTTP error: {e.response.status_code}", "query": query})
except Exception as e:
return json.dumps({"error": str(e), "query": query})
# Tool definition for Claude
SEARCH_TOOL_DEFINITION = {
"name": "search_web",
"description": (
"Search the web for current information. Use this when you need to look up "
"recent events, facts, prices, documentation, or anything that might have "
"changed since your training data. Returns titles, URLs, and snippets."
),
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query. Be specific — 'Python asyncio tutorial 2024' is better than 'python help'"
},
"num_results": {
"type": "integer",
"description": "Number of results to return (1-10). Default is 5.",
"default": 5
}
},
"required": ["query"]
}
}
Best Practices for Web Search
- Cache results within a session. If the agent searches for "Python asyncio" twice in one run, return the cached result rather than making two identical HTTP requests.
- Limit num_results by default. Five results is usually enough context. Ten or more bloats the conversation history with minimal benefit.
- Include the query in error responses. When search fails, tell the agent what query failed. It can then try a refined query or fallback approach.
- Rate limit aggressively. DuckDuckGo will temporarily block IPs that send too many requests. Add a 1–2 second delay between searches in multi-step research agents.
Tool 2 — File Operations
File tools let agents persist state between turns, generate reports, process uploaded documents, and build up artifacts over multiple steps. The key design decision is the workspace concept: the agent can only read and write within a designated directory. Attempts to escape it are blocked silently with a clear error message.
import os
import pathlib
# Set allowed directory (the agent can NEVER escape this)
ALLOWED_DIR = pathlib.Path("./agent_workspace").resolve()
ALLOWED_DIR.mkdir(exist_ok=True) # Create it if it doesn't exist
def _safe_path(filename: str) -> pathlib.Path:
"""Validate and resolve path, ensuring it stays in ALLOWED_DIR"""
# Resolve normalizes ".." and symlinks
path = (ALLOWED_DIR / filename).resolve()
if not str(path).startswith(str(ALLOWED_DIR)):
raise ValueError(f"Path traversal attempt blocked: {filename}")
return path
def read_file(filename: str) -> str:
"""Read a file from the agent workspace"""
try:
path = _safe_path(filename)
if not path.exists():
return f"Error: File '{filename}' not found. Use list_directory() to see available files."
if not path.is_file():
return f"Error: '{filename}' is a directory, not a file."
if path.stat().st_size > 100_000: # 100KB limit
return f"Error: File too large (max 100KB). Size: {path.stat().st_size:,} bytes"
return path.read_text(encoding="utf-8")
except ValueError as e:
return f"Security error: {e}"
except UnicodeDecodeError:
return f"Error: '{filename}' is not a text file (binary content detected)"
except Exception as e:
return f"Error reading file: {e}"
def write_file(filename: str, content: str) -> str:
"""Write or overwrite a file in the agent workspace"""
try:
path = _safe_path(filename)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(content, encoding="utf-8")
return f"Successfully wrote {len(content):,} characters to {filename}"
except ValueError as e:
return f"Security error: {e}"
except Exception as e:
return f"Error writing file: {e}"
def append_to_file(filename: str, content: str) -> str:
"""Append content to an existing file (or create it if it doesn't exist)"""
try:
path = _safe_path(filename)
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "a", encoding="utf-8") as f:
f.write(content)
return f"Appended {len(content):,} characters to {filename}"
except ValueError as e:
return f"Security error: {e}"
except Exception as e:
return f"Error appending to file: {e}"
def list_directory(subdir: str = "") -> str:
"""List files and folders in the agent workspace"""
try:
path = _safe_path(subdir) if subdir else ALLOWED_DIR
if not path.is_dir():
return f"Error: '{subdir}' is not a directory"
files = []
for item in sorted(path.iterdir()):
size = item.stat().st_size
tag = "[DIR]" if item.is_dir() else "[FILE]"
files.append(f"{tag} {item.name} ({size:,} bytes)")
return "\n".join(files) if files else "Empty directory"
except ValueError as e:
return f"Security error: {e}"
except Exception as e:
return f"Error listing directory: {e}"
File Tool Definitions
READ_FILE_DEFINITION = {
"name": "read_file",
"description": "Read the text contents of a file from the agent workspace. Use this to examine existing files before editing them, or to load data for analysis. Maximum file size is 100KB.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string", "description": "Filename or relative path within the workspace, e.g. 'report.txt' or 'data/analysis.csv'"}
},
"required": ["filename"]
}
}
WRITE_FILE_DEFINITION = {
"name": "write_file",
"description": "Write or overwrite a file in the agent workspace with the given content. Creates parent directories automatically. Use this to save results, create reports, or persist data.",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string", "description": "Filename or relative path, e.g. 'output.txt' or 'reports/summary.md'"},
"content": {"type": "string", "description": "The full text content to write to the file"}
},
"required": ["filename", "content"]
}
}
LIST_DIR_DEFINITION = {
"name": "list_directory",
"description": "List all files and folders in the agent workspace, or a subdirectory within it. Use this to discover what files are available before reading them.",
"input_schema": {
"type": "object",
"properties": {
"subdir": {"type": "string", "description": "Optional subdirectory to list. Leave empty for root workspace."}
},
"required": []
}
}
Tool 3 — HTTP / API Calls
A generic HTTP tool is one of the most powerful you can give an agent — it lets the agent interact with essentially any web service without you needing to build a specific integration for each one. The agent can call weather APIs, stock prices, internal microservices, webhook endpoints, and more.
The critical security requirement is Server-Side Request Forgery (SSRF) protection: blocking requests to localhost and internal network IP ranges. Without this, a prompt-injected agent could exfiltrate your cloud metadata endpoint (AWS's 169.254.169.254 being the classic target).
import requests
import json
from typing import Optional
import urllib.parse
# Domains/IP ranges the agent should never be able to reach
_BLOCKED_HOSTS = {'localhost', '127.0.0.1', '0.0.0.0', '::1'}
_BLOCKED_PREFIXES = ('192.168.', '10.', '172.16.', '172.17.', '172.18.',
'172.19.', '172.20.', '172.21.', '172.22.', '172.23.',
'172.24.', '172.25.', '172.26.', '172.27.', '172.28.',
'172.29.', '172.30.', '172.31.', '169.254.')
def _is_safe_url(url: str) -> tuple[bool, str]:
"""Returns (is_safe, reason) for an outbound HTTP request"""
try:
parsed = urllib.parse.urlparse(url)
host = parsed.hostname or ""
if host in _BLOCKED_HOSTS:
return False, f"Blocked: '{host}' is a loopback address"
if any(host.startswith(p) for p in _BLOCKED_PREFIXES):
return False, f"Blocked: '{host}' is a private/internal network address"
if parsed.scheme not in ('http', 'https'):
return False, f"Blocked: scheme '{parsed.scheme}' not allowed (use http or https)"
return True, ""
except Exception as e:
return False, f"Invalid URL: {e}"
def http_get(url: str, headers: Optional[dict] = None, params: Optional[dict] = None) -> str:
"""Make an HTTP GET request to an external URL"""
safe, reason = _is_safe_url(url)
if not safe:
return json.dumps({"error": reason})
try:
resp = requests.get(
url,
headers=headers or {},
params=params or {},
timeout=15,
allow_redirects=True
)
# Attempt JSON first, fall back to trimmed text
try:
data = resp.json()
result = json.dumps(data, indent=2)
except ValueError:
result = resp.text
# Cap response at 5,000 chars to keep context window manageable
if len(result) > 5000:
result = result[:5000] + "\n\n[Response truncated at 5,000 characters]"
return result if resp.ok else json.dumps({
"error": f"HTTP {resp.status_code}",
"body": result[:500]
})
except requests.Timeout:
return json.dumps({"error": "Request timed out after 15 seconds"})
except requests.ConnectionError as e:
return json.dumps({"error": f"Connection failed: {e}"})
except Exception as e:
return json.dumps({"error": str(e)})
def http_post(url: str, body: dict, headers: Optional[dict] = None) -> str:
"""Make an HTTP POST request with a JSON body"""
safe, reason = _is_safe_url(url)
if not safe:
return json.dumps({"error": reason})
try:
default_headers = {"Content-Type": "application/json"}
if headers:
default_headers.update(headers)
resp = requests.post(url, json=body, headers=default_headers, timeout=15)
try:
result = json.dumps(resp.json(), indent=2)
except ValueError:
result = resp.text
if len(result) > 5000:
result = result[:5000] + "\n\n[Response truncated at 5,000 characters]"
return result
except requests.Timeout:
return json.dumps({"error": "Request timed out after 15 seconds"})
except Exception as e:
return json.dumps({"error": str(e)})
Tool 4 — Database Queries
Giving agents database access unlocks powerful capabilities: querying business data, generating reports from live records, and answering analytical questions. The non-negotiable constraint is read-only access. An agent that can run DELETE FROM users or DROP TABLE orders is an incident waiting to happen.
The implementation below enforces read-only access at two levels: the SQLite connection uses mode=ro (read-only URI), and we additionally block dangerous keywords as a defense-in-depth measure.
import sqlite3
import json
from typing import Optional
# Keywords that indicate write operations — block all of them
_DANGEROUS_KEYWORDS = [
'INSERT', 'UPDATE', 'DELETE', 'DROP', 'CREATE', 'ALTER',
'TRUNCATE', 'EXEC', 'EXECUTE', 'GRANT', 'REVOKE', 'ATTACH'
]
def query_database(
sql: str,
db_path: str = "./data.db",
params: Optional[list] = None
) -> str:
"""
Execute a read-only SQL SELECT query against a SQLite database.
Blocks all write operations. Returns up to 100 rows as JSON.
"""
sql_upper = sql.upper().strip()
# Keyword-level check (belt-and-suspenders)
for keyword in _DANGEROUS_KEYWORDS:
# Word boundary check to avoid blocking 'CREATED_AT' etc.
import re
if re.search(rf'\b{keyword}\b', sql_upper):
return json.dumps({
"error": f"Operation '{keyword}' is not permitted. This tool provides read-only access.",
"hint": "Use SELECT statements only."
})
if not sql_upper.startswith('SELECT') and not sql_upper.startswith('WITH'):
return json.dumps({
"error": "Only SELECT and WITH (CTE) statements are allowed.",
"received": sql[:100]
})
try:
# mode=ro opens the database in read-only mode at the OS level
conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute(sql, params or [])
rows = cursor.fetchmany(100) # Hard cap: never return more than 100 rows
# Check if there were more rows beyond the limit
has_more = cursor.fetchone() is not None
result = {
"rows": [dict(row) for row in rows],
"count": len(rows),
"truncated": has_more,
"message": "Results capped at 100 rows. Refine your query with LIMIT/WHERE to get specific data." if has_more else None
}
conn.close()
return json.dumps(result, indent=2, default=str)
except sqlite3.OperationalError as e:
# Includes "attempt to write a readonly database"
return json.dumps({"error": f"SQL error: {str(e)}"})
except sqlite3.Error as e:
return json.dumps({"error": f"Database error: {str(e)}"})
# PostgreSQL variant (requires psycopg2: pip install psycopg2-binary)
def query_postgres(sql: str, connection_string: str, params: Optional[list] = None) -> str:
"""Query a PostgreSQL database in read-only mode"""
import psycopg2
import psycopg2.extras
sql_upper = sql.upper().strip()
if not (sql_upper.startswith('SELECT') or sql_upper.startswith('WITH')):
return json.dumps({"error": "Only SELECT and WITH statements are allowed."})
try:
conn = psycopg2.connect(connection_string)
conn.set_session(readonly=True) # Enforce read-only at connection level
with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur:
cur.execute(sql, params or [])
rows = cur.fetchmany(100)
return json.dumps({"rows": [dict(r) for r in rows], "count": len(rows)}, default=str)
except psycopg2.Error as e:
return json.dumps({"error": str(e)})
Tool 5 — Email via SMTP
Email is irreversible. Once sent, you cannot unsend it. The pattern here — draft first, require explicit approval to send — is the right approach for any tool that has real-world side effects: sending messages, making purchases, publishing content, calling external APIs with write operations. Always show the user what is about to happen and require explicit confirmation.
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
import os
# IMPORTANT: Never auto-send without human confirmation.
# The pattern: agent calls draft_email() → you show the draft to the user
# → user confirms → agent calls send_email(approved=True)
def draft_email(to: str, subject: str, body: str) -> str:
"""Create an email draft for human review. Does NOT send anything."""
return f"""EMAIL DRAFT (not sent — requires your approval before sending):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
To: {to}
Subject: {subject}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
{body}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Reply 'send it' to send, or ask me to make changes."""
def send_email(to: str, subject: str, body: str, approved: bool = False) -> str:
"""
Send an email via SMTP. Requires explicit approved=True.
Configure via environment variables:
SMTP_USER — your Gmail address (e.g. you@gmail.com)
SMTP_PASS — your Gmail App Password (NOT your account password)
Generate at: myaccount.google.com → Security → App Passwords
"""
if not approved:
return (
"Email NOT sent. The 'approved' parameter must be True to send. "
"Show the draft to the user first and get their explicit confirmation."
)
smtp_user = os.getenv('SMTP_USER')
smtp_pass = os.getenv('SMTP_PASS')
if not smtp_user or not smtp_pass:
return "Error: SMTP_USER and SMTP_PASS environment variables are not set."
try:
msg = MIMEMultipart()
msg['From'] = smtp_user
msg['To'] = to
msg['Subject'] = subject
msg.attach(MIMEText(body, 'plain'))
with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server:
server.login(smtp_user, smtp_pass)
server.send_message(msg)
return f"Email sent successfully to {to} (Subject: {subject})"
except smtplib.SMTPAuthenticationError:
return "Error: SMTP authentication failed. Check your App Password (not your Gmail password)."
except smtplib.SMTPRecipientsRefused:
return f"Error: Recipient address '{to}' was refused by the mail server."
except Exception as e:
return f"Failed to send email: {e}"
Gmail App Password Setup
- Enable 2-Factor Authentication on your Google Account
- Go to myaccount.google.com → Security → App Passwords
- Create a new app password for "Mail" on "Other (custom name)"
- Store it as
SMTP_PASSin your.envfile — never hardcode it
Tool 6 — Slack Integration
Slack is where most teams live. A Slack tool lets your agents post updates, report completion, flag errors for review, and keep humans informed without requiring them to constantly poll a dashboard. It is also excellent for the "human in the loop" approval pattern: the agent posts a draft action to Slack, a human reacts or replies, and the agent proceeds accordingly.
The simplest integration is an Incoming Webhook — a single URL that accepts POST requests and posts to a specific channel. No OAuth, no scope management, no bot tokens to rotate. Perfect for agents that only need to post notifications.
import requests
import json
import os
def post_to_slack(message: str, channel: str = "#general") -> str:
"""
Post a message to Slack via Incoming Webhook.
Setup:
1. Go to api.slack.com/apps → Create App → Incoming Webhooks
2. Activate Incoming Webhooks, click 'Add New Webhook to Workspace'
3. Choose a channel, copy the webhook URL
4. Set as SLACK_WEBHOOK_URL environment variable
"""
webhook_url = os.getenv('SLACK_WEBHOOK_URL')
if not webhook_url:
return "Error: SLACK_WEBHOOK_URL environment variable is not set."
try:
payload = {
"text": message,
"channel": channel,
"username": "AI Agent",
"icon_emoji": ":robot_face:"
}
resp = requests.post(webhook_url, json=payload, timeout=10)
if resp.status_code == 200 and resp.text == 'ok':
return f"Message posted to {channel} successfully"
elif resp.text == 'channel_not_found':
return f"Error: Channel '{channel}' not found. The webhook may only post to its configured channel."
else:
return f"Slack error: {resp.status_code} — {resp.text}"
except requests.Timeout:
return "Error: Slack request timed out"
except Exception as e:
return f"Failed to post to Slack: {e}"
# For richer messages, use Block Kit format:
def post_to_slack_rich(title: str, body: str, status: str = "info") -> str:
"""Post a formatted Block Kit message to Slack"""
webhook_url = os.getenv('SLACK_WEBHOOK_URL')
if not webhook_url:
return "Error: SLACK_WEBHOOK_URL not configured"
emoji_map = {"info": ":information_source:", "success": ":white_check_mark:",
"warning": ":warning:", "error": ":x:"}
emoji = emoji_map.get(status, ":robot_face:")
payload = {
"blocks": [
{
"type": "header",
"text": {"type": "plain_text", "text": f"{emoji} {title}"}
},
{
"type": "section",
"text": {"type": "mrkdwn", "text": body}
},
{"type": "divider"}
]
}
try:
resp = requests.post(webhook_url, json=payload, timeout=10)
return "Message posted successfully" if resp.status_code == 200 else f"Error: {resp.text}"
except Exception as e:
return f"Failed to post: {e}"
Model Context Protocol (MCP)
Model Context Protocol (MCP) is an open standard created by Anthropic for connecting LLMs to tools and data sources. Think of it as the USB-C of AI tooling: you write a tool once as an MCP server, and any MCP-compatible client — Claude Desktop, Cursor, your own agent, or anyone else's — can use it.
Direct Tool Calling vs MCP — When to Use Which
Your tools are specific to one agent, you want the simplest possible implementation, or you are building a prototype. Direct tool calling in Python is easier to get started with and easier to debug.
You want to reuse tools across multiple agents and clients, you want your tools available in Claude Desktop or Cursor for interactive use, or you are building tools for others to use. MCP adds a small layer of complexity but enables powerful reuse.
# Install: pip install mcp
# Run: python my_server.py
# Connect in Claude Desktop via claude_desktop_config.json
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp import types
app = Server("my-agent-tools")
@app.list_tools()
async def list_tools() -> list[types.Tool]:
"""Return all available tools. Claude reads this list on startup."""
return [
types.Tool(
name="search_web",
description="Search the web for current information. Use when you need recent facts, documentation, or anything that might have changed since your training data.",
inputSchema={
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
},
"num_results": {
"type": "integer",
"description": "Number of results (1-10)",
"default": 5
}
},
"required": ["query"]
}
),
types.Tool(
name="read_file",
description="Read a text file from the workspace",
inputSchema={
"type": "object",
"properties": {
"filename": {"type": "string", "description": "File to read"}
},
"required": ["filename"]
}
)
]
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
"""Route tool calls to the appropriate function."""
if name == "search_web":
result = search_web(
arguments["query"],
arguments.get("num_results", 5)
)
return [types.TextContent(type="text", text=result)]
elif name == "read_file":
result = read_file(arguments["filename"])
return [types.TextContent(type="text", text=result)]
else:
raise ValueError(f"Unknown tool: {name}")
async def main():
async with stdio_server() as (read_stream, write_stream):
await app.run(
read_stream,
write_stream,
app.create_initialization_options()
)
if __name__ == "__main__":
import asyncio
asyncio.run(main())
Adding to Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) to register your MCP server:
{
"mcpServers": {
"my-agent-tools": {
"command": "python",
"args": ["/absolute/path/to/my_server.py"],
"env": {
"SLACK_WEBHOOK_URL": "your_webhook_url_here"
}
}
}
}
Error Handling When Tools Fail
Tools will fail. Networks time out. APIs return 429s. Files get deleted. The question is not whether your tools will fail — it is whether your agent handles failure gracefully or crashes in a confusing way. Never let exceptions propagate to the model as Python tracebacks. Always return a structured, descriptive error string that the agent can understand and adapt to.
The key insight is that the model is pretty good at handling errors — if you tell it what went wrong. A response of "Search timed out after 10s for query 'Q3 revenue'" lets the model decide to retry with a shorter query, try a different tool, or tell the user it cannot get that information right now. A Python traceback tells the model nothing useful.
import time
from functools import wraps
def with_retry(max_retries: int = 3, delay: float = 1.0, backoff: float = 2.0,
retry_on: tuple = (Exception,)):
"""
Decorator that retries a tool function on specified exceptions.
Uses exponential backoff: 1s, 2s, 4s between attempts by default.
"""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
last_error = None
wait = delay
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except retry_on as e:
last_error = e
if attempt < max_retries - 1:
time.sleep(wait)
wait *= backoff
print(f"[retry] {func.__name__} attempt {attempt+1} failed: {e}. Retrying in {wait:.1f}s...")
# All retries exhausted — return a descriptive error string
return f"Tool '{func.__name__}' failed after {max_retries} attempts. Last error: {last_error}"
return wrapper
return decorator
# Apply to any tool that might have transient failures
import requests
@with_retry(max_retries=3, delay=1.0, retry_on=(requests.Timeout, requests.ConnectionError))
def search_web_reliable(query: str, num_results: int = 5) -> str:
"""Web search with automatic retry on network errors"""
return search_web(query, num_results) # Your base search_web function
# Circuit breaker pattern for persistent failures
class CircuitBreaker:
"""
After 'threshold' consecutive failures, opens the circuit for 'timeout' seconds.
Prevents hammering a failing service and wasting tokens on guaranteed-to-fail calls.
"""
def __init__(self, threshold: int = 5, timeout: float = 60.0):
self.threshold = threshold
self.timeout = timeout
self.failures = 0
self.opened_at = None
self.is_open = False
def record_failure(self):
self.failures += 1
if self.failures >= self.threshold:
self.is_open = True
self.opened_at = time.time()
def record_success(self):
self.failures = 0
self.is_open = False
def can_attempt(self) -> bool:
if not self.is_open:
return True
# Check if timeout has elapsed (half-open state)
if time.time() - self.opened_at > self.timeout:
self.is_open = False
self.failures = 0
return True
return False
def wrap(self, func):
@wraps(func)
def wrapper(*args, **kwargs):
if not self.can_attempt():
remaining = self.timeout - (time.time() - self.opened_at)
return f"Service unavailable: circuit breaker open. Will retry in {remaining:.0f}s."
try:
result = func(*args, **kwargs)
self.record_success()
return result
except Exception as e:
self.record_failure()
return f"Tool failed: {e}"
return wrapper
# Usage
search_breaker = CircuitBreaker(threshold=5, timeout=60)
search_web_protected = search_breaker.wrap(search_web)
Putting It All Together — The Tool Router
As you add more tools, you need a clean way to register them and dispatch tool calls. The Tool Router pattern gives you a single object that holds all tool definitions and handles execution, making your agent loop clean and your tools easy to add or remove.
class ToolRouter:
"""
Central registry for all agent tools.
Holds definitions for the API and dispatches calls to functions.
"""
def __init__(self):
self._tools: dict = {}
def register(self, name: str, func, definition: dict):
"""Register a tool function with its Claude tool definition."""
self._tools[name] = {"func": func, "definition": definition}
return self # Allows chaining: router.register(...).register(...)
@property
def definitions(self) -> list:
"""Get all tool definitions to pass to the API."""
return [t["definition"] for t in self._tools.values()]
def execute(self, name: str, inputs: dict) -> str:
"""Execute a tool by name with the given inputs."""
if name not in self._tools:
available = list(self._tools.keys())
return f"Unknown tool: '{name}'. Available tools: {available}"
try:
return str(self._tools[name]["func"](**inputs))
except TypeError as e:
return f"Tool '{name}' received wrong arguments: {e}"
except Exception as e:
return f"Tool '{name}' failed unexpectedly: {e}"
def __repr__(self):
return f"ToolRouter({list(self._tools.keys())})"
# ─── Build the router ────────────────────────────────────────────────────────
router = ToolRouter()
(router
.register("search_web", search_web, SEARCH_TOOL_DEFINITION)
.register("read_file", read_file, READ_FILE_DEFINITION)
.register("write_file", write_file, WRITE_FILE_DEFINITION)
.register("list_directory", list_directory, LIST_DIR_DEFINITION)
.register("http_get", http_get, HTTP_GET_DEFINITION)
.register("post_to_slack", post_to_slack, SLACK_TOOL_DEFINITION)
)
# ─── Clean agent loop using the router ───────────────────────────────────────
import anthropic
client = anthropic.Anthropic()
def run_agent(user_task: str, max_iterations: int = 10) -> str:
"""
Agentic loop with tool routing. Hard cap on iterations prevents runaway cost.
"""
messages = [{"role": "user", "content": user_task}]
for iteration in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=router.definitions,
messages=messages
)
if response.stop_reason == "end_turn":
# Done — extract and return the final text
for block in response.content:
if hasattr(block, "text"):
return block.text
return "Agent completed with no text output."
elif response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
print(f"[tool] {block.name}({block.input})")
result = router.execute(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
else:
return f"Unexpected stop reason: {response.stop_reason}"
return f"Agent stopped: reached maximum {max_iterations} iterations."
# Run it
result = run_agent("Search for the latest Python release, then write a summary to summary.txt")
print(result)
With the Tool Router, retry logic, and the six tool implementations above, you have everything you need to build production-grade agents. Your next step is cost optimization — because agents with rich tool sets tend to be expensive to run. Head to the Cost Guide to learn how to keep your bill manageable.