LLM Agents in Production: Function Calling, ReAct Pattern, and LangGraph Orchestration
Build production LLM agents: structured function calling with tool use, the ReAct reasoning pattern, LangGraph for multi-step agent workflows, and patterns for reliable agent behavior.
LLM Agents in Production: Function Calling, ReAct Pattern, and LangGraph Orchestration
LLM agents extend language models from text generation to action-taking: an agent can search the web, query a database, call an API, and use the results to inform its next step. The difference between a useful agent and an unreliable one is the design of its tool definitions, the robustness of its error handling, and the orchestration framework controlling the loop.
This post covers production agent patterns: typed function/tool definitions, the ReAct (Reason-Act-Observe) pattern, multi-agent orchestration with LangGraph, and the failure modes that make agents unreliable in production.
Agent Architecture Overview
User query
โ
LLM reasons about the query (Thought)
โ
LLM decides to use a tool (Action)
โ
Tool executes, returns result (Observation)
โ
LLM reasons about the result (Thought)
โ
LLM generates final answer OR takes another action
โ
Final response to user
This loop (ReAct: Reason-Act-Observe) continues until the LLM produces a final answer or hits a max-step limit.
Function/Tool Calling with the Anthropic API
# src/agents/tools.py
import anthropic
import json
from typing import Any
import httpx
client = anthropic.Anthropic()
# Define tools with precise schemas โ the quality of your tool definitions
# directly determines agent reliability
tools = [
{
"name": "search_knowledge_base",
"description": (
"Search the internal knowledge base for documentation, FAQs, and support articles. "
"Use this when the user asks about product features, pricing, or troubleshooting. "
"Do NOT use for real-time data like order status or account balance."
),
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query. Be specific โ 'how to reset password' not 'password'",
},
"max_results": {
"type": "integer",
"description": "Maximum number of results to return (1-10)",
"default": 5,
},
},
"required": ["query"],
},
},
{
"name": "get_order_status",
"description": (
"Retrieve the current status of a customer's order. "
"Requires a valid order ID in format ORD-XXXXXX. "
"Only use when the user provides an order ID or asks about a specific order."
),
"input_schema": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"pattern": "^ORD-[A-Z0-9]{6}$",
"description": "Order ID in format ORD-XXXXXX",
},
"customer_email": {
"type": "string",
"format": "email",
"description": "Customer email for verification (optional but recommended)",
},
},
"required": ["order_id"],
},
},
{
"name": "create_support_ticket",
"description": (
"Create a support ticket for issues that cannot be resolved immediately. "
"Use as a LAST RESORT โ only after attempting to resolve via knowledge base. "
"Always summarize what was already tried before creating a ticket."
),
"input_schema": {
"type": "object",
"properties": {
"subject": {
"type": "string",
"description": "Brief, descriptive subject line (max 100 chars)",
"maxLength": 100,
},
"description": {
"type": "string",
"description": "Full problem description including what was already tried",
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "urgent"],
"description": "urgent = service down; high = major feature broken; medium = workaround exists; low = question",
},
"customer_email": {"type": "string", "format": "email"},
},
"required": ["subject", "description", "priority", "customer_email"],
},
},
]
# Tool implementations
def search_knowledge_base(query: str, max_results: int = 5) -> list[dict]:
# In production: call your vector DB or search API
return [
{"title": "How to reset your password", "content": "Go to Settings โ Security โ Reset Password...", "score": 0.95},
]
def get_order_status(order_id: str, customer_email: str | None = None) -> dict:
# In production: query your orders database
return {"order_id": order_id, "status": "shipped", "tracking": "1Z999AA10123456784"}
def create_support_ticket(subject: str, description: str, priority: str, customer_email: str) -> dict:
# In production: call your support system API (Zendesk, Linear, etc.)
return {"ticket_id": "TKT-78901", "status": "created", "eta": "24 hours"}
TOOL_REGISTRY = {
"search_knowledge_base": search_knowledge_base,
"get_order_status": get_order_status,
"create_support_ticket": create_support_ticket,
}
ReAct Agent Loop
# src/agents/react_agent.py
import anthropic
import json
from typing import Any
def run_support_agent(
user_message: str,
customer_email: str,
max_iterations: int = 10,
) -> str:
"""ReAct agent that loops until it has a final answer or hits max iterations."""
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]
system = f"""You are a helpful customer support agent for MyApp.
Customer email: {customer_email}
Guidelines:
- Search the knowledge base FIRST before creating support tickets
- Be concise but complete in your responses
- Always verify order IDs before querying them
- If you cannot resolve an issue, create a support ticket with full context
- Never make up information โ only report what tools return"""
for iteration in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
system=system,
tools=tools,
messages=messages,
)
# Add assistant response to conversation
messages.append({"role": "assistant", "content": response.content})
# Check if agent is done (no tool calls)
if response.stop_reason == "end_turn":
# Extract final text response
for block in response.content:
if hasattr(block, "text"):
return block.text
return "I was unable to complete your request."
# Process tool calls
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
tool_name = block.name
tool_input = block.input
try:
if tool_name not in TOOL_REGISTRY:
raise ValueError(f"Unknown tool: {tool_name}")
result = TOOL_REGISTRY[tool_name](**tool_input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result),
})
except Exception as e:
# Report tool errors back to the agent โ let it decide how to proceed
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps({"error": str(e), "tool": tool_name}),
"is_error": True,
})
# Add tool results to conversation
messages.append({"role": "user", "content": tool_results})
return "I was unable to complete your request within the allowed steps. A support ticket has been created."
๐ค AI Is Not the Future โ It Is Right Now
Businesses using AI automation cut manual work by 60โ80%. We build production-ready AI systems โ RAG pipelines, LLM integrations, custom ML models, and AI agent workflows.
- LLM integration (OpenAI, Anthropic, Gemini, local models)
- RAG systems that answer from your own data
- AI agents that take real actions โ not just chat
- Custom ML models for prediction, classification, detection
LangGraph: Multi-Step Agent Workflows
For complex workflows with conditional paths, parallel steps, and human-in-the-loop approval, LangGraph provides a stateful graph abstraction:
# src/agents/langgraph_agent.py
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage
from typing import TypedDict, Annotated
import operator
class AgentState(TypedDict):
messages: Annotated[list, operator.add]
customer_email: str
ticket_created: bool
iteration_count: int
# Initialize LLM with tools bound
llm = ChatAnthropic(model="claude-sonnet-4-6")
llm_with_tools = llm.bind_tools(tools)
def agent_node(state: AgentState) -> AgentState:
"""Main reasoning node โ calls LLM with current conversation."""
system = SystemMessage(content=f"""You are a support agent. Customer: {state['customer_email']}
Search the knowledge base first. Create tickets only as last resort.""")
response = llm_with_tools.invoke([system] + state["messages"])
return {
"messages": [response],
"iteration_count": state["iteration_count"] + 1,
}
def should_continue(state: AgentState) -> str:
"""Routing function: continue loop or end."""
last_message = state["messages"][-1]
# Max iterations guard
if state["iteration_count"] >= 10:
return "end"
# No tool calls = agent is done
if not hasattr(last_message, "tool_calls") or not last_message.tool_calls:
return "end"
# Human approval required for ticket creation
for tool_call in last_message.tool_calls:
if tool_call["name"] == "create_support_ticket":
return "human_approval"
return "tools"
def human_approval_node(state: AgentState) -> AgentState:
"""
Pause for human review before creating tickets.
In production: send to queue, wait for webhook callback.
Here: simplified synchronous implementation.
"""
last_message = state["messages"][-1]
for tool_call in last_message.tool_calls:
if tool_call["name"] == "create_support_ticket":
# In production: interrupt the graph and wait
# For now: auto-approve (implement HITL per your requirements)
print(f"[HUMAN APPROVAL] Creating ticket: {tool_call['args']['subject']}")
return state
# Build the graph
graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", ToolNode(list(TOOL_REGISTRY.values())))
graph.add_node("human_approval", human_approval_node)
graph.set_entry_point("agent")
graph.add_conditional_edges(
"agent",
should_continue,
{
"tools": "tools",
"human_approval": "human_approval",
"end": END,
},
)
graph.add_edge("tools", "agent")
graph.add_edge("human_approval", "tools") # After approval, execute the tool
app = graph.compile()
def run_agent(user_message: str, customer_email: str) -> str:
result = app.invoke({
"messages": [HumanMessage(content=user_message)],
"customer_email": customer_email,
"ticket_created": False,
"iteration_count": 0,
})
# Extract final text
for msg in reversed(result["messages"]):
if hasattr(msg, "content") and isinstance(msg.content, str):
return msg.content
return "Unable to process request."
Production Reliability Patterns
# src/agents/reliable_agent.py
import asyncio
from functools import wraps
# 1. Tool timeouts โ agents can hang waiting on slow APIs
async def with_timeout(coro, timeout_seconds: float = 10.0):
try:
return await asyncio.wait_for(coro, timeout=timeout_seconds)
except asyncio.TimeoutError:
raise ValueError(f"Tool timed out after {timeout_seconds}s")
# 2. Tool output size limits โ LLMs get confused with too much context
def truncate_tool_output(result: any, max_chars: int = 2000) -> str:
text = json.dumps(result) if not isinstance(result, str) else result
if len(text) <= max_chars:
return text
return text[:max_chars] + f"\n...[truncated, {len(text) - max_chars} chars omitted]"
# 3. Input validation โ validate before calling tools
def validate_order_id(order_id: str) -> str:
import re
if not re.match(r'^ORD-[A-Z0-9]{6}$', order_id):
raise ValueError(f"Invalid order ID format: {order_id}. Expected: ORD-XXXXXX")
return order_id
# 4. Observability โ trace every agent step
import logging
logger = logging.getLogger(__name__)
def trace_tool_call(tool_name: str, inputs: dict, result: any, duration_ms: float):
logger.info(
"agent_tool_call",
extra={
"tool": tool_name,
"inputs": inputs,
"result_preview": str(result)[:200],
"duration_ms": duration_ms,
}
)
โก Your Competitors Are Already Using AI โ Are You?
We build AI systems that actually work in production โ not demos that die in a Colab notebook. From data pipeline to deployed model to real business outcomes.
- AI agent systems that run autonomously โ not just chatbots
- Integrates with your existing tools (CRM, ERP, Slack, etc.)
- Explainable outputs โ know why the model decided what it did
- Free AI opportunity audit for your business
Agent Failure Modes and Mitigations
| Failure Mode | Symptom | Mitigation |
|---|---|---|
| Infinite loop | Agent calls tools repeatedly without converging | Max iterations (default: 10) |
| Tool hallucination | Agent invents data not in tool results | Require citations; validate facts |
| Context overflow | Agent loses track of earlier steps | Summarize old turns; limit history |
| Over-confident errors | Agent reports tool failure as success | Always check is_error field |
| Prompt injection | User input hijacks agent behavior | Separate system/user content; sanitize |
| Slow tools | Agent waits indefinitely on API calls | Tool timeouts + fallback behavior |
| Expensive loops | Many LLM calls for simple tasks | Cost monitoring; alert on high usage |
Working With Viprasol
We build production LLM agent systems โ from tool definition design through ReAct loop implementation, LangGraph orchestration, and reliability patterns.
What we deliver:
- Tool/function definition design with precise schemas
- ReAct agent implementation with error handling and max-iteration guards
- LangGraph workflow design for multi-step, conditional agent flows
- Human-in-the-loop approval integration for high-stakes actions
- Observability: agent step tracing, tool call logging, cost monitoring
โ Discuss your AI agent project โ AI and machine learning services
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Want to Implement AI in Your Business?
From chatbots to predictive models โ harness the power of AI with a team that delivers.
Free consultation โข No commitment โข Response within 24 hours
Ready to automate your business with AI agents?
We build custom multi-agent AI systems that handle sales, support, ops, and content โ across Telegram, WhatsApp, Slack, and 20+ other platforms. We run our own business on these systems.