Function Calling and ReAct Loop: Designing Agents that Act Independently

LLMs are great at chatting, but they’re essentially goldfish. They lose context the moment the request ends, and they can’t touch your database or call an API on their own. I spent the last few weeks building agents that move beyond static responses, and the secret isn't just "prompt engineering"—it’s mastering the ReAct (Reason + Act) loop combined with structured function calling.

Why Function Calling Changes Everything

If you’re still parsing LLM output with regex to trigger a script, stop. Modern models (like GPT-4o or Claude 3.5 Sonnet) support native tool definitions. You provide a JSON schema describing your functions, and the model returns a structured call instead of a string.

The real power comes when you wrap this in a loop. Instead of expecting the model to solve a problem in one go, you give it the ability to pause, call a tool, ingest the output, and decide the next step.

The ReAct Loop Architecture

The ReAct pattern follows a simple, repeating cycle:

Thought: The model evaluates what it knows vs. what it needs.
Action: The model decides which tool to use.
Observation: Your code executes the tool and feeds the result back to the LLM.

If you don't implement a loop, your agent is just a glorified calculator. With the loop, it’s an autonomous worker.

Implementing a Robust Agent Loop

Here is a simplified Python implementation using the standard OpenAI client pattern. This approach avoids heavy frameworks, giving you full control over the execution context.

import json
from openai import OpenAI

client = OpenAI()

# Define the tools the agent can use
tools = [{
    "type": "function",
    "function": {
        "name": "get_database_stats",
        "description": "Fetch current row counts for tables",
        "parameters": {"type": "object", "properties": {}}
    }
}]

def run_agent(user_query):
    messages = [{"role": "user", "content": user_query}]
    
    # We loop until the model provides a final answer
    while True:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools
        )
        
        msg = response.choices[0].message
        messages.append(msg)
        
        # If the model wants to call a tool
        if msg.tool_calls:
            for tool_call in msg.tool_calls:
                # In a real app, map the function name to your actual code
                tool_result = "Table 'users': 500 rows, Table 'orders': 1200 rows"
                
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": tool_result
                })
        else:
            # Model has reached a conclusion
            return msg.content

# Usage
print(run_agent("How many users do we have compared to orders?"))

Architectural Trade-offs

When building these, I’ve found three areas where things usually break:

Context Window Bloat: Every loop iteration adds to the conversation history. If your loop runs ten times, your prompt becomes massive. I recommend summarizing the "Thought" process if the history exceeds a certain token threshold.
Infinite Loops: An agent might get stuck calling the same tool repeatedly if the output isn't what it expected. Always enforce a max_iterations limit in your loop logic to prevent runaway API costs.
Tool Complexity: Don't give the model 50 tools. It struggles to choose the right one. Group tools by domain and only inject the relevant ones based on the initial user intent.

Debugging Tips for Autonomous Agents

Debugging an agent is harder than debugging standard code because the "logic" is probabilistic.

Log the intermediate state: Don't just log the final answer. Log the tool_calls and the tool return values. If the agent gets the wrong answer, you need to see if it was because it used the wrong tool or because the tool returned bad data.
Deterministic Fallbacks: If an agent fails to parse a tool output twice, force a stop. Don't let it keep guessing.
Observability: I use tools like LangSmith or simple local SQLite logs to track the chain of thought. If you can’t see the "Thought" process, you’re flying blind.

The goal isn't to build an agent that does everything; it's to build one that knows exactly when to ask for help and when to execute. Keep your tool definitions tight, and always enforce a hard limit on the number of turns an agent can take.

Why Function Calling Changes Everything

The ReAct Loop Architecture

Implementing a Robust Agent Loop

Architectural Trade-offs

Debugging Tips for Autonomous Agents

Aditya Shenvi