Skip to main content

Command Palette

Search for a command to run...

Implementing AI Agents from Scratch using LangChain and OpenAI

Build AI agents from scratch with LangChain and OpenAI. From tools to agent loops — this guide covers it all with real code, best practices, and advanced tips.

Published
10 min read
Implementing AI Agents from Scratch using LangChain and OpenAI

Author: Surjeet Singh, Software Engineer II — GeekyAnts
Original article: geekyants.com

AI agents within LangChain take a language model and tie it together with a set of tools to address larger, more complex tasks. Unlike a static chain of instructions, an agent dynamically decides at each step which action (tool) to take based on the conversation and intermediate results.

In practice, this means the agent prompts the LLM (the "reasoning engine") to formulate its next action or question, potentially invoke a tool (like a web search or calculator), and then use whatever results show up as new information in its reasoning. The agent continues to loop until it develops a final answer.

What Is a LangChain Agent?

A LangChain agent is an LLM-based system that can decide actions dynamically — such as calling tools or answering directly. Unlike a chain (fixed steps), an agent reasons step-by-step, adapting based on context.

Why Use an Agent Instead of a Simple Chain?

If your task always follows a fixed sequence, a chain would suffice (a hard-coded sequence). If you want flexibility — allowing the system to operate with different tools or follow more branching logic — you need an agent.

For instance, agents are useful for processes where you want the system to search the web, interact with a knowledge base, do something computational, and then summarize these steps in a single seamless process.

This guide will cover the basics and follow the building of a LangChain agent in detail, step-by-step. It covers the primary components (tools, LLMs, prompts), how the agent loop works, and best practices to create more robust agents. The examples use the most current LangChain API (2025 version), and it is expected that the reader is familiar with Python and large language models (LLMs).

How It Works

  • Follows a loop: Think → Act → Observe → Repeat.

  • The LLM decides whether to act or respond.

  • Tool results are fed back as context for the next step.

Analogy: Like a detective solving a case:

  • Uses a notebook (scratchpad) for thoughts.

  • Chooses from a toolbox (APIs/functions).

  • Stops when confident in the answer.

Key Components of Agents

Tools

  • External functions or APIs with names and descriptions.

  • Used by the agent to perform tasks (e.g., search, math).

LLM

  • The decision-making model (e.g., GPT-4o-mini, Gemini 2.0).

  • Chooses the next action or gives the final answer based on tool outputs.

Prompt / Scratchpad

  • Guides the LLM for proper tool usage, its guardrails, and clear tool differentiation.

  • Stores previous actions and results for context.

Tools: Building Blocks for Actions

A tool is simply a Python function wrapped with metadata. For example, to make a calculator tool that evaluates arithmetic expressions:

from langchain.tools import Tool

def calculate_expression(expr: str) -> str:
    try:
        result = eval(expr)
        return str(result)
    except Exception as e:
        return f"Error: {e}"

def return_dummy_weather(city: str) -> str:
    return f"The weather in {city} is cloudy"

calc_tool = Tool(
    name="Calculator",
    description="Performs simple arithmetic. Input should be a valid Python expression, e.g. '2+2'.",
    func=calculate_expression
)

# Dummy weather tool
weather_tool = Tool(
    name="WeatherSearch",
    description="Tells current weather of a city. Input should be a valid city in string, e.g 'paris'.",
    func=return_dummy_weather
)

This calc_tool tells the agent that whenever it needs to do math, it can call the tool "Calculator" with an input string. The agent's prompt will include this tool's name and description. The description should be clear and specific — vagueness can confuse the agent, causing it to choose the wrong tool or misuse it.

LangChain comes with many built-in tools and tool wrappers. For example:

  • Web Search Tool — Interfaces like TavilySearchResults or GoogleSerperAPIWrapper let an agent search the web. (You'll need API keys for external search services.)

  • Retriever Tool — Wraps a vector database or document store. Use create_retriever_tool to expose it to the agent. This tool fetches relevant text snippets from your data given the query.

  • Custom API Tools — You can define your own tool that calls any API. For instance, a weather_tool calling a weather API, or a jira_tool that creates JIRA tickets.

When giving tools to an agent, put them in a list:

tools = [calc_tool, weather_tool, search_tool, retriever_tool, ...]

Note: Each tool should ideally perform a clear, atomic function. Complex or multi-step logic can confuse the agent. If needed, break tasks into simpler tools or chains and let the agent sequence them.

Language Model: The Reasoning Engine

The LLM (often a chat model) processes prompts and generates next steps. In LangChain 2025, a common import is:

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-4o", temperature=0.0)

You may also use ChatAnthropic, ChatGooglePalm, etc. Setting temperature=0 (or low) can make the agent's decisions more deterministic, which is often desirable for tool use.

Prompt (Agent Scratchpad)

The agent prompt template defines how the LLM is instructed to behave. A common pattern (ReAct-style) includes:

  • System / Instruction — Explains to the assistant that it is an agent with certain tools.

  • Tool Descriptions — Lists each tool's name and description so the model knows what actions it can take.

  • Format Guide — Tells the model how to output its reasoning. You can also use libraries like Pydantic to get more precise and formatted JSON objects for tool calls.

Here's an example prompt based on our Calculator and Weather tools:

<Persona>
  You are a helpful, precise AI assistant capable of solving user queries using available tools.
  You can perform reasoning, fetch information, and carry out calculations when needed.
</Persona>

<Guardrails>
  - Only call a tool if it's required to answer the question.
  - Do not guess values or fabricate information.
  - Never perform code execution or arithmetic by yourself; use the Calculator tool for all such tasks.
</Guardrails>

<AvailableTools>
  <Tool>
    <Name>Calculator</Name>
    <Description>
      Performs simple arithmetic. Input must be a valid Python expression, such as '3 * (4 + 5)'.
      Use this tool only for basic math operations (e.g., +, -, *, /, parentheses).
    </Description>
    <Format>
      To use this tool, return:
      Action: Calculator
      Action Input: 2 + 2
    </Format>
  </Tool>

  <Tool>
    <Name>Weather</Name>
    <Description>
      Tells current weather of a city. Input should be a valid city in string, e.g 'paris'.
    </Description>
    <Format>
      To use this tool, return:
      Action: Weather
      Action Input: Paris
    </Format>
  </Tool>
</AvailableTools>

How the Agent Loop Works

Under the hood, an agent uses a loop to repeatedly query the LLM, parse its output, execute tools, and update context. Conceptually:

  1. Initial Input — The user's question is given to the agent (along with any system instructions).

  2. LLM Response — The agent prompts the LLM, which returns either a final answer or an action.

  3. Tool Invocation (if any) — If the output is an action, the agent executes the corresponding tool function with the provided input.

  4. Observe — The result from the tool (text, JSON, etc.) is captured and added to the scratchpad.

  5. Loop or End — The agent checks if the LLM signaled a final answer or if stopping criteria (max steps/time) are met. If not finished, it goes back to step 2, calling the LLM again with the new observations included.

  6. Return Answer — Once the agent decides it's done, it returns the final answer to the user.

This process is illustrated by the pseudocode below:

from langchain.schema import HumanMessage, AIMessage, SystemMessage

def process_with_tool_loop(user_input: str):
    MAX_ITERATIONS = 10
    current_iteration = 0

    messages = [
        SystemMessage(content="You are a helpful assistant with access to a calculator tool."),
        HumanMessage(content=user_input)
    ]

    while current_iteration < MAX_ITERATIONS:
        print(f"Iteration {current_iteration + 1}")
        response = llm.invoke(messages)

        # Check if LLM wants to call a function
        if not response.additional_kwargs.get("function_call"):
            print(f"Final answer: {response.content}")
            break

        function_call = response.additional_kwargs["function_call"]
        function_name = function_call["name"]
        function_args = function_call["arguments"]

        # Execute the tool
        if function_name == "Calculator":
            import json
            args = json.loads(function_args)
            tool_result = calculate_expression(args.get("expr", ""))

        if function_name == "WeatherSearch":
            import json
            args = json.loads(function_args)
            tool_result = return_dummy_weather(args.get("city", ""))

        # Add function call and result to conversation
        messages.append(response)
        messages.append(AIMessage(content=f"Function result: {tool_result}"))

        current_iteration += 1

    return response.content

Managing History for the Conversation

When building AI chat systems, preserving conversation history is essential for providing contextual, coherent responses. The ConversationHistoryService handles this by transforming stored messages into LangChain-compatible message objects that the model can understand.

This formatting is especially important when using OpenAI LLM models, as LangChain standardizes the message structure (e.g., HumanMessage, AIMessage, ToolMessage) required for proper tool invocation and response handling.

Note: Different models may expect varying formats for tool calls and conversation history. For other LLMs such as Gemini, the history format may differ — especially when supporting agentic behavior — so the message transformation logic must be adapted to match each model's specific input requirements.

This system:

  • Handles multiple sender types (USER, AI, TOOL)

  • Ensures messages are properly ordered and valid according to OpenAI LLM (gpt-4o-mini) requirements

  • Constructs an array of LangChain messages starting with the system prompt

We must store the complete conversation history along with the tool call and tool response in the database. At every LLM call, fetch the history from the DB and format it according to LangChain/LLM requirements.

For example:

from langchain_core.messages import HumanMessage, AIMessage, ToolMessage

def convert_to_langchain_message(message, next_message=None):
    sender_type = message.get("sender_type")

    if sender_type == "TOOL":
        return ToolMessage(
            tool_call_id=message.get("tool_call_id"),
            name=message.get("content"),
            content=message.get("content")
        )
    elif sender_type == "USER":
        return HumanMessage(content=message.get("content"))
    else:  # Assume AI
        if next_message is None:
            return None
        if message.get("additional_metadata", {}).get("tool_calls") and next_message.get("sender_type") != "TOOL":
            return None
        return AIMessage(
            content=message.get("content"),
            additional_kwargs=message.get("additional_metadata", {})
        )

It loops through stored conversation messages and based on the sender_type, converts each into the appropriate LangChain message:

  • TOOLToolMessage

  • USERHumanMessage

  • Otherwise (typically AI) → AIMessage

Best Practices & Advanced Tips

Building robust agents often requires attention to detail and thoughtful configuration. Here are some tips and advanced considerations:

1. Write Clear Tool Descriptions

The agent relies heavily on the text descriptions of tools. Make these descriptions concise and unambiguous. Specify input/output formats or any usage constraints. Poor descriptions can cause the agent to pick the wrong tool or misuse one entirely.

2. Zero-Shot vs Few-Shot Prompts

You can provide examples in the system prompt to guide the agent's reasoning (few-shot prompting), especially if the default behavior is incorrect. Give one or two example interactions showing how to use each tool.

3. Control Temperature

Use a low temperature (e.g. 0.1 or 0.2) for agents to make consistent decisions. High randomness can lead to inconsistent tool use.

4. Set Iteration Limits

To avoid infinite loops, configure the agent executor's limits. LangChain's AgentExecutor has parameters max_iterations (default 10) and max_execution_time to halt runaway loops.

Conclusion

LangChain makes it surprisingly straightforward to build intelligent agents by combining LLM reasoning with tool usage. By defining clear tools and prompt instructions, you can create a system that handles multi-step questions and leverages external data or computation.

Remember that agents are powerful but also require careful crafting of prompts, descriptions, and limits to behave reliably. Whether you're building a QA chatbot that searches the web, an analytics assistant that processes databases, or any autonomous tool-based LLM system — understanding the agent loop and its components is key.

With the foundations in this guide, you can start designing your LangChain agents and explore more advanced topics like multi-agent coordination or integration with LangGraph for complex pipelines. Happy agent-building!


Stay tuned for an exciting deep dive into building AI Agents using Google Gemini!
In the next blog, we will explore how to leverage Gemini's powerful multimodal capabilities to create intelligent, tool-using agents that can reason, act, and adapt to complex tasks — all in real time.


Originally published on GeekyAnts Blog by Surjeet Singh.