Building Your First AI Agent: A Step-by-Step Guide
In this tutorial, we'll walk through building a basic AI agent that can perceive inputs, decide on actions, and interact with external tools. We'll use Python and the LangChain framework to demonstrate how to wire together a prompt-driven decision loop, tool integrations, and context management.
What Is an AI Agent?
An AI agent is a program that:
- Perceives its environment (user inputs, API responses, files)
- Decides on an action based on its internal logic (prompts, chain-of-thought, planner)
- Acts by invoking tools or generating outputs
- Learns/Remembers by updating its context or memory store between steps
Such agents power applications like autonomous chatbots, data analysts, and code assistants.
Key Components
-
Language Model (LLM)
Provides the reasoning and language generation capabilities (e.g., OpenAI's GPT-4) -
Prompt Template
Defines how input and context are formatted for the LLM -
Tool Interface
Python wrappers around APIs or local functions (e.g., search, calculator, file I/O) -
Memory/Context Store
Persists conversation history or state (e.g., in Redis, SQLite, or in-memory) -
Agent Loop
Orchestrates perception → decision → action steps until a termination condition
Step-by-Step Build
1. Install Dependencies
pip install langchain openai
2. Define Tools
Create simple tool functions—for example, a calculator and a weather lookup stub.
# tools.py def calculator(expression: str) -> str: try: result = eval(expression, {"__builtins__": {}}) return str(result) except Exception as e: return f"Error: {e}" def get_weather(location: str) -> str: # Placeholder for real API call return f"Weather at {location}: Sunny, 72°F"
3. Set Up the Agent with LangChain
# agent.py from langchain import OpenAI, LLMChain, PromptTemplate from langchain.agents import Tool, initialize_agent from tools import calculator, get_weather # 1. Initialize the LLM llm = OpenAI(model_name="gpt-4", temperature=0.3) # 2. Wrap our functions as LangChain Tools tools = [ Tool(name="Calculator", func=calculator, description="Useful for math expressions"), Tool(name="Weather", func=get_weather, description="Gets weather by city name"), ] # 3. Define the Agent agent = initialize_agent( tools=tools, llm=llm, agent="zero-shot-react-description", verbose=True ) def run_agent(user_input: str): return agent.run(user_input) if __name__ == "__main__": print(run_agent("What is the result of 12 * 8?")) print(run_agent("What's the weather in Tokyo?"))
This code will:
- Load GPT-4 with a fixed prompt strategy (zero-shot-react-description)
- Wrap our Python functions so the LLM can call them as "tools"
- Execute a REACT-style loop: Reason → Act → Observe → Think
4. Managing Memory
For multi-turn interactions, integrate a simple memory module.
# memory.py from langchain.memory import ConversationBufferMemory memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) # In agent initialization: agent = initialize_agent( tools=tools, llm=llm, agent="conversational-react-description", memory=memory, verbose=True )
Now the agent will include past exchanges in its prompt, preserving context across turns.
5. Testing Your Agent
Run interactive sessions to verify behavior:
python -i agent.py >>> run_agent("Hi, who are you?") >>> run_agent("Can you do 15 + 27 for me?") >>> run_agent("Now remind me what we talked about earlier.")
Further Reading & References
- A Practical Guide to Building Agents
- LangChain Agents Documentation
- "ReAct: Synergizing Rational Thought and Acting"
- OpenAI GPT-4 API Reference
By following these steps, you have a working foundation for an AI agent. From here, you can extend capabilities: integrate databases, add retrieval-augmented generation, or train custom decision policies. Happy building!
Cheers!
Yijie :)