AI Agents: Beginner to Advanced
AI agents are systems where AI can do more than answer one question. They can think in steps, use tools, collect information, and then give a final answer. This page starts with basics and then moves into intermediate and advanced implementation paths.
What Is an AI Agent?
An AI agent is a system where AI can understand a task, think about the next step, use a tool if needed, and then complete the task.
A normal chatbot mostly gives one direct answer. An AI agent can work step by step.
For example, if a user asks for weather, an AI agent can call a weather API, read live data, and then answer correctly.
User Question
↓
AI Model
↓
Understand Task
↓
Use Tool / API
↓
Get Result
↓
Final AnswerWhere AI Agents Are Used
AI agents are used in customer support, research assistants, coding tools, business automation, and task management systems.
They are useful when the task needs multiple steps instead of a single reply.
Simple Example Code
This example only shows the basic idea of sending a question to a model. In real agent systems, more tool logic is added.
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI()
question = "What is the capital of France?"
response = llm.predict(question)
print(response)Intermediate: Tool-Calling Agents
A tool-calling agent decides, at runtime, which tool to use based on the user request. You define tools as Python functions and attach them to the agent.
The agent reads the user question, chooses a tool (like search, calculator, or an API), calls it, reads the result, and forms a final answer.
This is the most common real-world agent pattern. Below is a working example using LangChain's tool decorator.
from langchain.chat_models import ChatOpenAI
from langchain.agents import tool, initialize_agent, AgentType
# Define a custom tool the agent can call
@tool
def get_word_count(text: str) -> str:
"""Returns the word count of the given text."""
count = len(text.split())
return f"The text has {count} words."
@tool
def reverse_text(text: str) -> str:
"""Reverses the given text."""
return text[::-1]
# Create the language model
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# Initialize the agent with tools
agent = initialize_agent(
tools=[get_word_count, reverse_text],
llm=llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True
)
# Run the agent
result = agent.run("How many words are in 'The quick brown fox'?")
print(result)Intermediate: Adding Memory to an Agent
Without memory, every agent call is stateless — the agent forgets the previous question. Memory lets the agent remember the conversation.
There are two common memory types: ConversationBufferMemory (keeps all messages) and ConversationSummaryMemory (keeps a summary to save tokens).
Here is how to attach memory to an agent conversation chain.
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
# Create the memory object
memory = ConversationBufferMemory()
# Create LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# Create conversation chain with memory
conversation = ConversationChain(
llm=llm,
memory=memory,
verbose=True
)
# First turn
response1 = conversation.predict(input="My name is Arjun.")
print("Turn 1:", response1)
# Second turn — agent should remember the name
response2 = conversation.predict(input="What is my name?")
print("Turn 2:", response2)Intermediate: Handling Tool Failures Gracefully
Tools can fail. An API might be down, a search might return no results, or a database query might time out. A production agent must handle these failures without crashing.
The best practice is to wrap each tool in try/except and return a clear error string so the agent can decide what to do next.
This pattern prevents the whole agent from stopping just because one tool had an error.
from langchain.agents import tool
@tool
def fetch_stock_price(ticker: str) -> str:
"""Fetches the current stock price for a given ticker symbol."""
try:
# Simulating an API call that might fail
import random
if random.random() < 0.3:
raise ConnectionError("API timeout")
price = round(random.uniform(100, 500), 2)
return f"The price of {ticker} is ${price}"
except ConnectionError as e:
return f"Could not fetch price for {ticker}: {str(e)}. Please try again."
except Exception as e:
return f"Unexpected error: {str(e)}"
# Test the tool
print(fetch_stock_price.run("AAPL"))Intermediate: Structuring Agent System Prompts
The system prompt is what tells the agent who it is and how it should behave. A weak system prompt produces inconsistent answers.
A good system prompt defines: the agent role, its goal, tone, constraints (what it should not do), and the output format expected.
Below is a template system prompt pattern used in production agents.
from langchain.chat_models import ChatOpenAI
from langchain.schema import SystemMessage, HumanMessage
SYSTEM_PROMPT = """
You are an AI study assistant named StudyBot.
Your job is:
- Answer student questions clearly and accurately
- Give examples whenever possible
- Keep answers under 150 words
Rules:
- Do not provide answers to exam questions directly
- Always encourage the student to think first
- If unsure, say "I'm not certain, please verify this"
Output format:
- Use bullet points for lists
- Use plain language (no jargon unless explained)
"""
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.2)
messages = [
SystemMessage(content=SYSTEM_PROMPT),
HumanMessage(content="What is machine learning?")
]
response = llm(messages)
print(response.content)Advanced: Building a Multi-Agent System
In a multi-agent system, different agents handle different responsibilities. One agent might research, another might write, and a third might review quality.
Each agent has its own tools, memory, and instructions. A coordinator (also called an orchestrator) decides which agent to call.
This pattern is used in production systems where tasks are too complex for a single agent to handle reliably.
User Request
↓
Orchestrator Agent
/ | \
↓ ↓ ↓
Research Writer Reviewer
Agent Agent Agent
\ | /
↓ ↓ ↓
Orchestrator merges
↓
Final Responsefrom langchain.chat_models import ChatOpenAI
from langchain.agents import tool, initialize_agent, AgentType
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# --- Research Agent ---
@tool
def research_topic(query: str) -> str:
"""Researches a topic and returns key facts."""
# In real use, this would call a search API
return f"Key facts about '{query}': It is widely used, growing fast, and important for modern AI."
research_agent = initialize_agent(
tools=[research_topic],
llm=llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=False
)
# --- Writer Agent ---
@tool
def format_summary(facts: str) -> str:
"""Formats raw facts into a clean readable summary."""
return f"Summary: {facts.strip()} This is a great area to explore further."
writer_agent = initialize_agent(
tools=[format_summary],
llm=llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=False
)
# --- Orchestrator ---
def orchestrate(user_query: str):
print("Step 1: Researching...")
facts = research_agent.run(user_query)
print("Step 2: Writing summary...")
final = writer_agent.run(facts)
print("\nFinal Output:")
print(final)
orchestrate("AI agents in education")Advanced: Agent Evaluation and Quality Scoring
Evaluation means checking if your agent is actually giving correct, relevant, and safe answers. Without evaluation, you are shipping code you cannot measure.
A basic evaluation pipeline: define test questions, run the agent, compare output against expected criteria, and track scores over time.
You can use LangSmith's evaluation features or write your own grading functions.
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# Define test cases
test_cases = [
{
"question": "What is supervised learning?",
"must_contain": ["label", "training", "predict"],
},
{
"question": "Name two Python AI libraries.",
"must_contain": ["tensorflow", "pytorch", "scikit", "keras"],
},
]
def evaluate_agent(test_cases):
results = []
for case in test_cases:
response = llm([HumanMessage(content=case["question"])]).content.lower()
# Check if response contains expected keywords
hits = [kw for kw in case["must_contain"] if kw in response]
score = len(hits) / len(case["must_contain"])
results.append({
"question": case["question"],
"score": round(score * 100),
"found_keywords": hits,
"response_preview": response[:100]
})
return results
results = evaluate_agent(test_cases)
for r in results:
print(f"Q: {r['question']}")
print(f" Score: {r['score']}% | Keywords found: {r['found_keywords']}")
print()Advanced: Adding Guardrails and Safety Checks
Guardrails prevent agents from responding to harmful, off-topic, or unsafe inputs. This is critical before deploying any agent to real users.
The simplest guardrail is a pre-check function that scans input for prohibited content before sending it to the model.
More advanced guardrails use a separate small model or rule set to classify intent, then block or redirect dangerous requests.
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# List of blocked topics for this education assistant
BLOCKED_KEYWORDS = ["hack", "exploit", "weapon", "illegal", "cheat exam"]
SAFE_SYSTEM_PROMPT = """
You are a student study assistant. Only answer questions about education,
study skills, science, math, history, and general knowledge.
If asked about anything else, politely decline and redirect.
"""
def input_guardrail(user_input: str) -> bool:
"""Returns True if input is safe, False if it should be blocked."""
lower = user_input.lower()
for keyword in BLOCKED_KEYWORDS:
if keyword in lower:
return False
return True
def safe_agent_response(user_input: str) -> str:
# Step 1: Pre-filter input
if not input_guardrail(user_input):
return "I'm only able to help with educational topics. Please ask something related to studying."
# Step 2: Run agent with safe system prompt
messages = [
SystemMessage(content=SAFE_SYSTEM_PROMPT),
HumanMessage(content=user_input)
]
response = llm(messages)
return response.content
# Test safe and unsafe inputs
print(safe_agent_response("Explain photosynthesis"))
print("---")
print(safe_agent_response("How do I hack a website?"))Project Milestones by Level
Beginner Project: Build a single-tool agent that answers questions about one topic (e.g., a study assistant that looks up definitions).
Intermediate Project: Build a multi-tool agent that can search, calculate, and summarize — then wraps responses in consistent format with memory for multi-turn chat.
Advanced Project: Build a multi-agent pipeline where a researcher, writer, and reviewer each have defined roles, all connected by an orchestrator that tracks evaluation scores per session.
Frequently Asked Questions
What is the difference between a chatbot and an AI agent?
A chatbot usually gives one direct answer, while an AI agent can think in steps, use tools, and complete a task.
Do AI agents always use tools?
Not always, but many useful AI agents use tools like APIs, search, calculators, or databases.
How do I move from beginner to intermediate agent building?
Move after you can build a clean single-agent workflow, explain each prompt step, and handle basic errors. Then add tool routing and memory in small increments.
What defines advanced AI agent work?
Advanced work focuses on reliability and scale: multi-agent coordination, guardrails, observability, evaluation metrics, and production debugging.