Framework

LangChain: Beginner to Advanced

LangChain is a framework that helps developers build AI applications using prompts, language models, tools, and memory. This page covers beginner basics and then expands into intermediate and advanced usage.

Best for

Readers who want a practical, role-based learning guide with clear progression from fundamentals to advanced implementation.

Not ideal for

Visitors looking for a short definition page without examples, sections, or a guided learning path.

What Is LangChain?

LangChain helps connect prompts, AI models, tools, databases, and memory in one workflow.

Instead of writing everything manually, developers can use LangChain components to build AI systems faster.

It is useful when you want to create chatbots, assistants, document tools, or workflow-based AI apps.

User Question
      ↓
    Prompt
      ↓
   LLM Model
      ↓
 Tool / Memory / Database
      ↓
  Final Answer

Why LangChain Is Useful

It helps organize prompts and model calls in a cleaner way.

It also makes it easier to add tools, memory, and structured chains to your app.

Simple LangChain Example

This example shows a small prompt chain that explains a topic in simple language.

from langchain.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain

prompt = PromptTemplate(
    input_variables=["topic"],
    template="Explain {topic} in simple language."
)

llm = ChatOpenAI()
chain = LLMChain(llm=llm, prompt=prompt)

result = chain.run("Artificial Intelligence")
print(result)

Intermediate: LCEL — LangChain Expression Language

LCEL is the modern way to build chains in LangChain. It uses the pipe operator | to chain components together cleanly.

The basic pattern is: prompt | model | output_parser. Each step transforms data and passes it to the next.

LCEL supports streaming, batching, and async calls — all critical for production performance.

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

# Step 1: Define a prompt template
prompt = ChatPromptTemplate.from_template(
    "Explain {topic} in 3 bullet points for a beginner."
)

# Step 2: Create the LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# Step 3: Create an output parser
parser = StrOutputParser()

# Step 4: Build the chain using LCEL pipe syntax
chain = prompt | llm | parser

# Step 5: Run the chain
result = chain.invoke({"topic": "neural networks"})
print(result)

# You can also stream results token by token:
# for chunk in chain.stream({"topic": "neural networks"}):
#     print(chunk, end="", flush=True)

Intermediate: Retrieval-Augmented Generation (RAG)

RAG means giving the model access to your own documents so it can answer questions based on real data, not just training knowledge.

The flow: load your documents → split them into chunks → embed chunks into a vector store → retrieve relevant chunks per question → send chunks + question to the model.

This is one of the most widely used LangChain patterns in real applications.

Your Documents (PDF, text)
         ↓
   Text Splitter
         ↓
   Embeddings Model
         ↓
   Vector Store (FAISS)
         ↓ (on user query)
   Similarity Search
         ↓
   Relevant Chunks
         ↓
  Prompt + LLM Answer

from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

# Step 1: Load your document
loader = TextLoader("my_study_notes.txt")
documents = loader.load()

# Step 2: Split into small chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

# Step 3: Create embeddings and store in FAISS vector database
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)

# Step 4: Create a retriever from the vector store
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# Step 5: Build the RAG chain
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True
)

# Step 6: Ask a question
result = qa_chain({"query": "What are the main topics covered in chapter 2?"})
print("Answer:", result["result"])
print("Sources:", [doc.metadata for doc in result["source_documents"]])

Intermediate: Memory Types in LangChain

Memory allows your chain to remember past turns in a conversation. LangChain offers several memory types depending on your use case.

ConversationBufferMemory keeps every message. ConversationSummaryMemory keeps a summary (saves tokens for long chats). ConversationWindowMemory keeps the last N messages.

Use buffer memory for short sessions. Use summary memory for long sessions where token costs matter.

from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import (
    ConversationBufferMemory,
    ConversationSummaryMemory,
    ConversationBufferWindowMemory
)

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# Option 1: Buffer Memory — stores all messages
buffer_memory = ConversationBufferMemory()

# Option 2: Window Memory — stores only last 3 exchanges
window_memory = ConversationBufferWindowMemory(k=3)

# Option 3: Summary Memory — keeps a running summary (good for long chats)
summary_memory = ConversationSummaryMemory(llm=llm)

# Build a conversation chain with window memory
chain = ConversationChain(llm=llm, memory=window_memory, verbose=True)

# Simulate a multi-turn conversation
chain.predict(input="I am studying for a biology exam.")
chain.predict(input="Tell me about cell division.")
chain.predict(input="What did I say I was studying?")  # Should remember

# Print current memory state
print(chain.memory.load_memory_variables({}))

Intermediate: Custom Tools and Structured Outputs

Custom tools allow your chain to call any Python function — including APIs, databases, or calculations. Structured output parsing ensures the model returns clean JSON or typed data.

Using Pydantic models with structured output parsers makes downstream processing reliable and avoids string parsing bugs.

This pattern is essential for apps where the chain output feeds into other systems.

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
from typing import List

# Define the expected output shape using Pydantic
class StudyPlan(BaseModel):
    subject: str = Field(description="The subject being studied")
    topics: List[str] = Field(description="List of key topics to cover")
    daily_hours: int = Field(description="Recommended hours per day")
    weeks_needed: int = Field(description="Estimated weeks to complete")

# Create a parser for this model
parser = PydanticOutputParser(pydantic_object=StudyPlan)

# Build a prompt that includes format instructions
prompt = ChatPromptTemplate.from_template(
    """Create a structured study plan for: {subject}
    
    {format_instructions}"""
)

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# Build the chain
chain = prompt | llm | parser

# Run with format instructions
result = chain.invoke({
    "subject": "Python programming for beginners",
    "format_instructions": parser.get_format_instructions()
})

# Access structured fields
print(f"Subject: {result.subject}")
print(f"Topics: {result.topics}")
print(f"Daily hours: {result.daily_hours}")
print(f"Weeks needed: {result.weeks_needed}")

Advanced: Production Chain Architecture

A production LangChain system separates concerns: prompt definitions, chain logic, tool definitions, memory management, and API layer are all in separate modules.

This makes testing, updating, and debugging much easier. You can update a prompt without touching the chain logic.

Below is the recommended folder structure for a production LangChain project.

project/
  ├── chains/
  │     ├── qa_chain.py       ← RAG chain definition
  │     ├── summary_chain.py  ← summarization chain
  │     └── base.py           ← shared LLM config
  ├── prompts/
  │     ├── qa_prompts.py     ← prompt templates
  │     └── system_prompts.py ← system messages
  ├── tools/
  │     ├── search_tool.py    ← search API wrapper
  │     └── calc_tool.py      ← calculator tool
  ├── memory/
  │     └── session_memory.py ← memory factory
  ├── evaluation/
  │     └── run_eval.py       ← evaluation runner
  └── api/
        └── main.py           ← FastAPI endpoint

# chains/base.py — shared LLM config for all chains
from langchain.chat_models import ChatOpenAI
from functools import lru_cache

@lru_cache(maxsize=1)
def get_llm(model: str = "gpt-3.5-turbo", temperature: float = 0.0):
    """Single cached LLM instance reused across chains."""
    return ChatOpenAI(model=model, temperature=temperature)

# ---

# chains/qa_chain.py — RAG Q&A chain
from langchain.chains import RetrievalQA
from .base import get_llm

def build_qa_chain(retriever):
    return RetrievalQA.from_chain_type(
        llm=get_llm(),
        retriever=retriever,
        return_source_documents=True
    )

# ---

# api/main.py — FastAPI wrapper
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI()

class QueryRequest(BaseModel):
    question: str
    session_id: str

@app.post("/ask")
async def ask(request: QueryRequest):
    try:
        chain = build_qa_chain(retriever=get_retriever())
        result = chain({"query": request.question})
        return {"answer": result["result"]}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Advanced: Evaluating LangChain Outputs

Evaluation means running your chain against test cases and scoring how well it performs. LangChain provides a built-in evaluator module for this.

You can evaluate: factual correctness, relevance to the question, conciseness, and safety. Each dimension gets a score.

Run evaluations before every deployment to catch regressions when prompts or models are updated.

from langchain.chat_models import ChatOpenAI
from langchain.evaluation import load_evaluator
from langchain.evaluation.schema import EvaluatorType

llm = ChatOpenAI(model="gpt-4", temperature=0)

# Load a criteria-based evaluator
evaluator = load_evaluator(
    EvaluatorType.CRITERIA,
    llm=llm,
    criteria="relevance"
)

# Define test cases
test_cases = [
    {
        "input": "What is photosynthesis?",
        "output": "Photosynthesis is the process by which plants use sunlight, water, and CO2 to produce glucose and oxygen.",
        "reference": "Photosynthesis converts light energy into chemical energy in plants."
    },
    {
        "input": "What is Newton's second law?",
        "output": "Force equals mass times acceleration (F = ma).",
        "reference": "Newton's second law states F = ma."
    }
]

# Run evaluation on each test case
for case in test_cases:
    result = evaluator.evaluate_strings(
        prediction=case["output"],
        input=case["input"],
        reference=case["reference"]
    )
    print(f"Q: {case['input']}")
    print(f"   Score: {result['score']} | Reason: {result['reasoning'][:80]}")
    print()

Advanced: Async Chains for Performance

Synchronous chains process one request at a time. For high-traffic applications, async chains let you handle many requests simultaneously.

LangChain supports async with arun(), ainvoke(), and astream() methods. Combine with Python asyncio and FastAPI for full async performance.

Async is critical when your chain calls multiple tools or APIs — you can run them in parallel instead of waiting for each one.

import asyncio
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
parser = StrOutputParser()

explain_prompt = ChatPromptTemplate.from_template(
    "Explain {topic} in one paragraph for a high school student."
)

chain = explain_prompt | llm | parser

# Async function to process one topic
async def explain_topic(topic: str) -> str:
    result = await chain.ainvoke({"topic": topic})
    return f"{topic}: {result[:80]}..."

# Process multiple topics in parallel
async def explain_all(topics: list) -> list:
    tasks = [explain_topic(topic) for topic in topics]
    results = await asyncio.gather(*tasks)
    return results

# Run the async pipeline
topics = ["gravity", "photosynthesis", "machine learning"]
results = asyncio.run(explain_all(topics))

for r in results:
    print(r)
    print()

Project Milestones by Level

Beginner Project: One LCEL prompt chain that takes a topic and returns a simple structured explanation. Deploy it as a Python script.

Intermediate Project: RAG-based document Q&A app. Load your own text files, embed them, and answer questions based only on your documents. Add a conversation window memory.

Advanced Project: Full production LangChain system with modular folder structure, async FastAPI endpoint, Pydantic-typed outputs, and automated evaluation that runs before each deployment.

Frequently Asked Questions

Is LangChain used only for chatbots?

No. LangChain can also be used for agents, document apps, search workflows, and many other AI applications.

Do I need Python before learning LangChain?

Yes, basic Python is very helpful before starting LangChain.

When should I move to advanced LangChain topics?

Move when you can independently build a retrieval or tool-based app, debug errors, and explain how each component affects final output quality.