Q&A Agent Evaluation and Testing

A customer support Q&A agent with a comprehensive evaluation suite demonstrating LangSmith-powered testing patterns. Includes deterministic evaluators (keyword coverage, tool usage, hallucination detection), LLM-as-judge evaluators (correctness, tone), trajectory evaluation via agentevals, and regression detection across experiment runs.

Prerequisites

Setup

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your API keys

Environment Variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Yes	Anthropic API key for Claude
`LANGSMITH_API_KEY`	Yes	LangSmith API key for tracing and evals
`LANGSMITH_TRACING`	Yes	Set to `true` to enable tracing

Running

python qa_agent.py

To run evaluations:

python evals.py

Article

Evaluation and Testing for LangGraph Agents

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env.example		.env.example
README.md		README.md
evals.py		evals.py
pyproject.toml		pyproject.toml
qa_agent.py		qa_agent.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q&A Agent Evaluation and Testing

Prerequisites

Setup

Environment Variables

Running

Article

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Q&A Agent Evaluation and Testing

Prerequisites

Setup

Environment Variables

Running

Article

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages