Skip to main content

LangFuse Integration

LangFuse provides observability and evaluation capabilities for AI agents using generated MCP servers.

Features

  • Distributed Tracing: Track agent interactions across tool calls
  • Evaluation Framework: Automated assessment of agent performance
  • Dataset Building: Interactive dataset creation for testing
  • Real-time Monitoring: Performance metrics and error tracking

Configuration

# LangFuse Configuration
LANGFUSE_HOST=http://localhost:3000
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...

Evaluation Workflow

# Build evaluation dataset interactively
make run-a2a-eval-mode

# Run automated evaluation
make eval

Metrics

LangFuse tracks:

  • Correctness: Did the agent complete the task successfully?
  • Hallucination: Did the agent make up information?
  • Trajectory: Did the agent use tools efficiently?

LangFuse integration documentation in progress. See examples with --generate-eval for working implementations.