Langfuse is an open-source LLM engineering platform designed to help developers debug and improve their large language model applications. It provides comprehensive tools for:
- Observability: Capture complete traces of LLM applications and agents, enabling inspection of failures and building evaluation datasets. It's based on OpenTelemetry and supports popular LLM/agent libraries.
- Metrics: Track key performance indicators for LLM applications.
- Prompt Management: Manage and version prompts effectively.
- Playground: Experiment with LLMs and prompts.
- Evaluation: Facilitate both automated and human-in-the-loop evaluation of LLM outputs.
- Annotations: Enable human annotation for data labeling and feedback.
- Public API: Integrate Langfuse capabilities into existing workflows.
It offers Python and JS/TS SDKs and integrates with frameworks like Langchain, LlamaIndex, and model providers like OpenAI, LiteLLM, and no-code tools like Dify, Flowise, and Langflow. Langfuse aims to provide a robust solution for the entire LLM application lifecycle, from development to production.

