Arize Phoenix
Arize Phoenix is an open source LLM observability and evaluation platform you can self-host — a LangSmith alternative that traces, evaluates, and debugs AI apps and agents on your own infrastructure, built on OpenTelemetry.
What is Arize Phoenix?
Arize Phoenix is an open source AI observability and evaluation platform for debugging LLM and agent applications. It captures traces of every prompt, model call, retrieval step, and tool action, then lets you score and replay them — so you can see exactly where an app breaks instead of guessing from the final output. It’s built on OpenTelemetry and runs on your own infrastructure.
What is Arize Phoenix best for?
Teams shipping LLM apps, RAG pipelines, and agents who need to debug, evaluate, and monitor them without sending production traces to a hosted-only vendor. It’s a strong fit when you want a turnkey eval platform you can self-host for free and keep sensitive data in your own environment.
What can Arize Phoenix do?
- Trace LLM calls, retrieval steps, and agent actions at the span level using OpenTelemetry
- Run LLM-as-a-judge, code-based, and human-label evaluations for relevance, toxicity, hallucination, and quality
- Manage prompts — version, store, and test prompt variants across datasets
- Run experiments to compare how prompt, model, or retrieval changes affect outputs
- Use the built-in playground to replay traced calls and compare models side by side
- Integrate with 40+ frameworks and providers — LangChain, LlamaIndex, DSPy, CrewAI, LangGraph, OpenAI, Anthropic, and more
- Deploy locally, in a Jupyter notebook, via Docker, or on Kubernetes with Helm
Is Arize Phoenix free?
Yes — self-hosting Phoenix is free and open source, and you only pay for your own infrastructure. There is a managed Phoenix Cloud option and paid dedicated support from Arize, plus a separate enterprise product (Arize AX) with additional features, but the self-hosted platform itself costs nothing.
Where does Arize Phoenix fall short?
- It’s source-available under the Elastic License 2.0, not OSI open source — you can self-host and use it freely, but the license restricts offering it as a competing hosted service.
- The deepest tracing and eval integrations are Python-first, so non-Python stacks rely more on raw OpenTelemetry instrumentation.
- It focuses on LLM/agent observability, not general application monitoring — for infrastructure metrics and logs you’d still run a tool like Grafana or Datadog alongside it.
What does Arize Phoenix replace?
Arize Phoenix is a self-hosted alternative to LangSmith, LangChain’s hosted tracing and evaluation platform. It does the same trace-evaluate-debug job for LLM apps, but you run it on your own infrastructure with no per-seat pricing and your data never leaves your environment. Langfuse is another open source option in the same space.
FAQ
Is Arize Phoenix open source? It’s source-available under the Elastic License 2.0 (ELv2). The code is public and free to self-host, but the license stops you from reselling it as a competing managed service.
Can I self-host Arize Phoenix for free? Yes. Self-hosting is free — install it with pip install arize-phoenix or run the Docker image, and you only pay for the server it runs on. Phoenix Cloud and dedicated support are the paid, managed options.
Is Arize Phoenix a good LangSmith alternative? For teams that want to self-host and avoid per-seat pricing, yes — it covers tracing, evals, prompt management, and experiments. Because it’s built on OpenTelemetry, it’s also vendor-agnostic and not tied to LangChain.
What do I need to run Arize Phoenix? Python with the arize-phoenix package for a local or notebook setup, or Docker for a containerized deployment. For production you’d typically run the container with a Postgres backend, or deploy to Kubernetes via Helm.