Chroma
Chroma is an open source vector database for AI applications — an Apache-2.0 Pinecone alternative that stores embeddings and runs similarity, hybrid, and full-text search, either embedded in your app or as a self-hosted server.
What is Chroma?
Chroma is an open source vector database — the retrieval layer for AI apps. You create a collection, add your documents, and query by similarity; Chroma handles the tokenizing, embedding, and indexing for you, or takes embeddings you’ve already computed. It stores documents and metadata alongside the vectors, so one system holds both your data and its search index.
What is Chroma best for?
Developers building RAG pipelines, semantic search, and AI prototypes who want the simplest possible embedding store. With pip install chromadb you get a working vector database in-process — no server to run — which makes it ideal for shipping a retrieval feature fast, then graduating to a self-hosted server or Chroma Cloud as you grow. In a RAG stack Chroma finds the relevant chunks, and you feed them to whatever model you’re running, including one you serve yourself with vLLM.
What can Chroma do?
- Run dense vector, sparse, hybrid, and full-text/regex keyword search over your data
- Filter results at query time by metadata (
where) or by document contents (where_document) - Embed automatically with built-in functions for OpenAI, Cohere, Hugging Face, and sentence-transformers — or bring your own vectors
- Index and search multi-modal data: text, images, and audio
- Call it from Python or JavaScript/TypeScript SDKs, with integrations for LangChain and LlamaIndex
- Run three ways: embedded in your app, as a self-hosted client-server instance (
chroma run), or on managed Chroma Cloud
Is Chroma free?
Yes — Chroma is free and fully open source under the Apache 2.0 license, so you can self-host it at no cost and only pay for your own server. Chroma Cloud is the optional managed, serverless layer: a Starter plan at $0/mo plus usage (with $5 in free credits), a Team plan at $250/mo, and custom Enterprise pricing. Cloud usage is metered by writes ($2.50/GiB), storage ($0.33/GiB per month), and queries.
Where does Chroma fall short?
- It isn’t built for massive scale. Chroma shines from prototype up to a few million vectors; past that, recall and filtering get strained, and teams often migrate to a heavier engine like Qdrant, Weaviate, or pgvector.
- Embedded mode shares memory with your application, so beyond roughly 100K vectors you’ll want to run it as a dedicated server rather than in-process.
- The docs and Cloud dashboard are still maturing — some SDK corners (especially TypeScript error handling) are thinner than an established managed service like Pinecone.
What does Chroma replace?
Chroma is an open source, self-hostable alternative to Pinecone, the managed serverless vector database. It does the same store-embeddings-and-search-by-similarity job, but you can run it on your own infrastructure under Apache 2.0 instead of paying Pinecone’s usage-based, per-query pricing. It’s also commonly compared to open source engines like Qdrant, Weaviate, and pgvector.
FAQ
Is Chroma open source? Yes — fully open source under the Apache 2.0 license, one of the most permissive there is. The code is public on GitHub and free to self-host, audit, and modify, with no source-available restrictions.
Can I self-host Chroma for free? Yes. Self-hosting is free under Apache 2.0; you only pay for the server it runs on. The managed Chroma Cloud is the optional paid, serverless alternative.
Is Chroma a good Pinecone alternative? For prototypes and small-to-medium workloads, yes — it’s free, open source, and famously quick to start with. If you need proven recall at tens or hundreds of millions of vectors with no tuning, a managed service like Pinecone (or a heavier open source engine) may serve you better.
What do I need to run Chroma? For local use, just pip install chromadb (or the npm package) — it runs in-process with no separate server. For shared or production use, run chroma run as a client-server instance on a machine you control, sized to your data.