~/ossroster.com

≡

home tools articles categories alternatives about

~/categories/llm-inference

category

LLM Inference & Serving

Open source LLM inference and serving engines that run large language models on your own GPUs or CPUs — with high-throughput batching, OpenAI-compatible APIs, and quantization, so you can self-host open models instead of paying per token for a hosted API.

Tools

2 in this category

// tools tagged: llm-inference

Ollama_

Run open source LLMs locally with one command.

vLLM_

High-throughput LLM inference and serving on your own GPUs.

Articles

// articles tagged: llm-inference

No articles tagged with this category yet.