Open WebUI: self-hosted LLM chat interface for your VPS
Open WebUI is the closest thing to a drop-in, self-hosted ChatGPT. It’s an open-source chat interface that runs on your own server and connects to either local models (via Ollama) or any OpenAI-compatible API. If you’ve been using hosted LLM chat interfaces and wishing you could run your own, Open WebUI is what you’re looking for.
This guide covers what Open WebUI does, why it’s a fit for a VPS, and how to deploy it in minutes with Docker Compose.
What Open WebUI does
Open WebUI is a polished web UI that looks and feels like ChatGPT. Features:
- Multiple model backends: Ollama for local models, or OpenAI, Anthropic, Groq, OpenRouter, any OpenAI-compatible API
- Multi-user support: accounts for your team, each with their own chat history
- Conversation history: search, export, delete
- Document chat (RAG): upload PDFs and text files, ask questions
- Image generation integration: Stable Diffusion, DALL·E
- Voice input/output: speech-to-text and text-to-speech
- Plugins and tools: extend with custom Python functions
- API access: use Open WebUI’s models from your own scripts
The project is actively developed on GitHub (open-webui/open-webui) with thousands of stars and frequent releases.
Why self-host Open WebUI on a VPS
Privacy. Conversations don’t leave your server. For companies handling sensitive data, trade secrets, legal documents, or customer PII, that’s a hard requirement.
Cost control. Running local models via Ollama means no per-token fees. You pay for the VPS, nothing else.
No per-seat fees. A single Open WebUI VPS supports your whole team without the per-user billing that hosted LLM services charge.
Model flexibility. Swap between GPT-4, Claude, Llama, Mistral, DeepSeek, or any local model on the fly. No vendor lock-in.
Always-on access. The service runs 24/7 on your VPS. No rate limits from someone else’s API (unless your backend is external).
What you need
- A Linux VPS with at least 2 GB RAM for cloud-API-only use, 8 GB+ RAM if you want to run local models with Ollama
- Docker and Docker Compose installed
- An API key for your chosen provider (or Ollama for local models)
For local models, RAM and CPU matter. An 8 GB VPS comfortably runs 7B-parameter models at usable speeds. Larger models (13B, 70B) need more RAM and benefit from GPU.
Open WebUI Docker Compose deployment
Here’s a minimal docker-compose.yml with persistent storage:
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
ports:
- "3000:8080"
volumes:
- ./data:/app/backend/data
environment:
- OPENAI_API_KEY=your_openai_key_here
- WEBUI_AUTH=true
Save it to /opt/openwebui/docker-compose.yml, then:
cd /opt/openwebui
docker compose up -d
Open WebUI is now running on port 3000. The first account created becomes the admin.
Adding Ollama for local models
To run models locally (Llama 3, Mistral, DeepSeek, etc.), add Ollama to the compose file:
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
volumes:
- ./data:/app/backend/data
environment:
- OLLAMA_BASE_URL=http://ollama:11434
- WEBUI_AUTH=true
ollama:
image: ollama/ollama
volumes:
- ./ollama:/root/.ollama
restart: unless-stopped
Pull a model:
docker exec -it openwebui-ollama-1 ollama pull llama3.2:3b
Open WebUI auto-detects models available through Ollama and shows them in the model dropdown.
Serve Open WebUI behind HTTPS
Port 3000 is fine for personal use. For a team, put a real domain and HTTPS in front with Caddy:
chat.yourdomain.com {
reverse_proxy localhost:3000
}
Caddy handles TLS automatically via Let’s Encrypt. Point chat.yourdomain.com at your VPS IP, drop the Caddyfile in place, start Caddy, done.
Multi-user setup
After creating the admin account:
- Admin Settings > Users
- Create team accounts, or enable signup for self-registration
- Set per-user model access (admins can restrict which models each user sees)
Each user has their own chat history, settings, and API keys. Admins see system logs but not individual conversations.
Backup
The ./data volume holds everything: chat history, users, settings, uploaded files. Back it up:
tar -czf openwebui-backup-$(date +%F).tar.gz /opt/openwebui/data
Schedule this nightly via cron and push the archive to object storage (Backblaze B2 is cheap).
Open WebUI VPS cost
A 2 GB ColossusCloud VPS is enough for Open WebUI with cloud model APIs. For Ollama + local models, step up to 8 GB.
With cloud APIs, you still pay per-token fees to OpenAI/Anthropic/whoever. What you save is per-seat ChatGPT Plus or Claude Pro subscriptions, plus your data stays local.
With fully local setups (Ollama only), you pay nothing beyond the VPS. Ideal for privacy-critical workloads or high-volume usage.
Why VPS over home server
A home server means worrying about uptime, power, internet, dynamic DNS, port forwarding, and security. A VPS handles all of that. Your team can access it from anywhere, always online, low upfront cost.
For production team use, a VPS is the right choice.
Deploy Open WebUI on a ColossusCloud VPS and get your own private LLM chat running in minutes.