AI & Automation

Open WebUI: self-hosted LLM chat interface for your VPS

By ColossusCloud's Team March 8, 2026

Open WebUI is the closest thing to a drop-in, self-hosted ChatGPT. It’s an open-source chat interface that runs on your own server and connects to either local models (via Ollama) or any OpenAI-compatible API. If you’ve been using hosted LLM chat interfaces and wishing you could run your own, Open WebUI is what you’re looking for.

This guide covers what Open WebUI does, why it’s a fit for a VPS, and how to deploy it in minutes with Docker Compose.

What Open WebUI does

Open WebUI is a polished web UI that looks and feels like ChatGPT. Features:

Multiple model backends: Ollama for local models, or OpenAI, Anthropic, Groq, OpenRouter, any OpenAI-compatible API
Multi-user support: accounts for your team, each with their own chat history
Conversation history: search, export, delete
Document chat (RAG): upload PDFs and text files, ask questions
Image generation integration: Stable Diffusion, DALL·E
Voice input/output: speech-to-text and text-to-speech
Plugins and tools: extend with custom Python functions
API access: use Open WebUI’s models from your own scripts

The project is actively developed on GitHub (open-webui/open-webui) with thousands of stars and frequent releases.

Why self-host Open WebUI on a VPS

Privacy. Conversations don’t leave your server. For companies handling sensitive data, trade secrets, legal documents, or customer PII, that’s a hard requirement.

Cost control. Running local models via Ollama means no per-token fees. You pay for the VPS, nothing else.

No per-seat fees. A single Open WebUI VPS supports your whole team without the per-user billing that hosted LLM services charge.

Model flexibility. Swap between GPT-4, Claude, Llama, Mistral, DeepSeek, or any local model on the fly. No vendor lock-in.

Always-on access. The service runs 24/7 on your VPS. No rate limits from someone else’s API (unless your backend is external).

What you need

A Linux VPS with at least 2 GB RAM for cloud-API-only use, 8 GB+ RAM if you want to run local models with Ollama
Docker and Docker Compose installed
An API key for your chosen provider (or Ollama for local models)

For local models, RAM and CPU matter. An 8 GB VPS comfortably runs 7B-parameter models at usable speeds. Larger models (13B, 70B) need more RAM and benefit from GPU.

Open WebUI Docker Compose deployment

Here’s a minimal docker-compose.yml with persistent storage:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3000:8080"
    volumes:
      - ./data:/app/backend/data
    environment:
      - OPENAI_API_KEY=your_openai_key_here
      - WEBUI_AUTH=true

Save it to /opt/openwebui/docker-compose.yml, then:

cd /opt/openwebui
docker compose up -d

Open WebUI is now running on port 3000. The first account created becomes the admin.

Adding Ollama for local models

To run models locally (Llama 3, Mistral, DeepSeek, etc.), add Ollama to the compose file:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - "3000:8080"
    volumes:
      - ./data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_AUTH=true

  ollama:
    image: ollama/ollama
    volumes:
      - ./ollama:/root/.ollama
    restart: unless-stopped

Pull a model:

docker exec -it openwebui-ollama-1 ollama pull llama3.2:3b

Open WebUI auto-detects models available through Ollama and shows them in the model dropdown.

Serve Open WebUI behind HTTPS

Port 3000 is fine for personal use. For a team, put a real domain and HTTPS in front with Caddy:

chat.yourdomain.com {
    reverse_proxy localhost:3000
}

Caddy handles TLS automatically via Let’s Encrypt. Point chat.yourdomain.com at your VPS IP, drop the Caddyfile in place, start Caddy, done.

Multi-user setup

After creating the admin account:

Admin Settings > Users
Create team accounts, or enable signup for self-registration
Set per-user model access (admins can restrict which models each user sees)

Each user has their own chat history, settings, and API keys. Admins see system logs but not individual conversations.

Backup

The ./data volume holds everything: chat history, users, settings, uploaded files. Back it up:

tar -czf openwebui-backup-$(date +%F).tar.gz /opt/openwebui/data

Schedule this nightly via cron and push the archive to object storage (Backblaze B2 is cheap).

Open WebUI VPS cost

A 2 GB ColossusCloud VPS is enough for Open WebUI with cloud model APIs. For Ollama + local models, step up to 8 GB.

With cloud APIs, you still pay per-token fees to OpenAI/Anthropic/whoever. What you save is per-seat ChatGPT Plus or Claude Pro subscriptions, plus your data stays local.

With fully local setups (Ollama only), you pay nothing beyond the VPS. Ideal for privacy-critical workloads or high-volume usage.

Why VPS over home server

A home server means worrying about uptime, power, internet, dynamic DNS, port forwarding, and security. A VPS handles all of that. Your team can access it from anywhere, always online, low upfront cost.

For production team use, a VPS is the right choice.

Deploy Open WebUI on a ColossusCloud VPS and get your own private LLM chat running in minutes.

← Back to all applications