Skip to content

Open WebUI

Open WebUI (126k★) is the most popular self-hosted chat interface for AI. With Hermes Agent’s built-in API server, you can use Open WebUI as a polished web frontend for your agent — complete with conversation management, user accounts, and a modern chat interface.

flowchart LR
A["Open WebUI<br/>browser UI<br/>port 3000"]
B["hermes-agent<br/>gateway API server<br/>port 8642"]
A -->|POST /v1/chat/completions| B
B -->|SSE streaming response| A

Open WebUI connects to Hermes Agent’s API server just like it would connect to OpenAI. Hermes handles the requests with its full toolset — terminal, file operations, web search, memory, skills — and returns the final response.

:::important Runtime location The API server is a Hermes agent runtime, not a pure LLM proxy. For each request, Hermes creates a server-side AIAgent on the API-server host. Tool calls run where that API server is running.

For example, if a laptop points Open WebUI or another OpenAI-compatible client at a Hermes API server on a remote machine, pwd, file tools, browser tools, local MCP tools, and other workspace tools run on the remote API-server host, not on the laptop. :::

Open WebUI talks to Hermes server-to-server, so you do not need API_SERVER_CORS_ORIGINS for this integration.

One-command local bootstrap (macOS/Linux, no Docker)

Section titled “One-command local bootstrap (macOS/Linux, no Docker)”

If you want Hermes + Open WebUI wired together locally with a reusable launcher, run:

Окно терминала
cd ~/.hermes/hermes-agent
bash scripts/setup_open_webui.sh

What the script does:

  • ensures ~/.hermes/.env contains API_SERVER_ENABLED, API_SERVER_HOST, API_SERVER_KEY, API_SERVER_PORT, and API_SERVER_MODEL_NAME
  • restarts the Hermes gateway so the API server comes up
  • installs Open WebUI into ~/.local/open-webui-venv
  • writes a launcher at ~/.local/bin/start-open-webui-hermes.sh
  • on macOS, installs a launchd user service; on Linux with systemd --user, installs a user service there

Defaults:

  • Hermes API: http://127.0.0.1:8642/v1
  • Open WebUI: http://127.0.0.1:8080
  • model name advertised to Open WebUI: Hermes Agent

Useful overrides:

Окно терминала
OPEN_WEBUI_NAME='My Hermes UI' \
OPEN_WEBUI_ENABLE_SIGNUP=true \
HERMES_API_MODEL_NAME='My Hermes Agent' \
bash scripts/setup_open_webui.sh

On Linux, automatic background service setup requires a working systemd --user session. If you are on a headless SSH box and want to skip service installation, run:

Окно терминала
OPEN_WEBUI_ENABLE_SERVICE=false bash scripts/setup_open_webui.sh
Окно терминала
hermes config set API_SERVER_ENABLED true
hermes config set API_SERVER_KEY your-secret-key

hermes config set auto-routes the flag to config.yaml and the secret to ~/.hermes/.env. If the gateway is already running, restart it so the change takes effect:

Окно терминала
hermes gateway stop && hermes gateway
Окно терминала
hermes gateway

You should see:

[API Server] API server listening on http://127.0.0.1:8642
Окно терминала
curl -s http://127.0.0.1:8642/health
# {"status": "ok", ...}
curl -s -H "Authorization: Bearer your-secret-key" http://127.0.0.1:8642/v1/models
# {"object":"list","data":[{"id":"hermes-agent", ...}]}

If /health fails, the gateway didn’t pick up API_SERVER_ENABLED=true — restart it. If /v1/models returns 401, your Authorization header doesn’t match API_SERVER_KEY.

Окно терминала
docker run -d -p 3000:8080 \
-e OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1 \
-e OPENAI_API_KEY=your-secret-key \
-e ENABLE_OLLAMA_API=false \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main

ENABLE_OLLAMA_API=false suppresses the default Ollama backend, which would otherwise show up empty and clutter the model picker. Omit it if you actually have Ollama running alongside.

First launch takes 15–30 seconds: Open WebUI downloads sentence-transformer embedding models (~150MB) the first time it starts. Wait for docker logs open-webui to settle before opening the UI.

Go to http://localhost:3000. Create your admin account (the first user becomes admin). You should see your agent in the model dropdown (named after your profile, or hermes-agent for the default profile). Start chatting!

For a more permanent setup, create a docker-compose.yml:

services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
volumes:
- open-webui:/app/backend/data
environment:
- OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1
- OPENAI_API_KEY=your-secret-key
- ENABLE_OLLAMA_API=false
extra_hosts:
- "host.docker.internal:host-gateway"
restart: always
volumes:
open-webui:

Then:

Окно терминала
docker compose up -d

If you prefer to configure the connection through the UI instead of environment variables:

  1. Log in to Open WebUI at http://localhost:3000
  2. Click your profile avatarAdmin Settings
  3. Go to Connections
  4. Under OpenAI API, click the wrench icon (Manage)
  5. Click + Add New Connection
  6. Enter:
    • URL: http://host.docker.internal:8642/v1
    • API Key: the exact same value as API_SERVER_KEY in Hermes
  7. Click the checkmark to verify the connection
  8. Save

Your agent model should now appear in the model dropdown (named after your profile, or hermes-agent for the default profile).

Environment variables only take effect on Open WebUI’s first launch. After that, connection settings are stored in its internal database. To change them later, use the Admin UI or delete the Docker volume and start fresh.

Open WebUI supports two API modes when connecting to a backend:

ModeFormatWhen to use
Chat Completions (default)/v1/chat/completionsRecommended. Works out of the box.
Responses (experimental)/v1/responsesFor server-side conversation state via previous_response_id.

This is the default and requires no extra configuration. Open WebUI sends standard OpenAI-format requests and Hermes Agent responds accordingly. Each request includes the full conversation history.

To use the Responses API mode:

  1. Go to Admin SettingsConnectionsOpenAIManage
  2. Edit your hermes-agent connection
  3. Change API Type from “Chat Completions” to “Responses (Experimental)”
  4. Save

With the Responses API, Open WebUI sends requests in the Responses format (input array + instructions), and Hermes Agent can preserve full tool call history across turns via previous_response_id. When stream: true, Hermes also streams spec-native function_call and function_call_output items, which enables custom structured tool-call UI in clients that render Responses events.

When you send a message in Open WebUI:

  1. Open WebUI sends a POST /v1/chat/completions request with your message and conversation history
  2. Hermes Agent creates a server-side AIAgent instance using the API server’s profile, model/provider config, memory, skills, and configured API-server toolsets
  3. The agent processes your request — it may call tools (terminal, file operations, web search, etc.) on the API-server host
  4. As tools execute, inline progress messages stream to the UI so you can see what the agent is doing (e.g. `💻 ls -la`, `🔍 Python 3.12 release`)
  5. The agent’s final text response streams back to Open WebUI
  6. Open WebUI displays the response in its chat interface

Your agent has access to the same tools and capabilities as that API-server Hermes instance. If the API server is remote, those tools are remote too.

If you need tools to run against your local workspace today, run Hermes locally and point it at a pure LLM provider or pure OpenAI-compatible model proxy (for example vLLM, LiteLLM, Ollama, llama.cpp, OpenAI, OpenRouter, etc.). A future split-runtime mode for “remote brain, local hands” is being tracked in #18715; it is not the behavior of the current API server.

:::tip Tool Progress With streaming enabled (the default), you’ll see brief inline indicators as tools run — the tool emoji and its key argument. These appear in the response stream before the agent’s final answer, giving you visibility into what’s happening behind the scenes. :::

VariableDefaultDescription
API_SERVER_ENABLEDfalseEnable the API server
API_SERVER_PORT8642HTTP server port
API_SERVER_HOST127.0.0.1Bind address
API_SERVER_KEY(required)Bearer token for auth. Match OPENAI_API_KEY.
VariableDescription
OPENAI_API_BASE_URLHermes Agent’s API URL (include /v1)
OPENAI_API_KEYMust be non-empty. Match your API_SERVER_KEY.
  • Check the URL has /v1 suffix: http://host.docker.internal:8642/v1 (not just :8642)
  • Verify the gateway is running: curl http://localhost:8642/health should return {"status": "ok"}
  • Check model listing: curl -H "Authorization: Bearer your-secret-key" http://localhost:8642/v1/models should return a list with hermes-agent
  • Docker networking: From inside Docker, localhost means the container, not your host. Use host.docker.internal or --network=host.
  • Empty Ollama backend shadowing the picker: If you omitted ENABLE_OLLAMA_API=false, Open WebUI shows an empty Ollama section above your Hermes models. Restart the container with -e ENABLE_OLLAMA_API=false or disable Ollama in Admin Settings → Connections.

This is almost always the missing /v1 suffix. Open WebUI’s connection test is a basic connectivity check — it doesn’t verify model listing works.

Hermes Agent may be executing multiple tool calls (reading files, running commands, searching the web) before producing its final response. This is normal for complex queries. The response appears all at once when the agent finishes.

Make sure your OPENAI_API_KEY in Open WebUI matches the API_SERVER_KEY in Hermes Agent.

Open WebUI persists OpenAI-compatible connection settings in its own database after first launch. If you accidentally saved a wrong key in the Admin UI, fixing the environment variables alone is not enough — update or delete the saved connection in Admin Settings → Connections, or reset the Open WebUI data directory / database.

To run separate Hermes instances per user — each with their own config, memory, and skills — use profiles. Each profile runs its own API server on a different port and automatically advertises the profile name as the model in Open WebUI.

1. Create profiles and configure API servers

Section titled “1. Create profiles and configure API servers”
Окно терминала
hermes profile create alice
hermes -p alice config set API_SERVER_ENABLED true
hermes -p alice config set API_SERVER_PORT 8643
hermes -p alice config set API_SERVER_KEY alice-secret
hermes profile create bob
hermes -p bob config set API_SERVER_ENABLED true
hermes -p bob config set API_SERVER_PORT 8644
hermes -p bob config set API_SERVER_KEY bob-secret
Окно терминала
hermes -p alice gateway &
hermes -p bob gateway &

In Admin SettingsConnectionsOpenAI APIManage, add one connection per profile:

ConnectionURLAPI Key
Alicehttp://host.docker.internal:8643/v1alice-secret
Bobhttp://host.docker.internal:8644/v1bob-secret

The model dropdown will show alice and bob as distinct models. You can assign models to Open WebUI users via the admin panel, giving each user their own isolated Hermes agent.

:::tip Custom Model Names The model name defaults to the profile name. To override it, set API_SERVER_MODEL_NAME in the profile’s .env:

Окно терминала
hermes -p alice config set API_SERVER_MODEL_NAME "Alice's Agent"

:::

On Linux without Docker Desktop, host.docker.internal doesn’t resolve by default. Options:

Окно терминала
# Option 1: Add host mapping
docker run --add-host=host.docker.internal:host-gateway ...
# Option 2: Use host networking
docker run --network=host -e OPENAI_API_BASE_URL=http://localhost:8642/v1 ...
# Option 3: Use Docker bridge IP
docker run -e OPENAI_API_BASE_URL=http://172.17.0.1:8642/v1 ...