Configuring Models
Configuring Models
Section titled “Configuring Models”Hermes uses two kinds of model slots:
- Main model — what the agent thinks with. Every user message, every tool-call loop, every streamed response goes through this model.
- Auxiliary models — smaller side-jobs the agent offloads. Context compression, vision (image analysis), web-page summarization, session search, approval scoring, MCP tool routing, session-title generation, and skill search. Each has its own slot and can be overridden independently.
This page covers configuring both from the dashboard. If you prefer config files or the CLI, jump to Alternative methods at the bottom.
The Models page
Section titled “The Models page”Open the dashboard and click Models in the sidebar. You get two sections:
- Model Settings — the top panel, where you assign models to slots.
- Usage analytics — ranked cards showing every model that ran a session in the selected period, with token counts, cost, and capability badges.

The top card is the Model Settings panel. The main row always shows what the agent will spin up for new sessions. Click Change to open the picker.
Setting the main model
Section titled “Setting the main model”Click Change on the Main model row:

The picker has two columns:
- Left — authenticated providers. Only providers you’ve set up (API key set, OAuth’d, or defined as a custom endpoint) show up here. If a provider is missing, head to Keys and add its credential.
- Right — the curated model list for the selected provider. These are the agentic models Hermes recommends for that provider, not the raw
/modelsdump (which on OpenRouter includes 400+ models including TTS, image generators, and rerankers).
Type in the filter box to narrow by provider name, slug, or model ID.
Pick a model, hit Switch, and Hermes writes it to ~/.hermes/config.yaml under the model section. This applies to new sessions only — any chat tab you already have open keeps running whatever model it started with. To hot-swap the current chat, use the /model slash command inside it.
Setting auxiliary models
Section titled “Setting auxiliary models”Click Show auxiliary to reveal the eight task slots:

Every auxiliary task defaults to auto — meaning Hermes uses your main model for that job too. Override a specific task when you want a cheaper or faster model for a side-job.
Common override patterns
Section titled “Common override patterns”| Task | When to override |
|---|---|
| Title Gen | Almost always. A $0.10/M flash model writes session titles as well as Opus. Default config sets this to google/gemini-3-flash-preview on OpenRouter. |
| Vision | When your main model is a coding model without vision (e.g. Kimi, DeepSeek). Point it at google/gemini-2.5-flash or gpt-4o-mini. |
| Compression | When you’re burning reasoning tokens on Opus/M2.7 just to summarize context. A fast chat model does the job at 1/50th the cost. |
| Session Search | When recall queries fan out — default max_concurrency is 3. A cheap model keeps the bill predictable. |
| Approval | For approval_mode: smart — a fast/cheap model (haiku, flash, gpt-5-mini) decides whether to auto-approve low-risk commands. Expensive models here are waste. |
| Web Extract | When you use web_extract heavily. Same logic as compression — summarization doesn’t need reasoning. |
| Skills Hub | hermes skills search uses this. Usually fine at auto. |
| MCP | MCP tool routing. Usually fine at auto. |
Per-task override
Section titled “Per-task override”Click Change on any auxiliary row. Same picker opens, same behavior — pick provider + model, hit Switch. The row updates to show provider · model instead of auto (use main model).
Reset all to auto
Section titled “Reset all to auto”If you’ve over-tuned and want to start over, click Reset all to auto at the top of the auxiliary section. Every slot goes back to using your main model.
The “Use as” shortcut
Section titled “The “Use as” shortcut”Every model card on the page has a Use as dropdown. This is the fast path — pick a model you see in your analytics, click Use as, and assign it to the main slot or any specific auxiliary task in one click:

The dropdown has:
- Main model — same as clicking Change on the main row.
- All auxiliary tasks — assigns this model to all 8 aux slots at once. Useful when you just want every side-job on a cheap flash model.
- Individual task options — Vision, Web Extract, Compression, etc. The currently-assigned model for each task is marked
current.
Cards are badged with main or aux · <task> when they’re currently assigned to something — so you can see at a glance which of your historical models are wired in where.
What gets written to config.yaml
Section titled “What gets written to config.yaml”When you save via the dashboard, Hermes writes to ~/.hermes/config.yaml:
Main model:
model: provider: openrouter default: anthropic/claude-opus-4.7 base_url: '' # cleared on provider switch api_mode: chat_completionsAuxiliary override (example — vision on gemini-flash):
auxiliary: vision: provider: openrouter model: google/gemini-2.5-flash base_url: '' api_key: '' timeout: 120 extra_body: {} download_timeout: 30Auxiliary on auto (default):
auxiliary: compression: provider: auto model: '' base_url: '' # ... other fields unchangedprovider: auto with model: '' tells Hermes to use the main model for that task.
When does it take effect?
Section titled “When does it take effect?”- CLI (
hermes chat): nexthermes chatinvocation. - Gateway (Telegram, Discord, Slack, etc.): next new session. Existing sessions keep their model. Restart the gateway (
hermes gateway restart) if you want to force all sessions to pick up the change. - Dashboard chat tab (
/chat): next new PTY. The currently-open chat keeps its model — use/modelinside it to hot-swap.
Changes never invalidate prompt caches on running sessions. That’s deliberate: swapping the main model inside a session requires a cache reset (the system prompt contains model-specific content), and we reserve that for the explicit /model slash command inside chat.
Troubleshooting
Section titled “Troubleshooting””No authenticated providers” in the picker
Section titled “”No authenticated providers” in the picker”Hermes lists a provider only if it has a working credential. Check Keys in the sidebar — you should see one of: an API key, a successful OAuth, or a custom endpoint URL. If the provider you want isn’t there, run hermes setup to wire it up, or go to Keys and add the env var.
Main model didn’t change in my running chat
Section titled “Main model didn’t change in my running chat”Expected. The dashboard writes config.yaml, which new sessions read. The currently-open chat is a live agent process — it keeps whatever model it was spawned with. Use /model <name> inside the chat to hot-swap that specific session.
Auxiliary override “didn’t take effect”
Section titled “Auxiliary override “didn’t take effect””Three things to check:
- Did you start a new session? Existing chats don’t re-read config.
- Is
providerset to something other thanauto? If the field showsauto, the task is still using your main model. Click Change and pick a real provider. - Is the provider authenticated? If you assigned
minimaxto a task but don’t have a MiniMax API key, that task falls back to the openrouter default and logs a warning inagent.log.
I picked a model but Hermes switched providers on me
Section titled “I picked a model but Hermes switched providers on me”On OpenRouter (or any aggregator), bare model names resolve within the aggregator first. So claude-sonnet-4 on OpenRouter becomes anthropic/claude-sonnet-4.6, staying on your OpenRouter auth. But if you typed claude-sonnet-4 on a native Anthropic auth, it would stay as claude-sonnet-4-6. If you see an unexpected provider switch, check that your current provider is what you expect — the picker always shows the current main at the top of the dialog.
Alternative methods
Section titled “Alternative methods”CLI slash command
Section titled “CLI slash command”Inside any hermes chat session:
/model gpt-5.4 --provider openrouter # session-only/model gpt-5.4 --provider openrouter --global # also persists to config.yaml--global does the same thing the dashboard’s Change button does, plus it switches the running session in-place.
Custom aliases
Section titled “Custom aliases”Define your own short names for models you reach for often, then use /model <alias> in the CLI or any messaging platform:
model_aliases: fav: model: claude-sonnet-4.6 provider: anthropic grok: model: grok-4 provider: x-aiOr from the shell (short form, provider/model):
hermes config set model.aliases.fav anthropic/claude-opus-4.6hermes config set model.aliases.grok x-ai/grok-4Then /model fav or /model grok in chat. User aliases shadow built-in short names (sonnet, kimi, opus, etc.). See Custom model aliases for the full reference.
hermes model subcommand
Section titled “hermes model subcommand”hermes model list # list authenticated providers + modelshermes model set anthropic/claude-opus-4.7 --provider openrouterDirect config edit
Section titled “Direct config edit”Edit ~/.hermes/config.yaml and restart whatever reads it. See the Configuration reference for the full schema.
REST API
Section titled “REST API”The dashboard uses three endpoints. Useful for scripting:
# List authenticated providers + curated model listscurl -H "X-Hermes-Session-Token: $TOKEN" http://localhost:PORT/api/model/options
# Read current main + auxiliary assignmentscurl -H "X-Hermes-Session-Token: $TOKEN" http://localhost:PORT/api/model/auxiliary
# Set the main modelcurl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \ -d '{"scope":"main","provider":"openrouter","model":"anthropic/claude-opus-4.7"}' \ http://localhost:PORT/api/model/set
# Override a single auxiliary taskcurl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \ -d '{"scope":"auxiliary","task":"vision","provider":"openrouter","model":"google/gemini-2.5-flash"}' \ http://localhost:PORT/api/model/set
# Assign one model to every auxiliary taskcurl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \ -d '{"scope":"auxiliary","task":"","provider":"openrouter","model":"google/gemini-2.5-flash"}' \ http://localhost:PORT/api/model/set
# Reset all auxiliary tasks to autocurl -X POST -H "Content-Type: application/json" -H "X-Hermes-Session-Token: $TOKEN" \ -d '{"scope":"auxiliary","task":"__reset__","provider":"","model":""}' \ http://localhost:PORT/api/model/setThe session token is injected into the dashboard HTML at startup and rotates on every server restart. Grab it from the browser devtools (window.__HERMES_SESSION_TOKEN__) if you’re scripting against a running dashboard.