Skip to content

Google Gemini

Hermes Agent supports Google Gemini as a native provider using the Google AI Studio / Gemini API — not the OpenAI-compatible endpoint. This lets Hermes translate its internal OpenAI-shaped message and tool loop into Gemini’s native generateContent API while preserving tool calling, streaming, multimodal inputs, and Gemini-specific response metadata.

Hermes also supports a separate Google Gemini (OAuth) provider that uses the same Cloud Code Assist backend as Google’s Gemini CLI. Use the API-key provider (gemini) for the lowest-risk official API path.

  • Google AI Studio API key — create one at aistudio.google.com/apikey
  • Billing-enabled Google Cloud project — recommended for agent use. Gemini’s free tier is too small for long-running agent sessions because Hermes may make several model calls per user turn.
  • Hermes installed — no extra Python package is required for the native Gemini provider.

:::tip API key path Set GOOGLE_API_KEY or GEMINI_API_KEY. Hermes checks both names for the gemini provider. :::

Окно терминала
# Add your Gemini API key
echo "GOOGLE_API_KEY=..." >> ~/.hermes/.env
# Select Gemini as your provider
hermes model
# → Choose "More providers..." → "Google AI Studio"
# → Hermes checks your key tier and shows Gemini models
# → Select a model
# Start chatting
hermes chat

If you prefer direct config editing, use the native Gemini API base URL:

model:
default: gemini-3-flash-preview
provider: gemini
base_url: https://generativelanguage.googleapis.com/v1beta

After running hermes model, your ~/.hermes/config.yaml will contain:

model:
default: gemini-3-flash-preview
provider: gemini
base_url: https://generativelanguage.googleapis.com/v1beta

And in ~/.hermes/.env:

Окно терминала
GOOGLE_API_KEY=...

The recommended endpoint is:

https://generativelanguage.googleapis.com/v1beta

Hermes detects this endpoint and creates its native Gemini adapter. Internally, Hermes still keeps the agent loop in OpenAI-shaped messages, then translates each request to Gemini’s native schema:

  • messages[] → Gemini contents[]
  • system prompts → Gemini systemInstruction
  • tool schemas → Gemini functionDeclarations
  • tool results → Gemini functionResponse parts
  • streaming responses → OpenAI-shaped stream chunks for the Hermes loop

:::note Gemini 3 thought signatures For Gemini 3 tool use, Hermes preserves the thoughtSignature values attached to function-call parts and replays them on the next tool turn. That covers the validation-critical path for multi-step agent workflows.

Gemini 3 may also attach thought signatures to other response parts. Hermes’ native adapter is optimized for agent tool loops today, so it does not yet replay every non-tool-call signature with full part-level fidelity. :::

Google also exposes an OpenAI-compatible endpoint:

https://generativelanguage.googleapis.com/v1beta/openai/

For Hermes agent sessions, prefer the native Gemini endpoint above. Hermes includes a native Gemini adapter so it can map multi-turn tool use, tool-call results, streaming, multimodal inputs, and Gemini response metadata directly onto Gemini’s generateContent API. The OpenAI-compatible endpoint is still useful when you specifically need OpenAI API compatibility.

If you previously set GEMINI_BASE_URL to the /openai URL, remove it or change it:

Окно терминала
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta

Hermes also has a google-gemini-cli provider:

Окно терминала
hermes model
# → Choose "Google Gemini (OAuth)"

This uses browser PKCE login and the Cloud Code Assist backend. It can be useful for users who want Gemini CLI-style OAuth, but Hermes shows an explicit warning because Google may treat use of the Gemini CLI OAuth client from third-party software as a policy violation. For production or lowest-risk usage, prefer the API-key provider above.

The hermes model picker shows Gemini models maintained in Hermes’ provider registry. Common choices include:

ModelIDNotes
Gemini 3.1 Pro Previewgemini-3.1-pro-previewMost capable preview model when available
Gemini 3 Pro Previewgemini-3-pro-previewStrong reasoning and coding model
Gemini 3 Flash Previewgemini-3-flash-previewRecommended default balance of speed and capability
Gemini 3.1 Flash Lite Previewgemini-3.1-flash-lite-previewFastest / lowest-cost option when available

Model availability changes over time. If a model disappears or is not enabled for your key, run hermes model again and pick one from the current list.

:::info Model IDs Use Gemini’s native model IDs such as gemini-3-flash-preview, not OpenRouter-style IDs like google/gemini-3-flash-preview, when provider: gemini. :::

Google publishes moving aliases for the Pro and Flash Gemini families. gemini-pro-latest and gemini-flash-latest are useful when you want Google to advance the model automatically without changing your Hermes config.

AliasCurrently tracksNotes
gemini-pro-latestLatest Gemini Pro modelBest when you want Google’s current Pro default
gemini-flash-latestLatest Gemini Flash modelBest when you want Google’s current Flash default
model:
default: gemini-pro-latest
provider: gemini
base_url: https://generativelanguage.googleapis.com/v1beta

If you need strict reproducibility, prefer explicit model IDs such as gemini-3.1-pro-preview or gemini-3-flash-preview.

Google also exposes Gemma models through the Gemini API. Hermes recognizes these as Google models, but hides very low-throughput Gemma entries from the default model picker so new users do not accidentally select an evaluation-tier model for a long-running agent session.

Useful evaluation IDs include:

ModelIDNotes
Gemma 4 31B ITgemma-4-31b-itLarger Gemma model; useful for compatibility and quality evaluation
Gemma 4 26B A4B ITgemma-4-26b-a4b-itSmaller active-parameter variant when available

These models are best treated as evaluation options on Gemini API keys. Google’s Gemma API pricing is free-tier-only and the usage caps are low compared with production Gemini models, so sustained Hermes agent use should normally move to a paid Gemini model, a self-hosted deployment, or another provider with appropriate quota.

To use a Gemma model that is hidden from the picker, set it directly:

model:
default: gemma-4-31b-it
provider: gemini
base_url: https://generativelanguage.googleapis.com/v1beta

Use the /model command during a conversation:

/model gemini-3-flash-preview
/model gemini-flash-latest
/model gemini-3-pro-preview
/model gemini-pro-latest
/model gemma-4-31b-it
/model gemini-3.1-flash-lite-preview

If you have not configured Gemini yet, exit the session and run hermes model first. /model switches among already-configured providers and models; it does not collect new API keys.

Окно терминала
hermes doctor

The doctor checks:

  • Whether GOOGLE_API_KEY or GEMINI_API_KEY is available
  • Whether Gemini OAuth credentials exist for google-gemini-cli
  • Whether configured provider credentials can be resolved

For OAuth quota usage, run this inside a Hermes session:

/gquota

/gquota applies to the google-gemini-cli OAuth provider, not the AI Studio API-key provider.

Gemini works with all Hermes gateway platforms (Telegram, Discord, Slack, WhatsApp, LINE, Feishu, etc.). Configure Gemini as your provider, then start the gateway normally:

Окно терминала
hermes gateway setup
hermes gateway start

The gateway reads config.yaml and uses the same Gemini provider configuration.

”Gemini native client requires an API key”

Section titled “”Gemini native client requires an API key””

Hermes could not find a usable API key. Add one of these to ~/.hermes/.env:

Окно терминала
GOOGLE_API_KEY=...
# or
GEMINI_API_KEY=...

Then run hermes model again.

”This Google API key is on the free tier”

Section titled “”This Google API key is on the free tier””

Hermes probes Gemini API keys during setup. Free-tier quotas can be exhausted after a handful of agent turns because tool use, retries, compression, and auxiliary tasks may require multiple model calls.

Enable billing on the Google Cloud project attached to your key, regenerate the key if needed, then run:

Окно терминала
hermes model

The selected model is not available for your account, region, or key. Run hermes model again and pick another Gemini model from the current list.

Hermes may hide low-throughput Gemma models from the picker by default. If you intentionally want to evaluate one, set the model ID directly in ~/.hermes/config.yaml.

Gemma models exposed through the Gemini API are useful for evaluation, but their Gemini API free-tier caps are low. Use them for compatibility testing, then switch to a paid Gemini model or another provider for sustained agent sessions.

Check ~/.hermes/.env for:

Окно терминала
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/

Change it to the native endpoint or remove the override:

Окно терминала
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta

The google-gemini-cli provider uses a Gemini CLI / Cloud Code Assist OAuth flow. Hermes warns before starting it because this is distinct from the official AI Studio API-key path. Use provider: gemini with GOOGLE_API_KEY for the official API-key integration.

Upgrade Hermes and rerun hermes model. The native Gemini adapter sanitizes tool schemas for Gemini’s stricter function-declaration format; older builds or custom endpoints may not.