📑 Table of contents

Configure Models and Providers in Hermes Agent

Hermes Agent 🟢 Beginner ⏱️ 11 min read 📅 2026-05-05

Introduction

After installing Hermes Agent and verifying that the CLI responds correctly, the next step is crucial: choosing and configuring the AI model that powers the agent. The model directly determines response quality, tool usage capability, and the cost of every interaction. This article covers provider configuration, API key management, multi-provider routing, and best practices for a stable, performant setup.

The hermes model command: the central entry point

Hermes provides a single interactive command to manage all model-related configuration:

hermes model

This command opens an interactive menu that guides you through:

  • Choosing a provider (Anthropic, OpenRouter, DeepSeek, Nous Portal, GitHub Copilot, etc.)
  • Selecting a model from those available at the provider
  • Configuring authentication (API key or OAuth)
  • Setting up auxiliary models (vision, compression, web extraction)

You can switch providers at any time — there's no lock-in. Run hermes model again, select a different provider, and the configuration updates instantly.

Interactive menu details

The hermes model menu offers several entries:

  • Choose a provider: selection from all supported providers
  • Configure auxiliary models: configure secondary models (vision, web_extract, compression, etc.)
  • View current configuration: summary of the active configuration

This interactive approach eliminates the need to dig through configuration files manually, though direct configuration remains possible for advanced users.

Supported providers

Hermes Agent supports a broad provider ecosystem. Here are the main ones, organized by access type.

OAuth-authenticated providers (no API key)

These providers use a browser-based OAuth flow, ideal if you don't want to manage API keys:

  • Nous Portal: monthly subscription, zero configuration. Login via hermes model.
  • OpenAI Codex: uses Codex models with your ChatGPT account. Device code authentication via hermes model.
  • Anthropic (OAuth): requires a Claude Max plan + additional credits. Authentication routes through Claude Code.
  • MiniMax (OAuth): access to MiniMax-M2.7 models without an API key. Browser login via hermes model.
  • GitHub Copilot: uses your Copilot subscription to access GPT-5.x, Claude, Gemini, etc. OAuth via hermes model or COPILOT_GITHUB_TOKEN.

API key providers

These providers require an API key from their respective consoles:

  • Anthropic (ANTHROPIC_API_KEY): direct access to Claude models. Get your key at console.anthropic.com.
  • OpenRouter (OPENROUTER_API_KEY): the most versatile, routes to dozens of models (Claude, GPT, Gemini, DeepSeek, Qwen, etc.). Key at openrouter.ai/keys.
  • DeepSeek (DEEPSEEK_API_KEY): direct access to DeepSeek models. Key at platform.deepseek.com.
  • Google AI Studio (GOOGLE_API_KEY or GEMINI_API_KEY): native Gemini models. Key at aistudio.google.com.
  • Hugging Face (HF_TOKEN): routes to 20+ open-source models via a unified endpoint. Token at huggingface.co/settings/tokens.
  • Alibaba Cloud / DashScope (DASHSCOPE_API_KEY): Qwen models.
  • Kimi / Moonshot (KIMI_API_KEY): Moonshot coding and chat models.
  • NVIDIA NIM (NVIDIA_API_KEY): Nemotron models.
  • xAI (XAI_API_KEY): Grok models.

Custom Endpoint (self-hosted)

For users running their own models via Ollama, vLLM, SGLang, or any OpenAI-compatible endpoint:

hermes model
# Select "Custom Endpoint"
# Enter the base URL and model name

Or via direct configuration with OPENAI_BASE_URL and OPENAI_API_KEY.

Configuration storage: .env vs config.yaml

Hermes cleanly separates secrets from regular configuration. Everything lives in the ~/.hermes/ directory:

~/.hermes/
├── config.yaml    # Non-secret parameters (model, terminal, compression, etc.)
├── .env           # API keys and tokens (secrets)
├── auth.json      # OAuth credentials (Nous Portal, etc.)
└── ...

Fundamental rule

  • Secrets (API keys, tokens, passwords) → ~/.hermes/.env
  • Everything else (model, terminal backend, limits, toolsets) → ~/.hermes/config.yaml

The hermes config set command

This is the recommended method for modifying any configuration. It automatically routes the value to the correct file:

# Set the model (goes into config.yaml)
hermes config set model anthropic/claude-sonnet-4

# Set an API key (goes into .env)
hermes config set OPENROUTER_API_KEY ***

# Change the terminal backend (goes into config.yaml)
hermes config set terminal.backend docker

# View the full configuration
hermes config

# Open config.yaml in your editor
hermes config edit

Resolution priority

Parameters are resolved in this order (decreasing priority):

  1. CLI arguments: hermes chat --model anthropic/claude-sonnet-4 (per-invocation override)
  2. ~/.hermes/config.yaml: main configuration file
  3. ~/.hermes/.env: fallback for environment variables, required for secrets
  4. Default values: built-in safe defaults when nothing is configured

Concrete configuration per provider

Anthropic (Claude)

# Via hermes model (recommended)
hermes model
# → Choose "Anthropic"
# → Enter your API key

# Or manually
hermes config set model anthropic/claude-sonnet-4
hermes config set ANTHROPIC_API_KEY sk-ant-xxxxx

OpenRouter

# Get a key at https://openrouter.ai/keys
hermes config set OPENROUTER_API_KEY ***
hermes config set model anthropic/claude-opus-4

# Popular models on OpenRouter:
# - anthropic/claude-opus-4 (advanced reasoning)
# - anthropic/claude-sonnet-4 (fast, good value)
# - openai/gpt-4o (versatile)
# - google/gemini-2.5-flash (fast, affordable)

DeepSeek

hermes config set DEEPSEEK_API_KEY sk-xxxxx
hermes config set model deepseek/deepseek-chat

GitHub Copilot

Two authentication options:

# Option 1: OAuth via hermes model (recommended)
hermes model
# → Choose "GitHub Copilot"
# → Browser authentication

# Option 2: Existing GitHub token
hermes config set COPILOT_GITHUB_TOKEN gho_xxxxx
hermes config set model github/copilot

Nous Portal

hermes model
# → Choose "Nous Portal"
# → OAuth login, no API key required

Custom Endpoint (Ollama, vLLM, etc.)

# Example with local Ollama
hermes config set OPENAI_BASE_URL http://localhost:11434/v1
hermes config set OPENAI_API_KEY ollama
hermes config set model llama3.1:70b

# Example with vLLM
hermes config set OPENAI_BASE_URL http://localhost:8000/v1
hermes config set OPENAI_API_KEY empty
hermes config set model meta-llama/Llama-3.1-70B-Instruct

Minimum requirement: 64,000 context tokens

Hermes Agent requires a model with at least 64,000 context tokens. Models with smaller windows are rejected at startup.

Why this requirement? Hermes' multi-step workflows (tool-calling, file reading, command execution) consume context rapidly. A model with 32K tokens doesn't have enough working memory to maintain a productive conversation with tool calls.

Most hosted models (Claude, GPT, Gemini, Qwen, DeepSeek) far exceed this threshold. For local models, make sure to configure the context size:

# llama.cpp
--ctx-size 65536

# Ollama
-c 65536  # or via context variable

Switching models in one command

Switching between providers or models is instant:

# Via hermes model (interactive)
hermes model

# Via hermes config set (direct)
hermes config set model anthropic/claude-sonnet-4

# Via /model command in session
/model
# → Interactive selection without leaving the conversation

No restart required. The change takes effect at the next interaction.

Configuring auxiliary models

Hermes uses secondary models for specific tasks (vision, web extraction, context compression). By default, auxiliary tasks use your main model. You can configure them independently to optimize costs:

hermes model
# → Choose "Configure auxiliary models"

The interactive menu lets you configure:

  • vision: model for image analysis
  • web_extract: model for web page summarization
  • session_search: model for searching past sessions
  • title_generation: model for generating session titles
  • compression: model for context compression

For example, to use a cheaper model for compression:

# In ~/.hermes/config.yaml
auxiliary:
  compression:
    provider: "openrouter"
    model: "google/gemini-2.5-flash"

The universal pattern for each auxiliary model is always the same: provider + model + optionally base_url.

Routing and multi-provider fallback

Automatic fallback

Configure a backup model that automatically takes over if the primary provider encounters an error:

# In ~/.hermes/config.yaml
fallback_model:
  provider: openrouter
  model: anthropic/claude-sonnet-4

Hermes tries the primary model first, then switches to the fallback after transient errors (rate limits, timeouts, 5xx).

Provider routing

For advanced users who want fine-grained routing control:

# In ~/.hermes/config.yaml
provider_routing:
  sort: "price"           # "price", "throughput", or "latency"
  only: ["anthropic", "openai"]   # Limit to authorized providers
  ignore: ["deepseek"]             # Exclude providers
  order: ["anthropic", "openrouter", "openai"]  # Fallback order
  require_parameters: true        # Only use providers supporting all parameters
  data_collection: "deny"         # Exclude providers that store data

Credential rotation

If you have multiple API keys for the same provider, configure the rotation strategy:

credential_pool_strategies:
  openrouter: round_robin    # Fair distribution
  anthropic: least_used      # Always the least-used key

Available options: fill_first (default), round_robin, least_used, random.

Testing that a provider works

Before diving into complex tasks, verify that your configuration is functional:

Step 1: Diagnostic

hermes doctor

This command detects configuration issues, missing keys, and incompatibilities.

Step 2: Test conversation

Launch Hermes and send a simple, verifiable prompt:

hermes

Then test with:

List the files in the current directory and tell me what the main language of the project is.

Signs of success:

  • The startup banner displays your model and provider
  • Hermes responds without errors
  • Tools work (terminal, file reading, etc.)
  • The conversation continues normally over several turns

Step 3: Session resumption

hermes --continue   # or hermes -c

Verify that the previous session is properly resumed.

Quick troubleshooting

If something isn't working, follow this sequence:

hermes doctor
hermes model        # Re-check configuration
hermes setup        # Re-run full setup
hermes --continue   # Test resumption

VPS deployment

For production deployment (Telegram gateway, Discord, etc.), a dedicated VPS is recommended. A VPS with 2 vCPU and 4 GB RAM is sufficient to run Hermes with the gateway.

Hostinger offers performant VPS starting from a few euros per month, ideal for hosting Hermes Agent in always-on mode. Configure the terminal backend to Docker for isolation:

hermes config set terminal.backend docker

Essential environment variables

Here's a summary of the most common variables, to place in ~/.hermes/.env or set via hermes config set:

  • OPENROUTER_API_KEY: OpenRouter key
  • ANTHROPIC_API_KEY: Anthropic key
  • DEEPSEEK_API_KEY: DeepSeek key
  • OPENAI_API_KEY: key for custom OpenAI-compatible endpoint
  • OPENAI_BASE_URL: base URL for custom endpoint
  • GOOGLE_API_KEY: Google AI Studio / Gemini key
  • COPILOT_GITHUB_TOKEN: GitHub token for Copilot
  • HF_TOKEN: Hugging Face token

Conclusion

Configuring models and providers in Hermes Agent is designed to be accessible via hermes model while remaining fully configurable manually for advanced users. The clear separation between .env (secrets) and config.yaml (parameters), support for 25+ providers, and fallback/routing mechanisms offer remarkable flexibility.

Key takeaways:

  • Use hermes model for interactive configuration, hermes config set for quick adjustments
  • Ensure a minimum of 64K context tokens
  • Always test with hermes doctor and a simple conversation after each change
  • Configure a fallback for service continuity
  • Optimize costs by configuring cheaper auxiliary models

In the next article, we'll explore Hermes Agent's advanced CLI features and built-in tools, building on the installation and introduction basics.