Introduction
After installing Hermes Agent and verifying that the CLI responds correctly, the next step is crucial: choosing and configuring the AI model that powers the agent. The model directly determines the quality of the responses, the ability to use tools, and the cost of each interaction. This article covers in detail the configuration of providers, the management of API keys, multi-provider routing, and best practices for a stable and performant setup.
The hermes model command: the central entry point
Hermes provides a single interactive command to manage all model-related configuration:
hermes model
This command opens an interactive menu that guides you through:
- Choosing the provider (Anthropic, OpenRouter, DeepSeek, Nous Portal, GitHub Copilot, etc.)
- Selecting the model among those available from the provider
- Configuring authentication (API key or OAuth)
- Configuring auxiliary models (vision, compression, web extraction)
You can switch providers at any time — there is no lock-in. Relaunch hermes model, select another provider, and the configuration is updated instantly.
Interactive menu in detail
The hermes model menu offers several entries:
- Choose a provider: selection among all supported providers
- Configure auxiliary models: configure secondary models (vision, web_extract, compression, etc.)
- View current configuration: summary of the active configuration
This interactive approach eliminates the need to dig through configuration files manually, although direct configuration remains possible for advanced users.
Supported providers
Hermes Agent supports a broad ecosystem of providers. Here are the main ones, categorized by access type.
Providers with OAuth authentication (no API key)
These providers use an OAuth flow via the browser, ideal if you don't want to manage API keys:
- Nous Portal : monthly subscription, zero configuration. Log in via
hermes model. - OpenAI Codex : uses Codex models with your ChatGPT account. Device code authentication via
hermes model. - Anthropic (OAuth) : requires a Claude Max plan + additional credits. Authentication routes through Claude Code.
- MiniMax (OAuth) : access to MiniMax-M2.7 models without an API key. Browser login via
hermes model. - GitHub Copilot : uses your Copilot subscription to access GPT-5.x, Claude, Gemini, etc. OAuth via
hermes modelorCOPILOT_GITHUB_TOKENtoken.
Providers with API key
These providers require an API key obtained from their respective consoles:
- Anthropic (
ANTHROPIC_API_KEY) : direct access to Claude models. Get your key on console.anthropic.com. - OpenRouter (
OPENROUTER_API_KEY) : the most versatile, routes to dozens of models (Claude, GPT, Gemini, DeepSeek, Qwen, etc.). Key on openrouter.ai/keys. - DeepSeek (
DEEPSEEK_API_KEY) : direct access to DeepSeek models. Get your key from the DeepSeek console. - Google AI Studio (
GOOGLE_API_KEYorGEMINI_API_KEY) : native Gemini models. Key on aistudio.google.com. - Hugging Face (
HF_TOKEN) : routes to 20+ open-source models via a unified endpoint. Get your token from your Hugging Face account settings. - Alibaba Cloud / DashScope (
DASHSCOPE_API_KEY) : Qwen models. - Kimi / Moonshot (
KIMI_API_KEY) : Moonshot coding and chat models. - NVIDIA NIM (
NVIDIA_API_KEY) : Nemotron models. - xAI (
XAI_API_KEY) : Grok models.
Custom Endpoint (self-hosted)
For users running their own models via Ollama, vLLM, SGLang, or any OpenAI-compatible endpoint, select "Custom Endpoint" in the hermes model menu, then provide the base URL and model name. You can also configure this manually with the OPENAI_BASE_URL and OPENAI_API_KEY variables.
Storing configurations: .env vs config.yaml
Hermes cleanly separates secrets from normal configuration. Everything is located in the ~/.hermes/ directory, which notably contains config.yaml for non-secret settings (model, terminal, compression), .env for API keys and tokens, as well as auth.json for OAuth credentials (Nous Portal, etc.).
Fundamental rule
- Secrets (API keys, tokens, passwords) →
~/.hermes/.env - Everything else (model, terminal backend, limits, toolsets) →
~/.hermes/config.yaml
The hermes config set command
This is the recommended method for modifying any configuration. It automatically routes the value to the correct file:
hermes config set model anthropic/claude-sonnet-4
hermes config set OPENROUTER_API_KEY sk-or-v1-xxxxxxxx
This command automatically distinguishes secrets (which go into .env) from standard parameters (which go into config.yaml). Use hermes config to see the complete configuration, and hermes config edit to directly open the file in your editor.
Resolution priority
Parameters are resolved in this order (decreasing priority):
- CLI arguments:
hermes chat --model anthropic/claude-sonnet-4(override by invocation) ~/.hermes/config.yaml: main configuration file~/.hermes/.env: fallback for environment variables, required for secrets- Default values: safe built-in defaults when nothing is configured
Concrete configuration by provider
Anthropic (Claude)
Run hermes model, choose "Anthropic", then enter your API key. Alternatively, from the command line:
hermes config set model anthropic/claude-sonnet-4
hermes config set ANTHROPIC_API_KEY sk-ant-xxxxx
OpenRouter
Get your key on openrouter.ai/keys, then configure:
hermes config set OPENROUTER_API_KEY sk-or-v1-xxxxx
hermes config set model anthropic/claude-opus-4
Popular models on OpenRouter include anthropic/claude-opus-4 (advanced reasoning), anthropic/claude-sonnet-4 (fast, good value for money), openai/gpt-4o (versatile) and google/gemini-2.5-flash (fast, cheap).
DeepSeek
hermes config set DEEPSEEK_API_KEY sk-xxxxx
hermes config set model deepseek/deepseek-chat
GitHub Copilot
Two authentication options: via hermes model by choosing "GitHub Copilot" for an OAuth flow in the browser, or by manually configuring an existing token:
hermes config set COPILOT_GITHUB_TOKEN gho_xxxxx
hermes config set model github/copilot
Nous Portal
Run hermes model, choose "Nous Portal", then follow the OAuth login. No API key is required.
Custom Endpoint (Ollama, vLLM, etc.)
For local Ollama, set OPENAI_BASE_URL to http://localhost:11434/v1, OPENAI_API_KEY to ollama, then set the model (e.g.: llama3.1:70b). For vLLM, use http://localhost:8000/v1 as the base URL with a model like meta-llama/Llama-3.1-70B-Instruct.
Minimum requirement: 64,000 context tokens
Hermes Agent requires a model with at least 64,000 tokens of context window. Models with a smaller window are rejected at startup.
Why this requirement? Hermes' multi-step workflows (tool-calling, file reading, command execution) quickly consume context. A model with 32K tokens does not have enough working memory to maintain a productive conversation with tool calls.
Most hosted models (Claude, GPT, Gemini, Qwen, DeepSeek) far exceed this threshold. For local models via llama.cpp, pass the --ctx-size 65536 argument. With Ollama, use the -c 65536 flag or the appropriate context variable.
Switch models in a single command
Switching between providers or models is instantaneous:
- Via
hermes model(interactive menu) - Via
hermes config set model anthropic/claude-sonnet-4(direct) - Via the
/modelcommand in a chat session for an interactive selection without leaving the conversation
No restart is required. The change takes effect at the next interaction.
Configuring auxiliary models
Hermes uses secondary models for specific tasks (vision, web extraction, context compression). By default, auxiliary tasks use your main model. You can configure them independently to optimize costs via the hermes model → "Configure auxiliary models" menu.
The interactive menu allows you to configure:
- vision: model for image analysis
- web_extract: model for web page summarization
- session_search: model for searching in past sessions
- title_generation: model for generating session titles
- compression: model for context compression
For example, to use a cheaper model for compression, add an auxiliary section in ~/.hermes/config.yaml with the compression key containing the provider (e.g., "openrouter") and the model (e.g., "google/gemini-2.5-flash"). The universal pattern for each auxiliary model is always the same: provider + model + optionally base_url.
Multi-provider routing and fallback
Automatic fallback
Configure a fallback model that takes over automatically if the main provider encounters an error. In ~/.hermes/config.yaml, add a fallback_model section with the fallback provider and model. Hermes first tries the main model, then switches to the fallback after transient errors (rate limits, timeouts, 5xx).
Provider routing
For advanced users who want fine-grained control over routing, the provider_routing section in config.yaml offers several options: sort ("price", "throughput" or "latency"), only to limit to allowed providers, ignore to exclude some, order to define the fallback order, require_parameters to only use providers supporting all params, and data_collection (e.g., "deny") to exclude providers that store data.
Credential rotation
If you have multiple API keys for the same provider, configure the rotation strategy via the credential_pool_strategies section. The available options are fill_first (default), round_robin (fair distribution), least_used (always the least used key) and random.
Testing that a provider works
Before diving into complex tasks, check that your configuration is functional.
Step 1: Diagnostic
hermes doctor
This command detects configuration issues, missing keys, and incompatibilities.
Step 2: Test conversation
Launch hermes then send a simple, verifiable prompt, such as asking to list the files in the current directory and identify the main language of the project. Signs of success are: the startup banner displays your model and provider, Hermes responds without error, tools work (terminal, file reading, etc.), and the conversation continues normally over several turns.
Step 3: Session resumption
hermes --continue # or hermes -c
Verify that the previous session is properly resumed.
Quick troubleshooting
If something isn't working, follow this sequence: run hermes doctor to diagnose, then hermes model to double-check the configuration, hermes setup to rerun the full setup, and finally hermes --continue to test resumption.
Deploying on a VPS
For a production deployment (Telegram gateway, Discord, etc.), a dedicated VPS is recommended. A VPS with 2 vCPUs and 4 GB of RAM is enough to run Hermes with the gateway.
Hostinger offers high-performance VPSes starting from a few euros per month, ideal for hosting Hermes Agent in always-on mode. Configure the terminal backend in Docker for isolation:
hermes config set terminal.backend docker
Essential environment variables
Here is a summary of the most common variables, to be placed in ~/.hermes/.env or set via hermes config set:
OPENROUTER_API_KEY: OpenRouter keyANTHROPIC_API_KEY: Anthropic keyDEEPSEEK_API_KEY: DeepSeek keyOPENAI_API_KEY: key for custom OpenAI-compatible endpointOPENAI_BASE_URL: base URL for custom endpointGOOGLE_API_KEY: Google AI Studio / Gemini keyCOPILOT_GITHUB_TOKEN: GitHub token for CopilotHF_TOKEN: Hugging Face token
Common errors
- "Context window too small": your model does not reach the required 64K tokens. Check the configured context size, especially for local models with Ollama or llama.cpp.
- Unrecognized API key: make sure you placed the key in
~/.hermes/.envviahermes config set, and not inconfig.yaml. Secrets must always go in the.envfile. - Timeout with a custom endpoint: check that the base URL ends with
/v1and that the model is actually accessible (test with a simple curl). - Fallback not triggered: check that the
fallback_modelsection is properly formatted inconfig.yamlwith theproviderandmodelkeys.
FAQ
Can I use multiple providers at the same time?
Yes, via the provider routing and fallback system. You can define a primary provider and one or more fallbacks, or configure automatic routing based on price, latency, or throughput.
Do I need to restart Hermes after a model change?
No. The model change via hermes config set, hermes model or /model takes effect immediately at the next interaction.
How do I know which auxiliary model is being used?
Run hermes config to see the complete configuration, including auxiliary models. If none are explicitly configured, the main model is used for all tasks.
Are API keys stored in plain text?
Yes, in ~/.hermes/.env. Protect access to this file with restrictive permissions (chmod 600 ~/.hermes/.env).
Recommended tools
- Hermes Agent : les 68 outils disponibles — guide complet : to understand all the tools your configured model will be able to invoke
- Fichiers de contexte : CLAUDE.md, AGENTS.md et au-delà : to configure the context the model will receive at each session
- Mémoire persistante : comment Hermes se souvient : to combine your model configuration with long-term memory
Key takeaways
- Use
hermes modelfor interactive configuration,hermes config setfor quick adjustments - Always separate secrets (
.env) from parameters (config.yaml) - Ensure a minimum of 64K context tokens
- Always test with
hermes doctorand a simple conversation after each change - Configure a fallback for service continuity
- Optimize costs by configuring cheaper auxiliary models for secondary tasks
Conclusion
Configuring models and providers in Hermes Agent is designed to be accessible via hermes model while remaining fully configurable manually for advanced users. The clear separation between .env (secrets) and config.yaml (settings), the support for over 25 providers in 2025, and the fallback and routing mechanisms offer remarkable flexibility.
To go further, discover the 68 tools available in Hermes Agent and learn how to structure the context of your sessions with CLAUDE.md and AGENTS.md files.