Hermes Agent: All 68 Built-in Tools — Complete Guide

Hermes Agent 🟡 Intermediate ⏱️ 12 min read 📅 2026-05-05

After [installing Hermes Agent](/article/hermes-agent-presentation-installation) and [configuring your models and providers](/article/hermes-agent-configurer-modeles-providers), it is time to explore what makes this agent powerful: its tools. Hermes Agent ships with 68 built-in tools organized into logical toolsets, covering virtually every use case — from web research to automation, including smart home control.

This guide walks through each tool category, explains how to enable or disable them, and provides concrete usage examples. Whether you are a developer, DevOps engineer, or simply curious, you will find everything you need to make the most of the Hermes tool ecosystem.

Toolset architecture: how it works

Hermes tools are not enabled individually in an ad-hoc manner. They are grouped into toolsets — named bundles that control what the agent can do. This is the primary mechanism for configuring tool availability per platform, per user profile, or per use case.

Each messaging platform has its own toolset preset. For example, hermes-telegram enables web, terminal, file, vision, todo, memory, and messaging by default, while hermes-cli enables a broader set including code_execution and delegation.

A toolset can consist of a single tool (like tts for text-to-speech) or multiple tools working together (like browser which groups 10 automation tools). There are also composite toolsets like debugging that aggregate file + terminal + web.

System tools: terminal and files

The operational core of Hermes relies on four fundamental tools for interacting with the filesystem and executing commands.

terminal and process

The terminal tool executes shell commands on the Linux environment. It supports seven execution backends:

local — direct execution on the host machine
docker — persistent isolated container (one shared container per session)
ssh — remote execution on a dedicated server
singularity — HPC containers for cluster computing
modal — serverless cloud execution
daytona — persistent cloud workspace
vercel_sandbox — cloud microVM with snapshot-based persistence

Long-running commands can be launched in the background with background=true. The process tool then manages these processes: list, poll, wait, retrieve logs, or kill them. PTY mode (pty=true) enables interactive CLI tools like Codex or Claude Code.

Example: launch a test suite in the background and get notified when it finishes

terminal(command="pytest -v tests/", background=true, notify_on_complete=true)
# Process runs in the background, Hermes notifies automatically when done

read_file, write_file, search_files, patch

These four tools replace classic shell commands with safer, smarter equivalents:

read_file — reads a file with line numbers and pagination. Cannot read images or binary files.
write_file — writes complete file content (full overwrite). Creates parent directories automatically.
search_files — content search by regex or file search by glob pattern, powered by ripgrep
patch — targeted find-and-replace with 9 fuzzy matching strategies, returns unified diff with automatic syntax checks

Example: fix a variable in a config file

patch(path="config.yaml", old_string="port: 3000", new_string="port: 8080")
# Returns a diff and automatically checks YAML syntax

Web tools: search and extraction

The web toolset includes two complementary tools for online information access.

web_search performs web searches returning up to 100 results with titles, URLs, and descriptions. It supports advanced search operators (site:, filetype:, intitle:, exact phrases). Multiple backends are supported: Exa, Parallel, Firecrawl, and Tavily.

web_extract retrieves URL content and converts it to markdown. It also works with PDFs — just pass the PDF URL directly. Pages under 5000 characters return full markdown; longer pages are LLM-summarized.

Example: search and extract API documentation

web_search(query="FastAPI middleware authentication site:fastapi.tiangolo.com")
web_extract(url="https://fastapi.tiangolo.com/tutorial/middleware/")

Browser tools: advanced web automation

The browser toolset is the richest in tool count, with 10 dedicated tools for interactive web automation:

browser_navigate — opens a URL and initializes the session
browser_snapshot — captures the page accessibility tree with reference IDs (@e1, @e2...) for interaction
browser_click — clicks an element by its reference ID
browser_type — types text into a field
browser_scroll — scrolls the page
browser_press — simulates a keyboard key
browser_back — navigates to the previous page
browser_get_images — lists all images on the page
browser_console — retrieves the JavaScript console
browser_vision — takes a screenshot and analyzes it with a vision model

A separate browser-cdp toolset (2 additional tools) activates automatically when a Chrome DevTools Protocol endpoint is detected, allowing raw CDP commands and native JavaScript dialog responses.

Example: navigate a site, fill a form, and visually verify the result

browser_navigate(url="https://example.com/login")
browser_snapshot()
browser_type(ref="@e5", text="my_username")
browser_type(ref="@e7", text="my_password")
browser_click(ref="@e9")
browser_vision(prompt="Verify that the dashboard page loaded correctly")

AI tools: vision, image generation, and TTS

Hermes Agent is not just a text agent — it has powerful multimodal tools:

vision_analyze

Image analysis via vision models. The agent can identify elements, read text (OCR), describe interfaces, or diagnose errors from screenshots.

image_generate

Text-to-image generation via FAL.ai (with optional OpenAI and xAI support). The default model is FLUX 2 Klein 9B, capable of generating an image in under one second.

text_to_speech

Text-to-speech conversion with native delivery per platform: voice bubble on Telegram, audio attachment on Discord and WhatsApp, file in ~/voice-memos/ in CLI. Voice and provider are configurable.

Example: generate a concept image and send it with a voice description

image_generate(prompt="A robot assistant in a modern office, flat illustration style")
text_to_speech(text="Here is the visual concept I generated for your project.")

Communication tools: messaging and Discord

send_message

The send_message tool allows Hermes to send messages to any connected platform (Telegram, Discord, Slack, WhatsApp, etc.) from within a session. Before sending, you must first list available targets with action="list".

discord and discord_admin

Two Discord-specific toolsets, available only on the hermes-discord platform:

discord — member search, message sending, reactions, channel reading and participation
discord_admin — moderation: role management, channels, timeouts, kicks, and bans (requires appropriate Discord permissions)

Productivity tools: todo, cron, skills, and memory

todo

Session task list management. Ideal for complex multi-step workflows. The agent can create, update, merge, and check off tasks automatically.

cronjob

Scheduled task manager with actions: create, list, update, pause, resume, run, and remove. Jobs can be attached to skills for sophisticated automations. Cron executions launch in fresh sessions with no current chat context.

Example: automate a weekly report

cronjob(action="create", name="weekly-report", schedule="0 9 * * 1",
        prompt="Generate a summary of last week's GitHub activity")

skills

The skills toolset includes three tools for managing the agent's procedural capabilities:

skills_list — lists available skills (name + description)
skill_view — loads full skill content and linked files (templates, scripts)
skill_manage — skill creation, update, and deletion

Skills are Hermes's procedural memory — reusable approaches for recurring task types, compatible with the agentskills.io standard and shareable via the community Skills Hub.

memory

The memory tool manages persistent cross-session memory. Important information is saved and injected into the system prompt at the start of each new session. This is how the agent remembers your preferences, environment, and context between conversations.

session_search

Search across all past session history. When the user says "we did this before" or "last time", this tool helps the agent retrieve context and summarize what happened.

Development tools: code execution and delegation

execute_code

The execute_code tool runs Python scripts that can programmatically call Hermes tools. Use it when you need 3+ tool calls with processing logic between them, or when you want to filter/reduce large tool outputs before they enter your context.

delegate_task

Spawn isolated subagents for parallel work. Each subagent gets its own conversation, terminal session, and toolset. Only the final summary is returned — intermediate results never pollute your context window. Ideal for dividing large projects into independent subtasks.

mixture_of_agents

The MOA tool routes a difficult problem through multiple collaborative LLMs. It makes 5 API calls (4 reference models + 1 aggregator) — reserve this for genuinely complex problems in mathematics, algorithms, or advanced reasoning.

Integration tools: Home Assistant, Spotify, RL, MCP

Home Assistant (4 tools)

The homeassistant toolset provides complete smart home control:

ha_list_entities — lists entities (lights, switches, sensors...) with domain or area filtering
ha_get_state — detailed entity state (brightness, temperature, etc.)
ha_list_services — lists available actions for each device type
ha_call_service — executes an action on a device

Example: turn on living room lights and set the thermostat

ha_call_service(domain="light", service="turn_on", entity_id="light.living_room")
ha_call_service(domain="climate", service="set_temperature",
               entity_id="climate.living_room", kwargs='{"temperature": 22}')

Spotify (7 tools)

Native Spotify control via the bundled plugin: playback, queue, search, playlists, albums, and library. Requires initial OAuth authorization via hermes spotify setup.

RL Training (10 tools)

Complete RL training management suite via Atropos: environment selection, configuration, training launch, WandB monitoring, stopping, and inference testing. Requires TINKER_API_KEY and WANDB_API_KEY.

MCP (Model Context Protocol)

Beyond the 68 built-in tools, Hermes can dynamically load tools from MCP servers. MCP tools appear with a server-name prefix (e.g., github_create_issue for the github server). This allows unlimited capability extension by connecting any compatible MCP server.

Other platforms: Feishu and Yuanbao

Specific toolsets exist for regional platforms:

feishu_doc (1 tool) — Feishu/Lark document reading
feishu_drive (4 tools) — Feishu file comment operations
yuanbao (5 tools) — DMs, groups, and stickers on Tencent Yuanbao platform

Configuring tools: the hermes tools command

Tool management is primarily done via the CLI:

# See all available tools and their status
hermes tools

# Launch interactive tool configuration per platform
hermes tools

# Use specific toolsets in CLI
hermes chat --toolsets "web,terminal,file"

# Enable a toolset in config
hermes config set toolsets.enabled '["web","terminal","file","browser"]'

The hermes tools command without arguments launches an interactive menu to browse toolsets, see which tools they contain, and enable or disable them per platform.

The safe toolset: for restricted environments

The safe toolset is designed specifically for environments where security is paramount. It includes only read-only tools:

web_search — web search
web_extract — content extraction
vision_analyze — image analysis
image_generate — image generation

No terminal access, no file write access, no code execution. Perfect for public messaging platforms or shared instances.

Security best practices

Choosing which tools to allow is not just a feature question — it is a security question. Here are the recommendations by platform:

Local CLI (hermes-cli) — Full profile. Enable all toolsets: terminal, file, code_execution, delegation, rl, browser. The environment is user-controlled.

Telegram / Discord / Slack — Balanced profile. Enable web, file (read-only if possible), vision, todo, memory, cronjob, messaging. Terminal and code_execution should only be enabled if the instance is private and secured. Use the Docker backend to isolate executions.

Public or shared instances — Use the safe toolset as a base. Add only strictly necessary tools. The SSH backend is recommended to prevent the agent from modifying its own code.

Production environments — Disable rl (training can consume significant resources), limit code_execution to the Docker backend with constrained resources (CPU, memory), and enable process quotas.

The Docker backend offers a good balance between flexibility and security: a persistent container with complete isolation, read-only root filesystem, and all Linux capabilities dropped.

Toolset summary

browser (10 tools) — interactive web automation
browser-cdp (2 tools) — Chrome DevTools Protocol commands
file (4 tools) — file reading, writing, searching, and patching
terminal (2 tools) — command execution and process management
web (2 tools) — web search and extraction
vision (1 tool) — image analysis
image_gen (1 tool) — image generation
tts (1 tool) — text-to-speech
todo (1 tool) — task management
cronjob (1 tool) — scheduled tasks
memory (1 tool) — persistent memory
session_search (1 tool) — history search
skills (3 tools) — skill management
messaging (1 tool) — cross-platform messaging
clarify (1 tool) — user clarification
code_execution (1 tool) — Python code execution
delegation (1 tool) — parallel subagents
moa (1 tool) — multi-model consensus
homeassistant (4 tools) — smart home control
discord (1 tool) — Discord actions
discord_admin (1 tool) — Discord moderation
spotify (7 tools) — Spotify control
rl (10 tools) — RL training
feishu_doc (1 tool) — Feishu documents
feishu_drive (4 tools) — Feishu comments
yuanbao (5 tools) — Yuanbao platform
safe (composite) — secure read-only profile
debugging (composite) — diagnostic bundle

Conclusion

With its 68 built-in tools organized into configurable toolsets, Hermes Agent covers an impressive range of capabilities. The toolset system provides fine-grained configuration: you choose exactly what the agent can do, on which platform, and with what access level.

The strength of this architecture lies in its modularity. You do not have to enable everything. A minimal setup with just web + safe is sufficient for a research assistant. An advanced setup with terminal + code_execution + delegation transforms Hermes into a true co-developer. And MCP integration extends these capabilities infinitely by connecting external tool servers.

In the next article in this series, we will explore Hermes's memory system — how the agent learns from your interactions and remembers your context across sessions.

#AI Tools #Automation #Hermes Agent #Open Source #Productivity #Terminal

📚 Related articles

Hermes Agent 🟢 Débutant 13 min

Hermes Agent: Complete Presentation and Installation Guide

Discover Hermes Agent, the most complete open source AI agent. Step-by-step installation guide: local, VPS, Android. 68 tools, multi-platform, free.

2026-05-05 14:42

Hermes Agent 🟢 Débutant 11 min

Configure models and providers in Hermes Agent

Complete guide to setting up AI models and providers in Hermes Agent: Anthropic, OpenRouter, DeepSeek, GitHub Copilot, and custom endpoints.

2026-05-05 14:51

Hermes Agent 🟢 Débutant 12 min

Mastering the Hermes Agent CLI

Complete guide to the Hermes Agent CLI: slash commands, keyboard shortcuts, sessions, configuration and concrete workflows for productivity.

2026-05-05 15:02

📑 Table of contents