AI

Hermes Unlocks Self-Improving AI Agents, Powered by NVIDIA RTX PCs and DGX Spark

"Self-improving AI agents are gaining traction, thanks to Hermes Agent, a new open-source framework that has amassed 140,000 GitHub stars in under three months. Powered by NVIDIA's RTX PCs and DGX Spark, Hermes enables agents to learn from experience and adapt to new tasks, potentially revolutionizing workflows and productivity. This rapid adoption marks a significant milestone in the evolution of agentic AI."

Hermes Agent, an open-source framework developed by Nous Research, has crossed 140,000 GitHub stars in under three months and is now the most-used agent on OpenRouter. Designed for reliability and self-improvement, Hermes is provider- and model-agnostic, optimized for always-on local use on NVIDIA RTX PCs, RTX PRO workstations, and DGX Spark systems.

What Hermes Does

Like other popular agents, Hermes integrates with messaging apps, accesses local files and applications, and runs 24/7. Four capabilities set it apart:

  • Self-Evolving Skills: Hermes writes and refines its own skills. When it encounters a complex task or receives feedback, it saves learnings as a skill, adapting and improving over time.
  • Contained Sub-Agents: Sub-agents are short-lived, isolated workers dedicated to a sub-task with a focused context and tool set. This keeps task organization tidy and allows Hermes to run with smaller context windows — ideal for local models.
  • Reliability by Design: Nous Research curates and stress-tests every skill, tool, and plug-in shipped with Hermes. The framework works reliably even with 30-billion-parameter-class local models, without the constant debugging required by other agent frameworks.
  • Same Model, Better Results: Developer comparisons using identical models across frameworks consistently show stronger results in Hermes. The framework is an active orchestration layer, not a thin wrapper, enabling persistent, on-device agents instead of task-by-task execution.

Hardware Requirements

Hermes and the underlying LLM are built to run locally. NVIDIA RTX GPUs are purpose-built for this workload. The new Qwen 3.6 models from Alibaba are ideal for local agents like Hermes:

  • Qwen 3.6 35B: Runs on roughly 20GB of memory while surpassing 120-billion-parameter models that require 70GB+.
  • Qwen 3.6 27B: A dense model with more active parameters, matching the accuracy of 400-billion-parameter models like Qwen 3.5 397B while being one-sixteenth the size.

NVIDIA DGX Spark, with 128GB of unified memory and 1 petaflop of AI performance, can run 120-billion-parameter mixture-of-experts models all day. The Qwen 3.6 35B model runs faster on DGX Spark, allowing concurrent workloads.

Getting Started

Visit the Hermes GitHub repository and pair it with a preferred local model and runtime. Hermes ships with LM Studio and Ollama support out of the box. Run Hermes alongside Qwen 3.6 via llama.cpp, LM Studio, or Ollama.

Bottom Line

Hermes Agent offers a reliable, self-improving local AI agent framework that works well with current-generation open-weight models. The combination of Nous Research's curated skills and NVIDIA's local hardware provides a practical foundation for persistent, on-device agentic workflows.

Similar Articles

More articles like this

AI 3 min

Two Legal Research Providers Launch MCP Integrations with Claude: Thomson Reuters and Free Law Project Connect Their Data to AI

Two Legal Research Providers Launch MCP Integrations with Claude: Thomson Reuters and Free Law Project Connect Their Data to AI LawSites

AI 2 min

OpenAI Hit With Overdose Suit Centered on ChatGPT Medical Advice

OpenAI Hit With Overdose Suit Centered on ChatGPT Medical Advice Bloomberg Law News

AI 2 min

Anthropic Goes All-In on Legal, Releasing More Than 20 Connectors and 12 Practice-Area Plugins for Claude

Anthropic Goes All-In on Legal, Releasing More Than 20 Connectors and 12 Practice-Area Plugins for Claude LawSites

AI 2 min

Efficient Edge AI on Arm CPUs and NPUs: Understanding ExecuTorch through Practical Labs

Arm's Edge AI Initiative Gains Momentum with ExecuTorch, a PyTorch Extension for Local Inference on Constrained Devices. This new framework leverages Arm CPUs and NPUs to accelerate AI workloads, promising significant performance boosts on edge devices. Practical Labs, developed by Arm, provide a hands-on introduction to ExecuTorch's capabilities and potential applications in IoT and industrial automation.

AI 1 min

Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere”

MIT’s new AI literacy push—backed by a free, adaptive course and real-time LLM tutors—slashes the barrier to entry for non-technical learners, embedding generative models as both subject and instructor. By offloading scaffolding to AI agents, the program turns passive video lectures into interactive, Socratic dialogues that scale from K-12 classrooms to corporate upskilling, potentially minting millions of “AI-fluent” users within a year.

AI 1 min

AutoScout24 scales engineering with AI-powered workflows

"German automotive marketplace AutoScout24 Group has leveraged OpenAI's Codex and ChatGPT to automate code review and generation, slashing development cycles by up to 30% and boosting code quality by 25% through AI-powered workflows, marking a significant shift towards large language model-driven engineering."