Coding

Show HN: Statewright – Visual state machines that make AI agents reliable

"Reliability trumps scale: A new approach to AI agent design uses constrained state machines and smaller models to tackle brittle problem-solving, potentially upending the industry's reliance on massive parameter counts and longer prompts."

Anthropic's Claude Code now ships with a plugin ecosystem covering specialized agent roles, but a new approach uses constrained state machines and smaller models to tackle brittle problem-solving. Statewright, a visual state machine tool, defines a workflow once and enforces it across multiple agents, including Claude Code, Codex, Cursor, opencode, and Pi. The tool restricts tool calls based on the current phase, preventing the model from flailing and improving reliability.

Overview

Statewright uses a Rust engine to evaluate state machine definitions, which are deterministic and do not involve LLMs. The tool integrates with coding agents via the MCP protocol, enforcing tool restrictions per state automatically. This approach makes the problem smaller by constraining the tool and solution spaces, allowing the model to reason in a focused context at each step.

What Each Plugin Does

The Statewright plugin for Claude Code provides a visual editor for defining workflows, which can be authored by hand or generated by agents. The plugin enforces tool restrictions per state, preventing the model from using tools that are not allowed in the current phase. The tool also provides a guardrail system, which includes features such as per-state tool enforcement, Bash discernment, and environment scoping.

Tradeoffs

Statewright requires MCP support in the agent, and workflow definitions are authored by hand. The tool also has some limitations, such as requiring a managed cloud for workflow storage and run history, and having advisory enforcement for Cursor.

When to Use It

Statewright is suitable for developers who want to improve the reliability of their AI agents by constraining the tool and solution spaces. The tool is particularly useful for tasks that require a high degree of precision and control, such as coding and debugging.

Pricing

Statewright is free for individual developers, with a managed cloud at statewright.ai handling workflow storage, run history, and the MCP gateway. The tool also offers a self-hosting option for single-developer and single-team use cases.

Bottom Line

Statewright provides a new approach to AI agent design, using constrained state machines and smaller models to improve reliability. The tool integrates with multiple agents, including Claude Code, Codex, Cursor, opencode, and Pi, and provides a guardrail system to prevent the model from flailing. While it has some limitations, Statewright is a valuable tool for developers who want to improve the reliability of their AI agents.

Practical Takeaway

Statewright is a useful tool for developers who want to improve the reliability of their AI agents by constraining the tool and solution spaces. By using Statewright, developers can define a workflow once and enforce it across multiple agents, preventing the model from flailing and improving reliability. The tool is free for individual developers and offers a self-hosting option for single-developer and single-team use cases.

Tags: [tag1, tag2, tag3] Sources Used: [Source Name]

Similar Articles

More articles like this

Coding 1 min

Visual Studio Code 1.120

Visual Studio Code’s 1.120 update slashes debugging friction with native Data Breakpoints, letting engineers pause execution when specific object properties change—not just memory addresses. The release also bakes in GitHub Copilot-powered inline code completions for Python, JavaScript, and TypeScript, cutting keystrokes by up to 40% in early benchmarks, while a revamped terminal shell integration finally bridges the gap between local and remote workflows.

Coding 1 min

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

A 26M-parameter model, Needle, distills the complexity of Gemini tool calling into a lightweight, attention-based architecture, leveraging simple attention networks and gating to achieve efficient function calling on consumer devices. By abandoning massive models and reasoning-heavy designs, Needle runs at 6000 tokens per second on prefill and 1200 tokens per second on decode, making it a promising solution for agentic experiences on budget phones and wearables.

Coding 1 min

Reimagining the mouse pointer for the AI era

A radical redesign of the traditional cursor is underway, as researchers propose replacing the static pointer with a dynamic, AI-driven "attention pointer" that adapts to the user's gaze and task at hand. This innovation leverages computer vision and machine learning to create a more intuitive and context-aware interaction paradigm. By decoupling the pointer from the screen, users may experience improved productivity and reduced cognitive load.

Coding 1 min

Show HN: Gigacatalyst – Extend your SaaS with an embedded AI builder

A new class of embedded AI builders is emerging, allowing SaaS companies to empower non-technical users to craft custom workflows and features through conversational interfaces, thereby bypassing traditional engineering bottlenecks and long product roadmaps. This trend is exemplified by Gigacatalyst, a platform that leverages AI to connect with a SaaS's APIs, learn its data model, and enable users to build custom features without requiring engineering expertise.

Coding 1 min

Bambu Lab is abusing the open source social contract

A prominent open-source project is quietly rebranding proprietary code as community-driven, undermining trust in the collaborative development model that has fueled innovation in software for decades. Bambu Lab's recent actions involve repackaging closed-source components as open-source modules, exploiting loopholes in licensing agreements to conceal the true nature of their codebase. This brazen move threatens to erode the social contract that underpins open-source software development.

Coding 1 min

I hate the recent open-source rise

The open-source software movement's accelerating reliance on GitHub Copilot's AI-powered code completion threatens to homogenize developer toolchains, as the platform's 10 million users increasingly rely on its proprietary "Code Suggest" feature to generate production-ready code, potentially undermining the value of bespoke, human-written software. This trend is exemplified by the 75% increase in Copilot-powered commits on popular open-source projects since Q1 2025.