Coding

Vibe coding and agentic engineering are getting closer than I'd like

As developers increasingly blur the lines between code and cognition, a new wave of "vibe coding" techniques is emerging, leveraging techniques like diffusion-based text synthesis to craft AI models that can intuitively grasp and manipulate the emotional resonance of digital artifacts. This trend is converging with agentic engineering, where AI systems are designed to autonomously generate and adapt to user interfaces, raising concerns about the erosion of human agency in digital interactions. The stakes are high, with implications for everything from user experience to social control.

Simon Willison, a well-known Python developer and AI coding commentator, recently observed a troubling trend in his own workflow: the line between "vibe coding" and "agentic engineering" is blurring. In a podcast conversation with Joseph Ruscio for Heavybit's High Leverage podcast, Willison described how the two approaches — which he had previously kept firmly separate — are starting to overlap in ways that make him uncomfortable.

The original distinction

Willison originally defined vibe coding as the practice of using AI to generate code without reviewing it. The user may not even know how to program. They ask for something, get a result, and if it works, they move on. If it doesn't, they ask again. Code quality, security, and maintainability are not considerations. Willison's position was that vibe coding is fine for personal tools where bugs only hurt the user, but "grossly irresponsible" for software used by others.

Agentic engineering, by contrast, is what professional software engineers do: they use AI coding tools as amplifiers of their own expertise. They review the generated code, understand security and performance implications, and aim to build higher-quality production systems faster. Willison described relying on his 25 years of experience to guide the tools.

The blur

The problem, Willison realized, is that as coding agents become more reliable, he is no longer reviewing every line of code they produce — even for production-level work. He gave the example of asking Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON. "It's just going to do it right," he said. "It's not going to mess that up." He adds automated tests and documentation, but he is not reading the code.

This creates a feeling of guilt. Willison compared it to working at a larger organization where another team hands over a service — say, an image resize service — and you use it without reading their code. You treat it as a semi-black box until something breaks. The difference is that human teams have professional reputations and accountability. "Claude Code does not have a professional reputation," Willison noted. "It can't take accountability for what it's done."

The normalization of deviance

Willison identified a risk he calls the "normalization of deviance": every time a model writes correct code without close monitoring, the temptation grows to trust it at the wrong moment in the future. The more often the agent proves itself, the harder it becomes to maintain the discipline of review.

Evaluating software has changed

Willison also pointed out that the traditional signals of software quality — a GitHub repository with a hundred commits, a good README, comprehensive tests — are now easy to fake. "I can knock out a git repository with a hundred commits and a beautiful readme and comprehensive tests of every line of code in half an hour," he said. The result looks identical to a project that received genuine care and attention. Even for his own projects, he cannot tell the difference by inspection alone.

His new heuristic: he values actual usage over apparent quality. "If you've got a vibe coded thing which you have used every day for the past two weeks, that's much more valuable to me than something that you've just spat out and hardly even exercised."

Bottlenecks have shifted

Willison noted that the entire software development lifecycle was designed around the assumption that a developer produces a few hundred lines of code per day. If that rate jumps to 2,000 lines per day, both upstream and downstream processes break. He cited a talk by Jenny Wen, design lead at Anthropic, who observed that design processes are built around the cost of getting things wrong — because handing off a bad design to engineers who spend three months building it is catastrophic. If building takes much less time, the design process can afford to be riskier.

Why Willison is not worried about his career

Despite these concerns, Willison is not afraid that AI will replace software engineers. He described his conversations with coding agents as "moon language for the vast majority of human beings." The tools are amplifiers of existing experience. "If you know what you're doing, you can run so much faster with them," he said. But producing software remains "ferociously difficult."

He quoted political commentator Matthew Yglesias, who tweeted: "Five months in, I think I've decided that I don't want to vibecode — I want professionally managed software companies to use AI coding assistance to make more/better/cheaper software products that they sell to me for money." Willison agreed, adding that he would rather hire a plumber than plumb his own house after watching YouTube tutorials.

Bottom line

Willison's key takeaway is that the convergence of vibe coding and agentic engineering is real and happening faster than he expected. The practical response is not to abandon AI coding tools, but to maintain disciplined review practices — and to value real-world usage over surface-level quality signals. For production software, the question is not whether the code looks good, but whether it has been proven to work under real conditions.

Similar Articles

More articles like this

Coding 1 min

Going Full Time on Open Source

After a decade at Stripe, engineer Daniel X. Moore is betting his livelihood on a radical premise: that a single open-source tool—his TypeScript-native runtime **Effect TS**—can outmaneuver Node.js and Deno by baking algebraic effects, structured concurrency, and zero-cost dependency injection into the language itself. With $1.2M in pre-seed funding, Moore’s pivot tests whether the enterprise will pay for a runtime that treats side effects as first-class citizens, not afterthoughts.

Coding 2 min

Higher usage limits for Claude and a compute deal with SpaceX

Article URL: https://www.anthropic.com/news/higher-limits-spacex Comments URL: https://news.ycombinator.com/item?id=48037986 Points: 125 # Comments: 60

Coding 1 min

The next great software company won't sell software

A new breed of "service-as-a-software" startups—like LayerX—is dismantling the traditional SaaS model by embedding AI agents directly into enterprise workflows, charging per transaction rather than per seat. By abstracting away the software layer entirely, these companies monetize outcomes (e.g., automated invoice processing, fraud detection) while letting clients bypass licensing, integrations, and even UI. The shift threatens legacy SaaS incumbents by turning software from a product into an invisible, pay-per-use utility.

Coding 1 min

Show HN: Adam – An embeddable cross-platform AI agent library

A new embeddable AI agent library, Adam, has emerged, leveraging SQLite's database expertise to integrate machine learning models with relational data. By utilizing SQLite's query language, Adam enables developers to query and manipulate AI-driven data in a familiar, SQL-like syntax. This novel approach could simplify the integration of AI and data storage, potentially accelerating the development of AI-powered applications.

Coding 1 min

What makes a good smartphone camera?

Advances in multi-frame noise reduction and optical zoom capabilities are redefining the smartphone camera landscape, as evidenced by recent flagship models boasting 1/1.3" sensors and 5x hybrid zoom. However, the true differentiator lies in the implementation of AI-driven autofocus and real-time HDR processing, which can significantly enhance low-light performance and color accuracy. This convergence of hardware and software innovations is driving a new era of smartphone photography.

Coding 3 min

GLM 5.1 offers a low-cost alternative to Claude Opus for developers

Zhipu AI's GLM 5.1 is emerging as a budget alternative to Anthropic's Claude Opus 4.6, priced at $18 per month—three times cheaper than Opus. It integrates with VS Code through the Cline extension and supports 8-hour autonomous coding sessions. Tested for three days, it reportedly matches Opus in performance for 'vibe coding' tasks and outperforms ChatGPT 5.4 and Gemini. Setup includes step-by-step configuration via a tutorial linked from the creator’s profile.