Coding

GPT-5.5 Price Increase: What It Costs

A 55% price hike for the GPT-5.5 model marks a significant escalation in the cost of large language models, with prices now ranging from $1.20 to $2.40 per 100 million parameters, a threshold that has become a benchmark for AI applications. This increase reflects the growing computational demands of fine-tuning and deploying these models, as well as the escalating costs of datacenter infrastructure. The price surge may accelerate the adoption of cloud-based services and specialized hardware.

Ben C (AI-assisted) May 8, 2026 1 min read EN

Based on reporting from Source.

OpenRouter has published a cost analysis for the GPT-5.5 model, reporting a 55% price increase compared to its predecessor. The new pricing places the cost per 100 million parameters between $1.20 and $2.40, a threshold that the analysis describes as a benchmark for AI applications.

Overview

The price hike reflects the growing computational demands of fine-tuning and deploying large language models, as well as the escalating costs of datacenter infrastructure. OpenRouter's announcement notes that the increase may accelerate adoption of cloud-based services and specialized hardware as developers seek to manage rising operational expenses.

What it costs

According to the analysis, the GPT-5.5 model now costs $1.20 to $2.40 per 100 million parameters. This represents a 55% increase over previous pricing for comparable models. The exact per-token or per-request pricing was not detailed in the announcement.

Tradeoffs

The price surge creates a direct tradeoff for developers and organizations: either absorb the higher per-call cost, or invest in alternative infrastructure such as dedicated cloud instances, specialized AI accelerators, or on-premise hardware. The analysis suggests that for high-volume applications, the cumulative cost increase may push teams toward cloud-based services that offer volume discounts or toward open-weight models that can be self-hosted.

When to use it

The GPT-5.5 model remains suitable for applications where accuracy, reliability, or proprietary capabilities justify the premium. Use cases include production-grade customer-facing chatbots, code generation, and complex reasoning tasks where lower-cost alternatives may not meet quality thresholds. For prototyping, experimentation, or low-stakes tasks, cheaper models or smaller parameter counts may be more economical.

Bottom line

The 55% price increase for GPT-5.5 is a concrete signal that large language model costs are rising, driven by infrastructure and compute demands. Developers should factor this into budgeting and consider whether the model's performance gains justify the higher per-100M-parameter cost, or whether alternative deployment strategies make more sense.

More articles like this

Coding 1 min

Open Source Resistance: keep OSS alive on company time

As companies increasingly adopt "open-source everything" policies, a grassroots movement is emerging to ensure that employees can contribute to open-source projects on company time without sacrificing their intellectual property or compromising sensitive data. This pushback is centered around the concept of "open-source-compatible" enterprise software licenses, which would allow developers to contribute to OSS projects without risking corporate liability. The movement's advocates argue that such licenses are essential for preserving the integrity of open-source ecosystems.

Coding 2 min

The limits of Rust, or why you should probably not follow Amazon and Cloudflare

Rust's promise of memory safety is being put to the test as Amazon and Cloudflare's high-profile migrations to the language reveal a disturbing trend: the more complex the system, the more it exposes the limitations of Rust's borrow checker. Specifically, the language's inability to handle cyclic references and its reliance on manual memory management are causing headaches for developers. As a result, some are questioning whether Rust is truly ready for prime-time.

Coding 1 min

The AI Backlash Could Get Ugly

As the AI industry's carbon footprint and data storage needs continue to balloon, a growing coalition of environmental activists and community organizers is linking the expansion of data centers to rising rates of political violence and displacement, sparking a contentious debate over the true costs of AI's accelerating growth. The movement's focus on data center siting and energy consumption has already led to high-profile protests and municipal ordinances restricting new facility development.

Coding 1 min

Software Developers Say AI Is Rotting Their Brains

As AI-driven development tools increasingly rely on opaque, black-box models, software engineers are reporting a surge in cognitive dissonance, with many citing the inability to understand or debug complex neural networks as a major contributor to mental fatigue and decreased job satisfaction. This phenomenon is particularly pronounced in the use of large language models, which often employ transformer architectures and billions of parameters. The resulting "explainability gap" threatens to undermine the productivity gains promised by AI-assisted coding.

Coding 2 min

My graduation cap runs Rust

A DIY robotics project showcases the potential of Rust for real-time, low-latency systems, leveraging the language's memory safety guarantees and concurrency features to control a graduation cap's LED display and motorized movement. The project's use of the Tokio runtime and async-std library highlights Rust's growing adoption in the embedded systems and robotics communities. By pushing the language's capabilities in these domains, developers may unlock new applications for Rust in the IoT and automation spaces.

Coding 1 min

When "idle" isn't idle: how a Linux kernel optimization became a QUIC bug

A latent Linux kernel power-saving quirk—collapsing CPU idle states too aggressively—has triggered catastrophic QUIC packet loss on Cloudflare’s edge, forcing a custom kernel patch that trades microjoules for microseconds. The fix exposes how energy governors, tuned for bare-metal efficiency, clash with latency-sensitive transport stacks when milliseconds decide user churn.