A reaction to the ChatGPT‑5 launch – what it means for people and organisations, and where we are on the technology S‑curve

11 Aug, 2025

It was hyped as a milestone weekend for AI. Instead, it turned into a case study of how people react to badly managed AI changes, and revealed a little bit more about where we are on the AI technology S-curve.

TL;DR

GPT-5 is an impressive leap, but the rollout stumbled
Expectations ran far ahead of what was released
The backlash tells us a lot about how people actually use AI, underscoring how the model’s overly agreeable responses can reinforce, and sometimes amplify, questionable human behaviours
We remain on the steep part of the S-curve, and the next six to nine months look rich with innovation – but beyond that, the trajectory is uncertain
There are key lessons from this weekend's launch for those in AI product design, AI safety, and AI enterprise adoption

1. What ChatGPT launched

OpenAI’s GPT-5 landed with a few headline promises: higher reasoning ability, no model picker, multimodal input / output, and an expanded context window. In practice:

Intelligence/Reasoning: The model is a top performer across many benchmarks and domains. It holds context better across long conversations and multi-step problems. It hallucinates very little and is extremely good at instruction following and longer horizon tasks. Coding, particularly front-end development, is a major improvement
Model Routing: Most people stick to chatgpt-4o, and don't benefit from the better models. The introduction of automatic model routing is a big improvement for the regular user - giving everyone access to better intelligence without needing to understand what all the models do
Multimodal: Vision, text, video (via advanced voice and via files) and speech work more seamlessly – though some early users saw image analysis lag
Context: In the ChatGPT app, Pro tiers now support a 256k-token context window (about 200k words), while the API variant offers 400k context window with 180k output tokens

From my testing, the new models (chatgpt-5, chatgpt-5-thinking, chatgpt-5-pro) are the frontier models for most knowledge work right now, and will expedite a range of use cases, particularly in enterprise. Anthropic's models are very competitive (if not better) at specific things, but OpenAI's seem to get more right across more areas. On the coding side - it's v close between 4 Sonnet / Opus 4.1 and ChatGPT 5, and the choice will come down to specific workflows and individual preferences. Generally - I still find Opus / Sonnet better at traversing larger projects, and Claude Code (on Anthropic's Max plan) is an excellent deal and experience overall.

2. What went wrong

The launch weekend had its share of gremlins:

Server strain: Voice mode failed intermittently; multimodal uploads stalled
Routing issues: OpenAI admitted the router was not working properly for much of day 1, defaulting users to the wrong (weaker?) model. Users weren’t sure if they were on GPT-5 medium or a lighter variant – transparency in model selection felt lacking. Users were disappointed. There were claims the model was worse for some on their existing workflows, when compared to models that had been removed from the model picker
Model Picker: Removing the older models from the model picker was a big move. This put many on the back foot, who had established workflows, and did not expect that to occur with little notice. This was not a good change experience
Personality: Many people reacted to the new tone / personality of ChatGPT-5 badly. This broke into two types of feedback:
- Some felt the new model was 'colder', less chatty
- Others reacted more strangely - with some mourning the loss of the model like it was a human, with some cries of desperation amongst users revealing a hidden side to model use
Expectations vs. delivery: Months of online hype primed people for “AGI-like”. Instead, what they got was a very, very good product but not quite a step change for those who were already used to the o3 models

My experience was fairly similar in the first few hours. ChatGPT-5 in the app was colder, terser, and was failing to execute standard jobs in my workflow. However, testing it via the API was very impressive. Adjust some settings in ChatGPT app, and looking closely at the model / system cards and prompting guides revealed some interesting items. Not only does the GPT-5 family demonstrate noticeably higher reasoning ability, but according to OpenAI’s own model documentation¹ it exhibits a markedly lower hallucination rate than previous flagship models (notably GPT-4o, with reductions measured in OpenAI’s benchmark evaluations²) and delivers strong performance on instruction-following and complex, multi-step or “long-horizon” tasks³. The model required me to adjust my set-up and prompting approach, resulting in my work outputs becoming much better or delivered faster, versus the previous models. Creative writing was also excellent.

3. What the reaction really means

Beneath the noise, the weekend exposed several deeper shifts in the AI landscape:

Emotional Attachment is Real: Many professionals treat GPT tools as mission-critical infrastructure. This weekend revealed another layer: people’s attachment to a model’s “personality” can be as important as its output quality
- For some, the reaction to a change in GPT-5’s tone varied from mild annoyance to threats of cancelling service

badvibes2

Social Media early reaction - Reddit

For others, it triggered surprisingly intense reactions, from nostalgia to grief, suggesting an unhealthy emotional dependence on the previous generation of models

AI Behaviour2

AI User Attachment Example

This is a new, emergent product-response element: user behaviour and their attachment to AI is now part of the product’s brand experience. In future, launches, iterations, and model deprecations will need to balance with strategies to preserve or gently transition these human-AI relationships. Sam Altman's response also acknowledge this tension:

Screenshot 2025-08-11 at 11

Sam Altman Reaction on User Attachment Post Launch

Brand stakes are higher: Confusion over which model people were actually using — and the perception by some that they were “downgraded” — was compounded by a trifecta of launch missteps: a presentation error during OpenAI’s launch event, a malfunctioning model router that sent users to the wrong tier, and the surprise removal of all older models from the picker. Combined with performance issues from the routing fault, many felt disoriented and frustrated. The key learning: customers quickly build habits around specific models, and those habits become associated with your brand. Break those user habits too abruptly without clear communication or smooth transitions, and companies risk brand damage
Safety debates are more visible: As soon as GPT‑5’s reasoning and voice features went live, online discussions turned to familiar but still pressing concerns — deepfake voices, bias amplification, and prompt‑injection risks. Over the weekend, one visible red‑teaming effort from @Elder_Plinius on X.com demonstrated successful extraction of the system prompt within hours, underscoring that prompt‑injection vulnerabilities persist even in flagship releases. This, alongside active community testing and high‑visibility discussions on forums like Hacker News, created near‑instant public scrutiny. Every major launch now invites a real‑time, crowdsourced safety audit, and companies need to anticipate this with robust mitigations and clear comms in place before release

chatgpt5 system prompt

*Leaked ChatGPT System Prompt:

chatgpt mdma hack ChatGPT-5 Prompt Injection Hack as posted on X here

For enterprises embedding AI into customer and employee experiences, the bar is no longer just “does it work?” — it’s “does it work flawlessly from day one across our workflows, and does it align with the emotional, ethical, and operational expectations our users now bring?” This means brand, user experience, and safety must be designed together: abrupt changes risk eroding trust; personality shifts can alter customer sentiment; and safety lapses can undermine both brand equity and compliance.

4. Where we are on the technology S‑curve

Did the good, but less than hyped, improvement by OpenAI with the launch of ChatGPT-5 herald that AI innovation is slowing down? I think my previous post on where we are still holds up. Ultimately, we are in a period of rapid incremental change based on optimisations across compute, data, and algorithmic improvement.

We are still on the rise through the S‑curve: meaningful gains are likely over the next 6–9 months (driven by better routing, tool‑use, memory, and inference‑time optimisation), models are also set to get more capability driven by new larger data centres and model improvements.

METR Analysis

METR Analysis - Model performance on long horizon tasks

What I am keeping a close eye on is whether we see a step-change in model improvement, or alternatively a flattening of the innovation curve, as new data centers come online from XAI with their Colossus infrastructure, the Stargate program with OpenAI, and Meta's new clusters, which will all be underpinned by the latest chips.

Beyond that horizon, the trajectory is uncertain — contingent on breakthroughs in training data, model architecture, chips, cooling/power, and smarter scaling laws rather than more brute force.

Regardless of where we are on the curve, without any further improvements in generative AI, there are major benefits to stitching the AI capabilities we have today into enterprise systems and consumer experiences, with a long road of benefits realisation available already.

5. Closing thoughts..

In the end, the GPT-5 launch was less about the model itself and more about the messy, very human dynamics around it. We saw how hype can outrun reality, how small UX and safety choices ripple into public perception, and how trust is built—or lost—at the speed of a social feed.

There is also a very dark side to how people grow attached to some more sycophantic models which has ramifications for how we design AI experiences and protect the vulnerable.

Ultimately, this weekend's events were a intriguing snapshot of an industry still sprinting up the S-curve: brilliant, chaotic, and just beginning to show its true shape.

Footnotes

¹ OpenAI, “GPT‑5 Chat Latest Model Card,” https://platform.openai.com/docs/models/gpt-5-chat-latest
² OpenAI, “GPT‑5 Benchmark Results,” section on factual accuracy, https://platform.openai.com/docs/models/gpt-5-chat-latest#benchmarks
³ OpenAI, “GPT‑5 Performance Guide,” section on long‑form and multi‑step reasoning, https://platform.openai.com/docs/guides/reasoning This means that the user needs to use great prompt structures and be specific on more complex tasks.