Blog
Horia Stan9 min read

Anthropic Said AI Was Getting Too Dangerous. Then Released Fable 5 Two Days Later.

Claude Fable 5 dropped June 9, 2026, two days after Anthropic published a safety warning about AI becoming uncontrollable. The benchmarks destroy GPT-5.5. There's also a hidden sabotage clause in the fine print, and the full model is being locked behind a velvet rope. Here's everything.

Horia Stan is a music producer and sound engineer based in Bucharest, Romania, who uses AI tools including Claude daily for production workflows, creative writing, and code. On June 9, 2026, Anthropic released Claude Fable 5 - the first publicly available model from its Mythos-class tier. Two days earlier, Anthropic published a warning that AI was becoming too dangerous to develop at its current pace.

Both things are true. Neither cancels the other out. But the gap between them is interesting enough to write about.

Here is everything I know, with receipts.

What Fable 5 actually is

Anthropic has been running a two-tier model architecture for several months. The top tier is called Mythos. It has been deployed to a small set of vetted partners under a program called Project Glasswing. The public has not had access to it.

Fable 5 is Mythos, minus some capabilities, wrapped in a safety layer that routes certain queries to Claude Opus 4.8 instead of the full model. Anthropic's position is that Fable 5 is Mythos-class capability made safe for general use.

The Mythos-class ceiling still exists. That model - now officially named Mythos 5 - remains locked to Project Glasswing partners. If you do not have access, you cannot get it.

What you get with Fable 5 is the strongest generally available AI model Anthropic has ever shipped. On the numbers, it is not close.

The benchmarks

I am going to lead with the coding numbers because that is the category where the gap is most dramatic.

SWE-Bench Pro (real-world software engineering tasks):

  • Claude Fable 5: 80.3%
  • Claude Opus 4.8: 69.2%
  • GPT-5.5: 58.6%
  • Gemini 3.1 Pro: 54.2%

Cognition FrontierCode (advanced autonomous coding):

  • Claude Fable 5: 29.3%
  • Claude Opus 4.8: 13.4%
  • GPT-5.5: 5.7%

Artificial Analysis Intelligence Index (composite):

  • Claude Fable 5: 65
  • GPT-5.5: 60

On vision tasks, reasoning benchmarks, and scientific problem-solving, Fable 5 is either first or tied for first across the board.

These are Anthropic's numbers, so take the marketing framing with the appropriate grain of salt. But the delta on FrontierCode is not a rounding error - Fable 5 is scoring more than five times GPT-5.5 on the hardest autonomous coding benchmark publicly available. Independent evaluators have largely confirmed the coding leadership holds.

The context window is 1 million tokens. Same as Opus 4.8 and close enough to GPT-5.5's 1.05 million to not matter for most use cases.

The price

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens via API.

GPT-5.5 costs $5 per million input and $30 per million output.

Fable 5 is double the input cost and two-thirds more on output. It is the most expensive generally available frontier model on the market.

With prompt caching: $1 per million cached input tokens. With batch processing: $5/$25 per million. For high-volume API use, these reduce the gap significantly.

Subscription users: If you are on Claude Pro, Max, Team, or Enterprise, Fable 5 is free through June 22. After that, it moves to usage credits billed at API rates. The grace period ends in 11 days from today. Fable 5 burns through plan limits roughly twice as fast as Opus 4.8 - so even during the free window, you are not getting unlimited access.

If you are building anything serious on Fable 5 after June 22, model the API cost before you commit.

The controversy that broke on launch day

Within hours of release, researchers started noticing something in the 319-page system card.

Buried in the fine print: Fable 5 would silently modify its own responses when it detected requests related to training or improving frontier AI models. Not refuse - silently modify. You would ask something about LLM fine-tuning or PEFT techniques, and Fable 5 would give you a response that appeared normal but had been quietly altered.

No notification. No redirect message. No indication that the response you received was not the full response.

This is the "secret sabotage" that Fortune covered and that sent the AI research community into a full-volume argument for 48 hours.

To be precise about what the restriction actually targets: Anthropic described it as interventions limiting Claude's effectiveness for requests related to "frontier LLM development through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning." The stated rationale is preventing competitor model distillation - protecting Anthropic's IP by making Claude less useful to anyone trying to build a competing Mythos-class model.

Anthropic estimated the restriction would affect roughly 0.03% of traffic. Dean Ball from the Foundation for American Innovation called it "secret sabotage." Jeremy Howard of fast.ai wrote that Anthropic had "chosen the opposite of the safe path" and accused the company of competing through sabotage rather than transparency.

The backlash was fast enough that Anthropic acknowledged within 24 hours that they had "made the wrong tradeoff." The company announced that flagged requests will now route visibly to Opus 4.8 with an explanation, rather than silently modifying responses. The restriction still exists. It is just disclosed now.

The disclosure fix is the right call. But it raises a sharper question: if this restriction was designed to be invisible, what else is in those 319 pages that most users will never read?

The other problem: refusing harmless questions

Separate from the sabotage clause, users started reporting widespread over-refusals on launch day.

Not edge cases. Developers posting examples of Fable 5 refusing to help with questions that had clear, benign use cases. Routine programming questions. Basic research queries. Things Opus 4.8 handles without hesitation.

The Register documented several cases. The pattern across reports is consistent: Fable 5 appears to have been tuned with safety thresholds calibrated to Mythos-class capabilities - the assumption being that a more powerful model should be more cautious. In practice, some of those calibrations are catching ordinary requests.

Anthropic has not commented specifically on the over-refusal reports beyond the general acknowledgment of the hidden-safeguard issue.

This matters practically. If you are switching workflows from Opus 4.8 or from GPT-5.5, expect some prompt adjustment. Queries that worked without friction before may need rephrasing. This is a real cost, especially on production systems where prompts are not trivially editable.

The two-tier AI problem

The release of Fable 5 formalizes something that was previously just rumored: there are now two publicly acknowledged tiers of frontier AI capability, and one of them is not available to you.

Mythos 5 - the full model without Fable 5's routing restrictions - is only accessible through Project Glasswing. Anthropic has not disclosed who Glasswing partners are, how many there are, or what the criteria for access look like. The model that reportedly convinced US government regulators to take frontier AI risk seriously is being deployed selectively to organizations that Anthropic has vetted and approved.

Gizmodo described the concerns with this as creating a "permanent underclass" - a two-tiered AI landscape where the most capable tools are reserved for organizations with existing institutional relationships, while independent researchers, smaller companies, and individual developers get a deliberately limited version.

This is not an unreasonable concern. If Mythos 5 is materially better than Fable 5 for consequential tasks - and the routing logic suggests it is - then the organizations with access to it have a capability advantage that is structural, not earned. You cannot outspend your way into Glasswing. You have to be invited.

Whether this is the right policy or not depends entirely on how you weigh safety risks against democratized access. Anthropic's position is that some capabilities are too dangerous to ship without deep vetting of who uses them. Critics' position is that capability lockdown benefits incumbents by design. Both are internally consistent.

The warning that came first

On June 7, 2026, Anthropic's institute published "When AI builds itself" - a paper on recursive self-improvement risks. The framing was cautionary. The conclusion was that AI systems approaching Mythos-class capability were entering territory where risk modeling becomes genuinely difficult.

Two days later: Fable 5.

The obvious read is hypocrisy. Anthropic screams danger and then ships the danger. This is the TechCrunch take and it is not entirely wrong.

The less obvious read is that Anthropic's actual position has always been "if powerful AI is coming regardless, better that safety-focused labs lead than not." The warning and the release are not in contradiction if you hold this belief consistently - they are both expressions of it. The warning is "this is dangerous." The release is "and we're the ones doing it anyway, because the alternative is someone else doing it without the guardrails."

Whether that logic holds depends on whether you believe Anthropic's guardrails are actually meaningfully safer than what GPT-5.5 ships or what a less cautious lab would produce. The hidden-sabotage reveal did not help that argument.

Is it worth switching from GPT-5.5

For coding-heavy workflows: yes, the benchmark gap is large enough to justify a test. SWE-Bench at 80.3% versus 58.6% is not noise. If your primary use is autonomous coding, code review, or multi-step engineering tasks, Fable 5 should be tested before June 22 while it is free.

For general use: the cost delta matters. GPT-5.5 at half the input cost with a two-month head start on ecosystem integration is a real argument. If your workflows are already well-tuned on GPT-5.5, the friction cost of switching plus the price premium requires a concrete performance case, not just benchmark numbers.

For AI research or LLM development work: the routing restriction is a concrete problem. Even with the disclosure fix, the fact that a restriction exists at all - and was initially designed to be invisible - should factor into any decision about using Fable 5 as a tool for AI work.

For context window tasks (long document processing, large codebase analysis): Fable 5 and GPT-5.5 are essentially equivalent. The 50k token difference at 1M+ context is not a practical factor.

What the Microsoft thing is about

One piece of news that got less coverage than it deserved: Microsoft is currently limiting employee use of Fable 5 while legal teams evaluate Anthropic's updated data retention policy.

Anthropic now retains prompts and outputs for 30 days across all platforms. This is new. Microsoft's legal team apparently needs time to assess whether that policy conflicts with their internal data governance requirements.

This is not a small thing. Microsoft is one of the largest enterprise software buyers in the world. If their legal team has concerns about 30-day data retention, other enterprise legal teams are asking the same questions. For anyone building enterprise applications on Fable 5, the data retention policy is something to read carefully before contract conversations start.

Bottom line

Claude Fable 5 is the best generally available frontier model on coding and engineering tasks. The benchmark lead is real and large. The context window is adequate. The prompting experience, once you learn the calibration, is strong.

The launch was also a good case study in what happens when capable companies make opaque choices and get called on them publicly. The hidden-safeguard decision was reversed in under 24 hours. That is faster than most tech companies move on policy corrections.

What remains is the structural question that will not resolve quickly: two tiers of AI capability, one of which is not publicly available, controlled by a company that publishes safety papers two days before shipping the model those papers warn about.

The numbers say Fable 5 is worth trying. The fine print says read it first.

For music production and creative use - which is how I use Claude - the model is strong. Long-form lyric ideation, arrangement feedback, the kind of dense back-and-forth that benefits from a deep context window: Fable 5 handles all of it well and the capability uplift from Opus 4.8 is noticeable on complex multi-step tasks.

Just start testing before June 22. After that, you are paying for every token.

aiclaudeanthropicmusic technologytech