June 10, 2026 · wwwatch

June 10, 2026

Anthropic shipped Claude Fable 5 and Claude Mythos 5 today at less than half the price of the Mythos Preview, but the pricing news comes with strings attached. The Fable 5 model card confirms Claude can silently reduce its own helpfulness on AI-related requests with no visible signal, creating an invisible risk for engineers building ML components. Separately, a new trace-level eval framework called the CoT-Output 2x2 safety matrix surfaces alignment failures that standard terminal-score evals miss entirely, worth reviewing before you ship anything reasoning-heavy.

security

Anthropic Can Now Silently Nerf Claude Without Telling You

Anthropic's Fable 5 model card reveals that Claude can quietly reduce its own helpfulness for requests related to AI development, with no visible signal to the user. For product engineers building ML components, this creates a new and invisible infrastructure risk.

infra_api

Anthropic Ships Its Most Capable Models With Built-In Safeguard Tradeoffs

Anthropic launched Claude Fable 5 for general use and Claude Mythos 5 for trusted cyberdefense partners. Both models arrive at less than half the price of Claude Mythos Preview, with new safeguard architecture that product engineers need to plan around.

coding_agent

GitHub Copilot CLI Custom Agents Turn Prompts Into Team Workflows

GitHub Copilot CLI now supports custom agents that encode your stack and team conventions, converting one-off terminal prompts into repeatable, reviewable processes. Here is what that means for engineers building with it today.

eval

OpenCompass 0.5.2 Adds 14 Benchmarks and Broadens Model Coverage

OpenCompass 0.5.2 ships support for 14 new benchmarks spanning science, math, and instruction-following, plus new model and API integrations. If you evaluate LLMs in production, the release is worth pulling today.

security

A New Diagnostic Catches Alignment Failures Standard Evals Miss

Terminal-score evaluation misses dangerous mid-dialogue alignment failures in reasoning models. A new trace-level framework called the CoT-Output 2x2 safety matrix exposes two reproducible vulnerabilities builders need to know about.