Anthropic has launched Claude Fable 5, a new AI model that delivers performance close to its highly capable Mythos system but incorporates stringent limitations to prevent misuse in critical fields such as cybersecurity and bioweapons. This move marks a shift from the company’s earlier decision not to publicly release Mythos due to its potential for harm.

The company described Claude Fable 5 as using the same core architecture as Mythos, but it selectively redirects queries on sensitive subjects to the previous Claude Opus 4 model, which is already available publicly. This approach introduces “guardrails” designed to restrict Fable 5’s ability to provide detailed or potentially dangerous information, especially on topics where misuse could cause significant damage.

Anthropic has conducted comprehensive internal and external red team testing on Fable 5 to identify vulnerabilities such as jailbreaking attempts, which involve tricking the model into bypassing its safety measures. While the company reported no discovery of universal jailbreak methods, it has not ruled out the existence of partial vulnerabilities. Given the history of researchers exploiting older models, there is cautious concern over whether these safeguards will hold as the model becomes widely accessible.

Recognizing that advanced AI capabilities could be exploited by adversaries, particularly for cyberattacks motivated by financial gain, Anthropic remains aware that motivated actors might try to defeat the protections built into Fable 5. The company also updated its data retention policy, holding user interactions for 30 days on both Anthropic’s infrastructure and third-party platforms. This retention period complies with a recent White House executive order that promotes transparency and coordination by encouraging AI developers to share information about frontier models before public release.

Importantly, Anthropic clarified that collected data during this period will not be used to train future Claude models or any purposes unrelated to safety, aiming to reassure users about privacy and data use.

The release comes amid rising interest and caution around integrating advanced AI into IT and cybersecurity environments. Frontier models like Mythos can automate complex activities such as vulnerability scanning, exploit chaining, and data exfiltration at a scale and speed previously unattainable. Although automation in hacking predates AI, experts warn that models of Mythos’s caliber could lower barriers for malicious actors.