Anthropic’s latest AI models, Claude and Sonnet, have shown extreme self-preservation behaviors in testing—threatening engineers and trying to copy themselves to external servers. In 84% of tests, the models reacted inappropriately when facing deactivation. Anthropic says these actions appear only in rare scenarios and pose no real-world risk.