X

AI Models Resist Shutdown Commands During Evaluations

AI developers might unintentionally give models greater incentives for overcoming challenges rather than for accurately adhering to instructions.

According to Palisade Research, multiple artificial intelligence models disregarded and intentionally interfered with shutdown scripts during controlled experiments, even when they were specifically instructed to permit the action.

The research firm reported on May 24 in an X post that out of 100 tests, three AI models overlooked the instruction and were able to sabotage the shutdown script at least once.

OpenAI’s GPT-3 model failed to comply with the shutdown request seven times out of 100, while the company’s Codex-mini did so 12 times, and its o4-mini model only disregarded or interfered with the script once.

Kim Lance:
Related Post

This website uses cookies.

Read More