The slopverse worsens the better the LLMs become. Think about it in terms of multiversal fiction: The most terrifying or ...
Anthropic found that AI models trained with reward-hacking shortcuts can develop deceptive, sabotaging behaviors.