Anthropic says one of its Claude models was pressured to lie, cheat and blackmail
In an experiment, a chatbot resorted to blackmail after it found an email about replacing it, while in another, it cheated to complete a task with a tight deadline.
Opent in een nieuw tabblad. Via Cryptopage meten we de klik voor statistieken.
Partner