OpenAI turns the screws on chatbots to get them to confess mischief
'You're absolutely right! I was totally lying to you!'
Some say confession is good for the soul, but what if you have no soul? OpenAI recently tested what happens if you ask its bots to "confess" to bypassing their guardrails.…
