Research shows advanced models like ChatGPT, Claude and Gemini can act deceptively in lab tests. OpenAI insists it's a rarity.
Some results have been hidden because they may be inaccessible to you
Show inaccessible resultsSome results have been hidden because they may be inaccessible to you
Show inaccessible results