AI is learning how to lie
Marketplace Tech - A podcast by Marketplace
![](https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/60/60/0d/60600d33-49e6-38fe-e070-f27f0e366d94/mza_10082545223892978806.png/300x300bb-75.jpg)
Categories:
Large language models go through a lot of vetting before they’re released to the public. That includes safety tests, bias checks, ethical reviews and more. But what if, hypothetically, a model could dodge a safety question by lying to developers, hiding its real response to a safety test and instead giving the exact response its human handlers are looking for? A recent study shows that advanced LLMs are developing the capacity for deception, and that could bring that hypothetical situation closer to reality. Marketplace’s Lily Jamali speaks with Thilo Hagendorff, a researcher at the University of Stuttgart and the author of the study, about his findings.