“AI companies aren’t really using external evaluators” by Zach Stein-Perlman
EA Forum Podcast (All audio) - A podcast by EA Forum Team
Categories:
From my new blog: AI Lab Watch. All posts will be crossposted to LessWrong. Subscribe on Substack. Many AI safety folks think that METR is close to the labs, with ongoing relationships that grant it access to models before they are deployed. This is incorrect. METR (then called ARC Evals) did pre-deployment evaluation for GPT-4 and Claude 2 in the first half of 2023, but it seems to have had no special access since then.[1] Other model evaluators also seem to have little access before deployment. Clarification: there are many kinds of audits. This post is about model evals for dangerous capabilities. But I'm not aware of the labs using other kinds of audits to prevent extreme risks, excluding normal security/compliance audits. Frontier AI labs' pre-deployment risk assessment should involve external model evals for dangerous capabilities.[2] External evals can improve a lab's risk assessment and—if the evaluator can publish [...] The original text contained 5 footnotes which were omitted from this narration. --- First published: May 26th, 2024 Source: https://forum.effectivealtruism.org/posts/ZPyhxiBqupZXLxLNd/ai-companies-aren-t-really-using-external-evaluators-1 --- Narrated by TYPE III AUDIO.