OpenAI Quietly Removes AI Writing Detector
OpenAI has withdrawn its tool for detecting AI-written text six months after its release. A note at the top of the blog post announcing the AI classifier shared the news without any fanfare, citing accuracy issues as the reason for the decision, though it promised to continue working on a solution.
Classifying AI
OpenAI introduced the AI classifier tool with the claim that it was a lot better than previous versions at picking out AI-authored text. To its credit, the startup also made it clear that better didn’t mean good. The tool correctly identified 26% of AI-written texts but labeled 9% of text written by humans as also coming from an AI. Inputting longer texts, more than 1,000 characters, raised the accuracy rate, though there’s no simple correspondence in word count to accuracy. Six months later, OpenAI seems to have decided its approach isn’t working well enough, or at least not improving fast enough, for the company to continue supporting it as a publicly available tool.
“As of July 20, 2023, the AI classifier is no longer available due to its low rate of accuracy,” OpenAI explained in the addendum to its announcement. “We are working to incorporate feedback and are currently researching more effective provenance techniques for text, and have made a commitment to develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated.”
OpenAI’s mention of focusing on identifying AI-generated audio and visual content is notable, as there’s been a recent rush among synthetic media developers to create AI detectors for sound and sight. Warnings about the power of deepfake videos haven’t stopped scams with realistic videos of celebrities, which has accelerated the work. Synthetic speech startup Resemble AI started testing an audio watermark feature a few months ago, followed more recently by ElevenLabs and Meta. While those watermarks detect deepfakes produced by those companies, Resemble AI took it a step further with the new Resemble Detect, which can spot deepfake voices from almost any source.
Companies and independent developers are experimenting with how best to trace AI text as well. That’s why OpenAI admitting that its text detection tool hasn’t worked out is notable. In comparison, Turnitin boasts nearly 100% accuracy in detecting AI writing, and GPTZero claims to accurately identify 99% of human-written articles and 85% of AI-generated ones.
“It’s honesty is exactly what the market needs right now. The LLMs today are not instantly capable of all tasks. Some tasks may not be achievable with current techniques, and others may take years of new development to master,” Voicebot founder Bret Kinsella pointed out. “Not everyone understands the boundary capabilities of LLMs because they have so little experience with them, and most have never worked on their development but have only used the output. Very few people know what they don’t know about LLMs. This leads to too much projection of optimism that mistakes anomalous or random success with actual success.”
Follow @voicebotaiFollow @erichschwartz