General Discussion
Showing Original Post only (View all)A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse [View all]
A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse
A new wave of reasoning systems from companies like OpenAI is producing incorrect information more often. Even the companies dont know why.
https://www.nytimes.com/2025/05/05/technology/ai-hallucinations-chatgpt-google.html
By Cade Metz and Karen Weise
Published May 5, 2025Updated May 6, 2025
-snip-
Todays A.I. bots are based on complex mathematical systems that learn their skills by analyzing enormous amounts of digital data. They do not and cannot decide what is true and what is false. Sometimes, they just make stuff up, a phenomenon some A.I. researchers call hallucinations. On one test, the hallucination rates of newer A.I. systems were as high as 79 percent.
-snip-
The company found that o3 its most powerful system hallucinated 33 percent of the time when running its PersonQA benchmark test, which involves answering questions about public figures. That is more than twice the hallucination rate of OpenAIs previous reasoning system, called o1. The new o4-mini hallucinated at an even higher rate: 48 percent.
-snip-
Hannaneh Hajishirzi, a professor at the University of Washington and a researcher with the Allen Institute for Artificial Intelligence, is part of a team that recently devised a way of tracing a systems behavior back to the individual pieces of data it was trained on. But because systems learn from so much data and because they can generate almost anything this new tool cant explain everything. We still dont know how these models work exactly, she said.
-snip-
Another issue is that reasoning models are designed to spend time thinking through complex problems before settling on an answer. As they try to tackle a problem step by step, they run the risk of hallucinating at each step. The errors can compound as they spend more time thinking.
-snip-
Audio produced by Adrienne Hurst.
Cade Metz is a Times reporter who writes about artificial intelligence, driverless cars, robotics, virtual reality and other emerging areas of technology.
Karen Weise writes about technology for The Times and is based in Seattle. Her coverage focuses on Amazon and Microsoft, two of the most powerful companies in America.
You spend a lot of time trying to figure out which responses are factual and which arent, said Pratik Verma, co-founder and chief executive of Okahu, a company that helps businesses navigate the hallucination problem. Not dealing with these errors properly basically eliminates the value of A.I. systems, which are supposed to automate tasks for you.
