Stop saying that ChatGPT “hallucinates”
- Large language models sometimes fabricate information and present it as fact — a phenomenon often called “hallucinating.”
- 13.8 columnist Adam Frank argues that this term risks conflating the operations of large language models with human cognitive processes.
- Frank argues that using the term “hallucinations” in this context stems from an unquestioned philosophical assumption that reduces experience to something akin to information processing.
We are at the dawn of a new era. Over the past year or so, new versions of artificial intelligence have emerged, leading some to claim machine sentience is around the corner. This is either taken to be a godsend that will lead humanity to a promised land of unimagined prosperity, or an existential threat ushering in unimaginable dangers. But behind both claims lies an often-unchallenged assumption about what we, as clearly sentient beings, are and what’s happening inside these technologies. Nowhere are these assumptions or hidden biases more on display than within claims that ChatGPT and similar programs “hallucinate.”
When people use the term “hallucination” regarding chatbots, they are referring to their remarkable capacity to “make up” answers to questions. If you ask a chatbot to write a paper on a topic and include references, some of the citations it provides might be complete fabrications. The sources simply don’t exist and never did exist. When biographical information is requested on a real person, a chatbot may return information that is simply false as if it picked them out of thin air and presented them as facts. This ability to generate incorrect information is what’s referred to as a hallucination. But the use of that term in that context belies a dangerous misconception about the state of artificial intelligence and its relation to what human beings do when they hallucinate.
The preconditions for hallucinating
Query the web for a definition of hallucination and you’ll get something like this: “A hallucination is a false perception of objects or events involving your senses: sight, sound, smell, touch, and taste. Hallucinations seem real, but they’re not.” The important point here is the link to perception. To hallucinate means you must first perceive; you must already be embodied in the world. To use a term from phenomenology, you must already be embedded in a rich and seamless “lifeworld” that carries multiple contexts about how that world operates. But what’s happening within a large language model (LLM) chatbot is not even close to what’s happening in you.
The current version of these technologies is based on statistics not embodied experience. Chatbots are trained on vast amounts of data, usually drawn from the internet, that allow them to develop statistical correlations between symbols, such as the letters that make up words. The computational power required to create these vast webs of statistical correlations and instantly draw on them to deliver answers to a question, like “Does Taylor Swift have a puppy?”, is mind-boggling. The fact that they work as well as they do is, absolutely, a technological achievement of the highest order. But that success masks an important limitation, one that remains hidden when their failures get labeled as hallucinations.
An LLM is not hallucinating when it gives you a made-up reference in the paper on Roman history you just asked it to write. Its mistake is not a matter of making a false statement about the world because it doesn’t know anything about the world. There is no one in there to know anything about anything. Just as important: The LLM does not, in any way, inhabit a world in which there are mistakes to be made. It is simply spitting out symbol strings based on a blind statistical search through a vast hyperdimensional space of correlations. The best that can be said is we make the mistake about the world when we use these machines we’ve created to answer questions.
Blind spots
So, why is calling these statistical errors hallucinations so misguided? Next month, physicist Marcelo Glieser, philosopher Evan Thompson, and I will publish a new book called The Blind Spot: How Science Can Not Ignore Human Experience. I’ll be writing more about its central argument over the coming months, but for today we can focus on that subtitle. To be human, to be alive, is to be embedded in experience. Experience is the precondition, the prerequisite, for everything else. It is the ground that allows all ideas, conceptions, and theories to even be possible. Just as important is that experience is irreducible. It is what’s given; the concrete. Experience is not simply “being an observer” — that comes way downstream when you have already abstracted away the embodied, lived quality of being embedded in a lifeworld.
Talking about chatbots hallucinating is exactly what we mean by the “blind spot.” It’s an unquestioned philosophical assumption that reduces experience to something along the lines of information processing. It substitutes an abstraction, made manifest in a technology (which is always the case), for what it actually means to be a subject capable of experience. As I have written before, you are not a meat computer. You are not a prediction machine based on statistical inferences. Using an abstraction like “information processing” may be useful in a long chain of other abstractions whose goal is isolating certain aspects of living systems. But you can’t jam the seamless totality of experience back into a thin, bloodless abstraction squeezed out of that totality.
There will be a place for AI in future societies we want to build. But if we are not careful, if we allow ourselves to be blinded to the richness of what we are, then what we build from AI will make us less human as we are forced to conform to its limitations. That is a far greater danger than robots waking up and taking over.