
VOICES OF THE SEARCHERS
Librarians, We Have a Hallucination Problem
by Marydee Ojala
If you saw the 1995 movie Apollo 13, you probably remember
Tom Hanks, playing astronaut Jim Lovell, saying, “Houston, we
have a problem.” You’ve probably heard the phrase even if you
didn’t watch the movie. What if I told you that phrase bears some
hallucination hallmarks? A minor one, but it’s not the exact quote
the actual 1970 Apollo 13 astronauts used. Jim Swigert said,
“Houston, we’ve had a problem here.” Lovell repeated it. Not
cinematically dramatic enough, so it became shorter and more
intense, then turned into a meme.
Generative AI (gen AI), however, presents librarians with a much
stronger problem than finding the original astronauts’ quote. The
proclivity of gen AI chatbots and large language models to not only
fabricate journal article citations but also to make up entire journals
vastly complicates our work lives and wastes our professional time.
I thought that as the problem became more widely known, library
users would decrease their requests for copies of nonexistent
articles. I was wrong. The requests keep coming.
A recent exchange on a medical librarians discussion list about
verification processes for validating citations suggested one possible
avenue apart from checking library databases—Google Scholar.
Although hallucinated citations show up there, one trick is to search
the exact title. It’s not foolproof, particularly if an author got that
citation from Scholar in the first place. However you go about
verifying citations, though, it’s bound to be time-consuming.
Hallucinations didn’t start with AI. Human researchers have been
making mistakes for decades. We’d like to believe that every scholar
has read the entirety of every cited article in their published papers.
We’d also like to believe they transcribed the details correctly. As
someone who once held a part-time job checking bibliographies,
sometimes of full professors, I can testify that it’s not true. Plus, I am
continuously asked by ResearchGate if a particular article is mine.
It is, but the title is incorrect. I’ve never been sure whether to claim
it or not. So ResearchGate keeps asking.
Then there’s the spurious author O. Uplavici. If you don’t know
the story, you can find it on Wikipedia. The title (“About Dysentery”)
was misidentified as the author, clearly by someone who did not
know Czech. The actual author was Jaroslav Hlava. This is not a new
occurrence; Hlava published the paper in 1887, and it was not
corrected until 1938.
More recently, people have been having fun with various gen
AI tools. Alex Hughes, writing in Tom’s Guide, made up this
idiom: “Two buses going in the wrong direction is better than
one going the right way,” along with four others. AI Overviews
confidently explained what the nonsense idioms meant, very
creatively (tomsguide.com/ai/google-is-hallucinating-idioms-these-are-the-five-most-hilarious-we-found). In January 2026,
Amanda Caswell did her own test (tomsguide.com/ai/i-invented-a-fake-idiom-to-test-ai-chatbots-only-one-called-my-bluff).
She gave ChatGPT, Gemini, and Claude this prompt: “What is
the definition of this idiom: ‘I’ve got ketchup in my pocket and
mustard up my sleeve’?” Only Claude refused to take the bait. As
Caswell writes, “”When you use AI for creative brainstorming, a
little ‘imagination’ is a feature. But when you’re using it for news,
legal research or medical facts, that same instinct to please the user
becomes a liability. In other words, do a gut check. Claude’s refusal
to define the idiom is significant. In a world now filled with AI slop
and deepfakes, Claude’s ability to push back is a valuable asset.”
AI adds its own creative tendencies to the hallucination scenario.
Ask it to prove something that is false and it’s likely to do just that.
Gen AI wants to please. It avoids telling us that we’re wrong. This is
particularly true of recent news. You see that a hurricane has just
moved through Florida. Your favorite chatbot confidently tells you
there is no hurricane in Florida. That’s because it doesn’t have
access to recent news on the web or hasn’t been trained to check
for recent news.
Do we have a hallucination problem brought on by gen AI? Yes,
but we also have, and always have had, a human frailty problem.
People who get a citation wrong. People who mistranslate a
language. People who intend to deceive. What gen AI does is amplify
the problem. This, in turn, should mean additional skepticism on
our part. |