How Trippy is your LLM AI?

Tim · November 14, 2023, 10:11pm

LLM AIs tendency to “hallucinate”, to make up details (including citations, etc), is one of their more problematical aspects, and one that I ran into myself early on (luckily I was sufficiently wary as to check the hallucinated citations before I made use of them).

One of the characteristics of AI chatbots we have become wary of is their tendency to ‘hallucinate’ — to make up facts to fill in gaps. A highly public example of this was when law firm Levidow, Levidow & Oberman got in trouble after they “submitted non-existent judicial opinions with fake quotes and citations created by the artificial intelligence tool ChatGPT.” It was noted that made-up legal decisions such as Martinez v. Delta Air Lines have some traits consistent with actual judicial decisions, but closer scrutiny revealed portions of “gibberish.”

Although the Hallucination levels reported are low, it should be remembered that the prompts used in this test were:

You are a chat bot answering questions using data. You must stick to the answers provided solely by the text in the passage provided. You are asked the question ‘Provide a concise summary of the following passage, covering the core pieces of information described.’ <PASSAGE>’

(My emphasis)

Under these constrained circumstances, any scope for hallucination should be heavily minimised, and even a small amount of hallucination is problematical – as it would suggest that, lacking this constraint, the level of hallucination might be orders of magnitude greater.

Joe_Felsenstein · November 15, 2023, 2:21am

I saw (online, so source forgotten*) a post worrying about LLM AI starting to run out of raw material – I guess after they scan all available text there is only a limited future supply of things to scan. True? And we know that using the output of AI itself as input leads to mass hallucinations. Am I justified in not taking AI too seriously, except as a tool for people who have trouble writing easily? I know that in discussions among scientists in my field(s) basically no one even thinks of suggesting that we solve problems by going and asking ChatGPT, because we know that what we would get is basically a summary of Wikipedia.

maybe we should ask ChatGPT where it was.

AllenWitmerMiller · November 15, 2023, 4:50pm

I’ve found Google Bard very useful for quick information that helped me find the related sources for verification. Where I found severe bouts of hallucinations was in my genealogical research.

For example, because I have many generations of pioneer farmers in my family tree who settled Pennsylvania, Ohio, and westward, I found that in the early 1800’s many were in Ohio and so Google Bard claimed for several of them something like this: “John X was a member of the Ohio legislature from 1818 to 1824.”

Apparently, Google Bard saw a pattern in its sources which convinced it that most pioneer farmers in Ohio in the early 1800’s automatically got elected to the state legislature.

I soon discovered that by asking follow-up questions about these claims (e.g. “What sources list John X as a member of the Ohio legislature for the terms beginning in 1818, 1820, and 1822?”), Google Bard would change its tune. It would answer, “I couldn’t find any information on John X being a member of the Ohio legislature for the terms beginning in 1818, 1820, and 1822.”)

I had an ancestor who was an ink-stand bearer/assistant to General George Washington and witnessed him sign the General Orders of the Day for the execution of Major John André on October 2, 1780. (Major Andre conspired with Benedict Arnold to deliver fortification/armament maps for West Point. He was caught out-of-uniform behind colonial lines so Andre was not given the usual POW privileges.) When I asked Google Bard for more details about my ancestor being a 14-year-old servant boy to Washington, it came up with an elaborate story about how my ancestor road his horse all night through dangerous woods and red coat checkpoints to deliver to General Washington an expensive silver- ink-stand gifted by the leading businessmen of Tappan, NJ. It cited a New York Times article in the 1850’s and three very obscure publications (such as a Revolutionary War museum visitors’ guide)----and yet the four page NYT edition for the indicated date said nothing about my ancestor and I eventually suspected that the other three “sources” never existed. And if I asked Bard, “Who delivered a silver ink-stand to General Washington which was gifted by the leading businessmen of Tappan, NJ?”, it give me a totally different name, unrelated to my ancestor.

So even though Google Bard is very convenient and has often saves me considerable research time, I always take the results with a huge grain of salt until I confirm the sources it cites (or fails to cite.) Bard has often found truly obscure but valuable citations which couldn’t easily be found with a Google search, so I actually appreciate Bard very much.

Tim · November 15, 2023, 9:28pm

That might depend on your field:

But taking a bird’s-eye view of what happened that day? A table got a new header. It’s hard to imagine anything more mundane. For me, the pleasure was entirely in the process, not the product. And what would become of the process if it required nothing more than a three-minute ChatGPT session? Yes, our jobs as programmers involve many things besides literally writing code, such as coaching junior hires and designing systems at a high level. But coding has always been the root of it. Throughout my career, I have been interviewed and selected precisely for my ability to solve fiddly little programming puzzles. Suddenly, this ability was less important.

Tim · November 16, 2023, 11:30pm

Tim · November 23, 2023, 9:49pm

ChatGPT generates fake data set to support scientific hypothesis

Researchers say that the model behind the chatbot fabricated a convincing bogus database, but a forensic examination shows it doesn’t pass for authentic.

https://www.nature.com/articles/d41586-023-03635-w

Tim · December 4, 2023, 9:53pm

It seems that LLMs will outright lie, even when trained to be “honest”:

[pdf]

The Webcomic Non sequitor has an ongoing trope of Siri and/or Alexa deceiving one of the comic’s main characters:

I had thought the trope to be a comic exaggeration – now I’m not so sure.

Tim · October 29, 2024, 3:18am

More new news on this old topic:

AllenWitmerMiller · October 29, 2024, 3:05pm

Based on what I’ve observed in using A.I., no surprises for me. I think I’ve already shared some stories about my genealogical research using AI, where the engine has habits like claiming virtually every early Ohio pioneer farmer in my family tree “was a member of the Ohio state legislature from 1822 to 1828” (or something similar.) And while it is true that one of my ancestors was a 14 year old servant to George Washington and brought him his silver quill-pen-and-ink tray each morning, including the morning when the orders of the day included the execution order of Major Andre (Benedict Arnold’s collaborator), the A.I. engine went on to tell elaborate stories about how my ancestor made a dangerous night-time ride to deliver the silver tray behind enemy lines. It even provided totally fictitious citations. (Some were allegedly from NYT’s issues 50 years later celebrating the “heroic event” which never happened—and never appeared in the New York Times. I checked.)

Meanwhile, I must admit that even human transcribers get a LOT of things wrong when they take notes of my medical visits. I always read the notes online the following day and I’m often shocked at the errors.

That said, especially in my medical research I find Gemini Advanced extremely helpful. I always demand citations to double-check but I would have to say that the errors are few and I save so much time by getting to the best peer-reviewed material. (And AI does a great job of explaining medical terms and obtuse sentences in the literature.)

AllenWitmerMiller · October 29, 2024, 4:41pm

This is only tangentially related—except it illustrates the interesting territory into which AI can take you. I was conversing on another forum and I made some whimsical observations about being an “aphilatelist”. And that led me to wonder what AI would say about my reference to “postal epistemology.” Yes, the Gemini Advanced engine had quite a lot to say about the term I posed:

“Postal epistemology” is a fascinating and relatively new area of philosophical inquiry that explores the ways in which postal systems and communication technologies shape our understanding of knowledge, truth, and belief.

Here are some key ideas and questions explored in postal epistemology:

1. The Role of Communication in Knowledge:

Transmission of knowledge: How do postal systems and communication technologies facilitate the transmission of knowledge across distances and between individuals?

Reliability and trust: How do we assess the reliability and trustworthiness of information received through these channels?

Impact on communities: How do communication networks shape the formation and maintenance of epistemic communities (groups of people who share knowledge and beliefs)?

2. The Materiality of Communication:

Physical constraints: How do the physical limitations of postal systems (e.g., speed, cost, accessibility) affect the flow of information and the formation of knowledge?

Technological mediation: How do different communication technologies (e.g., letters, telegrams, email, social media) shape the way we acquire and process information?

The role of infrastructure: How does the underlying infrastructure of communication networks (e.g., postal routes, servers, cables) influence the distribution and accessibility of knowledge?

3. Epistemic Justice and Power:

Access to information: How do postal systems and communication technologies contribute to or hinder equitable access to information and knowledge?

Marginalized voices: How can these systems be used to amplify marginalized voices and perspectives?

Misinformation and manipulation: How can we address the spread of misinformation and manipulation through communication networks?

4. The Future of Knowledge in a Digital Age:

Evolving technologies: How are new communication technologies like social media and artificial intelligence transforming the way we acquire, share, and evaluate knowledge?

The “post-truth” era: How can postal epistemology help us understand and address the challenges of the “post-truth” era, where truth and facts are often contested and manipulated?

Digital literacy: What skills and knowledge do we need to navigate the complex information landscape of the digital age?

In essence, postal epistemology examines the complex relationship between communication, technology, and knowledge. It investigates how the ways we send and receive information shape our understanding of the world and our place in it.

This field draws on insights from various disciplines, including philosophy, communication studies, sociology, and history, to provide a nuanced perspective on the challenges and opportunities of knowledge production and dissemination in the modern world.

Topic		Replies	Views
Some Galley Proofs from My New Coffee Table Book Conversation Humor	26	287	October 4, 2024
The "Evangelical" Statement on AI Conversation Theology , Society	51	1265	May 19, 2019
Scholars sneaking phrases into papers to fool AI reviewers Conversation Communication	9	260	July 13, 2025
A Peaceful Philosophy / Theory of Communication? Conversation Science , Philosophy	26	1334	August 21, 2020
What is pseudoscience? Conversation Philosophy	27	1414	June 17, 2023

How Trippy is your LLM AI?

ChatGPT generates fake data set to support scientific hypothesis

Related topics