Generative AI hallucinations: Why they occur and how to prevent them
The recent proliferation of generative AI (GenAI) models has sparked a lot of interest and excitement. These advanced large language models (LLMs) can do many extraordinary things — from crafting email responses to recommending a fitness routine to generating computer code; the list is seemingly endless.
However, these models aren't infallible. In fact, their propensity to sometimes hallucinate — a phenomenon where they provide responses that incorporate fabricated data that appears authentic — could significantly impact user trust. A recent TELUS International survey found that 61% of respondents were concerned about generative AI increasing the spread of inaccurate information online.
Understanding what hallucinations are, why they happen and how to prevent them is critical to maintain users' trust in GenAI applications.
Introduction to generative AI hallucinations
A hallucination describes a model output that is either nonsensical or outright false. An example is asking a generative AI application for five examples of bicycle models that will fit in the back of your specific make of sport utility vehicle. If only three models exist, the GenAI application may still provide five — two of which are entirely fabricated. While this is a relatively low-stakes example of a hallucination, some could have far greater consequences.
Recently a New York attorney representing a client's injury claim relied on a conversational chatbot to conduct his legal research. The federal judge overseeing the suit noted that six of the precedents quoted in his brief were bogus. It turns out that not only did the chatbot make them up, it even stipulated they were available in major legal databases.
While guardrails to prevent GenAI hallucinations need to be put in place, the first step is understanding how, or rather why, these inaccuracies happen.
Why do generative AI hallucinations occur?
While leading AI experts aren't entirely sure what causes hallucinations, there are several factors that are often cited as triggers.
First, hallucinations can occur if the training data used to develop the model is insufficient or includes large gaps leading to edge cases that are unfamiliar to the model. For example, in drafting a contract, a generative model that was fine-tuned using contracts from the financial services industry may not have sufficient exposure to terms and concepts in healthcare law to draft contracts for that industry. Most of the time, generative AI's primary goal is to produce an output in response to a prompt. Even if it doesn't "understand" the prompt, it might craft a response based on insufficient training data, thereby leading to a faulty result.
Similarly, some machine learning (ML) models suffer from overfitting, where the output is accurate for its training data, but inaccurate for new data. The goal of ML model training is to generalize from training instances so that the model can correctly operate on instances it hasn't seen before. Overfitting occurs when a model is trained to the point where it effectively memorizes the inputs and appropriate outputs for the training set, leaving it unable to generalize effectively to new data. For example, a model used to approve loan applications may be considered to be 90% accurate in predicting the likelihood of loan default. However, if it has been subject to overfitting, it likely has an accuracy rate closer to 70%. Therefore, applying the model to future loan decisions will result in a decrease in business profits and an increase in dissatisfied customers.
Another model development challenge is ensuring training texts and prompts are properly encoded. LLMs map terms to a set of numbers — a process known as a vector encoding — and these encodings have some key advantages over working with words directly. For example, words with multiple meanings can have one encoding per meaning, reducing the chance of ambiguity. So, the word "bank" would use one vector representation when referring to a financial institution and another when referring to the shore of a river. Vector representations also allow mapping semantic operations — like finding a similar word — into mathematical operations. Problems with encoding and decoding between text and representations can lead to hallucinations.
Best practices to prevent generative AI hallucinations
While generative AI hallucinations are a concerning issue, there are ways to reduce their frequency and intensity. Consider the following best practices.
Use high-quality training data
The most obvious protection from hallucinations is high-quality input data. It's critical that your GenAI model is trained on a diverse and representative dataset that covers a wide range of real-world examples. Using sufficient, unbiased data is key to training an accurate, resilient model.
The essential guide to AI training data
Discover best practices for the sourcing, labeling and analyzing of training data from TELUS International, a leading provider of AI data solutions.
Incorporate human feedback
Another crucial part of the training process is incorporating human feedback, also known as reinforcement learning from human feedback (RLHF). Because AI models lack a nuanced understanding of language, they can produce responses that are out of context or irrelevant. RLHF provides models with regular corrections via feedback from people, so that an understanding of the world that only a human can have is incorporated into the training process.
Train with transparency
Transparency in AI ensures that it's possible to understand a model's inner workings and how it makes decisions. The issue is coming to the forefront as LLMs evolve, particularly because more advanced models are harder to understand.
Users and machine learning professionals should be able to evaluate how a model was trained and review for potential weaknesses in the learning process. Transparency also makes it easier to identify and correct errors that can lead to hallucinations.
Enact continuous quality control
A malicious user of a generative AI system could experiment and probe a model looking for ways to make it hallucinate. It's necessary to incorporate counter measures to help monitor and control the types of inputs allowed in generative AI systems. These include using adversarial examples in the model's training to improve its ability to respond appropriately and to audit the model's anomaly detection capabilities. A further means is by data sanitization, which involves checking that no malicious inputs are being fed into the model. And finally, security updates and patches can help block external threats to your AI model.
The future of generative AI
While generative AI hallucinations are a frustrating hurdle that can significantly impact user trust, they aren't insurmountable. By training models using high-quality data, ensuring model transparency and enacting effective quality control, developers can improve their model's resiliency to deliver accurate predictions.