Everything you need to know about identifying hallucinations by LLMs
“Why is this misleading?”: Detecting News Headline Hallucinations with Explanations
Abstract
New headlines are an easy way to summarize the articles. But the recent wave of LLMs hallucinate, that is, generate text that looks convincing but isn’t true.
The authors of the paper - “Why is this misleading?”: Detecting News Headline Hallucinations with Explanations" introduce a framework called ExHalder (Explanation-enhanced Headline Hallucination detector) to address the challenge for headline hallucination detection. ExHalder identifies hallucinations and generates natural language sentences to explain the hallucination detection results.
Introduction
There are two popular ways of identifying the headlines of the news articles
Extractive methods to first extract words from the article titles and then organize those salient words into the single output headline.
Another set of techniques directly summarize the news article into a concise news headline mostly using the encoder-decoder architecture where the encoder synthesizes the knowledge in the news article using vector representations and the decoder outputs the generated headline in a word-by-word fashion. Despite overall quality improvements, these models often will hallucinate headlines that are not supported by the underlying news articles.
Intuitive/default approach and its associated flaws
An intuitive approach is to “train a classifier using a large set of ⟨article, headline⟩ pairs with their hallucination labels”. However “hallucination cases appear infrequently and require deep reading comprehension, such a labeled dataset is usually of small scale” and thus making it difficult to train a powerful model that “can capture the subtle semantic differences between news articles and news headlines”.
ExHalder explained
Training stage:
ExHalder has three key components:
Reasoning classifier :
Input: ⟨article, headline⟩ pair
Outputs : Hallucination Class label along with the label explanation
Hinted classifier :
Input : ⟨article, headline, explanation⟩ triplet
Output: class label
Explainer :
Input: ⟨article, headline⟩ with its known class label
Output: Generates the natural language explanation based on the input .
These three components explore the problem from different angles and work together within ExHalder framework.
During the training phase, explainer is used to generate more explanations and augment the original training set for learning the reasoning classifier and the hinted classifier.
Inference stage
At the inference stage,
Step 1:
Classifier used : Reasoning classifier
Input: ⟨article, headline⟩ tuple
Output: Predicted class and generated explanation.
Step 2 :
Classifier used: Hinted classifier
Input: ⟨article, headline, explanation predicted by Reasoning classifier⟩ tuple
Output: Predicted hallucination class label
Aggregate these two hallucination predictions and return the final predicted class with its corresponding explanation.
Overview of the training and inference of exHalder
Sample output of exHalder
The image below gives the output of model on various articles.