How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior (AI summary)

Key Points

1. Retrieval Augmented Generation (RAG) is commonly used to address hallucinations and enhance the accuracy of large language models (LLMs) by providing relevant retrieved content in the LLM prompt.

2. Despite RAG fixing most model mistakes (94% accuracy) by providing the correct retrieved information, when the reference document is altered with incorrect values, the LLM may recite the modified information more likely when its internal prior is weaker. Conversely, when the modified information deviates from the model's prior, the model is less likely to prefer it.

3. Commercial LLMs, such as ChatGPT and Gemini, already employ RAG in their web interfaces. However, the quality and accuracy of retrieved content can greatly influence the divergence between RAG-enabled and non-RAG counterparts.

4. The study systematically analyzes the tension between a model's prior knowledge and the retrieved information in RAG settings by testing GPT-4 and other LLMs on question-answering abilities across datasets with and without reference documents.

5. RAG preference rate is inversely correlated with the model's confidence in its response without context, and LLMs will increasingly revert to their priors when the original context is progressively modified with unrealistic values.

6. The degree of deviation between the model's prior response and the value contained in the retrieved context influences the model's likelihood to adopt the retrieved information over its initial response.

7. The choice of prompt (i.e., standard, strict, Loose) significantly influences the model's RAG adherence, with the strict prompt leading to higher RAG adherence than the standard prompt, while the loose prompt results in lower RAG adherence rates as the prior probability increases.

8. GPT-3.5 and Mistral-7B exhibit significantly lower performance compared to GPT-4 in terms of concordance of the prior and RAG, but they demonstrate similar inverse trends.

9. The study recognizes the potential risks associated with LLMs' utilization of RAG systems and calls for further research to understand the mechanisms modulating LLMs' adherence to RAG systems to ensure their trustworthiness and reliability.

The study acknowledges limitations and emphasizes the need for more comprehensive evaluations of models and asserts the significance of understanding LLMs' interaction with information of varying degrees of trustworthiness.

Summary

The paper investigates the interplay between a large language model's (LLM) internal knowledge and the retrieved information in cases of disagreement. It tests GPT-4 and other LLMs on question-answering abilities across datasets with and without reference documents. The findings reveal that providing correct retrieved information fixes most model mistakes (94% accuracy), but when the reference document is perturbed with wrong values, the LLM is more likely to recite the incorrect, modified information when its internal prior is weaker. The study also shows that the more the modified information deviates from the model's prior, the less likely the model is to prefer it. These results highlight the tension between the model's prior knowledge and the information in reference documents.

Additionally, the paper discusses the increasing reliance on retrieval augmented generation (RAG) systems in large language models (LLMs). It underlines the need for comprehensive evaluations of RAG-enabled LLM behavior, especially with the constant change in web results and the potential for outdated or incorrect information. The study aims to quantify the tension between LLMs' internal knowledge and the retrieved information provided in RAG settings, revealing the nuanced relationship between the model's prior probability and its preference for the information presented in the retrieved context.

The findings also demonstrate the influence of different prompting techniques on the model's adherence to retrieved information. Furthermore, the study analyzes the effects of different models such as GPT-3.5 and Mistral-7B on RAG adherence, highlighting the need for a systematic understanding of LLMs and their interaction with information of varying trustworthiness and accuracy.

The paper concludes by emphasizing the need for further research to characterize the risks of using LLMs to answer questions given contextual information and underscores the unpredictable behavior of models when presented with information that challenges their prior beliefs. The study reveals several mechanisms that modulate the degree to which LLMs adhere to RAG systems, shedding light on the complexities of integrating prior knowledge and retrieved information in LLMs.

Reference: https://arxiv.org/abs/2404.101...

ML and AI papers

How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior (AI summary)

Recent posts

Foundational Models Defining a New Era in Vision: A Survey and Outlook (AI summary)

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning (AI summary)

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents (AI summary)