Key Points

1. Dense retrieval has been effective across various tasks and languages but creating effective zero-shot dense retrieval systems without relevance labels remains difficult.

2. The proposed Hypothetical Document Embeddings (HyDE) system aims to address this challenge by using a generative language model to create a hypothetical document capturing relevance patterns and an unsupervised contrastively learned encoder to encode the document into an embedding vector for retrieval.

3. HyDE significantly outperforms the state-of-the-art unsupervised dense retriever Contriever and shows strong performance comparable to fine-tuned retrievers across various tasks and languages.

4. The paper discusses the challenges of zero-shot dense retrieval and the limitations of relying on large supervised datasets for transfer learning setups.

5. Modern deep learning methods, such as large language models and text encoders pre-trained with contrastive objectives, are leveraged to address the lack of supervision in zero-shot dense retrieval systems.

6. The proposed HyDE model is based on unsupervised contrastive learning and involves using an instruction-following language model to generate hypothetical documents for retrieval without reliance on real relevance labels.

7. Experimental results show that HyDE outperforms baseline models such as Contriever and mContriever across various retrieval tasks and languages, even in low-resource settings.

8. The study evaluates the impact of different language models and fine-tuned encoders on the performance of HyDE and explores its practical use in real-world search scenarios.

9. The paper concludes by raising questions about the role of numerical relevance scores in retrieval models and suggesting that HyDE offers a new paradigm of interactions between language models and dense retrievers for practical use in search systems.

Summary

Novel Approach to Building Effective Zero-Shot Dense Retrieval Systems
The research paper presents a novel approach to building effective fully zero-shot dense retrieval systems, which do not require relevance supervision, work out-of-box, and generalize across tasks. The proposed method, named Hypothetical Document Embeddings (HyDE), decomposes dense retrieval into two tasks: a generative task performed by an instruction-following language model and a document-document similarity task performed by a contrastive encoder. HyDE significantly outperforms previous state-of-the-art systems on 11 query sets covering tasks such as Web Search, Question Answering, Fact Verification, and various languages.


Challenges of Creating Zero-Shot Dense Retrieval Systems

The paper acknowledges the challenges of creating effective fully zero-shot dense retrieval systems when no relevance labels are available. The authors propose the use of HyDE, which first instructs an instruction-following language model to generate a hypothetical document capturing relevance patterns. This is followed by encoding the hypothetical document using a contrastive encoder, filtering out incorrect details and retrieving similar real documents based on vector similarity in the corpus embedding space.
The proposed method significantly outperforms the state-of-the-art unsupervised dense retriever Contriever and demonstrates strong performance across various tasks and languages. The experiments showed that HyDE using InstructGPT and Contriever as backbone models significantly outperforms previous state-of-the-art systems on various tasks and languages.

Addressing Challenges of Zero-Shot Dense Retrieval
The paper also discusses the challenges of zero-shot dense retrieval and the difficulty of training dense retrieval systems without relevance labels. It explores the use of unsupervised contrastive learning and generative language models to address these challenges, showing promising results in comparison to fine-tuned models and different retrieval tasks.

New Paradigm of Interactions between Language Models and Dense Retrievers
Overall, the paper introduces a new paradigm of interactions between language models and dense encoders/retrievers, demonstrating that relevance modeling and instruction understanding can be delegated to more powerful and flexible language models, removing the need for relevance labels. The authors propose that HyDE can be practically used at the beginning of a search system's life, offering comparable performance to fine-tuned models, before a supervised dense retriever can gradually be rolled out as the search log grows.

Reference: https://arxiv.org/abs/2212.10496