Key Points

1. The study introduces an approach called Blending, which involves integrating multiple chat AIs to potentially outperform or match larger models like ChatGPT.

2. It explores the trend of developing larger language models (LLMs) and the resulting computational demands, memory requirements, and access restrictions.

3. Blended is shown to outperform ChatGPT in terms of user retention and engagement while requiring less computational resources.

4. The development of chat AIs from rule-based algorithms to generative retrieval-based models and pre-trained transformer language models is discussed.

5. The importance of human feedback in training chat AIs and the methods used, such as reinforcement learning and reward models, are highlighted.

6. The paper proposes Blended as an innovative approach to combine outputs from blackbox language models and compares different ensembling methods for generative language tasks.

7. A/B testing is used to compare the performance of Blended with independent base chat AIs and larger models like GPT3.5, showing superior engagement and retention.

8. The study discusses the future potential of scaling the number of component systems in Blended and the optimal selection distribution for model selection.

9. The paper concludes by emphasizing the effectiveness of Blended in improving chat AI quality without significantly increasing inference costs and its potential as a promising solution for enhancing chat AIs.

Summary

Research on Blended Language Models for Chat AI Systems
The research paper explores the use of blended language models (LLMs) in chat AI systems as an alternative to singular large-scale LLMs with a high number of parameters. The researchers introduce the "Blending" approach, which combines multiple moderately-sized LLMs to potentially outperform or match the capabilities of much larger counterparts. Empirical evidence from large-scale A/B tests on the Chai research platform over thirty days shows that the blended ensemble outcompetes state-of-the-art LLMs in terms of user retention, engagement, and entertainment, despite requiring less inference cost and memory overhead.

Traditional Trend in Conversational AI Research
The traditional trend in conversational AI research has focused on developing larger LLMs with more parameters, leading to increased computational demands and memory requirements. However, the study demonstrates that a combination of moderately-sized LLMs using the Blending approach can offer comparable or better performance than singular large models. The researchers emphasize the practical implications of their findings, suggesting that rather than scaling up systems to improve quality, blending multiple smaller open-source systems without increasing inference costs can significantly enhance a user's conversational experience.

Optimizing Selection Distribution with Deep-Learning Classifier
The study also proposes the use of a deep-learning classifier to optimize the selection distribution of the component chat AIs, allowing for a more aligned distribution and the potential addition of new chat AIs to the selection set without compromising the performance of the blended system. Overall, the paper demonstrates that the Blending approach offers a promising solution to improve the quality of chat AIs while maintaining the inference costs of smaller systems. The findings highlight the potential of collaboration among smaller LLMs to achieve enhanced chat AI efficacy.

Reference: https://arxiv.org/abs/2401.02994