Key Points
1. The paper establishes that models trained on synthetic data are sensitive to passive property inheritance, and systematically studies the consequences of synthetic data integration.
2. The paper introduces a comprehensive toolkit for automatically monitoring LLMs' latent characteristics during training.
3. The paper finds that passive property inheritance from synthetic data impacts model behavior preferences when used as evaluators.
4. The paper proposes "active inheritance" as a mechanism for steering synthetic data curation towards desirable properties.
5. The paper demonstrates that strategic gathering and curation of synthetic data can significantly amplify desired characteristics like length and lexical diversity, and reduce undesired ones like toxicity.
6. The paper finds that multi-source sampling, where generations are sampled from multiple teacher models, is more effective than single-source sampling for steering model behavior.
7. The paper highlights that models are surprisingly sensitive to certain attributes in synthetic data, even when the prompts appear neutral.
8. The paper shows that training on data distilled from a model does not necessarily lead to replicating the model's profile, and can sometimes have the opposite effect.
9. The paper suggests that the success of active inheritance is limited by the quality of the pool of samples, and maximizing the chance of obtaining samples with desired characteristics can be achieved by employing a set of diverse teacher models.
Summary
The research paper investigates the impact of synthetic data on the properties and behaviors of large language models (LLMs). The study extensively characterizes the influence of passive inheritance of model properties by analyzing the consequences of synthetic data integration. The findings show that models are remarkably sensitive to certain attributes even when the synthetic data prompts appear "neutral." The research raises the question of whether this sensitivity can be exploited for positive purposes. To answer this question, the concept of "active inheritance" is introduced, wherein synthetic data is intentionally constrained according to a non-differentiable objective to steer the generation profiles of models towards desirable attributes.
Historical Challenges and Recent Efforts in Data Optimization
The paper discusses the historical challenges associated with high-quality labeled data, such as scarcity and financial costs, and how recent efforts have focused on optimizing existing data through data augmentation, auxiliary data fields, data weighting, data pruning, and curriculum learning. These methods, however, largely rely on the enhancement of existing "fixed" datasets and have limitations in introducing new properties or explicitly optimizing for task-specific metrics.
Understanding Passive Inheritance of Model Properties
The findings of the research highlight the need to understand the passive inheritance of model properties and preferences when trained on synthetic data from various sources. The paper thoroughly investigates how different attributes are transferred across models via synthetic data usage and how these changes are manifested in LLMs' generations and their evaluator preferences. The study demonstrates that models trained on synthetic data are sensitive to passive property inheritance and that passive inheritance from synthetic data impacts model behavior preferences when used as evaluators.
Proposing Active Inheritance for Synthetic Data Curation
The concept of active inheritance is proposed as a mechanism for steering synthetic data curation towards desirable properties. The study shows that by strategically gathering and curating synthetic data, it is possible to significantly amplify desired characteristics and reduce undesired ones. The paper highlights successful instances of actively steering model behavior to amplify desired attributes, such as increasing length and lexical diversity while decreasing toxicity. Additionally, the study explores the implications of using synthetic data to mitigate undesirable characteristics.
The research also compares the active inheritance approach with established optimization methods for non-differentiable attributes, such as policy-gradient based reinforcement learning algorithms. It emphasizes that active inheritance does not require a reward model or access to log probabilities, distinguishing it from traditional reinforcement learning frameworks.
Comprehensive Insights and Acknowledgment of Limitations
Overall, the paper provides comprehensive insights into the unintended consequences of synthetic data usage and offers guidance on how to tailor models towards desirable generation profiles. The study acknowledges several limitations and the need for further exploration of modifications and complexities associated with the guided distillation framework.
Reference: https://arxiv.org/abs/2407.01490