Key Points

1. The paper introduces a new problem formulation for machine unlearning (MU) in generative AI models, where the forget set is redefined as the set of undesired model outputs rather than specific input-output mappings.

2. The paper proposes three key objectives for evaluating the effectiveness of GenAI MU techniques: Accuracy (the model should not generate outputs from the target forget set), Locality (the unlearned model should maintain performance on the retain set), and Generalizability (the unlearned model should generalize the forgetting to the unseen forget set).

3. The paper provides a comprehensive taxonomy of existing GenAI MU techniques, categorizing them into two main approaches: Parameter Optimization and In-Context Unlearning. It discusses the unique characteristics, advantages, and limitations of each category in detail.

4. The paper summarizes commonly used datasets and benchmarks for evaluating GenAI MU techniques across different applications, including safety alignment, copyright protection, hallucination elimination, privacy compliance, and bias/unfairness alleviation.

5. The paper discusses the real-world applications and practical significance of GenAI MU, highlighting use cases beyond just protecting individual data privacy, such as accelerating leave-one-out cross-validation, addressing catastrophic forgetting, and serving as a safety alignment tool.

6. The paper identifies key challenges in GenAI MU, such as maintaining consistency of unlearning targets, enhancing the robustness of unlearning approaches against adversarial attacks, and ensuring the reliability of LLMs as evaluators, and proposes promising future research directions.

Summary

This paper provides a comprehensive survey on machine unlearning (MU) techniques for generative AI (GenAI) models, which include generative image models, large language models (LLMs), and multimodal (large) language models (MLLMs). The authors first introduce a new problem formulation for GenAI unlearning, highlighting the key differences from traditional MU. They define three crucial objectives for effective GenAI unlearning: Accuracy, Locality, and Generalizability. J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.

The authors then categorize existing GenAI unlearning approaches into two main groups: Parameter Optimization and In-Context Unlearning. Parameter Optimization techniques focus on directly adjusting model parameters to selectively forget certain behaviors, while preserving overall performance. This includes gradient-based methods, knowledge distillation, data sharding, extra learnable layers, task vectors, and parameter-efficient module operations. In-Context Unlearning, on the other hand, retains the original model parameters and manipulates the input context or environment to facilitate unlearning.

The paper also provides a comprehensive overview of the datasets and benchmarks commonly used to evaluate GenAI unlearning, organized by their intended unlearning objectives, such as safety alignment, copyright protection, hallucination elimination, privacy compliance, and bias/unfairness alleviation. Furthermore, the authors discuss real-world applications of GenAI unlearning, including mitigating the generation of inappropriate content, ensuring privacy compliance, protecting copyrights, reducing hallucinations, and alleviating biases.

Finally, the paper identifies several critical challenges, such as theoretical analysis, knowledge entanglement, and the reliability of LLMs as evaluators, and highlights promising future research directions in this emerging field. Overall, this survey offers a thorough, structured, and insightful examination of the current state of machine unlearning techniques in the rapidly evolving domain of generative AI, serving as a valuable resource for both researchers and practitioners working towards more trustworthy and responsible AI systems.

Reference: https://arxiv.org/abs/2407.20516