Key Points
1. The paper investigates parameter-efficient fine-tuning methods for large language models (LLMs) and presents a new method called Robust Adaptation (RoSA) inspired by robust principal component analysis (PCA).
2. RoSA outperforms both Low-Rank Adaptation (LoRA) and pure sparse fine-tuning at the same parameter budget across various generative tasks such as grade-school math and SQL query generation for LLMs.
3. The paper provides system support for RoSA including sparse GPU kernels, enabling memory- and computationally-efficient training.
4. State-of-the-art LLMs have demonstrated exceptional performance but come with high computational and memory costs, making fine-tuning using limited data an effective approach to improve LLM performance on specific tasks.
5. Parameter-Efficient Fine-Tuning (PEFT) methods allow optimization over a restricted set of parameters, offering partial accuracy recovery relative to full fine-tuning (FFT) at a fraction of its computational and memory cost.
6. LoRA methods train low-rank "adapter" layers for a selection of model layers, but can fail to fully recover accuracy for more complex tasks, leading to the investigation of methods combining the practical performance of LoRA with the high accuracy of FFT.
7. The paper proposes the RoSA method, which trains two adapters (low-rank and sparse) in a stable manner and leads to considerably higher accuracy of the resulting model at a comparable parameter budget relative to standard adapters.
8. The paper discusses the challenges and advancements in system support for sparsity, PyTorch implementation of RoSA, and extensions of robust principal component analysis for large-scale matrices.
9. Experimental results demonstrate that RoSA significantly outperforms LoRA and SpA at the same parameter budget and can match or even outperform FFT in various settings. The paper also outlines the impact of different mask generation methods and the need for better criteria to generate masks for sparse adaptation.
Summary
Reference: https://arxiv.org/abs/2401.04679