Key Points

1. M ODEL S WARMS is a collaborative search algorithm that adapts LLM experts via swarm intelligence, where each LLM expert is viewed as a "particle" that collaboratively searches the weight space.

2. M ODEL S WARMS starts with a pool of LLM experts and a utility function that represents the adaptation objective. It then iteratively updates the velocity and location of each particle to explore the weight space and optimize the utility function.

3. The velocity update is influenced by inertia, personal best, global best, and global worst, enabling each expert to chart an independent search path while exploring good neighborhoods in the weight space.

4. Compared to existing model composition approaches, M ODEL S WARMS offers tuning-free model adaptation, works in low-data regimes with as few as 200 examples, and does not require assumptions about the experts or how they should be composed.

5. Extensive experiments demonstrate that M ODEL S WARMS can flexibly adapt LLM experts to single tasks, multi-task domains, reward models, and diverse human interests, improving over 12 model composition baselines by up to 21.0%.

6. Analysis reveals that M ODEL S WARMS helps LLM experts discover previously unseen capabilities through the collaborative search process, evident in the 44.8% average "correctness emergence" across datasets.

7. The best-performing experts in M ODEL S WARMS often did not start as the best, indicating that weak experts are not inherently less effective but can undergo a "weak-to-strong" transition through collaborative search.

8. The diversity of starting experts is crucial for the success of M ODEL S WARMS, with more diverse experts leading to a 35.3% average improvement.

9. M ODEL S WARMS can be generalized to work with heterogeneous model architectures by operating on token probability distributions instead of model weights, opening up possibilities for seamless model composition.

Summary


MODEL SWARMS is a collaborative search algorithm that adapts large language models (LLMs) through swarm intelligence. The approach starts with a pool of pre-trained LLM experts and a utility function representing the adaptation objective. Inspired by particle swarm optimization (PSO), MODEL SWARMS views each LLM expert as a "particle" with a location (model weights) and a velocity (direction in the weight space). The velocity of each particle is iteratively updated based on its personal best location, the global best location across all particles, and the global worst location, enabling the experts to independently explore the weight space while being guided towards good neighborhoods.

Key advantages of MODEL SWARM
Compared to existing model composition approaches, MODEL SWARMS offers several key advantages. It is a training-free method that can adapt LLM experts to various objectives, including single tasks, multi-task domains, reward models, and human interests, using as few as 200 examples. It does not require any assumptions about the available experts or how they should be composed. The collaborative search process enables the discovery of previously unseen capabilities in the initial LLM checkpoints, allowing weak experts to transition into strong adapted models. Extensive experiments demonstrate the effectiveness of MODEL SWARMS. It outperforms 12 model composition baselines by up to 21.0% across tasks and contexts. On single tasks, MODEL SWARMS shows particular strength in reasoning-intensive contexts, achieving an average improvement of 21.0% over the baselines. In multi-task adaptation, it often produces Pareto-optimal experts that outperform models optimized for individual tasks. For reward model adaptation, MODEL SWARMS offers steerable experts that are on par or better than baselines by up to 14.6% in controllability. It also produces LLM experts that are preferred by humans in 85% of the evaluated interest domains.

Insights about the MODEL SWARMS search process
Further analysis reveals insights about the MODEL SWARMS search process. The diversity of starting experts is crucial, and the best ending models often did not start as the best. The collaborative search enables the discovery of previously "impossible" capabilities that were not present in the initial checkpoints. MODEL SWARMS could be accelerated through dropout-like techniques and seamlessly extended to handle experts with different model architectures through a token-based variant. Overall, the MODEL SWARMS framework presents a versatile approach to reimagine the potential of diverse open-source LLMs.

Reference: https://arxiv.org/abs/2410.11163