Key Points

1. The paper introduces a parametric World Knowledge Model (WKM) to facilitate agent planning, aiming to address the limitations of large language models (LLMs) in understanding the "real" physical world, which leads to brainless trial-and-error in global planning and generating hallucinatory actions in local planning.

2. The WKM synthesizes knowledge from expert and sampled trajectories and provides prior task knowledge to guide global planning and dynamic state knowledge to assist local planning. This is achieved through self-synthesizing task knowledge and state knowledge, and training the WKM based on the synthesized knowledge.

3. The WKM is evaluated on three real-world simulated planning tasks (ALFWorld, WebShop, and ScienceWorld) with state-of-the-art open-source LLMs (Mistral-7B, Gemma-7B, and Llama-3-8B), demonstrating superior performance compared to various strong baselines on both seen and unseen tasks.

4. The paper indicates that the WKM effectively reduces blind trial-and-error and hallucinatory actions, and its instance-level task knowledge generalizes better to unseen tasks. The results also show the feasibility of weak WKM guiding a strong agent model planning.

5. The WKM's ability to generalize to unseen tasks is highlighted, and the paper explores the potential of multi-task unified WKM training, suggesting that a unified world knowledge model could generalize to guide various agent models, analogous to the concept of Artificial General Intelligence (AGI).

6. The impact of explicit state knowledge on agent planning performance is evaluated, indicating that implicit knowledge constraints are often more prudent than explicitly extending prompts with a large amount of natural language feedback.

7. The paper concludes with the discussion of potential future directions, including building a unified world knowledge model, learning to predict the world like a world model, and applying to multi-modal agent planning, while acknowledging limitations such as the ongoing challenge of determining what a language model knows and doesn’t know.

8. The results also provide insights into the impact of the WKM on the planning performance of LLMs, its ability to mitigate blind trial-and-error and reduce hallucinatory actions, and the potential for a unified world knowledge model to be applied to help various agent models and generalize to guide held-out agent models, which could contribute to the development of Artificial General Intelligence (AGI).

9. Another key finding is that the explicit state knowledge can hurt the agent planning performance, and blindly extending prompts with a large amount of explicit natural language feedback may lead to lose-more-than-gain for agent planning.
.

Summary

The paper introduces a parametric World Knowledge Model (WKM) to enhance agent planning for interactive tasks, addressing the limitations of large language models (LLMs) in global and local planning. One of the key aspects studied in the paper is the introduction of a method for the agent model to self-synthesize knowledge from expert and sampled trajectories. The proposed World Knowledge Model (WKM) provides prior task knowledge to guide global planning and dynamic state knowledge to assist local planning. The experimental results show that the WKM effectively alleviates blind trial-and-error and hallucinatory action issues, providing strong support for the agent’s understanding of the world. Other findings include better generalization of instance-level task knowledge to unseen tasks, weak WKM guiding strong agent model planning, and the potential for further development through unified WKM training.

Evaluation of the WKM Performance
The paper evaluates the proposed method on three state-of-the-art LLMs and demonstrates superior performance compared to various strong baselines. The experimental results show that the WKM can effectively reduce blind trial-and-error and hallucinatory actions and that the model-generated instance-level knowledge can generalize better to unseen tasks. The study includes an exploration of the impact of weak WKM on strong agent model planning and analyzes the effectiveness of multi-task unified WKM training.

Research Findings and Future Directions
The research findings suggest that the proposed WKM can reduce brainless trial-and-error and invalid actions, generalize better to unseen tasks, achieve weak-guide-strong, and be effectively extended to unified world knowledge training. The paper also discusses the potential future directions, including building a unified world knowledge model, learning to predict the world like a world model, and applying to multi-modal agent planning.

Acknowledgment of Limitations
Despite the positive outcomes, the paper acknowledges some limitations, such as the challenge of determining what a language model knows and doesn’t know, the current limitation of world knowledge to textual information, the inability of the world knowledge model to dynamically update with changes in the world and feedback from the agent, and the introduction of additional inference overhead during world knowledge generation.

In summary, the paper introduces a novel approach to agent planning using a parametric World Knowledge Model, and the experimental results demonstrate the effectiveness of the proposed method in addressing key challenges in global and local planning, with potential for further development and application in various agent models and tasks.

Reference: https://arxiv.org/abs/2405.142...