Key Points

1. Prior research on enhancing reasoning capabilities of large language models (LLMs) has primarily focused on prompting techniques such as few-shot or zero-shot chain-of-thought (CoT) prompting, which often involve manually intensive prompt engineering.

2. This study asks whether LLMs can reason effectively without prompting and reveals that CoT reasoning paths can be elicited from pre-trained LLMs by altering the decoding process, bypassing the confounders of prompting.

3. CoT paths are frequently inherent in alternative top tokens during decoding, and the presence of a CoT in the decoding path correlates with a higher confidence in the model’s decoded answer.

4. CoT-decoding offers an alternative way to elicit reasoning capabilities from pre-trained LLMs without explicit prompting and significantly enhances models’ reasoning capabilities over greedy decoding across various reasoning benchmarks.

5. The proposed CoT-decoding method extracts better decoding paths from LLMs, bypassing the need for specialized prompting and allowing for a more accurate assessment of the models’ intrinsic reasoning abilities.

6. CoT-decoding improves reasoning performance over existing prompting methods in math reasoning tasks, natural language reasoning tasks, and symbolic reasoning tasks across various LLM model families.

7. In synthetic reasoning tasks, CoT prompting-based techniques play a larger "teaching" role in guiding how models solve a task, with models primarily mimicking the format of these prompts to generate accurate reasoning paths.

8. CoT-decoding reveals the intrinsic reasoning strategies of LLMs without being influenced by external prompts, and the generated CoT paths are stable and consistent, showing the model’s inherent capability to solve tasks effectively.

9. The study suggests that CoT-decoding can be leveraged for fine-tuning models to enhance their reasoning capabilities and explores the potential of branching at different decoding steps for even better results.

Summary

The research study explores the inherent reasoning capabilities of large language models (LLMs) without specialized prompting, focusing on eliciting chain-of-thought (CoT) reasoning paths during the decoding process. The study reveals that altering the decoding strategy to consider alternative top tokens uncovers CoT paths within pre-trained LLMs. These CoT paths are found to correlate with higher model confidence in the decoded answer.

The proposed CoT-decoding method significantly outperforms standard greedy decoding across various reasoning benchmarks, demonstrating the models' natural ability to reason effectively without explicit prompting. Additionally, the study compares CoT-decoding with existing prompting methods and identifies the potential for further research in leveraging CoT-decoding paths for model fine-tuning to enhance reasoning capabilities.

The findings also highlight the challenge of branching at later decoding steps and the need for future exploration in this area. The study provides valuable insights into the natural reasoning abilities of LLMs and introduces a novel approach to elicit CoT reasoning, offering potential for further advancements in LLM reasoning capabilities.

Reference: https://arxiv.org/abs/2402.10200