ToolGen: Unified Tool Retrieval and Calling via Generation (AI summary)

Renxi Wang, Xudong Han, Lei Ji, Shu Wang, Timothy Baldwin, Haonan Li

Read more

When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 (AI summary)

R. Thomas McCoy, Shunyu Yao, Dan Friedman, Mathew D. Hardy, Thomas L. Griffiths

Read more

Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models (AI summary)

Fei Wang, Xingchen Wan, Ruoxi Sun, Jiefeng Chen, Sercan Ö. Arık

Read more

RATIONALYST: Pre-training Process-Supervision for Improving Reasoning (AI summary)

Dongwei Jiang, Guoxuan Wang, Yining Lu, Andrew Wang, Jingyu Zhang, Chuyu Liu, Benjamin Van Durme, Daniel Khashabi

Read more

Differential Transformer (AI summary)

Tianzhu Ye, Li Dong, Yuqing Xia, Yutao Sun, Yi Zhu, Gao Huang, Furu Wei

Read more

Archon: An Architecture Search Framework for Inference-Time Techniques (AI summary)

Jon Saad-Falcon, Adrian Gamarra Lafuente, Shlok Natarajan, Nahum Maru, Hristo Todorov, Etash Guha, E. Kelly Buchanan, Mayee Chen, Neel Guha, Christopher Ré, Azalia Mirhoseini

Read more

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering (AI summary)

Jun Shern Chan, Neil Chowdhury, Oliver Jaffe, James Aung, Dane Sherburn, Evan Mays, Giulio Starace, Kevin Liu, Leon Maksin, Tejal Patwardhan, Lilian Weng, Aleksander Mądry

Read more

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations (AI summary)

Hadas Orgad, Michael Toker, Zorik Gekhman, Roi Reichart, Idan Szpektor, Hadas Kotek, Yonatan Belinkov

Read more

Were RNNs All We Needed? (AI summary)

Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio, Hossein Hajimirsadegh

Read more

Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts (AI summary)

Ming Wang, Yuanzhong Liu, Xiaoyu Liang, Yijie Huang, Daling Wang, Xiaocui Yang, Sijia Shen, Shi Feng, Xiaoming Zhang, Chaofeng Guan, Yifei Zhang

Read more
×