AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents (AI summary)

Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He

Read more

Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasonin (AI summary)

Yanfang Zhang, Yiliu Sun, Yibing Zhan, Dapeng Tao, Dacheng Tao, Chen Gong

Read more

Fine-grained Hallucination Detection and Editing for Language Models (AI summary)

Abhika Mishra, Akari Asai, Vidhisha Balachandran, Yizhong Wang, Graham Neubig, Yulia Tsvetkov, Hannaneh Hajishirzi

Read more

ChatQA: Building GPT-4 Level Conversational QA Models (AI summary)

Zihan Liu, Wei Ping, Rajarshi Roy, Peng Xu, Chankyu Lee, Mohammad Shoeybi, Bryan Catanzaro

Read more

A phase transition between positional and semantic learning in a solvable model of dot-product attention (AI summary)

Hugo Cui, Freya Behrens, Florent Krzakala, Lenka Zdeborová

Read more

Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering (AI summary)

Tal Ridnik, Dedy Kredo, Itamar Friedman

Read more

Self-Rewarding Language Models( AI summary)

Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

Read more

AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls (AI summary)

Yu Du, Fangyun Wei, Hongyang Zhang

Read more

Consistency Models (AI summary)

Yang Song, Prafulla Dhariwal, Mark Chen, Ilya Sutskever

Read more

Language Models are Few-Shot Learners (AI summary)

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei

Read more
×