Chinchilla Scaling: A replication attempt (AI summary)

Tamay Besiroglu, Ege Erdil, Matthew Barnett, Josh You

Read more

Best Practices and Lessons Learned on Synthetic Data for Language Models (AI summary)

Ruibo Liu, Jerry Wei, Fangyu Liu, Chenglei Si, Yanzhe Zhang, Jinmeng Rao, Steven Zheng, Daiyi Peng, Diyi Yang, Denny Zhou, Andrew M. Dai

Read more

Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought (AI summary)

Jooyoung Lee, Fan Yang, Thanh Tran, Qian Hu, Emre Barut, Kai-Wei Chang, Chengwei Su

Read more

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention (AI summary)

Tsendsuren Munkhdalai, Manaal Faruqui, Siddharth Gopal

Read more

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models (AI summary)

Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei

Read more

Long-context LLMs Struggle with Long In-context Learning (AI summary)

Tianle Li, Ge Zhang, Quy Duc Do, Xiang Yue, Wenhu Chen

Read more

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models (AI summary)

David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, Adam Santoro

Read more

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement (AI summary)

Nicholas Lee, Thanakul Wattanawong, Sehoon Kim, Karttikeya Mangalam, Sheng Shen, Gopala Anumanchipali, Michael W. Mahoney, Kurt Keutzer, Amir Gholami

Read more

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions (AI summary)

Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini

Read more

AIOS: LLM Agent Operating System (AI summary)

Kai Mei, Zelong Li, Shuyuan Xu, Ruosong Ye, Yingqiang Ge, Yongfeng Zhang

Read more
×