The Illusion of State in State-Space Models (AI summary)

William Merrill, Jackson Petty, Ashish Sabharwal

Read more

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (AI summary)

DeepSeek-AI

Read more

A Survey on Retrieval-Augmented Text Generation for Large Language Models (AI summary)

Yizheng Huang, Jimmy Huang

Read more

How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior (AI summary)

Kevin Wu, Eric Wu, James Zou

Read more

xLSTM: Extended Long Short-Term Memory (AI summary)

Maximilian Beck, Korbinian Pöppel, Markus Spanring, Andreas Auer, Oleksandra Prudnikova, Michael Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter

Read more

Chinchilla Scaling: A replication attempt (AI summary)

Tamay Besiroglu, Ege Erdil, Matthew Barnett, Josh You

Read more

Best Practices and Lessons Learned on Synthetic Data for Language Models (AI summary)

Ruibo Liu, Jerry Wei, Fangyu Liu, Chenglei Si, Yanzhe Zhang, Jinmeng Rao, Steven Zheng, Daiyi Peng, Diyi Yang, Denny Zhou, Andrew M. Dai

Read more

Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought (AI summary)

Jooyoung Lee, Fan Yang, Thanh Tran, Qian Hu, Emre Barut, Kai-Wei Chang, Chengwei Su

Read more

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention (AI summary)

Tsendsuren Munkhdalai, Manaal Faruqui, Siddharth Gopal

Read more

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models (AI summary)

Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei

Read more
×