AlphaMath Almost Zero: process Supervision without process (AI summary)

Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan

Read more

Reducing hallucination in structured outputs via Retrieval-Augmented Generation (AI summary)

Patrice Béchard, Orlando Marquez Ayala

Read more

The Illusion of State in State-Space Models (AI summary)

William Merrill, Jackson Petty, Ashish Sabharwal

Read more

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (AI summary)

DeepSeek-AI

Read more

A Survey on Retrieval-Augmented Text Generation for Large Language Models (AI summary)

Yizheng Huang, Jimmy Huang

Read more

How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior (AI summary)

Kevin Wu, Eric Wu, James Zou

Read more

xLSTM: Extended Long Short-Term Memory (AI summary)

Maximilian Beck, Korbinian Pöppel, Markus Spanring, Andreas Auer, Oleksandra Prudnikova, Michael Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter

Read more

Chinchilla Scaling: A replication attempt (AI summary)

Tamay Besiroglu, Ege Erdil, Matthew Barnett, Josh You

Read more

Best Practices and Lessons Learned on Synthetic Data for Language Models (AI summary)

Ruibo Liu, Jerry Wei, Fangyu Liu, Chenglei Si, Yanzhe Zhang, Jinmeng Rao, Steven Zheng, Daiyi Peng, Diyi Yang, Denny Zhou, Andrew M. Dai

Read more

Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought (AI summary)

Jooyoung Lee, Fan Yang, Thanh Tran, Qian Hu, Emre Barut, Kai-Wei Chang, Chengwei Su

Read more
×