DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data (AI summary)

Huajian Xin, Daya Guo, Zhihong Shao, Zhizhou Ren, Qihao Zhu, Bo Liu, Chong Ruan, Wenda Li, Xiaodan Liang

Read more

A Primer on the Inner Workings of Transformer-based Language Models (AI summary)

Javier Ferrando, Gabriele Sarti, Arianna Bisazza, Marta R. Costa-jussà

Read more

INDUS: Effective and Efficient Language Models for Scientific Applications (AI summary)

Bishwaranjan Bhattacharjee, Aashka Trivedi, Masayasu Muraoka, Muthukumaran Ramasubramanian, Takuma Udagawa, Iksha Gurung, Rong Zhang, Bharath Dandala, Rahul Ramachandran, Manil Maskey, Kaylin Bugbee, Mike Little, Elizabeth Fancher, Lauren Sanders, Sylvain Costes, Sergi Blanco-Cuaresma, Kelly Lockhart, Thomas Allen, Felix Grezes, Megan Ansdell, Alberto Accomazzi, Yousef El-Kurdi, Davis Wertheimer, Birgit Pfitzmann, Cesar Berrospi Ramis, Michele Dolfi, Rafael Teixeira de Lima, Panagiotis Vagenas, S. Karthik Mukkavilli, Peter Staar, Sanaz Vahidinia, Ryan McGranaghan, Armin Mehrabian, Tsendgar Lee

Read more

Self-Play Preference Optimization for Language Model Alignment (AI summary)

Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu

Read more

Lessons from the Trenches on Reproducible Evaluation of Language Models (AI summary)

Stella Biderman, Hailey Schoelkopf, Lintang Sutawika, Leo Gao, Jonathan Tow, Baber Abbasi, Alham Fikri Aji, Pawan Sasanka Ammanamanchi, Sidney Black, Jordan Clive, Anthony DiPofi, Julen Etxaniz, Benjamin Fattori, Jessica Zosa Forde, Charles Foster, Mimansa Jaiswal, Wilson Y. Lee, Haonan Li, Charles Lovering, Niklas Muennighoff, Ellie Pavlick, Jason Phang, Aviya Skowron, Samson Tan, Xiangru Tang, Kevin A. Wang, Genta Indra Winata, François Yvon, Andy Zou

Read more

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models (AI summary)

Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo

Read more

CAT3D: Create Anything in 3D with Multi-View Diffusion Models (AI summary)

Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole

Read more

Layer-Condensed KV Cache for Efficient Inference of Large Language Models (AI summary)

Haoyi Wu, Kewei Tu

Read more

RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing (AI summary)

Yucheng Hu, Yuxing Lu

Read more

You Only Cache Once: Decoder-Decoder Architectures for Language Models (AI summary)

Yutao Sun, Li Dong, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei

Read more
×