Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation (AI summary)

Ofir Press, Noah A. Smith, Mike Lewis

Read more

Evaluating Large Language Models Trained on Code (AI summary)

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba

Read more

RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation (AI summary)

Mahdi Nikdan, Soroush Tabesh, Dan Alistarh

Read more

Self-Consuming Generative Models Go MAD (AI summary)

Sina Alemohammad, Josue Casco-Rodriguez, Lorenzo Luzi, Ahmed Imtiaz Humayun, Hossein Babaei, Daniel LeJeune, Ali Siahkoohi, Richard G. Baraniuk

Read more

ReFT: Reasoning with Reinforced Fine-Tuning (AI summary)

Trung Quoc Luong, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran Jin, Hang Li

Read more

Transformers are Multi-State RNNs (AI summary)

Matanel Oren, Michael Hassid, Yossi Adi, Roy Schwartz

Read more

Training Compute-Optimal Large Language Models (AI summary)

Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, Laurent Sifre

Read more

WizardCoder: Empowering Code Large Language Models with Evol-Instruct (AI summary)

Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang

Read more

A Closer Look at AUROC and AUPRC under Class Imbalance (AI summary)

Matthew B. A. McDermott (1), Lasse Hyldig Hansen (2), Haoran Zhang (3), Giovanni Angelotti (4), Jack Gallifant (3) ((1) Harvard Medical School, (2) Aarhus University, (3) Massachusetts Institute of Technology, (4) IRCCS Humanitas Research Hospital)

Read more

Tuning Language Models by Proxy (AI summary)

Alisa Liu, Xiaochuang Han, Yizhong Wang, Yulia Tsvetkov, Yejin Choi, Noah A. Smith

Read more
×