Policy Improvement using Language Policy Improvement using Language Feedback Models (AI summary)

Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté

Read more

Scaling Laws for Fine-Grained Mixture of Experts (AI summary)

Jakub Krajewski, Jan Ludziejewski, Kamil Adamczewski, Maciej Pióro, Michał Krutul, Szymon Antoniak, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Piotr Sankowski, Marek Cygan, Sebastian Jaszczur

Read more

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping (AI summary)

Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

Read more

ODIN: Disentangled Reward Mitigates Hacking in RLHF (AI summary)

Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro

Read more

Direct Language Model Alignment from Direct Language Model Alignment from Online AI Feedback (AI summary)

Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel

Read more

Scaling Laws for Downstream Task Performance of Large Language Models (AI summary)

Berivan Isik, Natalia Ponomareva, Hussein Hazimeh, Dimitris Paparas, Sergei Vassilvitskii, Sanmi Koyejo

Read more

MOMENT: A Family of Open Time-series Foundation Models (AI summary)

Mononito Goswami, Konrad Szafer, Arjun Choudhry, Yifu Cai, Shuo Li, Artur Dubrawski

Read more

MobileVLM V2: Faster and Stronger Baseline for Vision Language Model (AI summary)

Xiangxiang Chu, Limeng Qiao, Xinyu Zhang, Shuang Xu, Fei Wei, Yang Yang, Xiaofei Sun, Yiming Hu, Xinyang Lin, Bo Zhang, Chunhua Shen

Read more

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models (AI summary)

Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang

Read more

LiPO: Listwise Preference Optimization through Learning-to-Rank (AI summary)

Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang

Read more
×