Paper-Conference

Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
Jiaming Ji , Xinyu Chen , Rui Pan , Conghui Zhang , Han Zhu , Jiahao Li , Donghai Hong , ... show more (4 authors) , Chi-Min Chan , Yida Tang , Sirui Han , Yike Guo , Yaodong Yang
NeurIPS 2025
Safety Alignment, Robotics, Vision-Language-Action
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
Borong Zhang , Yuhao Zhang , Jiaming Ji , Yingshan Lei , Josef Dai , Yuanpei Chen , Yaodong Yang
NeurIPS 2025 Spotlight
Safety Alignment, Robotics, Vision-Language-Action
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Jiayi Zhou , Jiaming Ji , Boyuan Chen , Jiapeng Sun , Wenqi Chen , Donghai Hong , Sirui Han , Yike Guo , Yaodong Yang
NeurIPS 2025
Safety Alignment, Robotics, Vision-Language-Action
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
Boyuan Chen , Donghai Hong , Jiaming Ji , Jiacheng Zheng , Bowen Dong , Jiayi Zhou , Kaile Wang , ... show more (3 authors) , Qirui Zheng , Wenxin Li , Sirui Han , Yike Guo , Yaodong Yang
NeurIPS 2025 Spotlight
Safety Alignment, Robotics, Vision-Language-Action
ProgressGym: Alignment with a Millennium of Moral Progress
ProgressGym: Alignment with a Millennium of Moral Progress
Tianyi Qiu , Yang Zhang , Xuchuan Huang , Jasmine Xinze Li , Jiaming Ji , Yaodong Yang
ACL 2025 Findings
Safety Alignment, Robotics, Vision-Language-Action
Reward Generalization in RLHF: A Topological Perspective
Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu , Fanzhi Zeng , Jiaming Ji , Dong Yan , Kaile Wang , Jiayi Zhou , Yang Han , Josef Dai , Xuehai Pan , Yaodong Yang
ACL 2025 Findings
Safety Alignment, Robotics, Vision-Language-Action
Aligner: Efficient Alignment by Learning to Correct
Aligner: Efficient Alignment by Learning to Correct
Jiaming Ji , Boyuan Chen , Hantao Lou , Donghai Hong , Borong Zhang , Xuehai Pan , Juntao Dai , Yaodong Yang
NeurIPS 2024 Oral
AI Alignment, AI Safety, NeurIPS