PKU-Alignment Group @Pair-Lab (under construction)
PKU-Alignment Group @Pair-Lab (under construction)
News
People
Events
Publications
Contact
More Platforms
知乎
Bilibili
Email
小红书
PAIR-Lab
Copied
Copied to clipboard
Paper-Conference
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
Jiaming Ji
,
Xinyu Chen
,
Rui Pan
,
Conghui Zhang
,
Han Zhu
,
Jiahao Li
,
Donghai Hong
,
... show more (4 authors)
,
Boyuan Chen
,
Jiayi Zhou
,
Kaile Wang
,
Juntao Dai
,
show less
,
Chi-Min Chan
,
Yida Tang
,
Sirui Han
,
Yike Guo
,
Yaodong Yang
NeurIPS 2025
Safety Alignment,
Robotics,
Vision-Language-Action
PDF
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
Borong Zhang
,
Yuhao Zhang
,
Jiaming Ji
,
Yingshan Lei
,
Josef Dai
,
Yuanpei Chen
,
Yaodong Yang
NeurIPS 2025
Spotlight
Safety Alignment,
Robotics,
Vision-Language-Action
PDF
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Jiayi Zhou
,
Jiaming Ji
,
Boyuan Chen
,
Jiapeng Sun
,
Wenqi Chen
,
Donghai Hong
,
Sirui Han
,
Yike Guo
,
Yaodong Yang
NeurIPS 2025
Safety Alignment,
Robotics,
Vision-Language-Action
PDF
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
Boyuan Chen
,
Donghai Hong
,
Jiaming Ji
,
Jiacheng Zheng
,
Bowen Dong
,
Jiayi Zhou
,
Kaile Wang
,
... show more (3 authors)
,
Juntao Dai
,
Xuyao Wang
,
Wenqi Chen
,
show less
,
Qirui Zheng
,
Wenxin Li
,
Sirui Han
,
Yike Guo
,
Yaodong Yang
NeurIPS 2025 Spotlight
Safety Alignment,
Robotics,
Vision-Language-Action
PDF
ProgressGym: Alignment with a Millennium of Moral Progress
Tianyi Qiu
,
Yang Zhang
,
Xuchuan Huang
,
Jasmine Xinze Li
,
Jiaming Ji
,
Yaodong Yang
ACL 2025 Findings
Safety Alignment,
Robotics,
Vision-Language-Action
PDF
Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu
,
Fanzhi Zeng
,
Jiaming Ji
,
Dong Yan
,
Kaile Wang
,
Jiayi Zhou
,
Yang Han
,
Josef Dai
,
Xuehai Pan
,
Yaodong Yang
ACL 2025 Findings
Safety Alignment,
Robotics,
Vision-Language-Action
PDF
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
Hantao Lou
,
Changye Li
,
Jiaming Ji
,
Yaodong Yang
ICML 2025
AI Alignment,
Multimodal Models
PDF
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
Hantao Lou
,
Jiaming Ji
,
Kaile Wang
,
Yaodong Yang
AAAI 2025
AI Alignment
PDF
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
Jiayi Zhou
,
Jiaming Ji
,
Juntao Dai
,
Yaodong Yang
AAAI 2025
Oral
AI Alignment
PDF
Aligner: Efficient Alignment by Learning to Correct
Jiaming Ji
,
Boyuan Chen
,
Hantao Lou
,
Donghai Hong
,
Borong Zhang
,
Xuehai Pan
,
Juntao Dai
,
Yaodong Yang
NeurIPS 2024
Oral
AI Alignment,
AI Safety,
NeurIPS
PDF
Code
Dataset
»
Cite
×