Our Mission

The PKU-Alignment Group, under the PKU Pair-Lab, is a pioneering research interest group dedicated to advancing the frontiers of AI safety and alignment. Our mission is to explore the fundamental algorithms and mechanisms that underpin AI alignment, driving both theoretical innovation and practical deployment.

Our mission is to ensure that AI systems remain consistently aligned with human goals. The team actively shares the latest advances in AI research, while fostering the development and real-world adoption of safety and alignment practices. Our key research direction include:

  • Mechanisms and Interpretability in Alignment: Investigating whether large models can be effectively aligned, their resilience to misalignment, and the interpretability of alignment mechanisms;
  • Reinforcement Learning and Post-training of Language Models: Designing more efficient and reliable post-alignment algorithms;
  • Safety Alignment and Superalignment: Addressing frontier-risk alignment challenges such as deceptive alignment, scalable oversight, CBRN hazards, and interpretability; as well as value alignment issues, including regional value alignment and bidirectional value lock-in.

Latest News