![]() |
王鹏远 |
Currently I am a first-year Ph.D. student of School of Artificial Intelligence in Nanjing University and a member of LAMDA Group, led by professor Zhi-Hua Zhou.
I received my B.Sc. degree from School of Computer Science, Northwestern Polytechnical University (NPU), China in June 2022. In September 2022, I was admitted to pursue a M.Sc. degree in Nanjing University, under the supervision of Professor Yang Yu.
My research focus on the algorithm design in reinforcement learning. Currently, I am working on large language models and reinforcement learning.
Feel free to contact me if you want to discuss some ideas.
Peng-Yuan Wang*, Tian-Shuo Liu*, Chenyang Wang, Ziniu Li, Yi-Di Wang, Shu Yan, Cheng-Xing Jia, Xu-Hui Liu, Xin-Wei Chen, Jia-Cheng Xu, Yang Yu. A Survey on Large Language Models for Mathematical Reasoning. arXiv:2506.08446 [PDF]
Chengxing Jia, Peng-Yuan Wang, Ziniu Li, Yi-Chen Li, Nan Tang, Yang Yu. BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation. arXiv:2405.17039 [PDF]
Zhilong Zhang*, *Ruifeng Chen*, *Junyin Ye*, Yihao Sun, Peng-Yuan Wang, Jingcheng Pang, Kaiyuan Li, Tianshuo Liu, Haoxin Lin, Yang Yu, Zhi-Hua Zhou. WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making. arXiv:2411.05619 [PDF]
Peng-Yuan Wang*, Jing-Cheng Pang*, Chen-Yang Wang*, Xuhui Liu, Tian-Shuo Liu, Si-Hang Yang, Hong Qian, Yang Yu. InCLET: Large Language Model In-context Learning can Improve Embodied Instruction-following. In Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS'25), 2025 [PDF]
Chengxing Jia, Ziniu Li, Peng-Yuan Wang, Yi-Chen Li, Zhenyu Hou, Yuxiao Dong, Yang Yu. Controlling Large Language Model with Latent Actions. In Proceedings of the 12th International Conference on Learning Representations (ICML'25), 2025 [PDF]
Tian-Shuo Liu, Xu-Hui Liu, Ruifeng Chen, Lixuan Jin, Peng-Yuan Wang, Zhilong Zhang, Yang Yu. Semantic Skill Extraction via Vision-Language Model Guidance for Efficient Reinforcement Learning. In Proceedings of the 12th International Conference on Learning Representations (ICML'25), 2025 [PDF]
Jing-Cheng Pang*, Peng-Yuan Wang*, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu. Language model self-improvement by reinforcement learning contemplation. In Proceedings of the 12th International Conference on Learning Representations (ICLR'24), 2024 [PDF]
Jing-Cheng Pang*, Heng-Bo Fan*, Peng-Yuan Wang*, Jia-Hao Xiao*, Nan Tang, Si-Hang Yang, Chengxing Jia, Sheng-Jun Huang, Yang Yu. Interactive Large Language Models for Reliable Answering under Incomplete Context. Transactions on Machine Learning Research (TMLR), 2025 [PDF]
* Equal Contribution
Email:
wangpy@lamda.nju.edu.cn
Laboratory:
Shaoyifu Building, Xianlin Campus of Nanjing University
Address:
Peng-Yuan Wang, National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China
(南京市栖霞区仙林大道163号, 南京大学仙林校区603信箱, 软件新技术国家重点实验室, 210023.)