Fan-Ming Luo
Ph.D. Student, LAMDA Group
School of Artificial Intelligence
National Key Laboratory for Novel Software Technology
Nanjing University, Nanjing 210023, China

Supervisor: Prof. Yang Yu

Email: luofm {AT}
Laboratory: Shaoyifu Building, Xianlin Campus of Nanjing University


Currently I am a third year Ph.D. student of School of Artificial Intelligence in Nanjing University and a member of LAMDA Group, which is led by professor Zhi-Hua Zhou.

I received my B.Sc. degree in School of Physics in June 2019 from Nanjing University. In September 2019, I was admitted to study for a M.Sc. degree in Nanjing University under the supervision of Prof. Yang Yu without entrance examination, respectively. From September 2021, I started my Ph.D. degree under the supervision of Prof. Yang Yu.

Research Interests

My research interest is reinforcement learning. Especially, I am interested in


Rong-Jun Qin, Fan-Ming Luo, Hong Qian, Yang Yu. Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games, CoRR abs/2208.09452, 2022. [paper]

Fan-Ming Luo, Xingchen Cao, Yang Yu. Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble, CoRR abs/2206.00238, 2022. [paper]

Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu. Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning, CoRR abs/2310.05422, 2023. [paper]

Conference Papers

Fan-Ming Luo, Shengyi Jiang, Yang Yu, Zongzhang Zhang, Yi-Feng Zhang. Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Event, 2022. [paper] (Oral Presentation)

Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Tony Qin, Shang Wenjie, Jieping Ye. Offline Model-based Adaptable Policy Learning. In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Event, 2021. [paper]

Journal Papers

Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen, Weinan Zhang, Yang Yu. A Survey on Model-based Reinforcement Learning, SCIENCE CHINA Information Sciences, in press. [paper]

Xiong-Hui Chen, Fan-Ming Luo, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Shang Wenjie, Jieping Ye. Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions, IEEE Transactions on Pattern Analysis and Machine Intelligence, in press. [paper]

Yi-Feng Zhang, Fan-Ming Luo, Yang Yu. Improve Generated Adversarial Imitation Learning with Reward Variance Regularization. Machine Learning, 2022. [paper]

Teaching Assistant

  • Introduction to Reinforcement Learning (with Prof. Yang Yu; for both undergraduate and graduate students), Fall, 2021.
  • Awards & Honors

  • National Scholarship for Doctoral Students 2021 (博士国家奖学金 2021)
  • LAMDA Elite Award 2021 (LAMDA英才奖 2021)
  • Silver Award in the 7th "Internet+" Innovation and Entrepreneurship Competition (“互联网+”大学生创新创业大赛银奖 2021)
  • Second Place in KDD CUP 2020 Track 4
  • Meritorious Winner in MCM 2018
    Address: National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China
    (南京市栖霞区仙林大道163号, 南京大学仙林校区603信箱, 软件新技术国家重点实验室, 210023.)