Yu-Cen Wang @ LAMDA, NJU-AI

wangyc.jpg 

汪钰岑
Yu-Cen Wang

Ph.D. Student
LAMDA Group
School of Artificial Intelligence
National Key Laboratory for Novel Software Technology
Nanjing University, Nanjing 210023, China

Email: wangyc@lamda.nju.edu.cn


Short Biography

Main Research Interests

My research interests include Machine Learning and Data Mining, especially Reinforcement Learning and World Models.

Publications

WSFG 
  • Yucen Wang*, Shenghua Wan*, Le Gan, Shuai Feng, De-Chuan Zhan. AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors. In Proceedings of the 41st International Conference on Machine Learning (ICML-2024), Vienna, Austria, 2024. [Paper] [Code] [Website]

  • We propose Implicit Action Generator (IAG) to learn the implicit actions of visual distractors, and present AD3, that leverages the action inferred by IAG to train separated world models. Implicit actions effectively aid in distinguishing task-irrelevant components, and the agent can optimize the policy in the task-relevant space. AD3 achieves superior performance on various visual control tasks featuring both heterogeneous and homogeneous distractors.

WSFG 
  • Shenghua Wan, Yucen Wang, Minghao Shao, Ruying Chen, De-chuan Zhan. SeMAIL: Eliminating Distractors in Visual Imitation vis Separated Models. In Proceedings of the 40th International Conference on Machine Learning (ICML-2023), Honolulu, Hawaii, USA, 2023. [Paper] [Code]

  • Existing Model-based imitation learning algorithms are highly deceptive by task-irrelevant information, especially moving distractors in videos. To tackle this problem, we propose SeMAIL, decoupling the environment dynamics into two parts by task-relevant dependency, which is determined by agent actions, and training separately. Our method achieves near-expert performance on various visual imitation tasks with complex observations.

Preprints

WSFG 
  • Yucen Wang, Rui Yu, Shenghua Wan, Le Gan, De-Chuan Zhan. FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making. [Paper] [Code]

  • We propose FOUNDER, a framework that integrates the generalizable knowledge embedded in FMs with the dynamic modeling capabilities of WMs to enable open-ended decision-making in embodied environments in a reward-free manner. FOUNDER demonstrates superior performance on various multi-task offline visual control benchmarks, excelling in capturing the deep-level semantics of tasks specified by text or videos, particularly in scenarios involving complex observations or domain gaps where prior methods struggle.

WSFG 
  • Rui Yu, Shenghua Wan, Yucen Wang, Chen-Xiao Gao, Le Gan, Zongzhang Zhang, De-Chuan Zhan. Reward Models in Deep Reinforcement Learning: A Survey. [Paper] [Code]

  • In this survey, we provide a comprehensive review of reward modeling techniques within the RL literature. We present an overview of recent reward modeling approaches, categorizing them based on the source, the mechanism, and the reward learning paradigm. This survey includes both established and emerging methods, filling the vacancy of a systematic review of reward models in current literature.

Awards & Honors

LAMDA Excellent Student Award, 2022.

Teaching Assistant

Advanced Machine Learning. (For undergraduate students, Autumn, 2023)

Computing Methods. (For undergraduate students, Spring, 2024)

Correspondence

Email: wangyc@lamda.nju.edu.cn
Office: Room 201, Yi-fu Building, Xianlin Campus of Nanjing University
Address: National Key Laboratory for Novel Software Technology
                 Nanjing University, Xianlin Campus
                 163 Xianlin Avenue, Qixia District, Nanjing 210023, China