Back History
Reinforcement learning

([MainPage|Back to main page])

===Conference Papers===

* __Yang Yu__ and Qing Da, ''PolicyBoost: Functional policy gradient with ranking-based reward objective''. In: '''Proceedings of AAAI Workshop on AI and Robotics (AIRob'14)''', Quebec City, Canada, 2014, pp.57-62. ([{UP}papers/airob14-pb.pdf|PDF])

* Qing Da, __Yang Yu__, and Zhi-Hua Zhou. ''Napping for functional representation of policy''. In: '''Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'14)''', Paris, France, 2014, pp.189-196. ([{UP}papers/aamas14-nap.pdf|PDF])

* Qing Da, __Yang Yu__, and Zhi-Hua Zhou. ''Self-practice imitation learning from weak policy''. In: '''Proceedings of the 2nd IAPR International Workshop on Partially Supervised Learning (PSL'13)''', Nanjing, China, 2013, pp.9-20.

* Wang-Zhou Dai, __Yang Yu__, and Zhi-Hua Zhou. ''Lifted-rollout for approximate policy iteration of Markov decision process''. In: '''Proceedings of the International Workshop on Learning and Data Mining for Robotics (LEMIR'11)''', in conjunction with ICDM'11, Vancouver, Canada, 2011.