Discuss (0)View Page Code History

Reinforcement learning

Modified: 2015/01/19 19:09 by admin - Uncategorized

(Back to main page)

Edit

Conference Papers

Yang Yu and Qing Da, PolicyBoost: Functional policy gradient with ranking-based reward objective. In: Proceedings of AAAI Workshop on AI and Robotics (AIRob'14), Quebec City, Canada, 2014, pp.57-62. (PDF)

Qing Da, Yang Yu, and Zhi-Hua Zhou. Napping for functional representation of policy. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'14), Paris, France, 2014, pp.189-196. (PDF)

Qing Da, Yang Yu, and Zhi-Hua Zhou. Self-practice imitation learning from weak policy. In: Proceedings of the 2nd IAPR International Workshop on Partially Supervised Learning (PSL'13), Nanjing, China, 2013, pp.9-20.

Wang-Zhou Dai, Yang Yu, and Zhi-Hua Zhou. Lifted-rollout for approximate policy iteration of Markov decision process. In: Proceedings of the International Workshop on Learning and Data Mining for Robotics (LEMIR'11), in conjunction with ICDM'11, Vancouver, Canada, 2011.

The end