Discuss (0)
View Page Code
History
Reinforcement learning
Print
RSS
Modified: 2015/01/19 19:09 by
admin
-
Uncategorized
(
Back to main page
)
Edit
Conference Papers
Yang Yu
and Qing Da,
PolicyBoost: Functional policy gradient with ranking-based reward objective
. In:
Proceedings of AAAI Workshop on AI and Robotics (AIRob'14)
, Quebec City, Canada, 2014, pp.57-62. (
PDF
)
Qing Da,
Yang Yu
, and Zhi-Hua Zhou.
Napping for functional representation of policy
. In:
Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'14)
, Paris, France, 2014, pp.189-196. (
PDF
)
Qing Da,
Yang Yu
, and Zhi-Hua Zhou.
Self-practice imitation learning from weak policy
. In:
Proceedings of the 2nd IAPR International Workshop on Partially Supervised Learning (PSL'13)
, Nanjing, China, 2013, pp.9-20.
Wang-Zhou Dai,
Yang Yu
, and Zhi-Hua Zhou.
Lifted-rollout for approximate policy iteration of Markov decision process
. In:
Proceedings of the International Workshop on Learning and Data Mining for Robotics (LEMIR'11)
, in conjunction with ICDM'11, Vancouver, Canada, 2011.
The end