Description
: This package includes the Python code of the EDO-CS algorithm [1] for finding a set of policies having both high rewards and diverse behaviors in reinforcement learning. In each iteration, the policies are divided into several clusters based on their behaviors, and a high-quality policy is selected from each cluster for reproduction. EDO-CS also adaptively balances the importance between quality and diversity in the reproduction process. Experiments on continuous MuJoCo locomotion tasks from the OpenAI Gym library [2], show the superior performance of EDO-CS. README files are included in the package, showing how to use the code.
References: [1] Yutong Wang, Ke Xue, and Chao Qian. Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning. In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual, 2022.
[2] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. OpenAI Gym. CoRR abs/1606.01540, 2016.
ATTN: This package is free for academic usage. You can run it at your own risk. For other purposes, please contact Dr. Chao Qian (qianc@lamda.nju.edu.cn).
Requirement: The package was developed with Python.
ATTN2: This package was developed by Ms. Yutong Wang (wangyt@lamda.nju.edu.cn). For any problem concerning the code, please feel free to contact Ms. Wang.
Download:
code (1MB)