Description : This package includes the Python code of the EDO-CS algorithm [1] for finding a set of policies having both high rewards and diverse behaviors in reinforcement learning. In each iteration, the policies are divided into several clusters based on their behaviors, and a high-quality policy is selected from each cluster for reproduction. EDO-CS also adaptively balances the importance between quality and diversity in the reproduction process. Experiments on continuous MuJoCo locomotion tasks from the OpenAI Gym library [2], show the superior performance of EDO-CS. README files are included in the package, showing how to use the code.

References: [1] Yutong Wang, Ke Xue, and Chao Qian. Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning. In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual, 2022. [2] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. OpenAI Gym. CoRR abs/1606.01540, 2016.

ATTN: This package is free for academic usage. You can run it at your own risk. For other purposes, please contact Dr. Chao Qian (qianc@lamda.nju.edu.cn).

Requirement: The package was developed with Python.

ATTN2: This package was developed by Ms. Yutong Wang (wangyt@lamda.nju.edu.cn). For any problem concerning the code, please feel free to contact Ms. Wang.

Download: code (1MB)