List of papers
252 papers are listed.
You can click on each title to display more information, including authors, url to pdf, abstract and bibtex.
2024: 6 Papers
-
Bayesian Illumination: Inference and Quality-Diversity Accelerate Generative Molecular Models
Authors:
- Jonas Verhellen
Abstract:
In recent years, there have been considerable academic and industrial research efforts to develop novel generative models for high-performing, small molecules. Traditional, rules-based algorithms such as genetic algorithms [Jensen, Chem. Sci., 2019, 12, 3567-3572] have, however, been shown to rival deep learning approaches in terms of both efficiency and potency. In previous work, we showed that the addition of a quality-diversity archive to a genetic algorithm resolves stagnation issues and substantially increases search efficiency [Verhellen, Chem. Sci., 2020, 42, 11485-11491]. In this work, we expand on these insights and leverage the availability of bespoke kernels for small molecules [Griffiths, Adv. Neural. Inf. Process. Syst., 2024, 36] to integrate Bayesian optimisation into the quality-diversity process. This novel generative model, which we call Bayesian Illumination, produces a larger diversity of high-performing molecules than standard quality-diversity optimisation methods. In addition, we show that Bayesian Illumination further improves search efficiency com- pared to previous generative models for small molecules, including deep learning approaches, genetic algorithms, and standard quality-diversity methods.Links:
Bibtex:
@article{Samvelyan2024Rainbow, title={Bayesian Illumination: Inference and Quality-Diversity Accelerate Generative Molecular Models}, author={Verhellen, Jonas}, year={2024} }
-
Multi–Objective Quality–Diversity for Crystal Structure Prediction
Authors:
- Hannah Janmohamed
- Marta Wolinska
- Shikha Surana
- Thomas Pierrot
- Aron Walsh
- Antoine Cully
Abstract:
Crystal structures are indispensable across various domains, from batteries to solar cells, and extensive research has been dedicated to predicting their properties based on their atomic configurations. However, prevailing Crystal Structure Prediction methods focus on identifying the most stable solutions that lie at the global minimum of the energy function. This approach overlooks other potentially interesting materials that lie in neighbouring local minima and have different material properties such as conductivity or resistance to deformation. By contrast, Quality-Diversity algorithms provide a promising avenue for Crystal Structure Prediction as they aim to find a collection of high-performing solutions that have diverse characteristics. However, it may also be valuable to optimise for the stability of crystal structures alongside other objectives such as magnetism or thermoelectric efficiency. Therefore, in this work, we harness the power of Multi-Objective Quality-Diversity algorithms in order to find crystal structures which have diverse features and achieve different trade-offs of objectives. We analyse our approach on 5 crystal systems and demonstrate that it is not only able to re-discover known real-life structures, but also find promising new ones. Moreover, we propose a method for illuminating the objective space to gain an understanding of what trade-offs can be achieved.Links:
Bibtex:
@article{janmohamed2024multi, title={Multi-Objective Quality-Diversity for Crystal Structure Prediction}, author={Janmohamed, Hannah and Wolinska, Marta and Surana, Shikha and Pierrot, Thomas and Walsh, Aron and Cully, Antoine}, journal={arXiv preprint arXiv:2403.17164}, year={2024}}
-
Quality with Just Enough Diversity in Evolutionary Policy Search
Authors:
- Paul Templier
- Luca Grillotti
- Emmanuel Rachelson
- Dennis G. Wilson
- Antoine Cully
Abstract:
Evolution Strategies (ES) are effective gradient-free optimization methods that can be competitive with gradient-based approaches for policy search. ES only rely on the total episodic scores of solutions in their population, from which they estimate fitness gradients for their update with no access to true gradient information. However this makes them sensitive to deceptive fitness landscapes, and they tend to only explore one way to solve a problem. Quality-Diversity methods such as MAP-Elites introduced additional information with behavior descriptors (BD) to return a population of diverse solutions, which helps exploration but leads to a large part of the evaluation budget not being focused on finding the best performing solution. Here we show that behavior information can also be leveraged to find the best policy by identifying promising search areas which can then be efficiently explored with ES. We introduce the framework of Quality with Just Enough Diversity (JEDi) which learns the relationship between behavior and fitness to focus evaluations on solutions that matter. When trying to reach higher fitness values, JEDi outperforms both QD and ES methods on hard exploration tasks like mazes and on complex control problems with large policies.Links:
Bibtex:
@inproceedings{Templier_2024, series={GECCO ’24}, title={Quality with Just Enough Diversity in Evolutionary Policy Search}, url={http://dx.doi.org/10.1145/3638529.3654047}, DOI={10.1145/3638529.3654047}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, publisher={ACM}, author={Templier, Paul and Grillotti, Luca and Rachelson, Emmanuel and Wilson, Dennis and Cully, Antoine}, year={2024}, month=jul, pages={105–113}, collection={GECCO ’24} }
-
Quality–Diversity Actor–Critic Learning High–Performing and Diverse Behaviors via Value and Successor Features Critics
Authors:
- Luca Grillotti
- Maxence Faldor
- Borja G. León
- Antoine Cully
Abstract:
A key aspect of intelligence is the ability to demonstrate a broad spectrum of behaviors for adapting to unexpected situations. Over the past decade, advancements in deep reinforcement learning have led to groundbreaking achievements to solve complex continuous control tasks. However, most approaches return only one solution specialized for a specific problem. We introduce Quality-Diversity Actor-Critic (QDAC), an off-policy actor-critic deep reinforcement learning algorithm that leverages a value function critic and a successor features critic to learn high-performing and diverse behaviors. In this framework, the actor optimizes an objective that seamlessly unifies both critics using constrained optimization to (1) maximize return, while (2) executing diverse skills. Compared with other Quality-Diversity methods, QDAC achieves significantly higher performance and more diverse behaviors on six challenging continuous control locomotion tasks. We also demonstrate that we can harness the learned skills to adapt better than other baselines to five perturbed environments. Finally, qualitative analyses showcase a range of remarkable behaviors: adaptive-intelligent-robotics.github.io/QDAC.Links:
Bibtex:
@article{Grillotti2024Quality-Diversity, title={Quality–Diversity Actor–Critic Learning High–Performing and Diverse Behaviors via Value and Successor Features Critics}, author={Grillotti, Luca and Faldor, Maxence and G. León, Borja and Cully, Antoine}, journal={arXiv preprint arXiv:2403.09930v3}, year={2024} }
-
Rainbow Teaming Open–Ended Generation of Diverse Adversarial Prompts
Authors:
- Mikayel Samvelyan
- Sharath Chandra Raparthy
- Andrei Lupu
- Eric Hambro
- Aram H. Markosyan
- Manish Bhatt
- Yuning Mao
- Minqi Jiang
- Jack Parker-Holder
- Jakob Foerster
- Tim Rocktäschel
- Roberta Raileanu
Abstract:
As large language models (LLMs) become increasingly prevalent across many real-world applications, understanding and enhancing their robustness to user inputs is of paramount importance. Existing methods for identifying adversarial prompts tend to focus on specific domains, lack diversity, or require extensive human annotations. To address these limitations, we present Rainbow Teaming, a novel approach for producing a diverse collection of adversarial prompts. Rainbow Teaming casts adversarial prompt generation as a quality-diversity problem, and uses open-ended search to generate prompts that are both effective and diverse. It can uncover a model's vulnerabilities across a broad range of domains including, in this paper, safety, question answering, and cybersecurity. We also demonstrate that fine-tuning on synthetic data generated by Rainbow Teaming improves the safety of state-of-the-art LLMs without hurting their general capabilities and helpfulness, paving the path to open-ended self-improvement.Links:
Bibtex:
@article{Samvelyan2024Rainbow, title={Rainbow Teaming Open-Ended Generation of Diverse Adversarial Prompts}, author={Samvelyan, Mikayel and Chandra Raparthy, Sharath and Lupu, Andrei and Hambro, Eric and H. Markosyan, Aram and Bhatt, Manish and Mao, Yuning and Jiang, Minqi and Parker-Holder, Jack and Foerster, Jakob and Rocktäschel, Tim and Raileanu, Roberta}, journal={arXiv preprint arXiv:2402.16822v1}, year={2024} }
-
Speeding up 6–DoF Grasp Sampling with Quality–Diversity Robotics
Authors:
- Johann Huber
- François Hélénon
- Mathilde Kappel
- Elie Chelly
- Mahdi Khoramshahi
- Faïz Ben Amar
- Stéphane Doncieux
Abstract:
Recent advances in AI have led to significant results in robotic learning, including natural language-conditioned planning and efficient optimization of controllers using generative models. However, the interaction data remains the bottleneck for generalization. Getting data for grasping is a critical challenge, as this skill is required to complete many manipulation tasks. Quality-Diversity (QD) algorithms optimize a set of solutions to get diverse, high-performing solutions to a given problem. This paper investigates how QD can be combined with priors to speed up the generation of diverse grasps poses in simulation compared to standard 6-DoF grasp sampling schemes. Experiments conducted on 4 grippers with 2-to-5 fingers on standard objects show that QD outperforms commonly used methods by a large margin. Further experiments show that QD optimization automatically finds some efficient priors that are usually hard coded. The deployment of generated grasps on a 2-finger gripper and an Allegro hand shows that the diversity produced maintains sim-to-real transferability. We believe these results to be a significant step toward the generation of large datasets that can lead to robust and generalizing robotic grasping policies.Links:
Bibtex:
@article{Huber2024Speeding, title={Speeding up 6–DoF Grasp Sampling with Quality–Diversity}, author={Huber, Johann and Hélénon, François and Kappel, Mathilde and Chelly, Elie and Khoramshahi, Mahdi and Ben Amar, Faïz and Doncieux, Stéphane}, journal={arXiv preprint arXiv:2403.06173v1}, year={2024} }
2023: 51 Papers
-
Automated Test SuiteGeneration for Software Product Lines based on Quality-Diversity Optimisation
Authors:
- Yi Xiang
- Han Huang
- Sizhe Li
- Miqing Li
- Chuan Luo
- Xiaowei Yang
Abstract:
A Software Product Line (SPL) is a set of software products that are built from a variability model. Real-world SPLs typically involve a vast number of valid products, making it impossible to individually test each of them. This arises the need for automated test suite generation, which was previously modeled as either a single-objective or a multi-objective optimisation problem considering only objective functions. This article provides a completely diferent mathematical model by exploiting the beneits of Quality-Diversity (QD) optimisation that is composed of not only an objective function (e.g., t-wise coverage or test suite diversity) but also a user-deined behavior space (e.g., the space with test suite size as its dimension). We argue that the new model is more suitable and generic than the two alternatives because it provides at a time a large set of diverse (measured in the behavior space) and high-performing solutions that can ease the decision-making process. We apply MAP-Elites, one of the most popular QD algorithms, to solve the model. The results of the evaluation, on both realistic and artiicial SPLs, are promising, with MAP-Elites signiicantly and substantially outperforming both single- and multi-objective approaches, and also several state-of-the-art SPL testing tools. In summary, this paper provides a new and promising perspective on the test suite generation for SPLs.Bibtex:
"@article{10.1145/3628158, author = {Xiang, Yi and Huang, Han and Li, Sizhe and Li, Miqing and Luo, Chuan and Yang, Xiaowei}, title = {Automated Test Suite Generation for Software Product Lines Based on Quality-Diversity Optimisation}, year = {2023}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, issn = {1049-331X}, url = {https://doi.org/10.1145/3628158}, doi = {10.1145/3628158}, note = {Just Accepted}, journal = {ACM Trans. Softw. Eng. Methodol.}, month = {oct}, keywords = {automated test suite generation, quality-diversity (QD) optimisation, Software product line} } "
-
BOP-Elites, a Bayesian Optimisation Approach to Quality Diversity Search with Black-Box descriptor functions
Authors:
- Paul Kent
- Adam Gaier
- Jean-Baptiste Mouret
- Juergen Branke
Abstract:
Quality Diversity (QD) algorithms such as MAP-Elites are a class of optimisation techniques that attempt to find many high performing points that all behave differently according to a user-defined behavioural metric. In this paper we propose the Bayesian Optimisation of Elites (BOP-Elites) algorithm. Designed for problems with expensive black-box fitness and behaviour functions, it is able to return a QD solution-set with excellent final performance already after a relatively small number of samples. BOP-Elites models both fitness and behavioural descriptors with Gaussian Process (GP) surrogate models and uses Bayesian Optimisation (BO) strategies for choosing points to evaluate in order to solve the quality-diversity problem. In addition, BOP-Elites produces high quality surrogate models which can be used after convergence to predict solutions with any behaviour in a continuous range. An empirical comparison shows that BOP-Elites significantly outperforms other state-of-the-art algorithms without the need for problem-specific parameter tuning.Links:
Bibtex:
@article{Kent2023BOP-Elites, title={BOP-Elites, a Bayesian Optimisation Approach to Quality Diversity Search with Black-Box descriptor functions}, author={Kent, Paul and Gaier, Adam and Mouret, Jean-Baptiste and Branke, Juergen}, journal={arXiv preprint arXiv:2307.09326v2}, year={2023} }
-
Bayesian Quality Diversity Search with Interactive Illumination
Authors:
- Paul Kent
- Juergen Branke
Links:
Bibtex:
@inproceedings{Kent_2023, doi = {10.1145/3583131.3590486}, url = {https://doi.org/10.1145%2F3583131.3590486}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Paul Kent and Juergen Branke}, title = {Bayesian Quality Diversity Search with Interactive Illumination}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Bayesian Quality–Diversity approaches for constrained optimization problems with mixed continuous, discrete and categorical variables
Authors:
- Loic Brevault
- Mathieu Balesdent
Abstract:
Complex system design problems, such as those involved in aerospace engineering, require the use of numerically costly simulation codes in order to predict the performance of the system to be designed. In this context, these codes are often embedded into an optimization process to provide the best design while satisfying the design constraints. Recently, new approaches, called Quality-Diversity, have been proposed in order to enhance the exploration of the design space and to provide a set of optimal diversified solutions with respect to some feature functions. These functions are interesting to assess trade-offs. Furthermore, complex design problems often involve mixed continuous, discrete, and categorical design variables allowing to take into account technological choices in the optimization problem. Existing Bayesian Quality-Diversity approaches suited for intensive high-fidelity simulations are not adapted to mixed variables constrained optimization problems. In order to overcome these limitations, a new Quality-Diversity methodology based on mixed variables Bayesian optimization strategy is proposed in the context of limited simulation budget. Using adapted covariance models and dedicated enrichment strategy for the Gaussian processes in Bayesian optimization, this approach allows to reduce the computational cost up to two orders of magnitude, with respect to classical Quality-Diversity approaches while dealing with discrete choices and the presence of constraints. The performance of the proposed method is assessed on a benchmark of analytical problems as well as on two aerospace system design problems highlighting its efficiency in terms of speed of convergence. The proposed approach provides valuable trade-offs for decision-markers for complex system design.Links:
Bibtex:
@article{Brevault_2024, title={Bayesian Quality-Diversity approaches for constrained optimization problems with mixed continuous, discrete and categorical variables}, volume={133}, ISSN={0952-1976}, url={http://dx.doi.org/10.1016/j.engappai.2024.108118}, DOI={10.1016/j.engappai.2024.108118}, journal={Engineering Applications of Artificial Intelligence}, publisher={Elsevier BV}, author={Brevault, Loïc and Balesdent, Mathieu}, year={2024}, month=jul, pages={108118} }
-
Benchmark tasks for Quality-Diversity applied to Uncertain domains
Authors:
- Manon Flageat
- Luca Grillotti
- Antoine Cully
Abstract:
While standard approaches to optimisation focus on producing a single high-performing solution, Quality-Diversity (QD) algorithms allow large diverse collections of such solutions to be found. If QD has proven promising across a large variety of domains, it still struggles when faced with uncertain domains, where quantification of performance and diversity are non-deterministic. Previous work in Uncertain Quality-Diversity (UQD) has proposed methods and metrics designed for such uncertain domains. In this paper, we propose a first set of benchmark tasks to analyse and estimate the performance of UQD algorithms. We identify the key uncertainty properties to easily define UQD benchmark tasks: the uncertainty location, the type of distribution and its parameters. By varying the nature of those key UQD components, we introduce a set of 8 easy-to-implement and lightweight tasks, split into 3 main categories. All our tasks build on the Redundant Arm: a common QD environment that is lightweight and easily replicable. Each one of these tasks highlights one specific limitation that arises when considering UQD domains. With this first benchmark, we hope to facilitate later advances in UQD.Links:
Bibtex:
@article{Flageat2023Benchmark, title={Benchmark tasks for Quality-Diversity applied to Uncertain domains}, author={Flageat, Manon and Grillotti, Luca and Cully, Antoine}, journal={arXiv preprint arXiv:2304.12454v2}, year={2023} }
-
Beyond Playing to Win: Creating a Team of Agents with Distinct Behaviours for Automated Gameplay
Authors:
- Cristina Guerrero-Romero
- Simon Lucas
- Diego Perez-Liebana
Links:
Bibtex:
@article{Guerrero_Romero_2023, doi = {10.1109/tg.2023.3241864}, url = {https://doi.org/10.1109%2Ftg.2023.3241864}, year = 2023, publisher = {Institute of Electrical and Electronics Engineers ({IEEE})}, pages = {1--14}, author = {Cristina Guerrero-Romero and Simon Lucas and Diego Perez-Liebana}, title = {Beyond Playing to Win: Creating a Team of Agents with Distinct Behaviours for Automated Gameplay}, journal = {{IEEE} Transactions on Games} }
-
CHSEL Producing Diverse Plausible Pose Estimates from Contact and Free Space Data
Authors:
- Sheng Zhong
- Nima Fazeli
- Dmitry Berenson
Abstract:
This paper proposes a novel method for estimating the set of plausible poses of a rigid object from a set of points with volumetric information, such as whether each point is in free space or on the surface of the object. In particular, we study how pose can be estimated from force and tactile data arising from contact. Using data derived from contact is challenging because it is inherently less information-dense than visual data, and thus the pose estimation problem is severely under-constrained when there are few contacts. Rather than attempting to estimate the true pose of the object, which is not tractable without a large number of contacts, we seek to estimate a plausible set of poses which obey the constraints imposed by the sensor data. Existing methods struggle to estimate this set because they are either designed for single pose estimates or require informative priors to be effective. Our approach to this problem, Constrained pose Hypothesis Set Elimination (CHSEL), has three key attributes: 1) It considers volumetric information, which allows us to account for known free space; 2) It uses a novel differentiable volumetric cost function to take advantage of powerful gradient-based optimization tools; and 3) It uses methods from the Quality Diversity (QD) optimization literature to produce a diverse set of high-quality poses. To our knowledge, QD methods have not been used previously for pose registration. We also show how to update our plausible pose estimates online as more data is gathered by the robot. Our experiments suggest that CHSEL shows large performance improvements over several baseline methods for both simulated and real-world data.Links:
Bibtex:
@inproceedings{Zhong_2023, doi = {10.15607/rss.2023.xix.077}, url = {https://doi.org/10.15607%2Frss.2023.xix.077}, year = 2023, month = {jul}, publisher = {Robotics: Science and Systems Foundation}, author = {Sheng Zhong and Dmitry Berenson and Nima Fazeli}, title = {{CHSEL}: Producing Diverse Plausible Pose Estimates from Contact and Free Space Data}, booktitle = {Robotics: Science and Systems {XIX}} }
-
CHSEL Producing Diverse Plausible Pose Estimates from Contact and Free Space Data
Authors:
- Sheng Zhong
- Nima Fazeli
- Dmitry Berenson
Abstract:
This paper proposes a novel method for estimating the set of plausible poses of a rigid object from a set of points with volumetric information, such as whether each point is in free space or on the surface of the object. In particular, we study how pose can be estimated from force and tactile data arising from contact. Using data derived from contact is challenging because it is inherently less information-dense than visual data, and thus the pose estimation problem is severely under-constrained when there are few contacts. Rather than attempting to estimate the true pose of the object, which is not tractable without a large number of contacts, we seek to estimate a plausible set of poses which obey the constraints imposed by the sensor data. Existing methods struggle to estimate this set because they are either designed for single pose estimates or require informative priors to be effective. Our approach to this problem, Constrained pose Hypothesis Set Elimination (CHSEL), has three key attributes: 1) It considers volumetric information, which allows us to account for known free space; 2) It uses a novel differentiable volumetric cost function to take advantage of powerful gradient-based optimization tools; and 3) It uses methods from the Quality Diversity (QD) optimization literature to produce a diverse set of high-quality poses. To our knowledge, QD methods have not been used previously for pose registration. We also show how to update our plausible pose estimates online as more data is gathered by the robot. Our experiments suggest that CHSEL shows large performance improvements over several baseline methods for both simulated and real-world data.Links:
Bibtex:
@article{Zhong2023CHSEL:, title={CHSEL Producing Diverse Plausible Pose Estimates from Contact and Free Space Data}, author={Zhong, Sheng and Fazeli, Nima and Berenson, Dmitry}, journal={arXiv preprint arXiv:2305.08042v1}, year={2023} }
-
Can the Problem-Solving Benefits of Quality Diversity Be Obtained Without Explicit Diversity Maintenance?
Authors:
- Ryan Boldi
- Lee Spector
Abstract:
When using Quality Diversity (QD) optimization to solve hard exploration or deceptive search problems, we assume that diversity is extrinsically valuable. This means that diversity is important to help us reach an objective, but is not an objective in itself. Often, in these domains, practitioners benchmark their QD algorithms against single objective optimization frameworks. In this paper, we argue that the correct comparison should be made to mphmulti-objective optimization frameworks. This is because single objective optimization frameworks rely on the aggregation of sub-objectives, which could result in decreased information that is crucial for maintaining diverse populations automatically. In order to facilitate a fair comparison between quality diversity and multi-objective optimization, we present a method that utilizes dimensionality reduction to automatically determine a set of behavioral descriptors for an individual, as well as a set of objectives for an individual to solve. Using the former, one can generate solutions using standard quality diversity optimization techniques, and using the latter, one can generate solutions using standard multi-objective optimization techniques. This allows for a level comparison between these two classes of algorithms, without requiring domain and algorithm specific modifications to facilitate a comparison.Links:
Bibtex:
"@inproceedings{boldi2023QDBench, title = {Can the Problem-Solving Benefits of Quality Diversity Be Obtained Without Explicit Diversity Maintenance?}, year = {2023}, isbn = {9798400701207}, author = {Boldi, Ryan and Spector, Lee}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference Companion}, series = {GECCO '23}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3583133.3596336}, doi = {10.1145/3583133.3596336}, numpages={6}, location={Lisbon, Portugal} } "
-
Controllable Exploration of a Design Space via Interactive Quality Diversity
Authors:
- Konstantinos Sfikas
- Antonios Liapis
- Georgios N. Yannakakis
Abstract:
This paper introduces a user-driven evolutionary algorithm based on Quality Diversity (QD) search. During a design session, the user iteratively selects among presented alternatives and their selections affect the upcoming results. We aim to address two major concerns of interactive evolution: (a) the user must be presented with few alternatives, to reduce cognitive load; (b) presented alternatives should be diverse but similar to the previous user selection, to reduce user fatigue. To address these concerns, we implement a variation of the MAP-Elites algorithm where the presented alternatives are sampled from a small region (window) of the behavioral space. After a user selection, the window is centered on the selected individual's behavior characterization, evolution selects parents from within this window to produce offspring, and new alternatives are sampled. Essentially we define an adaptive system of local QD, where the user's selections guide the search towards specific regions of the behavioral space. The system is tested on the generation of architectural layouts, a constrained optimization task, leveraging QD through a two-archive approach. Results show that while global exploration is not as pronounced as in MAP-Elites, the system finds more appropriate solutions to the user's taste, based on experiments with controllable artificial users.Links:
Bibtex:
@inproceedings{Sfikas_2023, doi = {10.1145/3583133.3590616}, url = {https://doi.org/10.1145%2F3583133.3590616}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Konstantinos Sfikas and Antonios Liapis and Georgios N. Yannakakis}, title = {Controllable Exploration of a Design Space via Interactive Quality Diversity}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
Covariance Matrix Adaptation MAP-Annealing
Authors:
- Matthew C. Fontaine
- Stefanos Nikolaidis
Abstract:
Single-objective optimization algorithms search for the single highest-quality solution with respect to an objective. Quality diversity (QD) optimization algorithms, such as Covariance Matrix Adaptation MAP-Elites (CMA-ME), search for a collection of solutions that are both high-quality with respect to an objective and diverse with respect to specified measure functions. However, CMA-ME suffers from three major limitations highlighted by the QD community: prematurely abandoning the objective in favor of exploration, struggling to explore flat objectives, and having poor performance for low-resolution archives. We propose a new quality diversity algorithm, Covariance Matrix Adaptation MAP-Annealing (CMA-MAE), that addresses all three limitations. We provide theoretical justifications for the new algorithm with respect to each limitation. Our theory informs our experiments, which support the theory and show that CMA-MAE achieves state-of-the-art performance and robustness.Links:
Bibtex:
"@inproceedings{10.1145/3583131.3590389, author = {Fontaine, Matthew and Nikolaidis, Stefanos}, title = {Covariance Matrix Adaptation MAP-Annealing}, year = {2023}, isbn = {9798400701191}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3583131.3590389}, doi = {10.1145/3583131.3590389}, abstract = {Single-objective optimization algorithms search for the single highest-quality solution with respect to an objective. Quality diversity (QD) optimization algorithms, such as Covariance Matrix Adaptation MAP-Elites (CMA-ME), search for a collection of solutions that are both high-quality with respect to an objective and diverse with respect to specified measure functions. However, CMA-ME suffers from three major limitations highlighted by the QD community: prematurely abandoning the objective in favor of exploration, struggling to explore flat objectives, and having poor performance for low-resolution archives. We propose a new quality diversity algorithm, Covariance Matrix Adaptation MAP-Annealing (CMA-MAE), that addresses all three limitations. We provide theoretical justifications for the new algorithm with respect to each limitation. Our theory informs our experiments, which support the theory and show that CMA-MAE achieves state-of-the-art performance and robustness.}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {456–465}, numpages = {10}, location = {Lisbon, Portugal}, series = {GECCO '23} } "
-
Domain Randomization for Sim2real Transfer of Automatically Generated Grasping Datasets Robotics
Authors:
- Johann Huber
- François Hélénon
- Hippolyte Watrelot
- Faiz Ben Amar
- Stéphane Doncieux
Abstract:
Robotic grasping refers to making a robotic system pick an object by applying forces and torques on its surface. Many recent studies use data-driven approaches to address grasping, but the sparse reward nature of this task made the learning process challenging to bootstrap. To avoid constraining the operational space, an increasing number of works propose grasping datasets to learn from. But most of them are limited to simulations. The present paper investigates how automatically generated grasps can be exploited in the real world. More than 7000 reach-and-grasp trajectories have been generated with Quality-Diversity (QD) methods on 3 different arms and grippers, including parallel fingers and a dexterous hand, and tested in the real world. Conducted analysis on the collected measure shows correlations between several Domain Randomization-based quality criteria and sim-to-real transferability. Key challenges regarding the reality gap for grasping have been identified, stressing matters on which researchers on grasping should focus in the future. A QD approach has finally been proposed for making grasps more robust to domain randomization, resulting in a transfer ratio of 84% on the Franka Research 3 arm.Links:
Bibtex:
@article{Huber2023Domain, title={Domain Randomization for Sim2real Transfer of Automatically Generated Grasping Datasets}, author={Huber, Johann and Hélénon, François and Watrelot, Hippolyte and Ben Amar, Faiz and Doncieux, Stéphane}, journal={arXiv preprint arXiv:2310.04517v1}, year={2023} }
-
Don't Bet on Luck Alone: Enhancing Behavioral Reproducibility of Quality-Diversity Solutions in Uncertain Domains
Authors:
- Luca Grillotti
- Manon Flageat
- Bryan Lim
- Antoine Cully
Abstract:
Quality-Diversity (QD) algorithms are designed to generate collections of high-performing solutions while maximizing their diversity in a given descriptor space. However, in the presence of unpredictable noise, the fitness and descriptor of the same solution can differ significantly from one evaluation to another, leading to uncertainty in the estimation of such values. Given the elitist nature of QD algorithms, they commonly end up with many degenerate solutions in such noisy settings. In this work, we introduce Archive Reproducibility Improvement Algorithm (ARIA); a plug-and-play approach that improves the reproducibility of the solutions present in an archive. We propose it as a separate optimization module, relying on natural evolution strategies, that can be executed on top of any QD algorithm. Our module mutates solutions to (1) optimize their probability of belonging to their niche, and (2) maximize their fitness. The performance of our method is evaluated on various tasks, including a classical optimization problem and two high-dimensional control tasks in simulated robotic environments. We show that our algorithm enhances the quality and descriptor space coverage of any given archive by at least 50%.Links:
Bibtex:
@inproceedings{Grillotti_2023, doi = {10.1145/3583131.3590498}, url = {https://doi.org/10.1145%2F3583131.3590498}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Luca Grillotti and Manon Flageat and Bryan Lim and Antoine Cully}, title = {Don{\textquotesingle}t Bet on Luck Alone: Enhancing Behavioral Reproducibility of Quality-Diversity Solutions in Uncertain Domains}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Effects of compliant and structural parts in evolved modular robots
Authors:
- Emma Stensby Norstein
- Frank Veenstra
- Kai Olav Ellefsen
- Tønnes Nygaard
- Kyrre Glette
Links:
Bibtex:
@inproceedings{Norstein_2023, series={ALIFE 2023}, title={Effects of compliant and structural parts in evolved modular robots}, url={http://dx.doi.org/10.1162/isal_a_00689}, DOI={10.1162/isal_a_00689}, booktitle={The 2023 Conference on Artificial Life}, publisher={MIT Press}, author={Norstein, Emma Stensby and Veenstra, Frank and Ellefsen, Kai Olav and Nygaard, Tønnes and Glette, Kyrre}, year={2023}, collection={ALIFE 2023} }
-
Efficient Exploration using Model–Based Quality–Diversity with Gradients
Authors:
- Bryan Lim
- Manon Flageat
- Antoine Cully
Links:
Bibtex:
@inproceedings{Lim_2023, series={ALIFE 2023}, title={Efficient Exploration using Model-Based Quality-Diversity with Gradients}, url={http://dx.doi.org/10.1162/isal_a_00566}, DOI={10.1162/isal_a_00566}, booktitle={The 2023 Conference on Artificial Life}, publisher={MIT Press}, author={Lim, Bryan and Flageat, Manon and Cully, Antoine}, year={2023}, collection={ALIFE 2023} }
-
Efficient Quality Diversity Optimization of 3D Buildings through 2D Pre-optimization
Authors:
- Alexander Hagg
- Martin L. Kliemank
- Alexander Asteroth
- Dominik Wilde
- Mario C. Bedrunka
- Holger Foysi
- Dirk Reith
Abstract:
Quality diversity algorithms can be used to efficiently create a diverse set of solutions to inform engineers' intuition. But quality diversity is not efficient in very expensive problems, needing 100.000s of evaluations. Even with the assistance of surrogate models, quality diversity needs 100s or even 1000s of evaluations, which can make it use infeasible. In this study we try to tackle this problem by using a pre-optimization strategy on a lower-dimensional optimization problem and then map the solutions to a higher-dimensional case. For a use case to design buildings that minimize wind nuisance, we show that we can predict flow features around 3D buildings from 2D flow features around building footprints. For a diverse set of building designs, by sampling the space of 2D footprints with a quality diversity algorithm, a predictive model can be trained that is more accurate than when trained on a set of footprints that were selected with a space-filling algorithm like the Sobol sequence. Simulating only 16 buildings in 3D, a set of 1024 building designs with low predicted wind nuisance is created. We show that we can produce better machine learning models by producing training data with quality diversity instead of using common sampling techniques. The method can bootstrap generative design in a computationally expensive 3D domain and allow engineers to sweep the design space, understanding wind nuisance in early design phases.Links:
Bibtex:
@article{Hagg_2023, doi = {10.1162/evco_a_00326}, url = {https://doi.org/10.1162%2Fevco_a_00326}, year = 2023, month = {apr}, publisher = {{MIT} Press}, pages = {1--21}, author = {Alexander Hagg and Martin L. Kliemank and Alexander Asteroth and Dominik Wilde and Mario C. Bedrunka and Holger Foysi and Dirk Reith}, title = {Efficient Quality Diversity Optimization of 3D Buildings through 2D Pre-optimization}, journal = {Evolutionary Computation} }
-
Efficient Quality-Diversity Optimization through Diverse Quality Species
Authors:
- Ryan Wickman
- Bibek Poudel
- Michael Villarreal
- Xiaofei Zhang
- Weizi Li
Abstract:
A prevalent limitation of optimizing over a single objective is that it can be misguided, becoming trapped in local optimum. This can be rectified by Quality-Diversity (QD) algorithms, where a population of high-quality and diverse solutions to a problem is preferred. Most conventional QD approaches, for example, MAP-Elites, explicitly manage a behavioral archive where solutions are broken down into predefined niches. In this work, we show that a diverse population of solutions can be found without the limitation of needing an archive or defining the range of behaviors in advance. Instead, we break down solutions into independently evolving species and use unsupervised skill discovery to learn diverse, high-performing solutions. We show that this can be done through gradient-based mutations that take on an information theoretic perspective of jointly maximizing mutual information and performance. We propose Diverse Quality Species (DQS) as an alternative to archive-based QD algorithms. We evaluate it over several simulated robotic environments and show that it can learn a diverse set of solutions from varying species. Furthermore, our results show that DQS is more sample-efficient and performant when compared to other QD algorithms. Relevant code and hyper-parameters are available at: https://github.com/rwickman/NEAT_RL.Links:
Bibtex:
@inproceedings{Wickman_2023, doi = {10.1145/3583133.3590581}, url = {https://doi.org/10.1145%2F3583133.3590581}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Ryan Wickman and Bibek Poudel and Taylor Michael Villarreal and Xiaofei Zhang and Weizi Li}, title = {Efficient Quality-Diversity Optimization through Diverse Quality Species}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
Evolving Flying Machines in Minecraft Using Quality Diversity
Authors:
- Alejandro Medina
- Melanie Richey
- Mark Mueller
- Jacob Schrum
Abstract:
Minecraft is a great testbed for human creativity that has inspired the design of various structures and even functioning machines, including flying machines. EvoCraft is an API for programmatically generating structures in Minecraft, but the initial work in this domain was not capable of evolving flying machines. This paper applies fitness-based evolution and quality diversity search in order to evolve flying machines. Although fitness alone can occasionally produce flying machines, thanks in part to a more sophisticated fitness function than was used previously, the quality diversity algorithm MAP-Elites is capable of discovering flying machines much more reliably, at least when an appropriate behavior characterization is used to guide the search for diverse solutions.Links:
Bibtex:
"@inproceedings{10.1145/3583131.3590352, author = {Medina, Alejandro and Richey, Melanie and Mueller, Mark and Schrum, Jacob}, title = {Evolving Flying Machines in Minecraft Using Quality Diversity}, year = {2023}, isbn = {9798400701191}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3583131.3590352}, doi = {10.1145/3583131.3590352}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {1418–1426}, numpages = {9}, keywords = {quality diversity, MAP-elites, minecraft}, location = {Lisbon, Portugal}, series = {GECCO '23}} "
-
Evolving Populations of Diverse RL Agents with MAP-Elites
Authors:
- Thomas Pierrot
- Arthur Flajolet
Abstract:
Quality Diversity (QD) has emerged as a powerful alternative optimization paradigm that aims at generating large and diverse collections of solutions, notably with its flagship algorithm MAP-ELITES (ME) which evolves solutions through mutations and crossovers. While very effective for some unstructured problems, early ME implementations relied exclusively on random search to evolve the population of solutions, rendering them notoriously sample-inefficient for high-dimensional problems, such as when evolving neural networks. Follow-up works considered exploiting gradient information to guide the search in order to address these shortcomings through techniques borrowed from either Black-Box Optimization (BBO) or Reinforcement Learning (RL). While mixing RL techniques with ME unlocked state-of-the-art performance for robotics control problems that require a good amount of exploration, it also plagued these ME variants with limitations common among RL algorithms that ME was free of, such as hyperparameter sensitivity, high stochasticity as well as training instability, including when the population size increases as some components are shared across the population in recent approaches. Furthermore, existing approaches mixing ME with RL tend to be tied to a specific RL algorithm, which effectively prevents their use on problems where the corresponding RL algorithm fails. To address these shortcomings, we introduce a flexible framework that allows the use of any RL algorithm and alleviates the aforementioned limitations by evolving populations of agents (whose definition include hyperparameters and all learnable parameters) instead of just policies. We demonstrate the benefits brought about by our framework through extensive numerical experiments on a number of robotics control problems, some of which with deceptive rewards, taken from the QD-RL literature.Links:
Bibtex:
@article{Pierrot2023Evolving, title={Evolving Populations of Diverse RL Agents with MAP-Elites}, author={Pierrot, Thomas and Flajolet, Arthur}, journal={arXiv preprint arXiv:2303.12803v2}, year={2023} }
-
Fast generation of centroids for MAP-Elites
Authors:
- Jean-Baptiste Mouret
Links:
Bibtex:
@inproceedings{Mouret_2023, doi = {10.1145/3583133.3590726}, url = {https://doi.org/10.1145%2F3583133.3590726}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Jean-Baptiste Mouret}, title = {Fast generation of centroids for {MAP}-Elites}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
Generating diverse and discriminatory knapsack instances by searching for novelty in variable dimensions of feature-space
Authors:
- Alejandro Marrero
- Eduardo Segredo
- Emma Hart
- Jakob Bossek
- Aneta Neumann
Links:
Bibtex:
@inproceedings{Marrero_2023, doi = {10.1145/3583131.3590504}, url = {https://doi.org/10.1145%2F3583131.3590504}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Alejandro Marrero and Eduardo Segredo and Emma Hart and Jakob Bossek and Aneta Neumann}, title = {Generating diverse and discriminatory knapsack instances by searching for novelty in variable dimensions of feature-space}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Generative Meta-Learning Robust Quality-Diversity Portfolio
Authors:
- Kamer Ali Yuksel
Abstract:
This paper proposes a novel meta-learning approach to optimize a robust portfolio ensemble. The method uses a deep generative model to generate diverse and high-quality sub-portfolios combined to form the ensemble portfolio. The generative model consists of a convolutional layer, a stateful LSTM module, and a dense network. During training, the model takes a randomly sampled batch of Gaussian noise and outputs a population of solutions, which are then evaluated using the objective function of the problem. The weights of the model are updated using a gradient-based optimizer. The convolutional layer transforms the noise into a desired distribution in latent space, while the LSTM module adds dependence between generations. The dense network decodes the population of solutions. The proposed method balances maximizing the performance of the sub-portfolios with minimizing their maximum correlation, resulting in a robust ensemble portfolio against systematic shocks. The approach was effective in experiments where stochastic rewards were present. Moreover, the results (Fig. 1) demonstrated that the ensemble portfolio obtained by taking the average of the generated sub-portfolio weights was robust and generalized well. The proposed method can be applied to problems where diversity is desired among co-optimized solutions for a robust ensemble. The source-codes and the dataset are in the supplementary material.Links:
Bibtex:
"@inproceedings{10.1145/3583133.3590729, author = {Yuksel, Kamer Ali}, title = {Generative Meta-Learning Robust Quality-Diversity Portfolio}, year = {2023}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3583133.3590729}, doi = {10.1145/3583133.3590729}, abstract = {This paper proposes a novel meta-learning approach to optimize a robust portfolio ensemble. The method uses a deep generative model to generate diverse and high-quality sub-portfolios combined to form the ensemble portfolio. The generative model consists of a convolutional layer, a stateful LSTM module, and a dense network. During training, the model takes a randomly sampled batch of Gaussian noise and outputs a population of solutions, which are then evaluated using the objective function of the problem. The weights of the model are updated using a gradient-based optimizer. The convolutional layer transforms the noise into a desired distribution in latent space, while the LSTM module adds dependence between generations. The dense network decodes the population of solutions. The proposed method balances maximizing the performance of the sub-portfolios with minimizing their maximum correlation, resulting in a robust ensemble portfolio against systematic shocks. The approach was effective in experiments where stochastic rewards were present. Moreover, the results (Fig. 1) demonstrated that the ensemble portfolio obtained by taking the average of the generated sub-portfolio weights was robust and generalized well. The proposed method can be applied to problems where diversity is desired among co-optimized solutions for a robust ensemble. The source-codes and the dataset are in the supplementary material.}, booktitle = {Genetic and Evolutionary Computation Conference (GECCO '23) Companion}, location = {Lisbon, Portugal}, series = {GECCO '23} } "
-
Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces
Authors:
- Raphael Boige
- Guillaume Richard
- Jérémie Dona
- Thomas Pierrot
- Antoine Cully
Abstract:
Quality Diversity (QD) algorithms have been proposed to search for a large collection of both diverse and high-performing solutions instead of a single set of local optima. While early QD algorithms view the objective and descriptor functions as black-box functions, novel tools have been introduced to use gradient information to accelerate the search and improve overall performance of those algorithms over continuous input spaces. However a broad range of applications involve discrete spaces, such as drug discovery or image generation. Exploring those spaces is challenging as they are combinatorially large and gradients cannot be used in the same manner as in continuous spaces. We introduce map-elites with a Gradient-Informed Discrete Emitter (ME-GIDE), which extends QD optimisation with differentiable functions over discrete search spaces. ME-GIDE leverages the gradient information of the objective and descriptor functions with respect to its discrete inputs to propose gradient-informed updates that guide the search towards a diverse set of high quality solutions. We evaluate our method on challenging benchmarks including protein design and discrete latent space illumination and find that our method outperforms state-of-the-art QD algorithms in all benchmarks.Links:
Bibtex:
@inproceedings{Boige_2023, doi = {10.1145/3583131.3590407}, url = {https://doi.org/10.1145%2F3583131.3590407}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Raphaël Boige and Guillaume Richard and J{\'{e}}r{\'{e}}mie Dona and Thomas Pierrot and Antoine Cully}, title = {Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Hybrid Encoding For Generating Large Scale Game Level Patterns With Local Variations
Authors:
- Jacob Schrum
- Benjamin Capps
- Kirby Steckel
- Vanessa Volz
- Sebastian Risi
Abstract:
Generative Adversarial Networks (GANs) are a powerful indirect genotype-to-phenotype mapping for evolutionary search. Much previous work applying GANs to level generation focuses on fixed-size segments combined into a whole level, but individual segments may not fit together cohesively. In contrast, segments in human designed levels are often repeated, directly or with variation, and organized into patterns (the symmetric eagle in Level 1 of The Legend of Zelda, or repeated pipe motifs in Super Mario Bros). Such patterns can be produced with Compositional Pattern Producing Networks (CPPNs). CPPNs define latent vector GAN inputs as a function of geometry, organizing segments output by a GAN into complete levels. However, collections of latent vectors can also be evolved directly, producing more chaotic levels. We propose a hybrid approach that evolves CPPNs first, but allows latent vectors to evolve later, combining the benefits of both approaches. These approaches are evaluated in Super Mario Bros. and The Legend of Zelda. We previously demonstrated via a Quality-Diversity algorithm that CPPNs better cover the space of possible levels than directly evolved levels. Here, we show that the hybrid approach (1) covers areas that neither of the other methods can, and (2) achieves comparable or superior QD scores.Links:
Bibtex:
"@article{schrum2023hybridcppn2gan, title={Hybrid Encoding For Generating Large Scale Game Level Patterns With Local Variations}, author={Schrum, Jacob and Capps, Benjamin and Steckel, Kirby and Volz, Vanessa and Risi, Sebastian}, journal={IEEE Transactions on Games}, year={2023}, volume={15}, number={1}, pages={46-55}, doi={10.1109/TG.2022.3170730}}"
-
Improving the Data Efficiency of Multi-Objective Quality-Diversity through Gradient Assistance and Crowding Exploration
Authors:
- Hannah Janmohamed
- Thomas Pierrot
- Antoine Cully
Abstract:
Quality-Diversity (QD) algorithms have recently gained traction as optimisation methods due to their effectiveness at escaping local optima and capability of generating wide-ranging and high-performing solutions. Recently, Multi-Objective MAP-Elites (MOME) extended the QD paradigm to the multi-objective setting by maintaining a Pareto front in each cell of a map-elites grid. MOME achieved a global performance that competed with NSGA-II and SPEA2, two well-established Multi-Objective Evolutionary Algorithms (MOEA), while also acquiring a diverse repertoire of solutions. However, MOME is limited by non-directed genetic search mechanisms which struggle in high-dimensional search spaces. In this work, we present Multi-Objective MAP-Elites with Policy-Gradient Assistance and Crowding-based Exploration (MOME-PGX): a new QD algorithm that extends MOME to improve its data efficiency and performance. MOME-PGX uses gradient-based optimisation to efficiently drive solutions towards higher performance. It also introduces crowding-based mechanisms to create an improved exploration strategy and to encourage uniformity across Pareto fronts. We evaluate MOME-PGX in four simulated robot locomotion tasks and demonstrate that it converges faster and to a higher performance than all other baselines. We show that MOME-PGX is between 4.3 and 42 times more data-efficient than MOME and doubles the performance of MOME, NSGA-II and SPEA2 in challenging environments.Links:
Bibtex:
@article{Janmohamed2023Improving, title={Improving the Data Efficiency of Multi-Objective Quality-Diversity through Gradient Assistance and Crowding Exploration}, author={Janmohamed, Hannah and Pierrot, Thomas and Cully, Antoine}, journal={arXiv preprint arXiv:2302.12668v2}, year={2023} }
-
LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization Neural Architecture Search
Authors:
- Muhammad U. Nasir
- Sam Earle
- Julian Togelius
- Steven James
- Christopher Cleghorn
Abstract:
Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. In this context, we view LLMs as mutation and crossover tools. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce LLMatic, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, LLMatic uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and highly performant networks. We test LLMatic on the CIFAR-10 image classification benchmark, demonstrating that it can produce competitive networks with just $2,000$ searches, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark.Links:
Bibtex:
"@article{nasir2023llmatic, ttitle={LLMatic Neural Architecture Search via Large Language Models and Quality Diversity Optimization}, author={U. Nasir, Muhammad and Earle, Sam and Togelius, Julian and James, Steven and Cleghorn, Christopher}, journal={arXiv preprint arXiv:2306.01102}, year={2023} } "
-
MAP-Elites with Descriptor-Conditioned Gradients and Archive Distillation into a Single Policy
Authors:
- Maxence Faldor
- Félix Chalumeau
- Manon Flageat
- Antoine Cully
Links:
Bibtex:
@inproceedings{Faldor_2023, doi = {10.1145/3583131.3590503}, url = {https://doi.org/10.1145%2F3583131.3590503}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Maxence Faldor and F{\'{e}}lix Chalumeau and Manon Flageat and Antoine Cully}, title = {{MAP}-Elites with Descriptor-Conditioned Gradients and Archive Distillation into a Single Policy}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
MarioGPT: Open-Ended Text2Level Generation through Large Language Models Games
Authors:
- Shyam Sdhakaran
- Miguel González-Duque
- Claire Glanois
- Matthias Freiberger
- Elias Najarro
- Sebastian Risi
Abstract:
Procedural Content Generation (PCG) algorithms provide a technique to generate complex and diverse environments in an automated way. However, while generating content with PCG methods is often straightforward, generating meaningful content that reflects specific intentions and constraints remains challenging. Furthermore, many PCG algorithms lack the ability to generate content in an open-ended manner. Recently, Large Language Models (LLMs) have shown to be incredibly effective in many diverse domains. These trained LLMs can be fine-tuned, re-using information and accelerating training for new tasks. In this work, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels, in our case Super Mario Bros levels. We show that MarioGPT can not only generate diverse levels, but can be text-prompted for controllable level generation, addressing one of the key challenges of current PCG techniques. As far as we know, MarioGPT is the first text-to-level model. We also combine MarioGPT with novelty search, enabling it to generate diverse levels with varying play-style dynamics (i.e. player paths). This combination allows for the open-ended generation of an increasingly diverse range of content.Links:
Bibtex:
" @misc{sudhakaran2023mariogpt, title={MarioGPT: Open-Ended Text2Level Generation through Large Language Models}, author={Shyam Sudhakaran and Miguel González-Duque and Claire Glanois and Matthias Freiberger and Elias Najarro and Sebastian Risi}, year={2023}, eprint={2302.05981}, archivePrefix={arXiv}, primaryClass={cs.AI} } "
-
Mix-ME: Quality-Diversity for Multi-Agent Learning Robotics
Authors:
- Garðar Ingvarsson
- Mikayel Samvelyan
- Bryan Lim
- Manon Flageat
- Antoine Cully
- Tim Rocktäschel
Abstract:
In many real-world systems, such as adaptive robotics, achieving a single, optimised solution may be insufficient. Instead, a diverse set of high-performing solutions is often required to adapt to varying contexts and requirements. This is the realm of Quality-Diversity (QD), which aims to discover a collection of high-performing solutions, each with their own unique characteristics. QD methods have recently seen success in many domains, including robotics, where they have been used to discover damage-adaptive locomotion controllers. However, most existing work has focused on single-agent settings, despite many tasks of interest being multi-agent. To this end, we introduce Mix-ME, a novel multi-agent variant of the popular MAP-Elites algorithm that forms new solutions using a crossover-like operator by mixing together agents from different teams. We evaluate the proposed methods on a variety of partially observable continuous control tasks. Our evaluation shows that these multi-agent variants obtained by Mix-ME not only compete with single-agent baselines but also often outperform them in multi-agent settings under partial observability.Links:
Bibtex:
"@article{ingvarsson2023mixme, title={Mix-ME: Quality-Diversity for Multi-Agent Learning}, author={Gar{\dh}ar Ingvarsson and Mikayel Samvelyan and Bryan Lim and Manon Flageat and Antoine Cully and Tim Rockt{\"a}schel}, journal={arXiv preprint arXiv:2311.01829}, year={2023} }"
-
Multi-Robot Coordination and Layout Design for Automated Warehousing Robotics
Authors:
- Yulun Zhang
- Matthew Fontaine
- Varun Bhatt
- Stefanos Nikolaidis
- Jiaoyang Li
Abstract:
With the rapid progress in Multi-Agent Path Finding (MAPF), researchers have studied how MAPF algorithms can be deployed to coordinate hundreds of robots in large automated warehouses. While most works try to improve the throughput of such warehouses by developing better MAPF algorithms, we focus on improving the throughput by optimizing the warehouse layout. We show that, even with state-of-the-art MAPF algorithms, commonly used human-designed layouts can lead to congestion for warehouses with large numbers of robots and thus have limited scalability. We extend existing automatic scenario generation methods to optimize warehouse layouts. Results show that our optimized warehouse layouts (1) reduce traffic congestion and thus improve throughput, (2) improve the scalability of the automated warehouses by doubling the number of robots in some cases, and (3) are capable of generating layouts with user-specified diversity measures. We include the source code at: https://github.com/lunjohnzhang/warehouse_env_gen_public.Links:
Bibtex:
"@article{zhang2023multi, title={Multi-Robot Coordination and Layout Design for Automated Warehousing}, author={Zhang, Yulun and Fontaine, Matthew C and Bhatt, Varun and Nikolaidis, Stefanos and Li, Jiaoyang}, journal={The International Joint Conference on Artificial Intelligence}, year={2023} } "
-
Multi-Task Multi-Behavior MAP-Elites
Authors:
- Timothée Anne
- Jean-Baptiste Mouret
Links:
Bibtex:
@inproceedings{Anne_2023, doi = {10.1145/3583133.3590730}, url = {https://doi.org/10.1145%2F3583133.3590730}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Timoth{\'{e}}e Anne and Jean-Baptiste Mouret}, title = {Multi-Task Multi-Behavior {MAP}-Elites}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
Multiple Hands Make Light Work: Enhancing Quality and Diversity using MAP-Elites with Multiple Parallel Evolution Strategies
Authors:
- Manon Flageat
- Bryan Lim
- Antoine Cully
Abstract:
With the development of hardware accelerators and their corresponding tools, evaluations have become more affordable through fast and massively parallel evaluations in some applications. This advancement has drastically sped up the runtime of evolution-inspired algorithms such as Quality-Diversity optimization, creating tremendous potential for algorithmic innovation through scale. In this work, we propose MAP-Elites-Multi-ES (MEMES), a novel QD algorithm based on Evolution Strategies (ES) designed for fast parallel evaluations. ME-Multi-ES builds on top of the existing MAP-Elites-ES algorithm, scaling it by maintaining multiple independent ES threads with massive parallelization. We also introduce a new dynamic reset procedure for the lifespan of the independent ES to autonomously maximize the improvement of the QD population. We show experimentally that MEMES outperforms existing gradient-based and objective-agnostic QD algorithms when compared in terms of generations. We perform this comparison on both black-box optimization and QD-Reinforcement Learning tasks, demonstrating the benefit of our approach across different problems and domains. Finally, we also find that our approach intrinsically enables optimization of fitness locally around a niche, a phenomenon not observed in other QD algorithms.Links:
Bibtex:
@article{Flageat2023Multiple, title={Multiple Hands Make Light Work: Enhancing Quality and Diversity using MAP-Elites with Multiple Parallel Evolution Strategies}, author={Flageat, Manon and Lim, Bryan and Cully, Antoine}, journal={arXiv preprint arXiv:2303.06137v1}, year={2023} }
-
On Evolvability and Behavior Landscapes in Neuroevolutionary Divergent Search
Authors:
- Bruno Gašperov
- Marko Đurasević
Abstract:
Evolvability refers to the ability of an individual genotype (solution) to produce offspring with mutually diverse phenotypes. Recent research has demonstrated that divergent search methods, particularly novelty search, promote evolvability by implicitly creating selective pressure for it. The main objective of this paper is to provide a novel perspective on the relationship between neuroevolutionary divergent search and evolvability. In order to achieve this, several types of walks from the literature on fitness landscape analysis are first adapted to this context. Subsequently, the interplay between neuroevolutionary divergent search and evolvability under varying amounts of evolutionary pressure and under different diversity metrics is investigated. To this end, experiments are performed on Fetch Pick and Place, a robotic arm task. Moreover, the performed study in particular sheds light on the structure of the genotype-phenotype mapping (the behavior landscape). Finally, a novel definition of evolvability that takes into account the evolvability of offspring and is appropriate for use with discretized behavior spaces is proposed, together with a Markov-chain-based estimation method for it.Links:
Bibtex:
@inproceedings{Ga_perov_2023, doi = {10.1145/3583131.3590427}, url = {https://doi.org/10.1145%2F3583131.3590427}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Bruno Ga{\v{s}}perov and Marko {\DJ}urasevi{\'{c}}}, title = {On Evolvability and Behavior Landscapes in Neuroevolutionary Divergent Search}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
On the Suitability of Representations for Quality Diversity Optimization of Shapes
Authors:
- Ludovico Scarton
- Alexander Hagg
Abstract:
The representation, or encoding, utilized in evolutionary algorithms has a substantial effect on their performance. Examination of the suitability of widely used representations for quality diversity optimization (QD) in robotic domains has yielded inconsistent results regarding the most appropriate encoding method. Given the domain-dependent nature of QD, additional evidence from other domains is necessary. This study compares the impact of several representations, including direct encoding, a dictionary-based representation, parametric encoding, compositional pattern producing networks, and cellular automata, on the generation of voxelized meshes in an architecture setting. The results reveal that some indirect encodings outperform direct encodings and can generate more diverse solution sets, especially when considering full phenotypic diversity. The paper introduces a multi-encoding QD approach that incorporates all evaluated representations in the same archive. Species of encodings compete on the basis of phenotypic features, leading to an approach that demonstrates similar performance to the best single-encoding QD approach. This is noteworthy, as it does not always require the contribution of the best-performing single encoding.Links:
Bibtex:
@inproceedings{Scarton_2023, doi = {10.1145/3583131.3590381}, url = {https://doi.org/10.1145%2F3583131.3590381}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Ludovico Scarton and Alexander Hagg}, title = {On the Suitability of Representations for Quality Diversity Optimization of Shapes}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Open–Ended Library Learning in Unsupervised Program Synthesis
Authors:
- Glanois Claire
- Shyam Sudhakaran
- Elias Najarro
- Sebastian Risi
Links:
Bibtex:
@inproceedings{Claire_2023, series={ALIFE 2023}, title={Open-Ended Library Learning in Unsupervised Program Synthesis}, url={http://dx.doi.org/10.1162/isal_a_00685}, DOI={10.1162/isal_a_00685}, booktitle={The 2023 Conference on Artificial Life}, publisher={MIT Press}, author={Claire, Glanois and Sudhakaran, Shyam and Najarro, Elias and Risi, Sebastian}, year={2023}, collection={ALIFE 2023} }
-
Overcoming Deceptive Rewards with Quality-Diversity
Authors:
- Arno Feiden
- Jochen Garcke
Links:
Bibtex:
@inproceedings{Feiden_2023, doi = {10.1145/3583133.3590741}, url = {https://doi.org/10.1145%2F3583133.3590741}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Arno Feiden and Jochen Garcke}, title = {Overcoming Deceptive Rewards with Quality-Diversity}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
QDax A Library for Quality–Diversity and Population–based Algorithms with Hardware Acceleration
Authors:
- Felix Chalumeau
- Bryan Lim
- Raphael Boige
- Maxime Allard
- Luca Grillotti
- Manon Flageat
- Valentin Macé
- Arthur Flajolet
- Thomas Pierrot
- Antoine Cully
Abstract:
QDax is an open-source library with a streamlined and modular API for Quality-Diversity (QD) optimization algorithms in Jax. The library serves as a versatile tool for optimization purposes, ranging from black-box optimization to continuous control. QDax offers implementations of popular QD, Neuroevolution, and Reinforcement Learning (RL) algorithms, supported by various examples. All the implementations can be just-in-time compiled with Jax, facilitating efficient execution across multiple accelerators, including GPUs and TPUs. These implementations effectively demonstrate the framework's flexibility and user-friendliness, easing experimentation for research purposes. Furthermore, the library is thoroughly documented and tested with 95\% coverage.Links:
Bibtex:
@article{Chalumeau2023QDax:, title={QDax A Library for Quality–Diversity and Population–based Algorithms with Hardware Acceleration}, author={Chalumeau, Felix and Lim, Bryan and Boige, Raphael and Allard, Maxime and Grillotti, Luca and Flageat, Manon and Macé, Valentin and Flajolet, Arthur and Pierrot, Thomas and Cully, Antoine}, journal={arXiv preprint arXiv:2308.03665v1}, year={2023} }
-
Quality Diversity Evolutionary Learning of Decision Trees
Authors:
- Andrea Ferigo
- Leonardo Lucio Custode
- Giovanni Iacca
Links:
Bibtex:
@inproceedings{Ferigo_2023, doi = {10.1145/3555776.3577591}, url = {https://doi.org/10.1145%2F3555776.3577591}, year = 2023, month = {mar}, publisher = {{ACM}}, author = {Andrea Ferigo and Leonardo Lucio Custode and Giovanni Iacca}, title = {Quality Diversity Evolutionary Learning of Decision Trees}, booktitle = {Proceedings of the 38th {ACM}/{SIGAPP} Symposium on Applied Computing} }
-
Quality Diversity optimization using the IsoLineDD operator: forward and backward directions are equally important
Authors:
- Konstantinos Christou
- Chris Christodoulou
- Vassilis Vassiliades
Links:
Bibtex:
@inproceedings{Christou_2023, doi = {10.1145/3583133.3590737}, url = {https://doi.org/10.1145%2F3583133.3590737}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Konstantinos Christou and Chris Christodoulou and Vassilis Vassiliades}, title = {Quality Diversity optimization using the {IsoLineDD} operator: forward and backward directions are equally important}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
Quality Diversity under Sparse Reward and Sparse Interaction Application to Grasping in Robotics Robotics
Authors:
- Johann Huber
- François Hélénon
- Miranda Coninx
- Faiz Ben Amar
- Stéphane Doncieux
Abstract:
Quality-Diversity (QD) methods are algorithms that aim to generate a set of diverse and high-performing solutions to a given problem. Originally developed for evolutionary robotics, most QD studies are conducted on a limited set of domains - mainly applied to locomotion, where the fitness and the behavior signal are dense. Grasping is a crucial task for manipulation in robotics. Despite the efforts of many research communities, this task is yet to be solved. Grasping cumulates unprecedented challenges in QD literature: it suffers from reward sparsity, behavioral sparsity, and behavior space misalignment. The present work studies how QD can address grasping. Experiments have been conducted on 15 different methods on 10 grasping domains, corresponding to 2 different robot-gripper setups and 5 standard objects. An evaluation framework that distinguishes the evaluation of an algorithm from its internal components has also been proposed for a fair comparison. The obtained results show that MAP-Elites variants that select successful solutions in priority outperform all the compared methods on the studied metrics by a large margin. We also found experimental evidence that sparse interaction can lead to deceptive novelty. To our knowledge, the ability to efficiently produce examples of grasping trajectories demonstrated in this work has no precedent in the literature.Links:
Bibtex:
@article{Huber2023Quality, title={Quality Diversity under Sparse Reward and Sparse Interaction Application to Grasping in Robotics}, author={Huber, J. and Hélénon, F. and Coninx, M. and Ben Amar, F. and Doncieux, S.}, journal={arXiv preprint arXiv:2308.05483v2}, year={2023} }
-
Quality-Diversity Optimisation on a Physical Robot Through Dynamics-Aware and Reset-Free Learning
Authors:
- Simón C. Smith
- Bryan Lim
- Hannah Janmohamed
- Antoine Cully
Links:
Bibtex:
@inproceedings{Smith_2023, doi = {10.1145/3583133.3590625}, url = {https://doi.org/10.1145%2F3583133.3590625}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Sim{\'{o}}n C. Smith and Bryan Lim and Hannah Janmohamed and Antoine Cully}, title = {Quality-Diversity Optimisation on a Physical Robot Through Dynamics-Aware and Reset-Free Learning}, booktitle = {Proceedings of the Companion Conference on Genetic and Evolutionary Computation} }
-
Quality-diversity in dissimilarity spaces
Authors:
- Steve Huntsman
Links:
Bibtex:
@inproceedings{Huntsman_2023, doi = {10.1145/3583131.3590409}, url = {https://doi.org/10.1145%2F3583131.3590409}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Steve Huntsman}, title = {Quality-diversity in dissimilarity spaces}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Runtime Analysis of Quality Diversity Algorithms
Authors:
- Jakob Bossek
- Dirk Sudholt
Abstract:
Quality diversity~(QD) is a branch of evolutionary computation that gained increasing interest in recent years. The Map-Elites QD approach defines a feature space, i.e., a partition of the search space, and stores the best solution for each cell of this space. We study a simple QD algorithm in the context of pseudo-Boolean optimisation on the ``number of ones'' feature space, where the $i$th cell stores the best solution amongst those with a number of ones in $[(i-1)k, ik-1]$. Here $k$ is a granularity parameter $1 \leq k \leq n+1$. We give a tight bound on the expected time until all cells are covered for arbitrary fitness functions and for all $k$ and analyse the expected optimisation time of QD on \textsc{OneMax} and other problems whose structure aligns favourably with the feature space. On combinatorial problems we show that QD finds a ${(1-1/e)}$-approximation when maximising any monotone sub-modular function with a single uniform cardinality constraint efficiently. Defining the feature space as the number of connected components of a connected graph, we show that QD finds a minimum spanning tree in expected polynomial time.Links:
Bibtex:
@inproceedings{Bossek_2023, doi = {10.1145/3583131.3590383}, url = {https://doi.org/10.1145%2F3583131.3590383}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Jakob Bossek and Dirk Sudholt}, title = {Runtime Analysis of Quality Diversity Algorithms}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Surrogate Assisted Generation of Human-Robot Interaction Scenarios
Authors:
- Varun Bhatt
- Heramb Nemlekar
- Matthew C. Fontaine
- Bryon Tjanaka
- Hejia Zhang
- Ya-Chuan Hsu
- Stefanos Nikolaidis
Abstract:
As human-robot interaction (HRI) systems advance, so does the difficulty of evaluating and understanding the strengths and limitations of these systems in different environments and with different users. To this end, previous methods have algorithmically generated diverse scenarios that reveal system failures in a shared control teleoperation task. However, these methods require directly evaluating generated scenarios by simulating robot policies and human actions. The computational cost of these evaluations limits their applicability in more complex domains. Thus, we propose augmenting scenario generation systems with surrogate models that predict both human and robot behaviors. In the shared control teleoperation domain and a more complex shared workspace collaboration task, we show that surrogate assisted scenario generation efficiently synthesizes diverse datasets of challenging scenarios. We demonstrate that these failures are reproducible in real-world interactions.Links:
Bibtex:
@article{Bhatt2023Surrogate, title={Surrogate Assisted Generation of Human-Robot Interaction Scenarios}, author={Bhatt, Varun and Nemlekar, Heramb and C. Fontaine, Matthew and Tjanaka, Bryon and Zhang, Hejia and Hsu, Ya-Chuan and Nikolaidis, Stefanos}, journal={arXiv preprint arXiv:2304.13787v2}, year={2023} }
-
The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers
Authors:
- Valentin Macé
- Raphaël Boige
- Felix Chalumeau
- Thomas Pierrot
- Guillaume Richard
- Nicolas Perrin-Gilbert
Links:
Bibtex:
@inproceedings{Mac__2023, doi = {10.1145/3583131.3590433}, url = {https://doi.org/10.1145%2F3583131.3590433}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Valentin Mac{\'{e}} and Raphaël Boige and Felix Chalumeau and Thomas Pierrot and Guillaume Richard and Nicolas Perrin-Gilbert}, title = {The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Towards open–ended evolution based on CVT–MAP–Elites with dynamic switching between feature spaces
Authors:
- Koki Usui
- Reiji Suzuki
- Takaya Arita
Links:
Bibtex:
@inproceedings{Usui_2023, series={ALIFE 2023}, title={Towards open-ended evolution based on CVT-MAP-Elites with dynamic switching between feature spaces}, url={http://dx.doi.org/10.1162/isal_a_00617}, DOI={10.1162/isal_a_00617}, booktitle={The 2023 Conference on Artificial Life}, publisher={MIT Press}, author={Usui, Koki and Suzuki, Reiji and Arita, Takaya}, year={2023}, collection={ALIFE 2023} }
-
Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing
Authors:
- Bryon Tjanaka
- Matthew C. Fontaine
- David H. Lee
- Aniruddha Kalkar
- Stefanos Nikolaidis
Abstract:
Pre-training a diverse set of neural network controllers in simulation has enabled robots to adapt online to damage in robot locomotion tasks. However, finding diverse, high-performing controllers requires expensive network training and extensive tuning of a large number of hyperparameters. On the other hand, Covariance Matrix Adaptation MAP-Annealing (CMA-MAE), an evolution strategies (ES)-based quality diversity algorithm, does not have these limitations and has achieved state-of-the-art performance on standard QD benchmarks. However, CMA-MAE cannot scale to modern neural network controllers due to its quadratic complexity. We leverage efficient approximation methods in ES to propose three new CMA-MAE variants that scale to high dimensions. Our experiments show that the variants outperform ES-based baselines in benchmark robotic locomotion tasks, while being comparable with or exceeding state-of-the-art deep reinforcement learning-based quality diversity algorithms.Links:
Bibtex:
@ARTICLE{10243102, author={Tjanaka, Bryon and Fontaine, Matthew C. and Lee, David H. and Kalkar, Aniruddha and Nikolaidis, Stefanos}, journal={IEEE Robotics and Automation Letters}, title={Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing}, year={2023}, volume={8}, number={10}, pages={6771-6778}, doi={10.1109/LRA.2023.3313012}}
-
Uncertain Quality-Diversity: Evaluation methodology and new methods for Quality-Diversity in Uncertain Domains
Authors:
- Manon Flageat
- Antoine Cully
Abstract:
Quality-Diversity optimisation (QD) has proven to yield promising results across a broad set of applications. However, QD approaches struggle in the presence of uncertainty in the environment, as it impacts their ability to quantify the true performance and novelty of solutions. This problem has been highlighted multiple times independently in previous literature. In this work, we propose to uniformise the view on this problem through four main contributions. First, we formalise a common framework for uncertain domains: the Uncertain QD setting, a special case of QD in which fitness and descriptors for each solution are no longer fixed values but distribution over possible values. Second, we propose a new methodology to evaluate Uncertain QD approaches, relying on a new per-generation sampling budget and a set of existing and new metrics specifically designed for Uncertain QD. Third, we propose three new Uncertain QD algorithms: Archive-sampling, Parallel-Adaptive-sampling and Deep-Grid-sampling. We propose these approaches taking into account recent advances in the QD community toward the use of hardware acceleration that enable large numbers of parallel evaluations and make sampling an affordable approach to uncertainty. Our final and fourth contribution is to use this new framework and the associated comparison methods to benchmark existing and novel approaches. We demonstrate once again the limitation of MAP-Elites in uncertain domains and highlight the performance of the existing Deep-Grid approach, and of our new algorithms. The goal of this framework and methods is to become an instrumental benchmark for future works considering Uncertain QD.Links:
Bibtex:
@article{Flageat_2023, doi = {10.1109/tevc.2023.3273560}, url = {https://doi.org/10.1109%2Ftevc.2023.3273560}, year = 2023, publisher = {Institute of Electrical and Electronics Engineers ({IEEE})}, pages = {1--1}, author = {Manon Flageat and Antoine Cully}, title = {Uncertain Quality-Diversity: Evaluation methodology and new methods for Quality-Diversity in Uncertain Domains}, journal = {{IEEE} Transactions on Evolutionary Computation} }
-
Understanding the Synergies between Quality-Diversity and Deep Reinforcement Learning
Authors:
- Bryan Lim
- Manon Flageat
- Antoine Cully
Abstract:
The synergies between Quality-Diversity (QD) and Deep Reinforcement Learning (RL) have led to powerful hybrid QD-RL algorithms that have shown tremendous potential, and brings the best of both fields. However, only a single deep RL algorithm (TD3) has been used in prior hybrid methods despite notable progress made by other RL algorithms. Additionally, there are fundamental differences in the optimization procedures between QD and RL which would benefit from a more principled approach. We propose Generalized Actor-Critic QD-RL, a unified modular framework for actor-critic deep RL methods in the QD-RL setting. This framework provides a path to study insights from Deep RL in the QD-RL setting, which is an important and efficient way to make progress in QD-RL. We introduce two new algorithms, PGA-ME (SAC) and PGA-ME (DroQ) which apply recent advancements in Deep RL to the QD-RL setting, and solves the humanoid environment which was not possible using existing QD-RL algorithms. However, we also find that not all insights from Deep RL can be effectively translated to QD-RL. Critically, this work also demonstrates that the actor-critic models in QD-RL are generally insufficiently trained and performance gains can be achieved without any additional environment evaluations.Links:
Bibtex:
@inproceedings{Lim_2023, doi = {10.1145/3583131.3590388}, url = {https://doi.org/10.1145%2F3583131.3590388}, year = 2023, month = {jul}, publisher = {{ACM}}, author = {Bryan Lim and Manon Flageat and Antoine Cully}, title = {Understanding the Synergies between Quality-Diversity and Deep Reinforcement Learning}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Workstation Suitability Maps Generating Ergonomic Behaviors on a Population of Virtual Humans With Multi–Task Optimization
Authors:
- Jacques Zhong
- Vincent Weistroffer
- Jean-Baptiste Mouret
- Francis Colas
- Pauline Maurice
Links:
Bibtex:
@article{Zhong_2023, doi = {10.1109/lra.2023.3318191}, url = {https://doi.org/10.1109%2Flra.2023.3318191}, year = 2023, month = {nov}, publisher = {Institute of Electrical and Electronics Engineers ({IEEE})}, volume = {8}, number = {11}, pages = {7384--7391}, author = {Jacques Zhong and Vincent Weistroffer and Jean-Baptiste Mouret and Francis Colas and Pauline Maurice}, title = {Workstation Suitability Maps: Generating Ergonomic Behaviors on a Population of Virtual Humans With Multi-Task Optimization}, journal = {{IEEE} Robotics and Automation Letters} }
-
pyribs: A Bare-Bones Python Library for Quality Diversity Optimization
Authors:
- Bryon Tjanaka
- Matthew C. Fontaine
- David H. Lee
- Yulun Zhang
- Nivedit Reddy Balam
- Nathaniel Dennler
- Sujay S. Garlanka
- Nikitas Dimitri Klapsis
- Stefanos Nikolaidis
Abstract:
Recent years have seen a rise in the popularity of quality diversity (QD) optimization, a branch of optimization that seeks to find a collection of diverse, high-performing solutions to a given problem. To grow further, we believe the QD community faces two challenges: developing a framework to represent the field's growing array of algorithms, and implementing that framework in software that supports a range of researchers and practitioners. To address these challenges, we have developed pyribs, a library built on a highly modular conceptual QD framework. By replacing components in the conceptual framework, and hence in pyribs, users can compose algorithms from across the QD literature; equally important, they can identify unexplored algorithm variations. Furthermore, pyribs makes this framework simple, flexible, and accessible, with a user-friendly API supported by extensive documentation and tutorials. This paper overviews the creation of pyribs, focusing on the conceptual framework that it implements and the design principles that have guided the library's development. Pyribs is available at https://pyribs.orgLinks:
Bibtex:
"@inproceedings{10.1145/3583131.3590374, author = {Tjanaka, Bryon and Fontaine, Matthew C and Lee, David H and Zhang, Yulun and Balam, Nivedit Reddy and Dennler, Nathaniel and Garlanka, Sujay S and Klapsis, Nikitas Dimitri and Nikolaidis, Stefanos}, title = {Pyribs: A Bare-Bones Python Library for Quality Diversity Optimization}, year = {2023}, isbn = {9798400701191}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3583131.3590374}, doi = {10.1145/3583131.3590374}, abstract = {Recent years have seen a rise in the popularity of quality diversity (QD) optimization, a branch of optimization that seeks to find a collection of diverse, high-performing solutions to a given problem. To grow further, we believe the QD community faces two challenges: developing a framework to represent the field's growing array of algorithms, and implementing that framework in software that supports a range of researchers and practitioners. To address these challenges, we have developed pyribs, a library built on a highly modular conceptual QD framework. By replacing components in the conceptual framework, and hence in pyribs, users can compose algorithms from across the QD literature; equally important, they can identify unexplored algorithm variations. Furthermore, pyribs makes this framework simple, flexible, and accessible, with a user-friendly API supported by extensive documentation and tutorials. This paper overviews the creation of pyribs, focusing on the conceptual framework that it implements and the design principles that have guided the library's development. Pyribs is available at https://pyribs.org}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {220–229}, numpages = {10}, keywords = {framework, software library, quality diversity}, location = {Lisbon, Portugal}, series = {GECCO '23} } "
2022: 40 Papers
-
A Case Study of Quality-Diversity Search in Human Computation
Authors:
- Seth Cooper
Links:
Bibtex:
@article{Cooper_2022, doi = {10.15346/hc.v9i1.135}, url = {https://doi.org/10.15346%2Fhc.v9i1.135}, year = 2022, month = {sep}, publisher = {Human Computation Institute}, volume = {9}, number = {1}, pages = {58--65}, author = {Seth Cooper}, title = {A Case Study of Quality-Diversity Search in Human Computation}, journal = {Human Computation} }
-
A Preliminary Study of Multi-task MAP-Elites with Knowledge Transfer for Robotic Arm Design
Authors:
- Peng Liu
- Zheng Guo
- Hua Yu
- Han Linghu
- Yijing Li
- Yaqing Hou
- Hongwei Ge
- Qiang Zhang
Links:
Bibtex:
@inproceedings{Liu_2022, doi = {10.1109/cec55065.2022.9870374}, url = {https://doi.org/10.1109%2Fcec55065.2022.9870374}, year = 2022, month = {jul}, publisher = {{IEEE}}, author = {Peng Liu and Zheng Guo and Hua Yu and Han Linghu and Yijing Li and Yaqing Hou and Hongwei Ge and Qiang Zhang}, title = {A Preliminary Study of Multi-task {MAP}-Elites with Knowledge Transfer for Robotic Arm Design}, booktitle = {2022 {IEEE} Congress on Evolutionary Computation ({CEC})} }
-
A collection of quality diversity optimization problems derived from hyperparameter optimization of machine learning models
Authors:
- Lennart Schneider
- Florian Pfisterer
- Janek Thomas
- Bernd Bischl
Links:
Bibtex:
@inproceedings{Schneider_2022, doi = {10.1145/3520304.3534003}, url = {https://doi.org/10.1145%2F3520304.3534003}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Lennart Schneider and Florian Pfisterer and Janek Thomas and Bernd Bischl}, title = {A collection of quality diversity optimization problems derived from hyperparameter optimization of machine learning models}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference Companion} }
-
A discretization-free metric for assessing quality diversity algorithms
Authors:
- Paul Kent
- Juergen Branke
- Adam Gaier
- Jean-Baptiste Mouret
Links:
Bibtex:
@inproceedings{Kent_2022, doi = {10.1145/3520304.3534018}, url = {https://doi.org/10.1145%2F3520304.3534018}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Paul Kent and Juergen Branke and Adam Gaier and Jean-Baptiste Mouret}, title = {A discretization-free metric for assessing quality diversity algorithms}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference Companion} }
-
Accelerated Quality-Diversity through Massive Parallelism
Authors:
- Bryan Lim
- Maxime Allard
- Luca Grillotti
- Antoine Cully
Links:
-
Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning Robotics
Authors:
- Bryon Tjanaka
- Matthew Fontaine
- Julian Togelius
- Stefanos Nikolaidis
Abstract:
Consider a walking agent that must adapt to damage. To approach this task, we can train a collection of policies and have the agent select a suitable policy when damaged. Training this collection may be viewed as a quality diversity (QD) optimization problem, where we search for solutions (policies) which maximize an objective (walking forward) while spanning a set of measures (measurable characteristics). Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available for the objective and measures. However, such gradients are typically unavailable in RL settings due to non-differentiable environments. To apply DQD in RL settings, we propose to approximate objective and measure gradients with evolution strategies and actor-critic methods. We develop two variants of the DQD algorithm CMA-MEGA, each with different gradient approximations, and evaluate them on four simulated walking tasks. One variant achieves comparable performance (QD score) with the state-of-the-art PGA-MAP-Elites in two tasks. The other variant performs comparably in all tasks but is less efficient than PGA-MAP-Elites in two tasks. These results provide insight into the limitations of CMA-MEGA in domains that require rigorous optimization of the objective and where exact gradients are unavailable.Links:
Bibtex:
"@inproceedings{10.1145/3512290.3528705, author = {Tjanaka, Bryon and Fontaine, Matthew C. and Togelius, Julian and Nikolaidis, Stefanos}, title = {Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning}, year = {2022}, isbn = {9781450392372}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3512290.3528705}, doi = {10.1145/3512290.3528705}, abstract = {Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks. These results provide insight into the limitations of current DQD algorithms in domains where gradients must be approximated. Source code is available at https://github.com/icaros-usc/dqd-rl}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {1102–1111}, numpages = {10}, keywords = {neuroevolution, reinforcement learning, quality diversity}, location = {Boston, Massachusetts}, series = {GECCO '22} } "
-
Automatic Acquisition of a Repertoire of Diverse Grasping Trajectories through Behavior Shaping and Novelty Search
Authors:
- Aurelien Morel
- Yakumo Kunimoto
- Alex Coninx
- Stephane Doncieux
Links:
Bibtex:
@inproceedings{Morel_2022, doi = {10.1109/icra46639.2022.9811837}, url = {https://doi.org/10.1109%2Ficra46639.2022.9811837}, year = 2022, month = {may}, publisher = {{IEEE}}, author = {Aurelien Morel and Yakumo Kunimoto and Alex Coninx and Stephane Doncieux}, title = {Automatic Acquisition of a Repertoire of Diverse Grasping Trajectories through Behavior Shaping and Novelty Search}, booktitle = {2022 International Conference on Robotics and Automation ({ICRA})} }
-
Automatic Acquisition of a Repertoire of Diverse Grasping Trajectories through Behavior Shaping and Novelty Search
Authors:
- Aur{\'e}lien Morel
- Yakumo Kunimoto
- Alex Coninx
- Stephane Doncieux
Abstract:
Grasping a particular object may require a dedicated grasping movement that may also be specific to the robot end-effector. No generic and autonomous method does exist to generate these movements without making hypotheses on the robot or on the object. Learning methods could help to autonomously discover relevant grasping movements, but they face an important issue: grasping movements are so rare that a learning method based on exploration has little chance to ever observe an interesting movement, thus creating a bootstrap issue. We introduce an approach to generate diverse grasping movements in order to solve this problem. The movements are generated in simulation, for particular object positions. We test it on several simulated robots: Baxter, Pepper and a Kuka Iiwa arm. Although we show that generated movements actually work on a real Baxter robot, the aim is to use this method to create a large dataset to bootstrap deep learning methods.Links:
Bibtex:
" @article{morel2022automatic, title={Automatic Acquisition of a Repertoire of Diverse Grasping Trajectories through Behavior Shaping and Novelty Search}, author={Morel, Aur{\'e}lien and Kunimoto, Yakumo and Coninx, Alex and Doncieux, St{\'e}phane}, journal={arXiv preprint arXiv:2205.08189}, year={2022} } "
-
Balancing teams with quality-diversity for heterogeneous multiagent coordination
Authors:
- Gaurav Dixit
- Kagan Tumer
Links:
Bibtex:
@inproceedings{Dixit_2022, doi = {10.1145/3520304.3529062}, url = {https://doi.org/10.1145%2F3520304.3529062}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Gaurav Dixit and Kagan Tumer}, title = {Balancing teams with quality-diversity for heterogeneous multiagent coordination}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference Companion} }
-
Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning
Authors:
- Manon Flageat*
- Bryan Lim*
- Luca Grillotti
- Maxime Allard
- Simón C. Smith
- Antoine Cully
Links:
-
Deep Surrogate Assisted Generation of Environments
Authors:
- Varun Bhatt
- Bryon Tjanaka
- Matthew C. Fontaine
- Stefanos Nikolaidis
Abstract:
Recent progress in reinforcement learning (RL) has started producing generally capable agents that can solve a distribution of complex environments. These agents are typically tested on fixed, human-authored environments. On the other hand, quality diversity (QD) optimization has been proven to be an effective component of environment generation algorithms, which can generate collections of high-quality environments that are diverse in the resulting agent behaviors. However, these algorithms require potentially expensive simulations of agents on newly generated environments. We propose Deep Surrogate Assisted Generation of Environments (DSAGE), a sample-efficient QD environment generation algorithm that maintains a deep surrogate model for predicting agent behaviors in new environments. Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms in discovering collections of environments that elicit diverse behaviors of a state-of-the-art RL agent and a planning agent. Our source code and videos are available at https://dsagepaper.github.io/Links:
-
Deep Surrogate Assisted MAP-Elites for Automated Hearthstone Deckbuilding Games
Authors:
- Yulun Zhang
- Matthew Fontaine
- Amy Hoover
- Stefanos Nikolaidis
Abstract:
We study the problem of efficiently generating high-quality and diverse content in games. Previous work on automated deckbuilding in Hearthstone shows that the quality diversity algorithm MAP-Elites can generate a collection of high-performing decks with diverse strategic gameplay. However, MAP-Elites requires a large number of expensive evaluations to discover a diverse collection of decks. We propose assisting MAP-Elites with a deep surrogate model trained online to predict game outcomes with respect to candidate decks. MAP-Elites discovers a diverse dataset to improve the surrogate model accuracy, while the surrogate model helps guide MAP-Elites towards promising new content. In a Hearthstone deckbuilding case study, we show that our approach improves the sample efficiency of MAP-Elites and outperforms a model trained offline with random decks, as well as a linear surrogate model baseline, setting a new state-of-the-art for quality diversity approaches in automated Hearthstone deckbuilding. We include the source code for all the experiments at: https://github.com/icaros-usc/EvoStone2.Links:
Bibtex:
"@inproceedings{10.1145/3512290.3528718, author = {Zhang, Yulun and Fontaine, Matthew C. and Hoover, Amy K. and Nikolaidis, Stefanos}, title = {Deep Surrogate Assisted MAP-Elites for Automated Hearthstone Deckbuilding}, year = {2022}, isbn = {9781450392372}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3512290.3528718}, doi = {10.1145/3512290.3528718}, abstract = {We study the problem of efficiently generating high-quality and diverse content in games. Previous work on automated deckbuilding in Hearthstone shows that the quality diversity algorithm MAP-Elites can generate a collection of high-performing decks with diverse strategic gameplay. However, MAP-Elites requires a large number of expensive evaluations to discover a diverse collection of decks. We propose assisting MAP-Elites with a deep surrogate model trained online to predict game outcomes with respect to candidate decks. MAP-Elites discovers a diverse dataset to improve the surrogate model accuracy while the surrogate model helps guide MAP-Elites towards promising new content. In a Hearthstone deck-building case study, we show that our approach improves the sample efficiency of MAP-Elites and outperforms a model trained offline with random decks, as well as a linear surrogate model baseline, setting a new state-of-the-art for quality diversity approaches in automated Hearthstone deckbuilding. We include the source code for all the experiments at: https://github.com/icaros-usc/EvoStone2.}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {158–167}, numpages = {10}, keywords = {deep neural networks, surrogate modeling, MAP-elites}, location = {Boston, Massachusetts}, series = {GECCO '22} } "
-
Discovering Unsupervised Behaviours from Full–State Trajectories
Authors:
- Luca Grillotti
- Antoine Cully
Abstract:
Improving open-ended learning capabilities is a promising approach to enable robots to face the unbounded complexity of the real-world. Among existing methods, the ability of Quality-Diversity algorithms to generate large collections of diverse and high-performing skills is instrumental in this context. However, most of those algorithms rely on a hand-coded behavioural descriptor to characterise the diversity, hence requiring prior knowledge about the considered tasks. In this work, we propose an additional analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations. We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories. All algorithms were applied to three tasks: navigation, moving forward with a high velocity, and performing half-rolls. The experimental results show that the algorithm under study discovers autonomously collections of solutions that are diverse with respect to all tasks. More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.Links:
Bibtex:
@article{Grillotti2022Discovering, title={Discovering Unsupervised Behaviours from Full–State Trajectories}, author={Grillotti, Luca and Cully, Antoine}, journal={arXiv preprint arXiv:2211.15451v1}, year={2022} }
-
Diversity policy gradient for sample efficient quality-diversity optimization
Authors:
- Thomas Pierrot
- Valentin Macé
- Felix Chalumeau
- Arthur Flajolet
- Geoffrey Cideron
- Karim Beguir
- Antoine Cully
- Olivier Sigaud
- Nicolas Perrin-Gilbert
Links:
Bibtex:
@inproceedings{Pierrot_2022, doi = {10.1145/3512290.3528845}, url = {https://doi.org/10.1145%2F3512290.3528845}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Thomas Pierrot and Valentin Mac{\'{e}} and Felix Chalumeau and Arthur Flajolet and Geoffrey Cideron and Karim Beguir and Antoine Cully and Olivier Sigaud and Nicolas Perrin-Gilbert}, title = {Diversity policy gradient for sample efficient quality-diversity optimization}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Effects of encodings and quality-diversity on evolving 2D virtual creatures
Authors:
- Frank Veenstra
- Martin Herring Olsen
- Kyrre Glette
Links:
Bibtex:
@inproceedings{Veenstra_2022, doi = {10.1145/3520304.3529053}, url = {https://doi.org/10.1145%2F3520304.3529053}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Frank Veenstra and Martin Herring Olsen and Kyrre Glette}, title = {Effects of encodings and quality-diversity on evolving 2D virtual creatures}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference Companion} }
-
Efficient Learning of Locomotion Skills through the Discovery of Diverse Environmental Trajectory Generator Priors Robotics
Authors:
- Shikha Surana*
- Bryan Lim*
- Antoine Cully
Links:
-
Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains
Authors:
- Manon Flageat
- Felix Chalumeau
- Antoine Cully
Abstract:
Quality-Diversity algorithms, among which MAP-Elites, have emerged as powerful alternatives to performance-only optimisation approaches as they enable generating collections of diverse and high-performing solutions to an optimisation problem. However, they are often limited to low-dimensional search spaces and deterministic environments. The recently introduced Policy Gradient Assisted MAP-Elites (PGA-MAP-Elites) algorithm overcomes this limitation by pairing the traditional Genetic operator of MAP-Elites with a gradient-based operator inspired by Deep Reinforcement Learning. This new operator guides mutations toward high-performing solutions using policy-gradients. In this work, we propose an in-depth study of PGA-MAP-Elites. We demonstrate the benefits of policy-gradients on the performance of the algorithm and the reproducibility of the generated solutions when considering uncertain domains. We first prove that PGA-MAP-Elites is highly performant in both deterministic and uncertain high-dimensional environments, decorrelating the two challenges it tackles. Secondly, we show that in addition to outperforming all the considered baselines, the collections of solutions generated by PGA-MAP-Elites are highly reproducible in uncertain environments, approaching the reproducibility of solutions found by Quality-Diversity approaches built specifically for uncertain applications. Finally, we propose an ablation and in-depth analysis of the dynamic of the policy-gradients-based variation. We demonstrate that the policy-gradient variation operator is determinant to guarantee the performance of PGA-MAP-Elites but is only essential during the early stage of the process, where it finds high-performing regions of the search space.Links:
Bibtex:
@article{Flageat_2023, doi = {10.1145/3577203}, url = {https://doi.org/10.1145%2F3577203}, year = 2023, month = {mar}, publisher = {Association for Computing Machinery ({ACM})}, volume = {3}, number = {1}, pages = {1--32}, author = {Manon Flageat and F{\'{e}}lix Chalumeau and Antoine Cully}, title = {Empirical analysis of {PGA}-{MAP}-Elites for Neuroevolution in Uncertain Domains}, journal = {{ACM} Transactions on Evolutionary Learning and Optimization} }
-
Evaluating Human–Robot Interaction Algorithms in Shared Autonomy via Quality Diversity Scenario Generation
Authors:
- Matthew C. Fontaine
- Stefanos Nikolaidis
Links:
Bibtex:
@article{Fontaine_2022, doi = {10.1145/3476412}, url = {https://doi.org/10.1145%2F3476412}, year = 2022, month = {sep}, publisher = {Association for Computing Machinery ({ACM})}, volume = {11}, number = {3}, pages = {1--30}, author = {Matthew C. Fontaine and Stefanos Nikolaidis}, title = {Evaluating Human{\textendash}Robot Interaction Algorithms in Shared Autonomy via Quality Diversity Scenario Generation}, journal = {{ACM} Transactions on Human-Robot Interaction} }
-
Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning Robotics
Authors:
- Yutong Wang
- Ke Xue
- Chao Qian
Abstract:
Reinforcement Learning (RL) has achieved significant successes, which aims to obtain a single policy maximizing the expected cumulative rewards for a given task. However, in many real-world scenarios, e.g., navigating in complex environments and controlling robots, one may need to find a set of policies having both high rewards and diverse behaviors, which can bring better exploration and robust few-shot adaptation. Recently, some methods have been developed by using evolutionary techniques, including iterative reproduction and selection of policies. However, due to the inefficient selection mechanisms, these methods cannot fully guarantee both high quality and diversity. In this paper, we propose EDO-CS, a new Evolutionary Diversity Optimization algorithm with Clustering-based Selection. In each iteration, the policies are divided into several clusters based on their behaviors, and a high-quality policy is selected from each cluster for reproduction. EDO-CS also adaptively balances the importance between quality and diversity in the reproduction process. Experiments on various (i.e., deceptive and multi-modal) continuous control tasks, show the superior performance of EDO-CS over previous methods, i.e., EDO-CS can achieve a set of policies with both high quality and diversity efficiently while previous methods cannot.Links:
Bibtex:
" @inproceedings{ wang2022evolutionary, title={Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning}, author={Yutong Wang and Ke Xue and Chao Qian}, booktitle={International Conference on Learning Representations}, year={2022}, url={https://openreview.net/forum?id=74x5BXs4bWD}} "
-
Few-Shot Quality-Diversity Optimization
Authors:
- Achkan Salehi
- Alexandre Coninx
- Stephane Doncieux
Abstract:
In the past few years,a considerable amount of research has been dedicated to the exploitation of previous learning experiences and the design of Few-shot and Meta Learning approaches, in problem domains ranging from Computer Vision to Reinforcement Learning based control. A notable exception, where to the best of our knowledge, little to no effort has been made in this direction is Quality-Diversity (QD) optimization. QD methods have been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. However, they remain costly due to their reliance on inherently sample inefficient evolutionary processes. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Our proposed method does not require backpropagation. It is simple to implement and scale, and furthermore, it is agnostic to the underlying models that are being trained. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.Links:
Bibtex:
" @article{salehi2022few, title={Few-Shot Quality-Diversity Optimization}, author={Salehi, Achkan and Coninx, Alexandre and Doncieux, Stephane}, journal={IEEE Robotics and Automation Letters}, volume={7}, number={2}, pages={4424--4431}, year={2022}, publisher={IEEE} } "
-
Geodesics, Non-linearities and the Archive of Novelty Search
Authors:
- Achkan Salehi
- Alexandre Coninx
- Stephane Doncieux
Abstract:
The Novelty Search (NS) algorithm was proposed more than a decade ago. However, the mechanisms behind its empirical success are still not well formalized/understood. This short note focuses on the effects of the archive on exploration. Experimental evidence from a few application domains suggests that archive-based NS performs in general better than when Novelty is solely computed with respect to the population. An argument that is often encountered in the literature is that the archive prevents exploration from backtracking or cycling, i.e. from revisiting previously encountered areas in the behavior space. We argue that this is not a complete or accurate explanation as backtracking —beside often being desirable —can actually be enabled by the archive. Through low-dimensional/analytical examples, we show that a key effect of the archive is that it counterbalances the exploration biases that result, among other factors, from the use of inadequate behavior metrics and the non-linearities of the behavior mapping. Our observations seem to hint that attributing a more active role to the archive in sampling can be beneficial.Links:
Bibtex:
" @article{salehi2022geodesics, title={Geodesics, Non-linearities and the Archive of Novelty Search}, author={Salehi, Achkan and Coninx, Alexandre and Doncieux, Stephane}, journal={In Genetic and Evolutionary Computation Conference Companion (GECCO’22 Companion), July 9–13, 2022, Boston, MA, USA. ACM, New York, NY, USA}, year={2022} } "
-
Guided Safe Shooting: model based reinforcement learning with safety constraints
Authors:
- Giuseppe Paolo
- Jonas Gonzalez-Billandon
- Albert Thomas
- Bal{\'a}zs K{\'e}gl
Abstract:
In the last decade, reinforcement learning successfully solved complex control tasks and decision-making problems, like the Go board game. Yet, there are few success stories when it comes to deploying those algorithms to real-world scenarios. One of the reasons is the lack of guarantees when dealing with and avoiding unsafe states, a fundamental requirement in critical control engineering systems. In this paper, we introduce Guided Safe Shooting (GuSS), a model-based RL approach that can learn to control systems with minimal violations of the safety constraints. The model is learned on the data collected during the operation of the system in an iterated batch fashion, and is then used to plan for the best action to perform at each time step. We propose three different safe planners, one based on a simple random shooting strategy and two based on MAP-Elites, a more advanced divergent-search algorithm. Experiments show that these planners help the learning agent avoid unsafe situations while maximally exploring the state space, a necessary aspect when learning an accurate model of the system. Furthermore, compared to model-free approaches, learning a model allows GuSS reducing the number of interactions with the real-system while still reaching high rewards, a fundamental requirement when handling engineering systems.Links:
Bibtex:
"@article{paolo2022guided, title={Guided Safe Shooting: model based reinforcement learning with safety constraints}, author={Paolo, Giuseppe and Gonzalez-Billandon, Jonas and Thomas, Albert and K{\'e}gl, Bal{\'a}zs}, journal={arXiv preprint arXiv:2206.09743}, year={2022} } "
-
Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution Games
Authors:
- Ke Xue
- Yutong Wang
- Lei Yuan
- Cong Guan
- Chao Qian
- Yang Yu
Abstract:
Generating agents that can achieve Zero-Shot Coordination (ZSC) with unseen partners is a new challenge in cooperative Multi-Agent Reinforcement Learning (MARL). Recently, some studies have made progress in ZSC by exposing the agents to diverse partners during the training process. They usually involve self-play when training the partners, implicitly assuming that the tasks are homogeneous. However, many real-world tasks are heterogeneous, and hence previous methods may fail. In this paper, we study the heterogeneous ZSC problem for the first time and propose a general method based on coevolution, which coevolves two populations of agents and partners through three sub-processes: pairing, updating and selection. Experimental results on a collaborative cooking task show the necessity of considering the heterogeneous setting and illustrate that our proposed method is a promising solution for heterogeneous cooperative MARL.Links:
Bibtex:
" @article{maze, title={Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution}, author={Ke Xue and Yutong Wang and Lei Yuan and Cong Guan and Chao Qian and Yang Yu}, journal={arXiv preprint arXiv:2208.04957}, year={2022}} "
-
Hierarchical Quality-Diversity for Online Damage Recovery Robotics
Authors:
- Maxime Allard
- Simón C. Smith
- Konstantinos Chatzilygeroudis
- Antoine Cully
Abstract:
Adaptation capabilities, like damage recovery, are crucial for the deployment of robots in complex environments. Several works have demonstrated that using repertoires of pre-trained skills can enable robots to adapt to unforeseen mechanical damages in a few minutes. These adaptation capabilities are directly linked to the behavioural diversity in the repertoire. The more alternatives the robot has to execute a skill, the better are the chances that it can adapt to a new situation. However, solving complex tasks, like maze navigation, usually requires multiple different skills. Finding a large behavioural diversity for these multiple skills often leads to an intractable exponential growth of the number of required solutions. In this paper, we introduce the Hierarchical Trial and Error algorithm, which uses a hierarchical behavioural repertoire to learn diverse skills and leverages them to make the robot more adaptive to different situations. We show that the hierarchical decomposition of skills enables the robot to learn more complex behaviours while keeping the learning of the repertoire tractable. The experiments with a hexapod robot show that our method solves maze navigation tasks with 20% less actions in the most challenging scenarios than the best baseline while having 57% less complete failures.Links:
Bibtex:
" @inproceedings{10.1145/3512290.3528751, author = {Allard, Maxime and Smith, Sim\'{o}n C. and Chatzilygeroudis, Konstantinos and Cully, Antoine}, title = {Hierarchical Quality-Diversity for Online Damage Recovery}, year = {2022}, isbn = {9781450392372}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3512290.3528751}, doi = {10.1145/3512290.3528751}, abstract = {Adaptation capabilities, like damage recovery, are crucial for the deployment of robots in complex environments. Several works have demonstrated that using repertoires of pre-trained skills can enable robots to adapt to unforeseen mechanical damages in a few minutes. These adaptation capabilities are directly linked to the behavioural diversity in the repertoire. The more alternatives the robot has to execute a skill, the better are the chances that it can adapt to a new situation. However, solving complex tasks, like maze navigation, usually requires multiple different skills. Finding a large behavioural diversity for these multiple skills often leads to an intractable exponential growth of the number of required solutions. In this paper, we introduce the Hierarchical Trial and Error algorithm, which uses a hierarchical behavioural repertoire to learn diverse skills and leverages them to make the robot more adaptive to different situations. We show that the hierarchical decomposition of skills enables the robot to learn more complex behaviours while keeping the learning of the repertoire tractable. The experiments with a hexapod robot show that our method solves maze navigation tasks with 20% less actions in the most challenging scenarios than the best baseline while having 57% less complete failures.}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {58–67}, numpages = {10}, keywords = {robotics, hierarchical learning, quality-diversity}, location = {Boston, Massachusetts}, series = {GECCO '22} } "
-
Illuminating Diverse Neural Cellular Automata for Level Generation Games
Authors:
- Sam Earle
- Justin Snider
- Matthew Fontaine
- Stefanos Nikolaidis
- Julian Togelius
Abstract:
We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space.Links:
Bibtex:
"@inproceedings{10.1145/3512290.3528754, author = {Earle, Sam and Snider, Justin and Fontaine, Matthew C. and Nikolaidis, Stefanos and Togelius, Julian}, title = {Illuminating Diverse Neural Cellular Automata for Level Generation}, year = {2022}, isbn = {9781450392372}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3512290.3528754}, doi = {10.1145/3512290.3528754}, abstract = {We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space.}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {68–76}, numpages = {9}, keywords = {cellular automata, evolutionary strategies, procedural content generation, neural networks}, location = {Boston, Massachusetts}, series = {GECCO '22} } "
-
Illuminating the Space of Dungeon Maps, Locked-door Missions and Enemy Placement Through MAP-Elites
Authors:
- Breno M. F. Viana
- Leonardo T. Pereira
- Claudio F. M. Toledo
Abstract:
Procedural Content Generation (PCG) methods are valuable tools to speed up the game development process. Moreover, PCG may also present in games as features, such as the procedural dungeon generation (PDG) in Moonlighter (Digital Sun, 2018). This paper introduces an extended version of an evolutionary dungeon generator by incorporating a MAP-Elites population. Our dungeon levels are discretized with rooms that may have locked-door missions and enemies within them. We encoded the dungeons through a tree structure to ensure the feasibility of missions. We performed computational and user feedback experiments to evaluate our PDG approach. They show that our approach accurately converges almost the whole MAP-Elite population for most executions. Finally, players' feedback indicates that they enjoyed the generated levels, and they could not indicate an algorithm as a level generator.Links:
Bibtex:
@article{M. F. Viana2022Illuminating, title={Illuminating the Space of Dungeon Maps, Locked-door Missions and Enemy Placement Through MAP-Elites}, author={M. F. Viana, Breno and T. Pereira, Leonardo and F. M. Toledo, Claudio}, journal={arXiv preprint arXiv:2202.09301v2}, year={2022} }
-
Learning to Walk Autonomously via Reset-Free Quality-Diversity Robotics
Authors:
- Bryan Lim*
- Alexander Reichenbach*
- Antoine Cully
Links:
-
Looking For Novelty in Search-Based Software Product Line Testing
Authors:
- Yi Xiang
- Han Huang
- Miqing Li
- Sizhe Li
- Xiaowei Yang
Abstract:
Testing software product lines (SPLs) is difficult due to a huge number of possible products to be tested. Recently, there has been a growing interest in similarity-based testing of SPLs, where similarity is used as a surrogate metric for the t-wise coverage. In this context, one of the primary goals is to sample, by optimizing similarity metrics using search-based algorithms, a small subset of test cases (i.e., products) as dissimilar as possible, thus potentially making more t-wise combinations covered. Prior work has shown, by means of empirical studies, the great potential of current similarity-based testing approaches. However, the rationale of this testing technique deserves a more rigorous exploration. To this end, we perform correlation analyses to investigate how similarity metrics are correlated with the t-wise coverage. We find that similarity metrics generally have significantly positive correlations with the t-wise coverage. This well explains why similarity-based testing works, as the improvement on similarity metrics will potentially increase the t-wise coverage. Moreover, we explore, for the first time, the use of the novelty search (NS) algorithm for similarity-based SPL testing. The algorithm rewards “novel” individuals, i.e., those being different from individuals discovered previously, and this well matches the goal of similarity-based SPL testing. We find that the novelty score used in NS has (much) stronger positive correlations with the t-wise coverage than previous approaches relying on a genetic algorithm (GA) with a similarity-based fitness function. Experimental results on 31 software product lines validate the superiority of NS over GA, as well as other state-of-the-art approaches, concerning both t-wise coverage and fault detection capacity. Finally, we investigate whether it is useful to combine two satisfiability solvers when generating new individuals in NS, and how the performance of NS is affected by its key parameters. In summary, looking for novelty provides a promising way of sampling diverse test cases for SPLs.Bibtex:
"@article{9350184, author={Xiang, Yi and Huang, Han and Li, Miqing and Li, Sizhe and Yang, Xiaowei}, journal={IEEE Transactions on Software Engineering}, title={Looking For Novelty in Search-Based Software Product Line Testing}, year={2022}, volume={48}, number={7}, pages={2317-2338}, doi={10.1109/TSE.2021.3057853} } "
-
Minimize Surprise MAP-Elites: a Task-Independent MAP-Elites Variant for Swarms Robotics
Authors:
- Tanja Katharina Kaiser
- Heiko Hamann
Abstract:
Swarm robotics controllers are often automatically generated using methods of evolutionary computation with a task-specific fitness function to guide the optimization process. By contrast, our minimize surprise approach uses a task-independent fitness function to generate diverse behaviors over several independent evolutionary runs. Alternatives are divergent search algorithms rewarding behavioral novelty, such as novelty search, and quality-diversity algorithms generating diverse high-quality solutions, such as MAP-Elites. These approaches usually rely on task-dependent measures. We propose Minimize Surprise MAP-Elites, a task-independent MAP-Elites variant that combines MAP-Elites with our minimize surprise approach. Our first experiments result in high-quality solutions that lead to behavioral diversity across tasks and within tasks.Links:
Bibtex:
" @inproceedings{10.1145/3520304.3528773, author = {Kaiser, Tanja Katharina and Hamann, Heiko}, title = {Minimize Surprise MAP-Elites: A Task-Independent MAP-Elites Variant for Swarms}, year = {2022}, isbn = {9781450392686}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3520304.3528773}, doi = {10.1145/3520304.3528773}, series = {GECCO '22}} "
-
Multi-objective quality diversity optimization
Authors:
- Thomas Pierrot
- Guillaume Richard
- Karim Beguir
- Antoine Cully
Links:
Bibtex:
@inproceedings{Pierrot_2022, doi = {10.1145/3512290.3528823}, url = {https://doi.org/10.1145%2F3512290.3528823}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Thomas Pierrot and Guillaume Richard and Karim Beguir and Antoine Cully}, title = {Multi-objective quality diversity optimization}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference} }
-
Procedural Content Generation using Neuroevolution and Novelty Search for Diverse Video Game Levels Games
Authors:
- Michael Beukman
- Christopher Cleghorn
- Steven James
Abstract:
Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate diverse levels in real time, without requiring training data or detailed domain-specific knowledge. We test our method on two domains, and our results show an order of magnitude speedup in generation time compared to existing methods while obtaining comparable metric scores. We further demonstrate the ability to generalise to arbitrary-sized levels without retraining.Links:
Bibtex:
"@inproceedings{beukman2022Procedural, author = {Beukman, Michael and Cleghorn, Christopher W and James, Steven}, title = {Procedural Content Generation Using Neuroevolution and Novelty Search for Diverse Video Game Levels}, year = {2022}, isbn = {9781450392372}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3512290.3528701}, doi = {10.1145/3512290.3528701}, abstract = {Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate diverse levels in real time, without requiring training data or detailed domain-specific knowledge. We test our method on two domains, and our results show an order of magnitude speedup in generation time compared to existing methods while obtaining comparable metric scores. We further demonstrate the ability to generalise to arbitrary-sized levels without retraining.}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, pages = {1028–1037}, numpages = {10}, keywords = {neuroevolution, novelty search, procedural content generation}, location = {Boston, Massachusetts}, series = {GECCO '22} } "
-
Quality-Diversity Meta-Evolution: Customizing Behavior Spaces to a Meta-Objective
Authors:
- David M. Bossens
- Danesh Tarapore
Links:
Bibtex:
@article{Bossens_2022, doi = {10.1109/tevc.2022.3152384}, url = {https://doi.org/10.1109%2Ftevc.2022.3152384}, year = 2022, month = {oct}, publisher = {Institute of Electrical and Electronics Engineers ({IEEE})}, volume = {26}, number = {5}, pages = {1171--1181}, author = {David M. Bossens and Danesh Tarapore}, title = {Quality-Diversity Meta-Evolution: Customizing Behavior Spaces to a Meta-Objective}, journal = {{IEEE} Transactions on Evolutionary Computation} }
-
Relevance–guided Unsupervised Discovery of Abilities with Quality–Diversity Algorithms
Authors:
- Luca Grillotti
- Antoine Cully
Abstract:
Quality-Diversity algorithms provide efficient mechanisms to generate large collections of diverse and high-performing solutions, which have shown to be instrumental for solving downstream tasks. However, most of those algorithms rely on a behavioural descriptor to characterise the diversity that is hand-coded, hence requiring prior knowledge about the considered tasks. In this work, we introduce Relevance-guided Unsupervised Discovery of Abilities; a Quality-Diversity algorithm that autonomously finds a behavioural characterisation tailored to the task at hand. In particular, our method introduces a custom diversity metric that leads to higher densities of solutions near the areas of interest in the learnt behavioural descriptor space. We evaluate our approach on a simulated robotic environment, where the robot has to autonomously discover its abilities based on its full sensory data. We evaluated the algorithms on three tasks: navigation to random targets, moving forward with a high velocity, and performing half-rolls. The experimental results show that our method manages to discover collections of solutions that are not only diverse, but also well-adapted to the considered downstream task.Links:
Bibtex:
@inproceedings{Grillotti_2022, series={GECCO ’22}, title={Relevance-guided unsupervised discovery of abilities with quality-diversity algorithms}, url={http://dx.doi.org/10.1145/3512290.3528837}, DOI={10.1145/3512290.3528837}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, publisher={ACM}, author={Grillotti, Luca and Cully, Antoine}, year={2022}, month=jul, collection={GECCO ’22} }
-
SafeAPT: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulation
Authors:
- Rituraj Kaushik
- Karol Arndt
- Ville Kyrki
Links:
Bibtex:
@article{Kaushik_2022, doi = {10.1109/lra.2022.3177294}, url = {https://doi.org/10.1109%2Flra.2022.3177294}, year = 2022, month = {jul}, publisher = {Institute of Electrical and Electronics Engineers ({IEEE})}, volume = {7}, number = {3}, pages = {6838--6845}, author = {Rituraj Kaushik and Karol Arndt and Ville Kyrki}, title = {{SafeAPT}: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulation}, journal = {{IEEE} Robotics and Automation Letters} }
-
Search-based Diverse Sampling from Real-world Software Product Lines
Authors:
- Yi Xiang
- Han Huang
- Yuren Zhou
- Sizhe Li
- Chuan Luo
- Qingwei Lin
- Miqing Li
Abstract:
Real-world software product lines (SPLs) often encompass enormous valid configurations that are impossible to enumerate. To understand properties of the space formed by all valid configurations, a feasible way is to select a small and valid sample set. Even though a number of sampling strategies have been proposed, they either fail to produce diverse samples with respect to the number of selected features (an important property to characterize behaviors of configurations), or achieve diverse sampling but with limited scalability (the handleable configuration space size is limited to 1013). To resolve this dilemma, we propose a scalable diverse sampling strategy, which uses a distance metric in combination with the novelty search algorithm to produce diverse samples in an incremental way. The distance metric is carefully designed to measure similarities between configurations, and further diversity of a sample set. The novelty search incrementally improves diversity of samples through the search for novel configurations. We evaluate our sampling algorithm on 39 real-world SPLs. It is able to generate the required number of samples for all the SPLs, including those which cannot be counted by sharpSAT, a state-of-the-art model counting solver. Moreover, it performs better than or at least competitively to state-of-the-art samplers regarding diversity of the sample set. Experimental results suggest that only the proposed sampler (among all the tested ones) achieves scalable diverse sampling.Links:
Bibtex:
"@inproceedings{10.1145/3510003.3510053, author = {Xiang, Yi and Huang, Han and Zhou, Yuren and Li, Sizhe and Luo, Chuan and Lin, Qingwei and Li, Miqing and Yang, Xiaowei}, title = {Search-Based Diverse Sampling from Real-World Software Product Lines}, year = {2022}, isbn = {9781450392211}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3510003.3510053}, doi = {10.1145/3510003.3510053}, booktitle = {Proceedings of the 44th International Conference on Software Engineering}, pages = {1945-1957}, numpages = {13}, keywords = {software product lines, distance metric, novelty search, diverse sampling}, location = {Pittsburgh, Pennsylvania}, series = {ICSE '22} } "
-
Synaptic pruning with MAP-elites
Authors:
- Federico Da Rold
- Olaf Witkovski
- Nathanael Aubert-Kato
Links:
Bibtex:
@inproceedings{Rold_2022, doi = {10.1145/3520304.3528813}, url = {https://doi.org/10.1145%2F3520304.3528813}, year = 2022, month = {jul}, publisher = {{ACM}}, author = {Federico Da Rold and Olaf Witkovski and Nathanael Aubert-Kato}, title = {Synaptic pruning with {MAP}-elites}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference Companion} }
-
T-DominO Exploring Multiple Criteria with Quality-Diversity and the Tournament Dominance Objective
Authors:
- Adam Gaier
- James Stoddart
- Lorenzo Villaggi
- Peter J Bentley
Abstract:
Real-world design problems are a messy combination of constraints, objectives, and features. Exploring these problem spaces can be defined as a Multi-Criteria Exploration (MCX) problem, whose goals are to produce a set of diverse solutions with high performance across many objectives, while avoiding low performance across any objectives. Quality-Diversity algorithms produce the needed design variation, but typically consider only a single objective. We present a new ranking, T-DominO, specifically designed to handle multiple objectives in MCX problems. T-DominO ranks individuals relative to other solutions in the archive, favoring individuals with balanced performance over those which excel at a few objectives at the cost of the others. Keeping only a single balanced solution in each MAP-Elites bin maintains the visual accessibility of the archive -- a strong asset for design exploration. We illustrate our approach on a set of easily understood benchmarks, and showcase its potential in a many-objective real-world architecture case study.Links:
Bibtex:
@article{Gaier2022T-DominO:, title={T-DominO Exploring Multiple Criteria with Quality-Diversity and the Tournament Dominance Objective}, author={Gaier, Adam and Stoddart, James and Villaggi, Lorenzo and J Bentley, Peter}, journal={arXiv preprint arXiv:2207.01439v1}, year={2022} }
-
Towards QD-suite: developing a set of benchmarks for Quality-Diversity algorithms
Authors:
- Achkan Salehi
- Stephane Doncieux
Abstract:
While the field of Quality-Diversity (QD) has grown into a distinct branch of stochastic optimization, a few problems, in particular locomotion and navigation tasks, have become de facto standards. Are such benchmarks sufficient? Are they representative of the key challenges faced by QD algorithms? Do they provide the ability to focus on one particular challenge by properly disentangling it from others? Do they have much predictive power in terms of scalability and generalization? Existing benchmarks are not standardized, and there is currently no MNIST equivalent for QD. Inspired by recent works on Reinforcement Learning benchmarks, we argue that the identification of challenges faced by QD methods and the development of targeted, challenging, scalable but affordable benchmarks is an important step. As an initial effort, we identify three problems that are challenging in sparse reward settings, and propose associated benchmarks: (1) Behavior metric bias, which can result from the use of metrics that do not match the structure of the behavior space. (2) Behavioral Plateaus, with varying characteristics, such that escaping them would require adaptive QD algorithms and (3) Evolvability Traps, where small variations in genotype result in large behavioral changes. The environments that we propose satisfy the properties listed above.Links:
Bibtex:
" @article{salehi2022toward, title={Towards QD-suite: developing a set of benchmarks for Quality-Diversity algorithms}, author={Salehi, Achkan and Doncieux, Stephane}, journal={arXiv preprint arXiv:2205.03207}, year={2022} } "
-
Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing
Authors:
- Bryon Tjanaka
- Matthew C. Fontaine
- David H. Lee
- Aniruddha Kalkar
- Stefanos Nikolaidis
Abstract:
Pre-training a diverse set of neural network controllers in simulation has enabled robots to adapt online to damage in robot locomotion tasks. However, finding diverse, high-performing controllers requires expensive network training and extensive tuning of a large number of hyperparameters. On the other hand, Covariance Matrix Adaptation MAP-Annealing (CMA-MAE), an evolution strategies (ES)-based quality diversity algorithm, does not have these limitations and has achieved state-of-the-art performance on standard QD benchmarks. However, CMA-MAE cannot scale to modern neural network controllers due to its quadratic complexity. We leverage efficient approximation methods in ES to propose three new CMA-MAE variants that scale to high dimensions. Our experiments show that the variants outperform ES-based baselines in benchmark robotic locomotion tasks, while being comparable with or exceeding state-of-the-art deep reinforcement learning-based quality diversity algorithms.Links:
Bibtex:
@article{Tjanaka2022Training, title={Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing}, author={Tjanaka, Bryon and C. Fontaine, Matthew and H. Lee, David and Kalkar, Aniruddha and Nikolaidis, Stefanos}, journal={arXiv preprint arXiv:2210.02622v2}, year={2022} }
-
Unsupervised Behavior Discovery With Quality-Diversity Optimization
Authors:
- Luca Grillotti
- Antoine Cully
Links:
Bibtex:
@article{Grillotti_2022, doi = {10.1109/tevc.2022.3159855}, url = {https://doi.org/10.1109%2Ftevc.2022.3159855}, year = 2022, month = {dec}, publisher = {Institute of Electrical and Electronics Engineers ({IEEE})}, volume = {26}, number = {6}, pages = {1539--1552}, author = {Luca Grillotti and Antoine Cully}, title = {Unsupervised Behavior Discovery With Quality-Diversity Optimization}, journal = {{IEEE} Transactions on Evolutionary Computation} }
2021: 24 Papers
-
A Quality Diversity Approach to Automatically Generating Human-Robot Interaction Scenarios in Shared Autonomy Robotics
Authors:
- Matthew Fontaine
- Stefanos Nikolaidis
Abstract:
The growth of scale and complexity of interactions between humans and robots highlights the need for new computational methods to automatically evaluate novel algorithms and applications. Exploring the diverse scenarios of interaction between humans and robots in simulation can improve understanding of the system and avoid potentially costly failures in real-world settings. We formulate this as a quality diversity (QD) problem, where the goal is to discover diverse failure scenarios by simultaneously exploring both environments and human actions. We focus on the shared autonomy domain, where the robot attempts to infer the goal of a human operator, and adopt the QD algorithm MAP-Elites to generate scenarios for two published algorithms in this domain: shared autonomy via hindsight optimization and linear policy blending. Some of the generated scenarios confirm previous theoretical findings, while others are surprising and bring about a new understanding of state-of-the-art implementations. Our experiments show that MAP-Elites outperforms Monte-Carlo simulation and optimization based methods in effectively searching the scenario space, highlighting its promise for automatic evaluation of algorithms in shared autonomy.Links:
Bibtex:
"@article{fontaine2020quality, title={A Quality Diversity Approach to Automatically Generating Human-Robot Interaction Scenarios in Shared Autonomy}, author={Fontaine, Matthew and Nikolaidis, Stefanos}, journal={arXiv preprint arXiv:2012.04283}, year={2020} } "
-
ARCH-Elites: Quality-Diversity for Urban Design
Authors:
- Theodoros Galanos
- Antonios Liapis
- Georgios N. Yannakakis
- Reinhard Koenig
Abstract:
This paper introduces ARCH-Elites, a MAP-Elites implementation that can reconfigure large-scale urban layouts at real-world locations via a pre-trained surrogate model instead of costly simulations. In a series of experiments, we generate novel urban designs for two real-world locations in Boston, Massachusetts. Combining the exploration of a possibility space with real-time performance evaluation creates a powerful new paradigm for architectural generative design that can extract and articulate design intelligence.Links:
Bibtex:
"@article{galanos2021arch, title={ARCH-Elites: Quality-Diversity for Urban Design}, author={Galanos, Theodoros and Liapis, Antonios and Yannakakis, Georgios N and Koenig, Reinhard}, journal={arXiv preprint arXiv:2104.08774}, year={2021} } "
-
AutoAlpha: an Efficient Hierarchical Evolutionary Algorithm for Mining Alpha Factors in Quantitative Investment
Authors:
- Tianping Zhang
- Yuanqi Li
- Yifei Jin
- Jian Li
Abstract:
The multi-factor model is a widely used model in quantitative investment. The success of a multi-factor model is largely determined by the effectiveness of the alpha factors used in the model. This paper proposes a new evolutionary algorithm called AutoAlpha to automatically generate effective formulaic alphas from massive stock datasets. Specifically, first we discover an inherent pattern of the formulaic alphas and propose a hierarchical structure to quickly locate the promising part of space for search. Then we propose a new Quality Diversity search based on the Principal Component Analysis (PCA-QD) to guide the search away from the well-explored space for more desirable results. Next, we utilize the warm start method and the replacement method to prevent the premature convergence problem. Based on the formulaic alphas we discover, we propose an ensemble learning-to-rank model for generating the portfolio. The backtests in the Chinese stock market and the comparisons with several baselines further demonstrate the effectiveness of AutoAlpha in mining formulaic alphas for quantitative trading.Links:
Bibtex:
"@article{zhang2020autoalpha, title={AutoAlpha: an Efficient Hierarchical Evolutionary Algorithm for Mining Alpha Factors in Quantitative Investment}, author={Zhang, Tianping and Li, Yuanqi and Jin, Yifei and Li, Jian}, journal={arXiv preprint arXiv:2002.08245}, year={2020} } "
-
BR-NS: an Archive-less Approach to Novelty Search Robotics
Authors:
- Achkan Salehi
- Alexandre Coninx
- Stephane Doncieux
Abstract:
As open-ended learning based on divergent search algorithms such as Novelty Search (NS) draws more and more attention from the research community, it is natural to expect that its application to increasingly complex real-world problems will require the exploration to operate in higher dimensional Behavior Spaces which will not necessarily be Euclidean. Novelty Search traditionally relies on k-nearest neighbours search and an archive of previously visited behavior descriptors which are assumed to live in a Euclidean space. This is problematic because of a number of issues. On one hand, Euclidean distance and Nearest-neighbour search are known to behave differently and become less meaningful in high dimensional spaces. On the other hand, the archive has to be bounded since, memory considerations aside, the computational complexity of finding nearest neighbours in that archive grows linearithmically with its size. A sub-optimal bound can result in 'cycling' in the behavior space, which inhibits the progress of the exploration. Furthermore, the performance of NS depends on a number of algorithmic choices and hyperparameters, such as the strategies to add or remove elements to the archive and the number of neighbours to use in k-nn search. In this paper, we discuss an alternative approach to novelty estimation, dubbed Behavior Recognition based Novelty Search (BR-NS), which does not require an archive, makes no assumption on the metrics that can be defined in the behavior space and does not rely on nearest neighbours search. We conduct experiments to gain insight into its feasibility and dynamics as well as potential advantages over archive-based NS in terms of time complexity.Links:
Bibtex:
"@inproceedings{salehi2021br, title={BR-NS: an Archive-less Approach to Novelty Search}, author={Salehi, Achkan and Coninx, Alexandre and Doncieux, Stephane}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Dynamics-Aware Quality-Diversity for Efficient Learning of Skill Repertoires Robotics
Authors:
- Bryan Lim
- Luca Grillotti
- Lorenzo Bernasconi
- Antoine Cully
Links:
-
Ensemble Feature Extraction for Multi-Container Quality-Diversity Algorithms
Authors:
- Leo Cazenille
Abstract:
Quality-Diversity algorithms search for large collections of diverse and high-performing solutions, rather than just for a single solution like typical optimisation methods. They are specially adapted for multi-modal problems that can be solved in many different ways, such as complex reinforcement learning or robotics tasks. However, these approaches are highly dependent on the choice of feature descriptors (FDs) quantifying the similarity in behaviour of the solutions. While FDs usually needs to be hand-designed, recent studies have proposed ways to define them automatically by using feature extraction techniques, such as PCA or Auto-Encoders, to learn a representation of the problem from previously explored solutions. Here, we extend these approaches to more complex problems which cannot be efficiently explored by relying only on a single representation but require instead a set of diverse and complementary representations. We describe MC-AURORA, a Quality-Diversity approach that optimises simultaneously several collections of solutions, each with a different set of FDs, which are, in turn, defined automatically by an ensemble of modular auto-encoders. We show that this approach produces solutions that are more diverse than those produced by single-representation approaches.Links:
Bibtex:
"@inproceedings{cazenille2021ensemble, title={Ensemble Feature Extraction for Multi-Container Quality-Diversity Algorithms}, author={Cazenille, Leo}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Expressivity of Parameterized and Data-driven Representations in Quality Diversity Search
Authors:
- Alexander Hagg
- Sebastian Berns
- Alexander Asteroth
- Simon Colton
- Thomas Bäck
Abstract:
We consider multi-solution optimization and generative models for the generation of diverse artifacts and the discovery of novel solutions. In cases where the domain's factors of variation are unknown or too complex to encode manually, generative models can provide a learned latent space to approximate these factors. When used as a search space, however, the range and diversity of possible outputs are limited to the expressivity and generative capabilities of the learned model. We compare the output diversity of a quality diversity evolutionary search performed in two different search spaces: 1) a predefined parameterized space and 2) the latent space of a variational autoencoder model. We find that the search on an explicit parametric encoding creates more diverse artifact sets than searching the latent space. A learned model is better at interpolating between known data points than at extrapolating or expanding towards unseen examples. We recommend using a generative model's latent space primarily to measure similarity between artifacts rather than for search and generation. Whenever a parametric encoding is obtainable, it should be preferred over a learned representation as it produces a higher diversity of solutions.Links:
Bibtex:
"@inproceedings{hagg2021expressivity, title={Expressivity of Parameterized and Data-driven Representations in Quality Diversity Search}, author={Hagg, Alexander and Berns, Sebastian and Asteroth, Alexander and Colton, Simon and B{\"a}ck, Thomas}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Generating Diverse and Competitive Play-Styles for Strategy Games Games
Authors:
- Diego Perez-Liebana
- Cristina Guerrero-Romero
- Alexander Dockhorn
- Dominik Jeurissen
- Linjie Xu
Abstract:
Designing agents that are able to achieve different play-styles while maintaining a competitive level of play is a difficult task, especially for games for which the research community has not found super-human performance yet, like strategy games. These require the AI to deal with large action spaces, long-term planning and partial observability, among other well-known factors that make decision-making a hard problem. On top of this, achieving distinct play-styles using a general algorithm without reducing playing strength is not trivial. In this paper, we propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes) and show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play. Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.Links:
Bibtex:
"@article{perez2021generating, title={Generating Diverse and Competitive Play-Styles for Strategy Games}, author={Perez-Liebana, Diego and Guerrero-Romero, Cristina and Dockhorn, Alexander and Jeurissen, Dominik and Xu, Linjie}, journal={arXiv preprint arXiv:2104.08641}, year={2021} } "
-
Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder Games
Authors:
- Anurag Sarkar
- Seth Cooper
Abstract:
Several recent works have demonstrated the use of variational autoencoders (VAEs) for both generating levels in the style of existing games as well as blending levels across different games. Additionally, quality-diversity (QD) algorithms have also become popular for generating varied game content by using evolution to explore a search space while focusing on both variety and quality. In order to reap the benefits of both these approaches, we present a level generation and game blending approach that combines the use of VAEs and QD algorithms. Specifically, we train VAEs on game levels and then run the MAP-Elites QD algorithm using the learned latent space of the VAE as the search space. The latent space captures the properties of the games whose levels we want to generate and blend, while MAP-Elites searches this latent space to find a diverse set of levels optimizing a given objective such as playability. We test our method using models for 5 different platformer games as well as a blended domain spanning 3 of these games. Our results show that using MAP-Elites in conjunction with VAEs enables the generation of a diverse set of playable levels not just for each individual game but also for the blended domain while illuminating game-specific regions of the blended latent space.Links:
Bibtex:
"@article{sarkar2021generating, title={Generating and Blending Game Levels via Quality-Diversity in the Latent Space of a Variational Autoencoder}, author={Sarkar, Anurag and Cooper, Seth}, journal={arXiv preprint arXiv:2102.12463}, year={2021} } "
-
Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network Games
Authors:
- Matthew C. Fontaine
- Ruilin Liu
- Ahmed Khalifa
- Jignesh Modi
- Julian Togelius
- Amy K. Hoover
- Stefanos Nikolaidis
Abstract:
Generative adversarial networks (GANs) are now a ubiquitous approach to procedurally generating video game levels. While GAN generated levels are stylistically similar to human-authored examples, human designers often want to explore the generative design space of GANs to extract interesting levels. However, human designers find latent vectors opaque and would rather explore along dimensions the designer specifies, such as number of enemies or obstacles. We propose using state-of-the-art quality diversity algorithms designed to optimize continuous spaces, i.e. MAP-Elites with a directional variation operator and Covariance Matrix Adaptation MAP-Elites, to efficiently explore the latent space of a GAN to extract levels that vary across a set of specified gameplay measures. In the benchmark domain of Super Mario Bros, we demonstrate how designers may specify gameplay measures to our system and extract high-quality (playable) levels with a diverse range of level mechanics, while still maintaining stylistic similarity to human authored examples. An online user study shows how the different mechanics of the automatically generated levels affect subjective ratings of their perceived difficulty and appearance.Links:
Bibtex:
"@inproceedings{fontaine2020illuminating, title={Illuminating mario scenes in the latent space of a generative adversarial network}, author={Fontaine, Matthew C and Liu, Ruilin and Togelius, Julian and Hoover, Amy K and Nikolaidis, Stefanos}, booktitle={AAAI Conference on Artificial Intelligence}, year={2021}} "
-
Illuminating the Space of Beatable Lode Runner Levels Produced By Various Generative Adversarial Networks Games
Authors:
- Kirby Steckel
- Jacob Schrum
Abstract:
Generative Adversarial Networks (GANs) are capable of generating convincing imitations of elements from a training set, but the distribution of elements in the training set affects to difficulty of properly training the GAN and the quality of the outputs it produces. This paper looks at six different GANs trained on different subsets of data from the game Lode Runner. The quality diversity algorithm MAP-Elites was used to explore the set of quality levels that could be produced by each GAN, where quality was defined as being beatable and having the longest solution path possible. Interestingly, a GAN trained on only 20 levels generated the largest set of diverse beatable levels while a GAN trained on 150 levels generated the smallest set of diverse beatable levels, thus challenging the notion that more is always better when training GANs.Links:
Bibtex:
"@article{steckel2021illuminating, title={Illuminating the Space of Beatable Lode Runner Levels Produced By Various Generative Adversarial Networks}, author={Steckel, Kirby and Schrum, Jacob}, journal={arXiv preprint arXiv:2101.07868}, year={2021} } "
-
MAP-Elites enables Powerful Stepping Stones and Diversity for Modular Robotics Robotics
Authors:
- Jørgen Nordmoen
- Frank Veenstra
- Kai Olav Ellefsen
- Kyrre Glette
Abstract:
In modular robotics, modules can be reconfigured to change the morphology of the robot, making it able to adapt for specific tasks. However, optimizing both the body and control is a difficult challenge due to the intricate relationship between fine-tuning control and morphological changes that can invalidate such optimizations. To solve this challenge we compare three different Evolutionary Algorithms on their capacity to optimize morphologies in modular robotics. We compare two objective-based search algorithms, with MAP-Elites. To understand the benefit of diversity we transition the evolved populations into two difficult environments to see if diversity can have an impact on solving complex environments. In addition, we analyse the genealogical ancestry to shed light on the notion of stepping stones as key to enable high performance. The results show that MAP-Elites is capable of evolving the highest performing solutions in addition to generating the largest morphological diversity. For the transition between environments the results show that MAP-Elites is better at regaining performance by promoting morphological diversity. With the analysis of genealogical ancestry we show that MAP-Elites produces more diverse and higher performing stepping stones than the other objective-based search algorithms. Transitioning the populations to more difficult environments show the utility of morphological diversity, while the analysis of stepping stones show a strong correlation between diversity of ancestry and maximum performance on the locomotion task. The paper shows the advantage of promoting diversity for solving a locomotion task in different environments for modular robotics. By showing that the quality and diversity of stepping stones in Evolutionary Algorithms is an important factor for overall performance we have opened up a new area of analysis and results.Links:
Bibtex:
"@article{nordmoen2021map, title={MAP-Elites enables Powerful Stepping Stones and Diversity for Modular Robotics}, author={Nordmoen, J{\o}rgen and Veenstra, Frank and Ellefsen, Kai Olav and Glette, Kyrre}, journal={Frontiers in Robotics and AI}, volume={8}, year={2021}, publisher={Frontiers Media SA} } "
-
MAP-Elites for Constrained Optimization
Authors:
- Stefano Fioravanzo
- Giovanni Iacca
Abstract:
Constrained optimization problems are often characterized by multiple constraints that, in the practice, must be satisfied with different tolerance levels. While some constraints are hard and as such must be satisfied with zero-tolerance, others may be soft, such that non-zero violations are acceptable. Here, we evaluate the applicability of MAP-Elites to “illuminate” constrained search spaces by mapping them into feature spaces where each feature corresponds to a different constraint. On the one hand, MAP-Elites implicitly preserves diversity, thus allowing a good exploration of the search space. On the other hand, it provides an effective visualization that facilitates a better understanding of how constraint violations correlate with the objective function. We demonstrate the feasibility of this approach on a large set of benchmark problems, in various dimensionalities, and with different algorithmic configurations. As expected, numerical results show that a basic version of MAP-Elites cannot compete on all problems (especially those with equality constraints) with state-of-the-art algorithms that use gradient information or advanced constraint handling techniques. Nevertheless, it has a higher potential at finding constraint violations versus objectives trade-offs and providing new problem information. As such, it could be used in the future as an effective building-block for designing new constrained optimization algorithms.Bibtex:
"@article{fioravanzomap, title={MAP-Elites for Constrained Optimization}, author={Fioravanzo, Stefano and Iacca, Giovanni}, journal={Constraint Handling in Metaheuristics and Applications}, pages={151}, publisher={Springer} } "
-
Monte Carlo Elites: Quality-Diversity Selection as a Multi-Armed Bandit Problem Robotics
Authors:
- Konstantinos Sfikas
- Antonios Liapis
- Georgios N. Yannakakis
Abstract:
A core challenge of evolutionary search is the need to balance between exploration of the search space and exploitation of highly fit regions. Quality-diversity search has explicitly walked this tightrope between a population's diversity and its quality. This paper extends a popular quality-diversity search algorithm, MAP-Elites, by treating the selection of parents as a multi-armed bandit problem. Using variations of the upper-confidence bound to select parents from under-explored but potentially rewarding areas of the search space can accelerate the discovery of new regions as well as improve its archive's total quality. The paper tests an indirect measure of quality for parent selection: the survival rate of a parent's offspring. Results show that maintaining a balance between exploration and exploitation leads to the most diverse and high-quality set of solutions in three different testbeds.Links:
Bibtex:
"@inproceedings{sfikas2021monte, title={Monte Carlo Elites: Quality-Diversity Selection as a Multi-Armed Bandit Problem}, author={Sfikas, Konstantinos and Liapis, Antonios and Yannakakis, Georgios N}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Multi-Emitter MAP-Elites: Improving quality, diversity and convergence speed with heterogeneous sets of emitters Robotics
Authors:
- Antoine Cully
Abstract:
Quality-Diversity (QD) optimisation is a new family of learning algorithms that aims at generating collections of diverse and high-performing solutions. Among those algorithms, the recently introduced Covariance Matrix Adaptation MAP-Elites (CMA-ME) algorithm proposes the concept of emitters, which uses a predefined heuristic to drive the algorithm’s exploration. This algorithm was shown to outperform MAP-Elites, a popular QD algorithm that has demonstrated promising results in numerous applications. In this paper, we introduce Multi-Emitter MAP-Elites (ME-MAP-Elites), an algorithm that directly extends CMA-ME and improves its quality, diversity and data efficiency. It leverages the diversity of a heterogeneous set of emitters, in which each emitter type improves the optimisation process in different ways. A bandit algorithm dynamically finds the best selection of emitters depending on the current situation. We evaluate the performance of ME-MAP-Elites on six tasks, ranging from standard optimisation problems (in 100 dimensions) to complex locomotion tasks in robotics. Our comparisons against CMA-ME and MAP-Elites show that ME-MAP-Elites is faster at providing collections of solutions that are significantly more diverse and higher performing. Moreover, in cases where no fruitful synergy can be found between the different emitters, ME-MAP-Elites is equivalent to the best of the compared algorithms.Links:
Bibtex:
"@inproceedings{cully2020multi, title={Multi-Emitter MAP-Elites: Improving quality, diversity and convergence speed with heterogeneous sets of emitters}, author={Cully, Antoine}, booktitle={Genetic and Evolutionary Computation Conference}, year={2020} } "
-
On the Importance of Environments in Human-Robot Coordination Robotics
Authors:
- Matthew Fontaine*
- Ya-Chuan Hsu*
- Yulun Zhang*
- Bryon Tjanaka
- Stefanos Nikolaidis
Abstract:
When studying robots collaborating with humans, much of the focus has been on robot policies that coordinate fluently with human teammates in collaborative tasks. However, less emphasis has been placed on the effect of the environment on coordination behaviors. To thoroughly explore environments that result in diverse behaviors, we propose a framework for procedural generation of environments that are (1) stylistically similar to human-authored environments, (2) guaranteed to be solvable by the human-robot team, and (3) diverse with respect to coordination measures. We analyze the procedurally generated environments in the Overcooked benchmark domain via simulation and an online user study. Results show that the environments result in qualitatively different emerging behaviors and statistically significant differences in collaborative fluency metrics, even when the robot runs the same planning algorithm.Links:
Bibtex:
"@INPROCEEDINGS{FontaineB-RSS-21, AUTHOR = {Matthew Fontaine AND Ya-Chuan Hsu AND Yulun Zhang AND Bryon Tjanaka AND Stefanos Nikolaidis}, TITLE = {{On the Importance of Environments in Human-Robot Coordination}}, BOOKTITLE = {Proceedings of Robotics: Science and Systems}, YEAR = {2021}, ADDRESS = {Virtual}, MONTH = {July}, DOI = {10.15607/RSS.2021.XVII.038} } "
-
On the use of feature-maps and parameter control for improved quality-diversity meta-evolution Robotics
Authors:
- David M. Bossens
- Danesh Tarapore
Abstract:
In Quality-Diversity (QD) algorithms, which evolve a behaviourally diverse archive of high-performing solutions, the behaviour space is a difficult design choice that should be tailored to the target application. In QD meta-evolution, one evolves a population of QD algorithms to optimise the behaviour space based on an archive-level objective, the meta-fitness. This paper proposes an improved meta-evolution system such that (i) the database used to rapidly populate new archives is reformulated to prevent loss of quality-diversity; (ii) the linear transformation of base-features is generalised to a feature-map, a function of the base-features parametrised by the meta-genotype; and (iii) the mutation rate of the QD algorithm and the number of generations per meta-generation are controlled dynamically. Experiments on an 8-joint planar robot arm compare feature-maps (linear, non-linear, and feature-selection), parameter control strategies (static, endogenous, reinforcement learning, and annealing), and traditional MAP-Elites variants, for a total of 49 experimental conditions. Results reveal that non-linear and feature-selection feature-maps yield a 15-fold and 3-fold improvement in meta-fitness, respectively, over linear feature-maps. Reinforcement learning ranks among top parameter control methods. Finally, our approach allows the robot arm to recover a reach of over 80% for most damages and at least 60% for severe damages.Links:
Bibtex:
"@article{bossens2021use, title={On the use of feature-maps and parameter control for improved quality-diversity meta-evolution}, author={Bossens, David M and Tarapore, Danesh}, journal={arXiv preprint arXiv:2105.10317}, year={2021} } "
-
Policy Gradient Assisted MAP-Elites Robotics
Authors:
- Olle Nilsson
- Antoine Cully
Abstract:
Quality-Diversity optimization algorithms such as MAP-Elites, aimto generate collections of both diverse and high-performing solu-tions to an optimization problem. MAP-Elites has shown promisingresults in a variety of applications. In particular in evolutionaryrobotics tasks targeting the generation of behavioral repertoiresthat highlight the versatility of robots. However, for most robot-ics applications MAP-Elites is limited to using simple open-loopor low-dimensional controllers. Here we present Policy GradientAssisted MAP-Elites (PGA-MAP-Elites), a novel algorithm that en-ables MAP-Elites to efficiently evolve large neural network con-trollers by introducing a gradient-based variation operator inspiredby Deep Reinforcement Learning. This operator leverages gradi-ent estimates obtained from a critic neural network to rapidly findhigher-performing solutions and is paired with a traditional geneticvariation to maintain a divergent search behavior. The synergy ofthese operators makes PGA-MAP-Elites an efficient yet powerfulalgorithm for finding diverse and high-performing behaviors. Weevaluate our method on four different tasks for building behavioralrepertoires that use deep neural network controllers. The resultsshow that PGA-MAP-Elites significantly improves the quality ofthe generated repertoires compared to existing methods.Links:
Bibtex:
"@inproceedings{nilsson2021policy, title={Policy Gradient Assisted MAP-Elites}, author={Nilsson, Olle and Cully, Antoine}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Policy Manifold Search: Exploring the Manifold Hypothesis for Diversity-based Neuroevolution Robotics
Authors:
- Nemanja Rakicevic
- Antoine Cully
- Petar Kormushev
Abstract:
Neuroevolution is an alternative to gradient-based optimisation that has the potential to avoid local minima and allows parallelisation. The main limiting factor is that usually it does not scale well with parameter space dimensionality. Inspired by recent work examining neural network intrinsic dimension and loss landscapes, we hypothesise that there exists a low-dimensional manifold, embedded in the policy network parameter space, around which a high-density of diverse and useful policies are located. This paper proposes a novel method for diversity-based policy search via Neuroevolution, that leverages learned representations of the policy network parameters, by performing policy search in this learned representation space. Our method relies on the Quality-Diversity (QD) framework which provides a principled approach to policy search, and maintains a collection of diverse policies, used as a dataset for learning policy representations. Further, we use the Jacobian of the inverse-mapping function to guide the search in the representation space. This ensures that the generated samples remain in the high-density regions, after mapping back to the original space. Finally, we evaluate our contributions on four continuous-control tasks in simulated environments, and compare to diversity-based baselines.Links:
Bibtex:
"@inproceedings{rakicevic2021policy, title={Policy Manifold Search: Exploring the Manifold Hypothesis for Diversity-based Neuroevolution}, author={Rakicevic, Nemanja and Cully, Antoine and Kormushev, Petar}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Quality Diversity Genetic Programming forLearning Decision Tree Ensembles
Authors:
- Stephen Boisvert
- John W. Sheppard
Abstract:
Quality Diversity (QD) algorithms are a class of population-based evolutionary algorithms designed to generate sets of solutions thatare both fit and diverse. In this paper, we describe a strategy for apply-ing QD concepts to the generation of decision tree ensembles by op-timizing collections of trees for both individually accurate and collec-tively diverse predictive behavior. We compare three variants of this QDstrategy with two existing ensemble generation strategies over severalclassification data sets. We then briefly highlight the effect of the evolu-tionary algorithm at the core of the strategy. The examined algorithmsgenerate ensembles with distinct predictive behaviors as measured byclassification accuracy and intrinsic diversity. The plotted behaviors hintat highly data-dependent relationships between these metrics. QD-basedstrategies are suggested as a means to optimize classifier ensembles alongthis performance curve along with other suggestions for future work.Links:
Bibtex:
"@inproceedings{boisvert2021quality, title={Quality Diversity Genetic Programming for Learning Decision Tree Ensembles.}, author={Boisvert, Stephen and Sheppard, John W}, booktitle={EuroGP}, pages={3--18}, year={2021} } "
-
Quality-Diversity Optimization: A Novel Branch of Stochastic Optimization Review
Authors:
- Konstantinos Chatzilygeroudis
- Antoine Cully
- Vassilis Vassiliades
- Jean-Baptiste Mouret
Abstract:
Traditional optimization algorithms search for a single global optimum that maximizes (or minimizes) the objective function. Multimodal optimization algorithms search for the highest peaks in the search space that can be more than one. Quality-Diversity algorithms are a recent addition to the evolutionary computation toolbox that do not only search for a single set of local optima, but instead try to illuminate the search space. In effect, they provide a holistic view of how high-performing solutions are distributed throughout a search space. The main differences with multimodal optimization algorithms are that (1) Quality-Diversity typically works in the behavioral space (or feature space), and not in the genotypic (or parameter) space, and (2) Quality-Diversity attempts to fill the whole behavior space, even if the niche is not a peak in the fitness landscape. In this chapter, we provide a gentle introduction to Quality-Diversity optimization, discuss the main representative algorithms, and the main current topics under consideration in the community. Throughout the chapter, we also discuss several successful applications of Quality-Diversity algorithms, including deep learning, robotics, and reinforcement learning.Links:
Bibtex:
"@incollection{chatzilygeroudis2021quality, title={Quality-Diversity Optimization: a novel branch of stochastic optimization}, author={Chatzilygeroudis, Konstantinos and Cully, Antoine and Vassiliades, Vassilis and Mouret, Jean-Baptiste}, booktitle={Black Box Optimization, Machine Learning, and No-Free Lunch Theorems}, pages={109--135}, year={2021}, publisher={Springer} } "
-
Seeking Quality Diversity in Evolutionary Co-design of Morphology and Control of Soft Tensegrity Modular Robots Robotics
Authors:
- Enrico Zardini
- Davide Zappetti
- Davide Zambrano
- Giovanni Iacca
- Dario Floreano
Abstract:
Designing optimal soft modular robots is difficult, due to non-trivial interactions between morphology and controller. Evolutionary algorithms (EAs), combined with physical simulators, represent a valid tool to overcome this issue. In this work, we investigate algorithmic solutions to improve the Quality Diversity of co-evolved designs of Tensegrity Soft Modular Robots (TSMRs) for two robotic tasks, namely goal reaching and squeezing trough a narrow passage. To this aim, we use three different EAs, i.e., MAP-Elites and two custom algorithms: one based on Viability Evolution (ViE) and NEAT (ViE-NEAT), the other named Double Map MAP-Elites (DM-ME) and devised to seek diversity while co-evolving robot morphologies and neural network (NN)-based controllers. In detail, DM-ME extends MAP-Elites in that it uses two distinct feature maps, referring to morphologies and controllers respectively, and integrates a mechanism to automatically define the NN-related feature descriptor. Considering the fitness, in the goal-reaching task ViE-NEAT outperforms MAP-Elites and results equivalent to DM-ME. Instead, when considering diversity in terms of 'illumination' of the feature space, DM-ME outperforms the other two algorithms on both tasks, providing a richer pool of possible robotic designs, whereas ViE-NEAT shows comparable performance to MAP-Elites on goal reaching, although it does not exploit any map.Links:
Bibtex:
"@inproceedings{zardini2021seeking, title={Seeking Quality Diversity in Evolutionary Co-design of Morphology and Control of Soft Tensegrity Modular Robots}, author={Zardini, Enrico and Zappetti, Davide and Zambrano, Davide and Iacca, Giovanni and Floreano, Dario}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Sparse Reward Exploration via Novelty Search and Emitters Robotics
Authors:
- Giuseppe Paolo
- Alexandre Coninx
- Stephane Doncieux
- Alban Laflaquière
Abstract:
Reward-based optimization algorithms require both exploration, to find rewards, and exploitation, to maximize performance. The need for efficient exploration is even more significant in sparse reward settings, in which performance feedback is given sparingly, thus rendering it unsuitable for guiding the search process. In this work, we introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm, capable of efficiently exploring a search space, as well as optimizing rewards found in potentially disparate areas. Contrary to existing emitters-based approaches, SERENE separates the search space exploration and reward exploitation into two alternating processes. The first process performs exploration through Novelty Search, a divergent search algorithm. The second one exploits discovered reward areas through emitters, i.e. local instances of population-based optimization algorithms. A meta-scheduler allocates a global computational budget by alternating between the two processes, ensuring the discovery and efficient exploitation of disjoint reward areas. SERENE returns both a collection of diverse solutions covering the search space and a collection of high-performing solutions for each distinct reward area. We evaluate SERENE on various sparse reward environments and show it compares favorably to existing baselines.Links:
Bibtex:
"@inproceedings{paolo2021sparse, title={Sparse Reward Exploration via Novelty Search and Emitters}, author={Paolo, Giuseppe and Coninx, Alexandre and Doncieux, St{\'e}phane and Laflaqui{\`e}re, Alban}, booktitle={Genetic and Evolutionary Computation Conference}, year={2021} } "
-
Unsupervised Behaviour Discovery with Quality-Diversity Optimisation Robotics
Authors:
- Luca Grillotti
- Antoine Cully
Abstract:
Quality-Diversity algorithms refer to a class of evolutionary algorithms designed to find a collection of diverse and high-performing solutions to a given problem. In robotics, such algorithms can be used for generating a collection of controllers covering most of the possible behaviours of a robot. To do so, these algorithms associate a behavioural descriptor to each of these behaviours. Each behavioural descriptor is used for estimating the novelty of one behaviour compared to the others. In most existing algorithms, the behavioural descriptor needs to be hand-coded, thus requiring prior knowledge about the task to solve. In this paper, we introduce: Autonomous Robots Realising their Abilities, an algorithm that uses a dimensionality reduction technique to automatically learn behavioural descriptors based on raw sensory data. The performance of this algorithm is assessed on three robotic tasks in simulation. The experimental results show that it performs similarly to traditional hand-coded approaches without the requirement to provide any hand-coded behavioural descriptor. In the collection of diverse and high-performing solutions, it also manages to find behaviours that are novel with respect to more features than its hand-coded baselines. Finally, we introduce a variant of the algorithm which is robust to the dimensionality of the behavioural descriptor space.Links:
Bibtex:
"@article{grillotti2021unsupervised, title={Unsupervised Behaviour Discovery with Quality-Diversity Optimisation}, author={Grillotti, Luca and Cully, Antoine}, journal={arXiv preprint arXiv:2106.05648}, year={2021} } "
2020: 43 Papers
-
A Framework for Automatic Behavior Generation in Multi-Function Swarms Robotics
Authors:
- Sondre A. Engebraaten
- Jonas Moen
- Oleg A. Yakimenko
- Kyrre Glette
Abstract:
Multi-function swarms are swarms that solve multiple tasks at once. For example, a quadcopter swarm could be tasked with exploring an area of interest while simultaneously functioning as ad-hoc relays. With this type of multi-function comes the challenge of handling potentially conflicting requirements simultaneously. Using the Quality-Diversity algorithm MAP-elites in combination with a suitable controller structure, a framework for automatic behavior generation in multi-function swarms is proposed. The framework is tested on a scenario with three simultaneous tasks: exploration, communication network creation and geolocation of RF emitters. A repertoire is evolved, consisting of a wide range of controllers, or behavior primitives, with different characteristics and trade-offs in the different tasks. This repertoire would enable the swarm to transition between behavior trade-offs online, according to the situational requirements. Furthermore, the effect of noise on the behavior characteristics in MAP-elites is investigated. A moderate number of re-evaluations is found to increase the robustness while keeping the computational requirements relatively low. A few selected controllers are examined, and the dynamics of transitioning between these controllers are explored. Finally, the study develops a methodology for analyzing the makeup of the resulting controllers. This is done through a parameter variation study where the importance of individual inputs to the swarm controllers is assessed and analyzed.Links:
Bibtex:
"@article{engebraaten2020framework, title={A Framework for Automatic Behavior Generation in Multi-Function Swarms}, author={Engebraaten, Sondre A and Moen, Jonas and Yakimenko, Oleg A and Glette, Kyrre}, journal={arXiv preprint arXiv:2007.08656}, year={2020} } "
-
An Analysis of Phenotypic Diversity in Multi-Solution Optimization
Authors:
- Alexander Hagg
- Mike Preuss
- Alexander Asteroth
- Thomas Bäck
Abstract:
More and more, optimization methods are used to find diverse solution sets. We compare solution diversity in multi-objective optimization, multimodal optimization, and quality diversity in a simple domain. We show that multiobjective optimization does not always produce much diversity, multimodal optimization produces higher fitness solutions, and quality diversity is not sensitive to genetic neutrality and creates the most diverse set of solutions. An autoencoder is used to discover phenotypic features automatically, producing an even more diverse solution set with quality diversity. Finally, we make recommendations about when to use which approach.Links:
Bibtex:
"@inproceedings{hagg2020analysis, title={An Analysis of Phenotypic Diversity in Multi-solution Optimization}, author={Hagg, Alexander and Preuss, Mike and Asteroth, Alexander and B{\"a}ck, Thomas}, booktitle={International Conference on Bioinspired Methods and Their Applications}, pages={43--55}, year={2020}, organization={Springer} } "
-
BOP-Elites, a Bayesian Optimisation algorithm for Quality-Diversity search
Authors:
- Paul Kent
- Juergen Branke
Abstract:
Quality Diversity (QD) algorithms such as MAP-Elites are a class of optimisation techniques that attempt to find a set of high-performing points from an objective function while enforcing behavioural diversity of the points over one or more interpretable, user chosen, feature functions. In this paper we propose the Bayesian Optimisation of Elites (BOP-Elites) algorithm that uses techniques from Bayesian Optimisation to explicitly model both quality and diversity with Gaussian Processes. By considering user defined regions of the feature space as 'niches' our task is to find the optimal solution in each niche. We propose a novel acquisition function to intelligently choose new points that provide the highest expected improvement to the ensemble problem of identifying the best solution in every niche. In this way each function evaluation enriches our modelling and provides insight to the whole problem, naturally balancing exploration and exploitation of the search space. The resulting algorithm is very effective in identifying the parts of the search space that belong to a niche in feature space, and finding the optimal solution in each niche. It is also significantly more sample efficient than simpler benchmark approaches. BOP-Elites goes further than existing QD algorithms by quantifying the uncertainty around our predictions and offering additional illumination of the search space through surrogate models.Links:
Bibtex:
"@article{kent2020bop, title={Bop-elites, a bayesian optimisation algorithm for quality-diversity search}, author={Kent, Paul and Branke, Juergen}, journal={arXiv preprint arXiv:2005.04320}, year={2020} } "
-
Baba is Y'all: Collaborative Mixed-Initiative Level Design
Authors:
- Megan Charity
- Ahmed Khalifa
- Julian Togelius
Abstract:
We present a collaborative mixed-initiative system for building levels for the puzzle game 'Baba is You'. Unlike previous mixed-initiative systems, Baba is Y'all is designed for collaborative asynchronous creation by multiple users over the internet. The system includes several AI-assisted features to help designers, including a level evolver and an automated player for playtesting. The level archives catalogues levels according to which mechanics are implemented and not implemented, allowing the system to ask users to design levels with specific combinations of mechanics. We describe the operation of the system and the results of small-scale informal user test, and discuss future development paths for this system as well as for collaborative mixed-initiative systems in general.Links:
Bibtex:
"@article{charity2020baba, title={Baba is Y'all: Collaborative Mixed-Initiative Level Design}, author={Charity, Megan and Khalifa, Ahmed and Togelius, Julian}, journal={arXiv preprint arXiv:2003.14294}, year={2020}}"
-
Behavioral Repertoires for Soft Tensegrity Robots Robotics
Authors:
- Kyle Doney
- Aikaterini Petridou
- Jacob Karaul
- Ali Khan
- Geoffrey Liu
- John Rieffel
Abstract:
Mobile soft robots offer compelling applications in fields ranging from urban search and rescue to planetary exploration. A critical challenge of soft robotic control is that the nonlinear dynamics imposed by soft materials often result in complex behaviors that are counter-intuitive and hard to model or predict. As a consequence, most behaviors for mobile soft robots are discovered through empirical trial and error and handtuning. A second challenge is that soft materials are difficult to simulate with high fidelity - leading to a significant reality gap when trying to discover or optimize new behaviors. In this work we employ a Quality Diversity Algorithm running model-free on a physical soft tensegrity robot that autonomously generates a behavioral repertoire with no a priori knowledge of the robot’s dynamics, and minimal human intervention. The resulting behavior repertoire displays a diversity of unique locomotive gaits useful for a variety of tasks. These results help provide a road map for increasing the behavioral capabilities of mobile soft robots through real-world automation.Links:
Bibtex:
"@article{doney2020behavioral, title={Behavioral Repertoires for Soft Tensegrity Robots}, author={Doney, Kyle and Petridou, Aikaterini and Karaul, Jacob and Khan, Ali and Liu, Geoffrey and Rieffel, John}, journal={arXiv preprint arXiv:2009.10864}, year={2020}} "
-
CPPN2GAN: Combining Compositional Pattern Producing Networks and GANs for Large-scale Pattern Generation
Authors:
- Jacob Schrum
- Vanessa Volz
- Sebastian Risi
Abstract:
Generative Adversarial Networks (GANs) are proving to be a powerful indirect genotype-to-phenotype mapping for evolutionary search, but they have limitations. In particular, GAN output does not scale to arbitrary dimensions, and there is no obvious way of combining multiple GAN outputs into a cohesive whole, which would be useful in many areas, such as the generation of video game levels. Game levels often consist of several segments, sometimes repeated directly or with variation, organized into an engaging pattern. Such patterns can be produced with Compositional Pattern Producing Networks (CPPNs). Specifically, a CPPN can define latent vector GAN inputs as a function of geometry, which provides a way to organize level segments output by a GAN into a complete level. This new CPPN2GAN approach is validated in both Super Mario Bros. and The Legend of Zelda. Specifically, divergent search via MAP-Elites demonstrates that CPPN2GAN can better cover the space of possible levels. The layouts of the resulting levels are also more cohesive and aesthetically consistent.Links:
Bibtex:
"@article{schrum2020cppn2gan, title={CPPN2GAN: Combining Compositional Pattern Producing Networks and GANs for Large-scale Pattern Generation}, author={Schrum, Jacob and Volz, Vanessa and Risi, Sebastian}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2020}, organization={ACM}}"
-
Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulations Robotics
Authors:
- Szymon Brych
- Antoine Cully
Abstract:
The increasing importance of robots and automation creates a demand for learnable controllers which can be obtained through various approaches such as Evolutionary Algorithms (EAs) or Reinforcement Learning (RL). Unfortunately, these two families of algorithms have mainly developed independently and there are only a few works comparing modern EAs with deep RL algorithms. We show that Multidimensional Archive of Phenotypic Elites (MAP-Elites), which is a modern EA, can deliver better-performing solutions than one of the state-of-the-art RL methods, Proximal Policy Optimization (PPO) in the generation of locomotion controllers for a simulated hexapod robot. Additionally, extensive hyper-parameter tuning shows that MAP-Elites displays greater robustness across seeds and hyper-parameter sets. Generally, this paper demonstrates that EAs combined with modern computational resources display promising characteristics and have the potential to contribute to the state-of-the-art in controller learning.Links:
Bibtex:
"@article{brych2020competitiveness, title={Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulations}, author={Brych, Szymon and Cully, Antoine}, journal={arXiv preprint arXiv:2009.08438}, year={2020} } "
-
Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space Games
Authors:
- Matthew C. Fontaine
- Julian Togelius
- Stefanos Nikolaidis
- Amy K. Hoover
Abstract:
We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diver-sity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains.Links:
Bibtex:
"@article{fontaine2019covariance, title={Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space}, author={Fontaine, Matthew C and Togelius, Julian and Nikolaidis, Stefanos and Hoover, Amy K}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2020}, organization={ACM}}"
-
DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics Robotics
Authors:
- Stephane Doncieux
- Nicolas Bredeche
- Léni Le Goff
- Benoît Girard
- Alexandre Coninx
- Olivier Sigaud
- Mehdi Khamassi
- Natalia Díaz-Rodríguez
- David Filliat
- Timothy Hospedales
- A. Eiben
- Richard Duro
Abstract:
Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithms rely on the definition of state and action spaces that define reachable behaviors. Their adaptation capability critically depends on the representations of these spaces: small and discrete spaces result in fast learning while large and continuous spaces are challenging and either require a long training period or prevent the robot from converging to an appropriate behavior. Beside the operational cycle of policy execution and the learning cycle, which works at a slower time scale to acquire new policies, we introduce the redescription cycle, a third cycle working at an even slower time scale to generate or adapt the required representations to the robot, its environment and the task. We introduce the challenges raised by this cycle and we present DREAM (Deferred Restructuring of Experience in Autonomous Machines), a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots. We describe results obtained so far with this approach and end up with a discussion of the questions it raises in Neuroscience.Links:
Bibtex:
"@article{doncieux2020dream, title={Dream architecture: a developmental approach to open-ended learning in robotics}, author={Doncieux, Stephane and Bredeche, Nicolas and Goff, L{\'e}ni Le and Girard, Beno{\^\i}t and Coninx, Alexandre and Sigaud, Olivier and Khamassi, Mehdi and D{\'\i}az-Rodr{\'\i}guez, Natalia and Filliat, David and Hospedales, Timothy and others}, journal={arXiv preprint arXiv:2005.06223}, year={2020} } "
-
Discovering Representations for Black-box Optimization
Authors:
- Adam Gaier
- Alexander Asteroth
- Jean-Baptiste Mouret
Abstract:
The encoding of solutions in black-box optimization is a delicate, handcrafted balance between expressiveness and domain knowledge between exploring a wide variety of solutions, and ensuring that those solutions are useful. Our main insight is that this process can be automated by generating a dataset of high-performing solutions with a quality diversity algorithm (here, MAP-Elites), then learning a representation with a generative model (here, a Varia-tional Autoencoder) from that dataset. Our second insight is that this representation can be used to scale quality diversity optimization to higher dimensions-but only if we carefully mix solutions generated with the learned representation and those generated with traditional variation operators. We demonstrate these capabilities by learning an low-dimensional encoding for the inverse kinemat-ics of a thousand joint planar arm. The results show that learned representations make it possible to solve high-dimensional problems with orders of magnitude fewer evaluations than the standard MAP-Elites, and that, once solved, the produced encoding can be used for rapid optimization of novel, but similar, tasks. The presented techniques not only scale up quality diversity algorithms to high dimensions, but show that black-box optimization encodings can be automatically learned, rather than hand designed.Links:
Bibtex:
"@inproceedings{gaier:hal-02555221, TITLE = {{Discovering Representations for Black-box Optimization}}, AUTHOR = {Gaier, Adam and Asteroth, Alexander and Mouret, Jean-Baptiste}, URL = {https://hal.inria.fr/hal-02555221}, BOOKTITLE = {{GECCO'20 - Genetic and Evolutionary Computation Conference}}, ADDRESS = {Cancun, Mexico}, VOLUME = {11}, YEAR = {2020}, MONTH = Jul, DOI = {10.1145/3377930.3390221}, PDF = {https://hal.inria.fr/hal-02555221/file/dde_gecco.pdf}, HAL_ID = {hal-02555221}, HAL_VERSION = {v1}}"
-
Diversity Preservation in Minimal Criterion Coevolution through Resource Limitation.
Authors:
- Jonathan C. Brant
- Kenneth O. Stanley
Links:
Bibtex:
"@inproceedings{Brant2020, author = {Brant, Jonathan C. and Stanley, Kenneth O.}, title = {Diversity Preservation in Minimal Criterion Coevolution through Resource Limitation.}, year = {2020}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://drive.google.com/file/d/1DV0PtS-U8R3YpiqUlGN5A9yGLts3LI5k/view}, booktitle = {Genetic and Evolutionary Computation Conference (GECCO '19) Companion}, series = {GECCO '20} } "
-
Diversity-based Design Assist for Large Legged Robots
Authors:
- David Howard
- Thomas Lowe
- Wade Geles
Abstract:
We combine MAP-Elites and highly parallelisable simulation to explore the design space of a class of large legged robots, which stand at around 2m tall and whose design and construction is not well-studied. The simulation is modified to account for factors such as motor torque and weight, and presents a reasonable fidelity search space. A novel robot encoding allows for bio-inspired features such as legs scaling along the length of the body. The impact of three possible control generation schemes are assessed in the context of body-brain co-evolution, showing that even constrained problems benefit strongly from coupling-promoting mechanisms. A two stage process in implemented. In the first stage, a library of possible robots is generated, treating user requirements as constraints. In the second stage, the most promising robot niches are analysed and a suite of human-understandable design rules generated related to the values of their feature variables. These rules, together with the library, are then ready to be used by a (human) robot designer as a Design Assist tool.Links:
Bibtex:
"@article{howard2020diversity, title={Diversity-based Design Assist for Large Legged Robots}, author={Howard, David and Lowe, Thomas and Geles, Wade}, journal={arXiv preprint arXiv:2004.08057}, year={2020}}"
-
EGAD! an Evolved Grasping Analysis Dataset for diversity and reproducibility in robotic manipulation
Authors:
- Douglas Morrison
- Peter Corke
- Jürgen Leitner
Abstract:
We present the Evolved Grasping Analysis Dataset (EGAD), comprising over 2000 generated objects aimed at training and evaluating robotic visual grasp detection algorithms. The objects in EGAD are geometrically diverse, filling a space ranging from simple to complex shapes and from easy to difficult to grasp, compared to other datasets for robotic grasping, which may be limited in size or contain only a small number of object classes. Additionally, we specify a set of 49 diverse 3D-printable evaluation objects to encourage reproducible testing of robotic grasping systems across a range of complexity and difficulty. The dataset, code and videos can be found at this URLLinks:
Bibtex:
"@article{morrison2020egad, title={EGAD! an Evolved Grasping Analysis Dataset for diversity and reproducibility in robotic manipulation}, author={Morrison, Douglas and Corke, Peter and Leitner, Jurgen}, journal={IEEE Robotics and Automation Letters}, year={2020}, publisher={IEEE}}"
-
Effective Diversity in Population Based Reinforcement Learning
Authors:
- Jack Parker-Holder
- Aldo Pacchiano
- Krzysztof Choromanski
- Stephen Roberts
Abstract:
Exploration is a key problem in reinforcement learning, since agents can only learn from data they acquire in the environment. With that in mind, maintaining a population of agents is an attractive method, as it allows data be collected with a diverse set of behaviors. This behavioral diversity is often boosted via multi-objective loss functions. However, those approaches typically leverage mean field updates based on pairwise distances, which makes them susceptible to cycling behaviors and increased redundancy. In addition, explicitly boosting diversity often has a detrimental impact on optimizing already fruitful behaviors for rewards. As such, the reward-diversity trade off typically relies on heuristics. Finally, such methods require behavioral representations, often handcrafted and domain specific. In this paper, we introduce an approach to optimize all members of a population simultaneously. Rather than using pairwise distance, we measure the volume of the entire population in a behavioral manifold, defined by task-agnostic behavioral embeddings. In addition, our algorithm Diversity via Determinants (DvD), adapts the degree of diversity during training using online learning techniques. We introduce both evolutionary and gradient-based instantiations of DvD and show they effectively improve exploration without reducing performance when better exploration is not required.Links:
Bibtex:
"@article{2020arXiv200200632P, author = {{Parker-Holder}, Jack and {Pacchiano}, Aldo and {Choromanski}, Krzysztof and {Roberts}, Stephen}, title = "{Effective Diversity in Population Based Reinforcement Learning}", journal = {arXiv e-prints}, keywords = {Computer Science - Machine Learning, Statistics - Machine Learning}, year = 2020, month = feb, eid = {arXiv:2002.00632}, pages = {arXiv:2002.00632}, archivePrefix = {arXiv}, eprint = {2002.00632}, primaryClass = {cs.LG}, adsurl = {https://ui.adsabs.harvard.edu/abs/2020arXiv200200632P}, adsnote = {Provided by the SAO/NASA Astrophysics Data System} }"
-
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions Robotics
Authors:
- Rui Wang
- Joel Lehman
- Aditya Rawal
- Jiale Zhi
- Yulun Li
- Jeff Clune
- Kenneth O. Stanley
Abstract:
Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning. A recent step in this direction is the Paired Open-Ended Trailblazer (POET), an algorithm that generates and solves its own challenges, and allows solutions to goal-switch between challenges to avoid local optima. However, the original POET was unable to demonstrate its full creative potential because of limitations of the algorithm itself and because of external issues including a limited problem space and lack of a universal progress measure. Importantly, both limitations pose impediments not only for POET, but for the pursuit of open-endedness in general. Here we introduce and empirically validate two new innovations to the original algorithm, as well as two external innovations designed to help elucidate its full potential. Together, these four advances enable the most open-ended algorithmic demonstration to date. The algorithmic innovations are (1) a domain-general measure of how meaningfully novel new challenges are, enabling the system to potentially create and solve interesting challenges endlessly, and (2) an efficient heuristic for determining when agents should goal-switch from one problem to another (helping open-ended search better scale). Outside the algorithm itself, to enable a more definitive demonstration of open-endedness, we introduce (3) a novel, more flexible way to encode environmental challenges, and (4) a generic measure of the extent to which a system continues to exhibit open-ended innovation. Enhanced POET produces a diverse range of sophisticated behaviors that solve a wide range of environmental challenges, many of which cannot be solved through other means.Links:
Bibtex:
"@inproceedings{wang2020enhanced, title={Enhanced POET: Open-ended reinforcement learning through unbounded invention of learning challenges and their solutions}, author={Wang, Rui and Lehman, Joel and Rawal, Aditya and Zhi, Jiale and Li, Yulun and Clune, Jeffrey and Stanley, Kenneth}, booktitle={International Conference on Machine Learning}, pages={9940--9951}, year={2020}, organization={PMLR} } "
-
Environmental Adaptation of Robot Morphology and Control through Real-world Evolution
Authors:
- Tønnes F. Nygaard
- Charles P. Martin
- David Howard
- Jim Torresen
- Kyrre Glette
Abstract:
Robots operating in the real world will experience a range of different environments and tasks. It is essential for the robot to have the ability to adapt to its surroundings to work efficiently in changing conditions. Evolutionary robotics aims to solve this by optimizing both the control and body (morphology) of a robot, allowing adaptation to internal, as well as external factors. Most work in this field has been done in physics simulators, which are relatively simple and not able to replicate the richness of interactions found in the real world. Solutions that rely on the complex interplay between control, body, and environment are therefore rarely found. In this paper, we rely solely on real-world evaluations and apply evolutionary search to yield combinations of morphology and control for our mechanically self-reconfiguring quadruped robot. We evolve solutions on two very different physical surfaces and analyze the results in terms of both control and morphology. We then transition to two previously unseen surfaces to demonstrate the generality of our method. We find that the evolutionary search adapts both control and body to the different physical environments, yielding significantly different morphology-controller configurations. Moreover, we observe that the solutions found by our method work well on previously unseen terrains.Links:
Bibtex:
"@article{nygaard2020environmental, title={Environmental Adaptation of Robot Morphology and Control through Real-world Evolution}, author={Nygaard, T{\o}nnes F and Martin, Charles P and Howard, David and Torresen, Jim and Glette, Kyrre}, journal={arXiv preprint arXiv:2003.13254}, year={2020}}"
-
Evolving the behavior of machines: from micro to macroevolution Review
Authors:
- Jean-Baptiste Mouret
Abstract:
Evolution gave rise to creatures that are arguably more sophisticated than the greatest human-designed systems. This feat has inspired computer scientists since the advent of computing and led to optimization tools that can evolve complex neural networks for machines—an approach known as “neuroevolution.” After a few successes in designing evolvable representations for high-dimensional artifacts, the field has been recently revitalized by going beyond optimization: to many, the wonder of evolution is less in the perfect optimization of each species than in the creativity of such a simple iterative process, that is, in the diversity of species. This modern view of artificial evolution is moving the field away from microevolution, following a fitness gradient in a niche, to macroevolution, filling many niches with highly different species. It already opened promising applications, like evolving gait repertoires, video game levels for different tastes, and diverse designs for aerodynamic bikes.Links:
Bibtex:
"@article{mouret2020evolving, title={Evolving the behavior of machines: from micro to macroevolution}, author={Mouret, Jean-Baptiste}, journal={Iscience}, pages={101731}, year={2020}, publisher={Elsevier} }"
-
Exploring the BipedalWalker benchmark with MAP-Elites and Curiosity-driven A3C
Authors:
- Vikas Gupta
- Nathanael Aubert-Kato
- Leo Cazenille
Bibtex:
"@article{gupta2020exploring, title={Exploring the BipedalWalker benchmark with MAP-Elites and Curiosity-driven A3C}, author={Gupta, Vikas and Aubert-Kato, Nathanael and Cazenille, Leo}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2020}, organization={ACM}}"
-
Exploring the Evolution of GANs through Quality Diversity
Authors:
- Victor Costa
- Nuno Lourenço
- Joao Correia
- Penousal Machado
Bibtex:
"@inproceedings{justesen2019map, title={Exploring the Evolution of GANs through Quality Diversity}, author={Costa, Victor and Lourenço, Nuno and Correia, Joao and Machado, Penousal}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2020}, organization={ACM}}"
-
Fast and stable MAP-Elites in noisy domains using deep grids
Authors:
- Manon Flageat
- Antoine Cully
Abstract:
Quality-Diversity optimisation algorithms enable the evolution of collections of both high-performing and diverse solutions. These collections offer the possibility to quickly adapt and switch from one solution to another in case it is not working as expected. It therefore finds many applications in real-world domain problems such as robotic control. However, QD algorithms, like most optimisation algorithms, are very sensitive to uncertainty on the fitness function, but also on the behavioural descriptors. Yet, such uncertainties are frequent in real-world applications. Few works have explored this issue in the specific case of QD algorithms, and inspired by the literature in Evolutionary Computation, mainly focus on using sampling to approximate the true value of the performances of a solution. However, sampling approaches require a high number of evaluations, which in many applications such as robotics, can quickly become impractical. In this work, we propose Deep-Grid MAP-Elites, a variant of the MAP-Elites algorithm that uses an archive of similar previously encountered solutions to approximate the performance of a solution. We compare our approach to previously explored ones on three noisy tasks: a standard optimisation task, the control of a redundant arm and a simulated Hexapod robot. The experimental results show that this simple approach is significantly more resilient to noise on the behavioural descriptors, while achieving competitive performances in terms of fitness optimisation, and being more sample-efficient than other existing approaches.Links:
Bibtex:
"@article{flageat2020fast, title={Fast and stable MAP-Elites in noisy domains using deep grids}, author={Flageat, Manon and Cully, Antoine}, booktitle={Proceedings of the 2020 Conference on Artificial Life}, year={2020}}"
-
Finding Game Levels with the Right Difficulty in a Few Trials through Intelligent Trial-and-Error
Authors:
- Miguel González-Duque
- Rasmus Berg Palm
- David Ha
- Sebastian Risi
Abstract:
Methods for dynamic difficulty adjustment allow games to be tailored to particular players to maximize their engagement. However, current methods often only modify a limited set of game features such as the difficulty of the opponents, or the availability of resources. Other approaches, such as experience-driven Procedural Content Generation (PCG), can generate complete levels with desired properties such as levels that are neither too hard nor too easy, but require many iterations. This paper presents a method that can generate and search for complete levels with a specific target difficulty in only a few trials. This advance is enabled by through an Intelligent Trial-and-Error algorithm, originally developed to allow robots to adapt quickly. Our algorithm first creates a large variety of different levels that vary across predefined dimensions such as leniency or map coverage. The performance of an AI playing agent on these maps gives a proxy for how difficult the level would be for another AI agent (e.g. one that employs Monte Carlo Tree Search instead of Greedy Tree Search); using this information, a Bayesian Optimization procedure is deployed, updating the difficulty of the prior map to reflect the ability of the agent. The approach can reliably find levels with a specific target difficulty for a variety of planning agents in only a few trials, while maintaining an understanding of their skill landscape.Links:
Bibtex:
"@article{gonzalez2020finding, title={Finding Game Levels with the Right Difficulty in a Few Trials through Intelligent Trial-and-Error}, author={Gonz{\'a}lez-Duque, Miguel and Palm, Rasmus Berg and Ha, David and Risi, Sebastian}, journal={arXiv preprint arXiv:2005.07677}, year={2020}}"
-
Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabi
Authors:
- Rodrigo Canaan
- Xianbo Gao
- Julian Togelius
- Andy Nealen
- Stefan Menzel
Abstract:
Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage pre-established conventions to great effect, but playing in an ad-hoc setting requires agents to adapt to its partner's strategies with no previous coordination. Evaluating an agent in this setting requires a diverse population of potential partners, but so far, the behavioral diversity of agents has not been considered in a systematic way. This paper proposes Quality Diversity algorithms as a promising class of algorithms to generate diverse populations for this purpose, and generates a population of diverse Hanabi agents using MAP-Elites. We also postulate that agents can benefit from a diverse population during training and implement a simple 'meta-strategy' for adapting to an agent's perceived behavioral niche. We show this meta-strategy can work better than generalist strategies even outside the population it was trained with if its partner's behavioral niche can be correctly inferred, but in practice a partner's behavior depends and interferes with the meta-agent's own behavior, suggesting an avenue for future research in characterizing another agent's behavior during gameplay.Links:
Bibtex:
"@article{canaan2020generating, title={Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanab}, author={Canaan, Rodrigo and Gao, Xianbo and Togelius, Julian and Nealen, Andy and Menzel, Stefan}, journal={arXiv preprint arXiv:2004.13710}, year={2020}}"
-
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity Robotics
Authors:
- Tanmay Gangwani
- Jian Peng
- Yuan Zhou
Abstract:
Quality-Diversity (QD) is a concept from Neuroevolution with some intriguing applications to Reinforcement Learning. It facilitates learning a population of agents where each member is optimized to simultaneously accumulate high task-returns and exhibit behavioral diversity compared to other members. In this paper, we build on a recent kernel-based method for training a QD policy ensemble with Stein variational gradient descent. With kernels based on f-divergence between the stationary distributions of policies, we convert the problem to that of efficient estimation of the ratio of these stationary distributions. We then study various distribution ratio estimators used previously for off-policy evaluation and imitation and re-purpose them to compute the gradients for policies in an ensemble such that the resultant population is diverse and of high-quality.Links:
Bibtex:
"@inproceedings{gangwani2020harnessing, title={Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity}, author={Gangwani, Tanmay and Peng, Jian and Zhou, Yuan}, booktitle={Proceedings of the CoRL 2020}, year={2020} } "
-
Illuminating Elite Patches of Chemical Space
Authors:
- Jonas Verhellen
- Jeriek Van den Abeele
Abstract:
In the past few years, there has been considerable activity in both academic and industrial research to develop innovative machine learning approaches to locate novel, high-performing molecules in chemical space. Here we describe a new and fundamentally different type of approach that provides a holistic overview of how high-performing molecules are distributed throughout a search space. Based on an open-source, graph-based implementation [Jensen, Chem. Sci., 2019, 12, 3567-3572] of a traditional genetic algorithm for molecular optimisation, and influenced by state-of-the-art concepts from soft robot design [Mouret et al., IEEE Trans. Evolut. Comput., 2016, 22, 623-630], we provide an algorithm that (i) produces a large diversity of high-performing, yet qualitatively different molecules, (ii) illuminates the distribution of optimal solutions, and (iii) improves search efficiency compared to both machine learning and traditional genetic algorithm approaches.Links:
Bibtex:
"@article{Verhellen2020, author = "Jonas Verhellen and Jeriek Van den Abeele", title = "{Illuminating Elite Patches of Chemical Space}", year = "2020", month = "7", url = "https://chemrxiv.org/articles/preprint/Illuminating_Elite_Patches_of_Chemical_Space/12608228", doi = "10.26434/chemrxiv.12608228.v1"}"
-
Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network
Authors:
- Matthew C. Fontaine
- Ruilin Liu
- Ahmed Khalifa
- Jignesh Modi
- Julian Togelius
- Amy K. Hoover
- Stefanos Nikolaidis
Abstract:
Generative adversarial networks (GANs) are now a ubiquitous approach to procedurally generating video game levels. While GAN generated levels are stylistically similar to human-authored examples, human designers often want to explore the generative design space of GANs to extract interesting levels. However, human designers find latent vectors opaque and would rather explore along dimensions the designer specifies, such as number of enemies or obstacles. We propose using state-of-the-art quality diversity algorithms designed to optimize continuous spaces, i.e. MAP-Elites with a directional variation operator and Covariance Matrix Adaptation MAP-Elites, to efficiently explore the latent space of a GAN to extract levels that vary across a set of specified gameplay measures. In the benchmark domain of Super Mario Bros, we demonstrate how designers may specify gameplay measures to our system and extract high-quality (playable) levels with a diverse range of level mechanics, while still maintaining stylistic similarity to human authored examples. An online user study shows how the different mechanics of the automatically generated levels affect subjective ratings of their perceived difficulty and appearance.Links:
Bibtex:
"@misc{fontaine2020illuminating, title={Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network}, author={Matthew C. Fontaine and Ruilin Liu and Ahmed Khalifa and Jignesh Modi and Julian Togelius and Amy K. Hoover and Stefanos Nikolaidis}, year={2020}, eprint={2007.05674}, archivePrefix={arXiv}, primaryClass={cs.AI} }"
-
Illuminating Super Mario Bros - Quality-Diversity Within Platformer Level Generation
Authors:
- Oliver Henry Withington
Bibtex:
"@article{Withington2020illuminating, title={Illuminating Super Mario Bros - Quality-Diversity Within Platformer Level Generation}, author={Withington, Oliver Henry}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2020}, organization={ACM}}"
-
Interactive Constrained MAP-Elites Analysis and Evaluation of the Expressiveness of the Feature Dimensions
Authors:
- Alberto Alvarez
- Steve Dahlskog
- Jose Font
- Julian Togelius
Abstract:
We propose the Interactive Constrained MAP-Elites, a quality-diversity solution for game content generation, implemented as a new feature of the Evolutionary Dungeon Designer: a mixed-initiative co-creativity tool for designing dungeons. The feature uses the MAP-Elites algorithm, an illumination algorithm that segregates the population among several cells depending on their scores with respect to different behavioral dimensions. Users can flexibly and dynamically alternate between these dimensions anytime, thus guiding the evolutionary process in an intuitive way, and then incorporate suggestions produced by the algorithm in their room designs. At the same time, any modifications performed by the human user will feed back into MAP-Elites, closing a circular workflow of constant mutual inspiration. This paper presents the algorithm followed by an in-depth analysis of its behaviour, with the aims of evaluating the expressive range of all possible dimension combinations in several scenarios, as well as discussing their influence in the fitness landscape and in the overall performance of the mixed-initiative procedural content generation.Links:
Bibtex:
"@article{alvarez2020interactive, title={Interactive Constrained MAP-Elites Analysis and Evaluation of the Expressiveness of the Feature Dimensions}, author={Alvarez, Alberto and Dahlskog, Steve and Font, Jose and Togelius, Julian}, journal={arXiv preprint arXiv:2003.03377}, year={2020}}"
-
Interactive Constrained MAP-Elites: Analysis and Evaluation of the Expressiveness of the Feature Dimensions Games
Authors:
- Alberto Alvarez
- Steve Dahlskog
- Jose Font
- Julian Togelius
Abstract:
We propose the Interactive Constrained MAP-Elites, a quality-diversity solution for game content generation, implemented as a new feature of the Evolutionary Dungeon Designer: a mixed-initiative co-creativity tool for designing dungeons. The feature uses the MAP-Elites algorithm, an illumination algorithm that segregates the population among several cells depending on their scores with respect to different behavioral dimensions. Users can flexibly and dynamically alternate between these dimensions anytime, thus guiding the evolutionary process in an intuitive way, and then incorporate suggestions produced by the algorithm in their room designs. At the same time, any modifications performed by the human user will feed back into MAP-Elites, closing a circular workflow of constant mutual inspiration. This paper presents the algorithm followed by an in-depth analysis of its behaviour, with the aims of evaluating the expressive range of all possible dimension combinations in several scenarios, as well as discussing their influence in the fitness landscape and in the overall performance of the mixed-initiative procedural content generation.Links:
Bibtex:
"@article{alvarez2020interactive, title={Interactive constrained map-elites: Analysis and evaluation of the expressiveness of the feature dimensions}, author={Alvarez, Alberto and Fernandez, Jose Maria Maria Font and Dahlskog, Steve and Togelius, Julian}, journal={IEEE Transactions on Games}, year={2020}, publisher={IEEE} } "
-
Learning a Behavioral Repertoire from Demonstrations Games
Authors:
- Niels Justesen
- Miguel Gonzalez Duque
- Daniel Cabarcas Jaramillo
- Jean-Baptiste Mouret
- Sebastian Risi
Abstract:
Imitation Learning (IL) is a machine learning approach to learn a policy from a dataset of demonstrations. IL can be useful to kick-start learning before applying reinforcement learning (RL) but it can also be useful on its own, e.g. to learn to imitate human players in video games. However, a major limitation of current IL approaches is that they learn only a single 'average' policy based on a dataset that possibly contains demonstrations of numerous different types of behaviors. In this paper, we propose a new approach called Behavioral Repertoire Imitation Learning (BRIL) that instead learns a repertoire of behaviors from a set of demonstrations by augmenting the state-action pairs with behavioral descriptions. The outcome of this approach is a single neural network policy conditioned on a behavior description that can be precisely modulated. We apply this approach to train a policy on 7,777 human replays to perform build-order planning in StarCraft II. Principal Component Analysis (PCA) is applied to construct a low-dimensional behavioral space from the high-dimensional army unit composition of each demonstration. The results demonstrate that the learned policy can be effectively manipulated to express distinct behaviors. Additionally, by applying the UCB1 algorithm, we are able to adapt the behavior of the policy - in-between games - to reach a performance beyond that of the traditional IL baseline approach.Links:
Bibtex:
"@inproceedings{justesen2020learning, title={Learning a Behavioral Repertoire from Demonstrations}, author={Justesen, Niels and Gonz{\'a}lez-Duque, Miguel and Cabarcas, Daniel and Mouret, Jean-Baptiste and Risi, Sebastian}, booktitle={2020 IEEE Conference on Games (CoG)}, pages={383--390}, year={2020}, organization={IEEE} } "
-
Learning behaviour-performance maps with meta-evolution
Authors:
- David Bossens
- Jean-Baptiste Mouret
- Danesh Tarapore
Abstract:
The MAP-Elites quality-diversity algorithm has been successful in robotics because it can create a behaviorally diverse set of solutions that later can be used for adaptation, for instance to unanticipated damages. In MAP-Elites, the choice of the behaviour space is essential for adaptation, the recovery of performance in unseen environments , since it defines the diversity of the solutions. Current practice is to hand-code a set of behavioural features, however, given the large space of possible behaviour-performance maps, the designer does not know a priori which behavioural features maximise a map's adaptation potential. We introduce a new meta-evolution algorithm that discovers those behavioural features that maximise future adaptations. The proposed method applies Covari-ance Matrix Adaptation Evolution Strategy to evolve a population of behaviour-performance maps to maximise a meta-fitness function that rewards adaptation. The method stores solutions found by MAP-Elites in a database which allows to rapidly construct new behaviour-performance maps on-the-fly. To evaluate this system , we study the gait of the RHex robot as it adapts to a range of damages sustained on its legs. When compared to MAP-Elites with user-defined behaviour spaces, we demonstrate that the meta-evolution system learns high-performing gaits with or without damages injected to the robot.Links:
Bibtex:
"@inproceedings{bossens:hal-02555231, TITLE = {{Learning behaviour-performance maps with meta-evolution}}, AUTHOR = {Bossens, David M and Mouret, Jean-Baptiste and Tarapore, Danesh}, URL = {https://hal.inria.fr/hal-02555231}, BOOKTITLE = {{GECCO'20 - Genetic and Evolutionary Computation Conference}}, ADDRESS = {Cancun, Mexico}, YEAR = {2020}, MONTH = Jul, KEYWORDS = {Learning paradigms ; quality-diversity algorithms ; behavioural diversity ; meta-learning ; evolutionary robotics ; damage recovery}, PDF = {https://hal.inria.fr/hal-02555231/file/pap349s3-file1.pdf}, HAL_ID = {hal-02555231}, HAL_VERSION = {v1}}"
-
Learning the Designer's Preferences to Drive Evolution
Authors:
- Alberto Alvarez
- Jose Font
Abstract:
This paper presents the Designer Preference Model, a data-driven solution that pursues to learn from user generated data in a Quality-Diversity Mixed-Initiative Co-Creativity (QD MI-CC) tool, with the aims of modelling the user's design style to better assess the tool's procedurally generated content with respect to that user's preferences. Through this approach, we aim for increasing the user's agency over the generated content in a way that neither stalls the user-tool reciprocal stimuli loop nor fatigues the user with periodical suggestion handpicking. We describe the details of this novel solution, as well as its implementation in the MI-CC tool the Evolutionary Dungeon Designer. We present and discuss our findings out of the initial tests carried out, spotting the open challenges for this combined line of research that integrates MI-CC with Procedural Content Generation through Machine Learning.Links:
Bibtex:
"@inproceedings{alvarez2020learning, title={Learning the designer’s preferences to drive evolution}, author={Alvarez, Alberto and Font, Jose}, booktitle={International Conference on the Applications of Evolutionary Computation (Part of EvoStar)}, pages={431--445}, year={2020}, organization={Springer}}"
-
Mech-Elites: Illuminating the Mechanic Space of GVGAI
Authors:
- Megan Charity
- Michael Cerny Green
- Ahmed Khalifa
- Julian Togelius
Abstract:
This paper introduces a fully automatic method of mechanic illumination for general video game level generation. Using the Constrained MAP-Elites algorithm and the GVG-AI framework, this system generates the simplest tile based levels that contain specific sets of game mechanics and also satisfy playability constraints. We apply this method to illuminate mechanic space for 4 different games in GVG-AI: Zelda, Solarfox, Plants, and RealPortals.Links:
Bibtex:
"@article{charity2020mech, title={Mech-Elites: Illuminating the Mechanic Space of GVGAI}, author={Charity, Megan and Green, Michael Cerny and Khalifa, Ahmed and Togelius, Julian}, journal={arXiv preprint arXiv:2002.04733}, year={2020}}"
-
Mech-Elites: Illuminating the Mechanic Space of GVGAI Games
Authors:
- Megan Charity
- Michael Cerny Green
- Ahmed Khalifa
- Julian Togelius
Abstract:
This paper introduces a fully automatic method of mechanic illumination for general video game level generation. Using the Constrained MAP-Elites algorithm and the GVG-AI framework, this system generates the simplest tile based levels that contain specific sets of game mechanics and also satisfy playability constraints. We apply this method to illuminate mechanic space for 4 different games in GVG-AI: Zelda, Solarfox, Plants, and RealPortals.Links:
Bibtex:
"@inproceedings{charity2020mech, title={Mech-Elites: Illuminating the Mechanic Space of GVG-AI}, author={Charity, Megan and Green, Michael Cerny and Khalifa, Ahmed and Togelius, Julian}, booktitle={International Conference on the Foundations of Digital Games}, pages={1--10}, year={2020} } "
-
Model-Based Quality-Diversity Search for Efficient Robot Learning
Authors:
- Leon Keller
- Daniel Tanneberg
- Svenja Stark
- Jan Peters
Abstract:
Despite recent progress in robot learning, it still remains a challenge to program a robot to deal with open-ended object manipulation tasks. One approach that was recently used to autonomously generate a repertoire of diverse skills is a novelty based Quality-Diversity (QD) algorithm. However, as most evolutionary algorithms, QD suffers from sample- inefficiency and, thus, it is challenging to apply it in real-world scenarios. This paper tackles this problem by integrating a neural network that predicts the behavior of the perturbed parameters into a novelty based QD algorithm. In the proposed Model-based Quality-Diversity search (M-QD), the network is trained concurrently to the repertoire and is used to avoid executing unpromising actions in the novelty search process. Furthermore, it is used to adapt the skills of the final repertoire in order to generalize the skills to different scenarios. Our experiments show that enhancing a QD algorithm with such a forward model improves the sample-efficiency and performance of the evolutionary process and the skill adaptation.Links:
Bibtex:
"@inproceedings{keller2021, title={Model-Based Quality-Diversity Search for Efficient Robot Learning}, author={Keller, Leon and Tanneberg, Daniel and Stark, Svenja and Peters, Jan}, year={2020}, booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)} }"
-
Path Towards Multilevel Evolution of Robots Robotics
Authors:
- Shelvin Chand
- David Howard
Abstract:
Multi-level evolution is a bottom-up robotic design paradigm which decomposes the design problem into layered sub-tasks that involve concurrent search for appropriate materials, component geometry and overall morphology. Each of the three layers operate with the goal of building a library of diverse candidate solutions which will be used either as building blocks for the layer above or provided to the decision maker for final use. In this paper we provide a theoretical discussion on the concepts and technologies that could potentially be used as building blocks for this framework.Links:
Bibtex:
"@inproceedings{chand2020path, title={Path towards multilevel evolution of robots}, author={Chand, Shelvin and Howard, David}, booktitle={Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion}, pages={1381--1382}, year={2020} } "
-
Policy Manifold Search for Improving Diversity-based Neuroevolution Robotics
Authors:
- Nemanja Rakicevic
- Antoine Cully
- Petar Kormushev
Abstract:
Diversity-based approaches have recently gained popularity as an alternative paradigm to performance-based policy search. A popular approach from this family, Quality-Diversity (QD), maintains a collection of high-performing policies separated in the diversity-metric space, defined based on policies' rollout behaviours. When policies are parameterised as neural networks, i.e. Neuroevolution, QD tends to not scale well with parameter space dimensionality. Our hypothesis is that there exists a low-dimensional manifold embedded in the policy parameter space, containing a high density of diverse and feasible policies. We propose a novel approach to diversity-based policy search via Neuroevolution, that leverages learned latent representations of the policy parameters which capture the local structure of the data. Our approach iteratively collects policies according to the QD framework, in order to (i) build a collection of diverse policies, (ii) use it to learn a latent representation of the policy parameters, (iii) perform policy search in the learned latent space. We use the Jacobian of the inverse transformation (i.e.reconstruction function) to guide the search in the latent space. This ensures that the generated samples remain in the high-density regions of the original space, after reconstruction. We evaluate our contributions on three continuous control tasks in simulated environments, and compare to diversity-based baselines. The findings suggest that our approach yields a more efficient and robust policy search process.Links:
Bibtex:
"@article{rakicevic2020policy, title={Policy Manifold Search for Improving Diversity-based Neuroevolution}, author={Rakicevic, Nemanja and Cully, Antoine and Kormushev, Petar}, journal={Beyond Backpropagation: Novel Ideas for Training Neural Architectures Workshop at NeurIPS 2020}, year={2020} } "
-
QED: using Quality-Environment-Diversity to evolve resilient robot swarms
Authors:
- David M. Bossens
- Danesh Tarapore
Abstract:
In swarm robotics, any of the robots in a swarm may be affected by different faults, resulting in significant performance declines. To allow fault recovery from randomly injected faults to different robots in a swarm, a model-free approach may be preferable due to the accumulation of faults in models and the difficulty to predict the behaviour of neighbouring robots. One model-free approach to fault recovery involves two phases: during simulation, a quality-diversity algorithm evolves a behaviourally diverse archive of controllers; during the target application, a search for the best controller is initiated after fault injection. In quality-diversity algorithms, the choice of the behavioural descriptor is a key design choice that determines the quality of the evolved archives, and therefore the fault recovery performance. Although the environment is an important determinant of behaviour, the impact of environmental diversity is often ignored in the choice of a suitable behavioural descriptor. This study compares different behavioural descriptors, including two generic descriptors that work on a wide range of tasks, one hand-coded descriptor which fits the domain of interest, and one novel type of descriptor based on environmental diversity, which we call Quality-Environment-Diversity (QED). Results demonstrate that the above-mentioned model-free approach to fault recovery is feasible in the context of swarm robotics, reducing the fault impact by a factor 2-3. Further, the environmental diversity obtained with QED yields a unique behavioural diversity profile that allows it to recover from high-impact faults.Links:
Bibtex:
"@article{bossens2020qed, title={QED: using Quality-Environment-Diversity to evolve resilient robot swarms}, author={Bossens, David M and Tarapore, Danesh}, journal={arXiv preprint arXiv:2003.02341}, year={2020}}"
-
Quality Diversity for Multi-task Optimization
Authors:
- Jean-Baptiste Mouret
- Glenn Maguire
Abstract:
Quality Diversity (QD) algorithms are a recent family of optimization algorithms that search for a large set of diverse but high-performing solutions. In some specific situations, they can solve multiple tasks at once. For instance, they can find the joint positions required for a robotic arm to reach a set of points, which can also be solved by running a classic optimizer for each target point. However, they cannot solve multiple tasks when the fitness needs to be evaluated independently for each task (e.g., optimizing policies to grasp many different objects). In this paper, we propose an extension of the MAP-Elites algorithm, called Multi-task MAP-Elites, that solves multiple tasks when the fitness function depends on the task. We evaluate it on a simulated parameterized planar arm (10-dimensional search space; 5000 tasks) and on a simulated 6-legged robot with legs of different lengths (36-dimensional search space; 2000 tasks). The results show that in both cases our algorithm outperforms the optimization of each task separately with the CMA-ES algorithm.Links:
Bibtex:
"@article{mouret2020quality, title={Quality Diversity for Multi-task Optimization}, author={Mouret, Jean-Baptiste and Maguire, Glenn}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2020}, organization={ACM}}"
-
Quality and Diversity in Evolutionary Modular Robotics Robotics
Authors:
- Jørgen Nordmoen
- Frank Veenstra
- Kai Olav Ellefsen
- Kyrre Glette
Abstract:
In Evolutionary Robotics a population of solutions is evolved to optimize robots that solve a given task. However, in traditional Evolutionary Algorithms, the population of solutions tends to converge to local optima when the problem is complex or the search space is large, a problem known as premature convergence. Quality Diversity algorithms try to overcome premature convergence by introducing additional measures that reward solutions for being different while not necessarily performing better. In this paper we compare a single objective Evolutionary Algorithm with two diversity promoting search algorithms; a Multi-Objective Evolutionary Algorithm and MAP-Elites a Quality Diversity algorithm, for the difficult problem of evolving control and morphology in modular robotics. We compare their ability to produce high performing solutions, in addition to analyze the evolved morphological diversity. The results show that all three search algorithms are capable of evolving high performing individuals. However, the Quality Diversity algorithm is better adept at filling all niches with high-performing solutions. This confirms that Quality Diversity algorithms are well suited for evolving modular robots and can be an important means of generating repertoires of high performing solutions that can be exploited both at design- and runtime.Links:
Bibtex:
"@inproceedings{nordmoen2020quality, title={Quality and Diversity in Evolutionary Modular Robotics}, author={Nordmoen, J{\o}rgen and Veenstra, Frank and Ellefsen, Kai Olav and Glette, Kyrre}, booktitle={2020 IEEE Symposium Series on Computational Intelligence (SSCI)}, pages={2109--2116}, year={2020}, organization={IEEE}} "
-
Rapidly adapting robot swarms with Swarm Map-based Bayesian Optimisation Robotics
Authors:
- David M. Bossens
- Danesh Tarapore
Abstract:
Rapid performance recovery from unforeseen environmental perturbations remains a grand challenge in swarm robotics. To solve this challenge, we investigate a behaviour adaptation approach, where one searches an archive of controllers for potential recovery solutions. To apply behaviour adaptation in swarm robotic systems, we propose two algorithms: (i) Swarm Map-based Optimisation (SMBO), which selects and evaluates one controller at a time, for a homogeneous swarm, in a centralised fashion; and (ii) Swarm Map-based Optimisation Decentralised (SMBO-Dec), which performs an asynchronous batch-based Bayesian optimisation to simultaneously explore different controllers for groups of robots in the swarm. We set up foraging experiments with a variety of disturbances: injected faults to proximity sensors, ground sensors, and the actuators of individual robots, with 100 unique combinations for each type. We also investigate disturbances in the operating environment of the swarm, where the swarm has to adapt to drastic changes in the number of resources available in the environment, and to one of the robots behaving disruptively towards the rest of the swarm, with 30 unique conditions for each such perturbation. The viability of SMBO and SMBO-Dec is demonstrated, comparing favourably to variants of random search and gradient descent, and various ablations, and improving performance up to 80% compared to the performance at the time of fault injection within at most 30 evaluations.Links:
Bibtex:
"@article{bossens2020rapidly, title={Rapidly adapting robot swarms with Swarm Map-based Bayesian Optimisation}, author={Bossens, David M and Tarapore, Danesh}, journal={arXiv preprint arXiv:2012.11444}, year={2020} } "
-
Reality-assisted evolution of soft robots through large-scale physical experimentation: a review Robotics
Authors:
- Toby Howison
- Simon Hauser
- Josie Hughes
- Fumiya Iida
Abstract:
In this review we introduce the framework of reality-assisted evolution to summarize a growing trend towards combining model-based and model-free approaches to improve the design of physically embodied soft robots. In silico, data-driven models build, adapt and improve representations of the target system using real-world experimental data. By simulating huge numbers of virtual robots using these data-driven models, optimization algorithms can illuminate multiple design candidates for transference to the real world. In reality, large-scale physical experimentation facilitates the fabrication, testing and analysis of multiple candidate designs. Automated assembly and reconfigurable modular systems enable significantly higher numbers of real-world design evaluations than previously possible. Large volumes of ground-truth data gathered via physical experimentation can be returned to the virtual environment to improve data-driven models and guide optimization. Grounding the design process in physical experimentation ensures the complexity of virtual robot designs does not outpace the model limitations or available fabrication technologies. We outline key developments in the design of physically embodied soft robots under the framework of reality-assisted evolution.Links:
Bibtex:
"@article{howison2020reality, title={Reality-assisted evolution of soft robots through large-scale physical experimentation: a review}, author={Howison, Toby and Hauser, Simon and Hughes, Josie and Iida, Fumuiya}, journal={arXiv preprint arXiv:2009.13960}, year={2020} } "
-
Scaling MAP-Elites to Deep Neuroevolution
Authors:
- Cédric Colas
- Joost Huizinga
- Vashisht Madhavan
- Jeff Clune
Abstract:
Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, have proven very useful for a broad range of applications including enabling real robots to recover quickly from joint damage, solving strongly deceptive maze tasks or evolving robot morphologies to discover new gaits. However, present implementations of MAP-Elites and other QD algorithms seem to be limited to low-dimensional controllers with far fewer parameters than modern deep neural network models. In this paper, we propose to leverage the efficiency of Evolution Strategies (ES) to scale MAP-Elites to high-dimensional controllers parameterized by large neural networks. We design and evaluate a new hybrid algorithm called MAP-Elites with Evolution Strategies (ME-ES) for post-damage recovery in a difficult high-dimensional control task where traditional ME fails. Additionally, we show that ME-ES performs efficient exploration, on par with state-of-the-art exploration algorithms in high-dimensional control tasks with strongly deceptive rewards.Links:
Bibtex:
"@article{colas2020scaling, title={Scaling MAP-Elites to Deep Neuroevolution}, author={Colas, C{\'e}dric and Huizinga, Joost and Madhavan, Vashisht and Clune, Jeff}, journal={arXiv preprint arXiv:2003.01825}, year={2020}}"
-
Using MAP-Elites to support policymaking around Workforce Schedulingand Routing
Authors:
- Neil Urquhart
- Emma Hart
- William Hutcheson
Abstract:
Algorithms such as MAP-Elites provide a means of allowing users toexplore a solution space by returning an archive of high-performing solutions. Suchan archive, can allow the user an overview of the solution space which may beuseful when formulating policy around the problem itself. The number of solutionsthat can potentially be returned by MAP-Elites is controlled by a parameter𝑑that discretises the user-defined features into ‘bins’. For a fixed evaluation budget,increasing the number of bins increases user-choice, but at the same time, maylead to a reduction in overall quality of solutions. We undertake a study of theapplication of Map-Elites to a Workforce Scheduling and Routing problem, using aset of realistic instances based in London.Links:
Bibtex:
"@article{urquhart2020using, title={Using MAP-Elites to support policy making around Workforce Scheduling and Routing}, author={Urquhart, Neil and Hart, Emma and Hutcheson, William}, journal={at-Automatisierungstechnik}, volume={68}, number={2}, pages={110--117}, year={2020}, publisher={De Gruyter Oldenbourg} } "
2019: 29 Papers
-
Adaptive Prior Selection for Repertoire-based Online Adaptation in Robotics Robotics
Authors:
- Rituraj Kaushik
- Pierre Desreumaux
- Jean-Baptiste Mouret
Abstract:
Repertoire-based learning is a data-efficient adaptation approach based on a two-step process in which (1) a large and diverse set of policies is learned in simulation, and (2) a planning or learning algorithm chooses the most appropriate policies according to the current situation (e.g., a damaged robot, a new object, etc.). In this paper, we relax the assumption of previous works that a single repertoire is enough for adaptation. Instead, we generate repertoires for many different situations (e.g., with a missing leg, on different floors, etc.) and let our algorithm selects the most useful prior. Our main contribution is an algorithm, APROL (Adaptive Prior selection for Repertoire-based Online Learning) to plan the next action by incorporating these priors when the robot has no information about the current situation. We evaluate APROL on two simulated tasks: (1) pushing unknown objects of various shapes and sizes with a robotic arm and (2) a goal reaching task with a damaged hexapod robot. We compare with "Reset-free Trial and Error" (RTE) and various single repertoire-based baselines. The results show that APROL solves both the tasks in less interaction time than the baselines. Additionally, we demonstrate APROL on a real, damaged hexapod that quickly learns to pick compensatory policies to reach a goal by avoiding obstacles in the path.Links:
Bibtex:
"@article{kaushik2019, author = {Rituraj Kaushik and Pierre Desreumaux and Jean-Baptiste Mouret}, title = {Adaptive Prior Selection for Repertoire-based Online Learning in Robotics}, journal = {arXiV}, volume = {abs/1907.07029}, year = {2019}, url = {http://arxiv.org/abs/1907.07029}}"
-
Alphastar: An evolutionary computation perspective Games
Authors:
- Kai Arulkumaran
- Antoine Cully
- Julian Togelius
Abstract:
In January 2019, DeepMind revealed AlphaStar to the world-the first artificial intelligence (AI) system to beat a professional player at the game of StarCraft II-representing a milestone in the progress of AI. AlphaStar draws on many areas of AI research, including deep learning, reinforcement learning, game theory, and evolutionary computation (EC). In this paper we analyze AlphaStar primarily through the lens of EC, presenting a new look at the system and relating it to many concepts in the field. We highlight some of its most interesting aspects-the use of Lamarckian evolution, competitive co-evolution, and quality diversity. In doing so, we hope to provide a bridge between the wider EC community and one of the most significant AI systems developed in recent times.Links:
Bibtex:
"@article{arulkumaran2019alphastar, title={Alphastar: An evolutionary computation perspective}, author={Arulkumaran, Kai and Cully, Antoine and Togelius, Julian}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, year={2019}, organization={ACM} }"
-
An illumination algorithm approach to solving the micro-depot routing problem
Authors:
- Neil Urquhart
- Silke Hohl
- Emma Hart
Bibtex:
"@inproceedings{urquhart2019illumination, title={An illumination algorithm approach to solving the micro-depot routing problem}, author={Urquhart, Neil and H{\"o}hl, Silke and Hart, Emma}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={1347--1355}, year={2019}, organization={ACM} }"
-
Are quality diversity algorithms better at generating stepping stones than objective-based search?
Authors:
- Adam Gaier
- Alexander Asteroth
- Jean-Baptiste Mouret
Bibtex:
"@inproceedings{gaier2019quality, title={Are quality diversity algorithms better at generating stepping stones than objective-based search?}, author={Gaier, Adam and Asteroth, Alexander and Mouret, Jean-Baptiste}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={115--116}, year={2019}, organization={ACM} }"
-
Automatic Calibration of Artificial Neural Networks for Zebrafish Collective Behaviours Using a Quality Diversity Algorithm Robotics
Authors:
- Leo Cazenille
- Nicolas Bredeche
- Jose Halloy
Abstract:
During the last two decades, various models have been proposed for fish collective motion. These models are mainly developed to decipher the biological mechanisms of social interaction between animals. They consider very simple homogeneous unbounded environments and it is not clear that they can simulate accurately the collective trajectories. Moreover when the models are more accurate, the question of their scalability to either larger groups or more elaborate environments remains open. This study deals with learning how to simulate realistic collective motion of collective of zebrafish, using real-world tracking data. The objective is to devise an agent-based model that can be implemented on an artificial robotic fish that can blend into a collective of real fish. We present a novel approach that uses Quality Diversity algorithms, a class of algorithms that emphasise exploration over pure optimisation. In particular, we use CVT-MAP-Elites, a variant of the state-of-the-art MAP-Elites algorithm for high dimensional search space. Results show that Quality Diversity algorithms not only outperform classic evolutionary reinforcement learning methods at the macroscopic level (i.e. group behaviour), but are also able to generate more realistic biomimetic behaviours at the microscopic level (i.e. individual behaviour).Links:
Bibtex:
"@inproceedings{cazenille2019automatic, title={Automatic Calibration of Artificial Neural Networks for Zebrafish Collective Behaviours Using a Quality Diversity Algorithm}, author={Cazenille, Leo and Bredeche, Nicolas and Halloy, Jos{\'e}}, booktitle={Conference on Biomimetic and Biohybrid Systems}, pages={38--50}, year={2019}, organization={Springer} }"
-
Autonomous Skill Discovery with Quality-diversity and Unsupervised Descriptors Robotics
Authors:
- Antoine Cully
Abstract:
Quality-Diversity optimization is a new family of optimization algorithms that, instead of searching for a single optimal solution to solving a task, searches for a large collection of solutions that all solve the task in a different way. This approach is particularly promising for learning behavioral repertoires in robotics, as such a diversity of behaviors enables robots to be more versatile and resilient. However, these algorithms require the user to manually define behavioral descriptors, which is used to determine whether two solutions are different or similar. The choice of a behavioral descriptor is crucial, as it completely changes the solution types that the algorithm derives. In this paper, we introduce a new method to automatically define this descriptor by combining Quality-Diversity algorithms with unsupervised dimensionality reduction algorithms. This approach enables robots to autonomously discover the range of their capabilities while interacting with their environment. The results from two experimental scenarios demonstrate that robot can autonomously discover a large range of possible behaviors, without any prior knowledge about their morphology and environment. Furthermore, these behaviors are deemed to be similar to handcrafted solutions that uses domain knowledge and significantly more diverse than when using existing unsupervised methods.Links:
Bibtex:
"@inproceedings{cully2019Autonomous, author = {Cully, Antoine}, title = {Autonomous Skill Discovery with Quality-diversity and Unsupervised Descriptors}, booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}, year = {2019}, location = {Prague, Czech Republic}, pages = {81--89}, doi = {10.1145/3321707.3321804}, publisher = {ACM} }"
-
Behavioral Repertoire via Generative Adversarial Policy Networks Robotics
Authors:
- Marija Jegorova
- Stephane Doncieux
- Timothy Hospedales
Abstract:
Learning algorithms are enabling robots to solve increasingly challenging real-world tasks. These approaches often rely on demonstrations and reproduce the behavior shown. Unexpected changes in the environment may require using different behaviors to achieve the same effect, for instance to reach and grasp an object in changing clutter. An emerging paradigm addressing this robustness issue is to learn a diverse set of successful behaviors for a given task, from which a robot can select the most suitable policy when faced with a new environment. In this paper, we explore a novel realization of this vision by learning a generative model over policies. Rather than learning a single policy, or a small fixed repertoire, our generative model for policies compactly encodes an unbounded number of policies and allows novel controller variants to be sampled. Leveraging our generative policy network, a robot can sample novel behaviors until it finds one that works for a new environment. We demonstrate this idea with an application of robust ball-throwing in the presence of obstacles. We show that this approach achieves a greater diversity of behaviors than an existing evolutionary approach, while maintaining good efficacy of sampled behaviors, allowing a Baxter robot to hit targets more often when ball throwing in the presence of obstacles.Links:
Bibtex:
"@article{jegorovabehavioral, title={Behavioral Repertoire via Generative Adversarial Policy Networks}, author={Jegorova, Marija and Doncieux, St{\'e}phane and Hospedales, Timothy M} }"
-
Benchmarking Open-Endedness in Minimal Criterion Coevolution.
Authors:
- Jonathan C. Brant
- Kenneth O. Stanley
Links:
Bibtex:
"@inproceedings{Brant2019, author = {Brant, Jonathan C. and Stanley, Kenneth O.}, title = {Benchmarking Open-Endedness in Minimal Criterion Coevolution.}, year = {2019}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://drive.google.com/file/d/1n_YuID1N1jznZ2y35Z_y0lzP0mPdEDMu/view}, booktitle = {Genetic and Evolutionary Computation Conference (GECCO '19) Companion}, series = {GECCO '19} } "
-
Comparing reliability of grid-based Quality-Diversity algorithms using artificial landscapes
Authors:
- Leo Cazenille
Abstract:
Quality-Diversity (QD) algorithms are a recent type of optimisation methods that search for a collection of both diverse and high performing solutions. They can be used to effectively explore a target problem according to features defined by the user. However, the field of QD still does not possess extensive methodologies and reference benchmarks to compare these algorithms. We propose a simple benchmark to compare the reliability of QD algorithms by optimising the Rastrigin function, an artificial landscape function often used to test global optimisation methods.Links:
Bibtex:
"@inproceedings{cazenille2019comparing, title={Comparing reliability of grid-based Quality-Diversity algorithms using artificial landscapes}, author={Cazenille, Leo}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={249--250}, year={2019}, organization={ACM} }"
-
Designing neural networks through neuroevolution
Authors:
- Kenneth Stanley
- Jeff Clune
- Joel Lehman
- Risto Miikkulainen
Abstract:
Much of recent machine learning has focused on deep learning, in which neural network weights are trained through variants of stochastic gradient descent. An alternative approach comes from the field of neuroevolution, which harnesses evolutionary algorithms to optimize neural networks, inspired by the fact that natural brains themselves are the products of an evolutionary process. Neuroevolution enables important capabilities that are typically unavailable to gradient-based approaches, including learning neural network building blocks (for example activation functions), hyperparameters, architectures and even the algorithms for learning themselves. Neuroevolution also differs from deep learning (and deep reinforcement learning) by maintaining a population of solutions during search, enabling extreme exploration and massive parallelization. Finally, because neuroevolution research has (until recently) developed largely in isolation from gradient-based neural network research, it has developed many unique and effective techniques that should be effective in other machine learning areas too. This Review looks at several key aspects of modern neuroevolution, including large-scale computing, the benefits of novelty and diversity, the power of indirect encoding, and the field’s contributions to meta-learning and architecture search. Our hope is to inspire renewed interest in the field as it meets the potential of the increasing computation available today, to highlight how many of its ideas can provide an exciting resource for inspiration and hybridization to the deep learning, deep reinforcement learning and machine learning communities, and to explain how neuroevolution could prove to be a critical tool in the long-term pursuit of artificial general intelligence.Links:
Bibtex:
"@article{stanley2019designing, title={Designing neural networks through neuroevolution}, author={Stanley, Kenneth O and Clune, Jeff and Lehman, Joel and Miikkulainen, Risto}, journal={Nature Machine Intelligence}, volume={1}, number={1}, pages={24--35}, year={2019} }"
-
Empowering Quality Diversity in Dungeon Design with Interactive Constrained MAP-Elites Games
Authors:
- Alberto Alvarez
- Steve Dahlskog
- Jose Font
- Julian Togelius
Abstract:
We propose the use of quality-diversity algorithms for mixed-initiative game content generation. This idea is implemented as a new feature of the Evolutionary Dungeon Designer, a system for mixed-initiative design of the type of levels you typically find in computer role playing games. The feature uses the MAP-Elites algorithm, an illumination algorithm which divides the population into a number of cells depending on their values along several behavioral dimensions. Users can flexibly and dynamically choose relevant dimensions of variation, and incorporate suggestions produced by the algorithm in their map designs. At the same time, any modifications performed by the human feed back into MAP-Elites, and are used to generate further suggestions.Links:
Bibtex:
"@article{alvarez2019empowering, title={Empowering Quality Diversity in Dungeon Design with Interactive Constrained MAP-Elites}, author={Alvarez, Alberto and Dahlskog, Steve and Font, Jose and Togelius, Julian}, journal={arXiv preprint arXiv:1906.05175}, year={2019} }"
-
Evaluating MAP-Elites on Constrained Optimization Problems
Authors:
- Stefano Fioravanzo
- Giovanni Iacca
Abstract:
Constrained optimization problems are often characterized by multiple constraints that, in the practice, must be satisfied with different tolerance levels. While some constraints are hard and as such must be satisfied with zero-tolerance, others may be soft, such that non-zero violations are acceptable. Here, we evaluate the applicability of MAP-Elites to "illuminate" constrained search spaces by mapping them into feature spaces where each feature corresponds to a different constraint. On the one hand, MAP-Elites implicitly preserves diversity, thus allowing a good exploration of the search space. On the other hand, it provides an effective visualization that facilitates a better understanding of how constraint violations correlate with the objective function. We demonstrate the feasibility of this approach on a large set of benchmark problems, in various dimensionalities, and with different algorithmic configurations. As expected, numerical results show that a basic version of MAP-Elites cannot compete on all problems (especially those with equality constraints) with state-of-the-art algorithms that use gradient information or advanced constraint handling techniques. Nevertheless, it has a higher potential at finding constraint violations vs. objectives trade-offs and providing new problem information. As such, it could be used in the future as an effective building-block for designing new constrained optimization algorithms.Links:
Bibtex:
"@article{fioravanzo2019evaluating, title={Evaluating MAP-Elites on Constrained Optimization Problems}, author={Fioravanzo, Stefano and Iacca, Giovanni}, journal={arXiv preprint arXiv:1902.00703}, year={2019} }"
-
Evolving embodied intelligence from materials to machines Robotics
Authors:
- David Howard
- Agoston Eiben
- Danielle Frances Kennedy
- Jean-Baptiste Mouret
- Philip Valencia
- Dave Winkler
Links:
Bibtex:
"@article{howard2019evolving, title={Evolving embodied intelligence from materials to machines}, author={Howard, David and Eiben, Agoston E and Kennedy, Danielle Frances and Mouret, Jean-Baptiste and Valencia, Philip and Winkler, Dave}, journal={Nature Machine Intelligence}, volume={1}, number={1}, pages={12--19}, year={2019}, publisher={Nature Publishing Group} }"
-
Exploration and Exploitation in Symbolic Regression using Quality-Diversity and Evolutionary Strategies Algorithms
Authors:
- J.-P. Bruneton
- L. Cazenille
- A. Douin
- V. Reverdy
Abstract:
By combining Genetic Programming, MAP-Elites and Covariance Matrix Adaptation Evolution Strategy, we demonstrate very high success rates in Symbolic Regression problems. MAP-Elites is used to improve exploration while preserving diversity and avoiding premature convergence and bloat. Then, a Covariance Matrix Adaptation-Evolution Strategy is used to evaluate free scalars through a non-gradient-based black-box optimizer. Although this evaluation approach is not computationally scalable to high dimensional problems, our algorithm is able to find exactly most of the 31 targets extracted from the literature on which we evaluate it.Links:
Bibtex:
"@article{bruneton2019exploration, title={Exploration and Exploitation in Symbolic Regression using Quality-Diversity and Evolutionary Strategies Algorithms}, author={Bruneton, J-P and Cazenille, L and Douin, A and Reverdy, V}, journal={arXiv preprint arXiv:1906.03959}, year={2019 }"
-
Exploring Self-Assembling Behaviors in a Swarm of Bio-micro-robots using Surrogate-Assisted MAP-Elites Robotics
Authors:
- Leo Cazenille
- Nicolas Bredeche
- Nathanael Aubert-Kato
Abstract:
Swarms of molecular robots are a promising approach to create specific shapes at the microscopic scale through self-assembly. However, controlling their behavior is a challenging problem as it involves complex non-linear dynamics and high experimental variability. Hand-crafting a molecular controller will often be time-consuming and give sub-optimal results. Optimization methods, like the bioNEAT algorithm, were previously employed to partially overcome these difficulties, but they still had to cope with deceptive high-dimensional search spaces and computationally expensive simulations. Here, we describe a novel approach to solve this problem by using MAP-Elites, an algorithm that searches for both high-performing and diverse solutions. We then apply it to a molecular robotic framework we recently introduced that allows sensing, signaling and self-assembly at the micro-scale and show that MAP-Elites outperforms previous approaches. Additionally, we propose a surrogate model of micro-robots physics and chemical reaction dynamics to reduce the computational costs of simulation. We show that the resulting methodology is capable of optimizing controllers with similar accuracy as when using only a full-fledged realistic model, with half the computational budget.Links:
Bibtex:
"@article{cazenille2019exploring, title={Exploring Self-Assembling Behaviors in a Swarm of Bio-micro-robots using Surrogate-Assisted MAP-Elites}, author={Cazenille, Leo and Bredeche, Nicolas and Aubert-Kato, Nathanael}, journal={arXiv preprint arXiv:1910.00230}, year={2019} }"
-
Exploring genetic programming systems with MAP-Elites
Authors:
- Emily Dolson
- Alexander Lalejini
- Charles Ofria
Links:
Bibtex:
"@incollection{dolson2019exploring, title={Exploring genetic programming systems with MAP-Elites}, author={Dolson, Emily and Lalejini, Alexander and Ofria, Charles}, booktitle={Genetic Programming Theory and Practice XVI}, pages={1--16}, year={2019}, publisher={Springer} }"
-
From exploration to control: learning object manipulation skills through novelty search and local adaptation
Authors:
- Seungsu Kim
- Alexandre Coninx
- Stephane Doncieux
Abstract:
Programming a robot to deal with open-ended tasks remains a challenge, in particular if the robot has to manipulate objects. Launching, grasping, pushing or any other object interaction can be simulated but the corresponding models are not reversible and the robot behavior thus cannot be directly deduced. These behaviors are hard to learn without a demonstration as the search space is large and the reward sparse. We propose a method to autonomously generate a diverse repertoire of simple object interaction behaviors in simulation. Our goal is to bootstrap a robot learning and development process with limited informations about what the robot has to achieve and how. This repertoire can be exploited to solve different tasks in reality thanks to a proposed adaptation method or could be used as a training set for data-hungry algorithms. The proposed approach relies on the definition of a goal space and generates a repertoire of trajectories to reach attainable goals, thus allowing the robot to control this goal space. The repertoire is built with an off-the-shelf simulation thanks to a quality diversity algorithm. The result is a set of solutions tested in simulation only. It may result in two different problems: (1) as the repertoire is discrete and finite, it may not contain the trajectory to deal with a given situation or (2) some trajectories may lead to a behavior in reality that differs from simulation because of a reality gap. We propose an approach to deal with both issues by using a local linearization between the motion parameters and the observed effects. Furthermore, we present an approach to update the existing solution repertoire with the tests done on the real robot. The approach has been validated on two different experiments on the Baxter robot: a ball launching and a joystick manipulation tasks.Links:
Bibtex:
"@article{kim2019exploration, title={From exploration to control: learning object manipulation skills through novelty search and local adaptation}, author={Kim, Seungsu and Coninx, Alexandre and Doncieux, St{\'e}phane}, journal={arXiv preprint arXiv:1901.00811}, year={2019}}"
-
Go-Explore: a New Approach for Hard-Exploration Problems Games
Authors:
- Adrien Ecoffet
- Joost Huizinga
- Joel Lehman
- Kenneth Stanley
- Jeff Clune
Abstract:
A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. It exploits the following principles: (1) remember previously visited states, (2) first return to a promising state (without exploration), then explore from it, and (3) solve simulated environments through any available means (including by introducing determinism), then robustify via imitation learning. The combined effect of these principles is a dramatic performance improvement on hard-exploration problems. On Montezuma's Revenge, Go-Explore scores a mean of over 43k points, almost 4 times the previous state of the art. Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge. Its max performance of nearly 18 million surpasses the human world record, meeting even the strictest definition of "superhuman" performance. On Pitfall, Go-Explore with domain knowledge is the first algorithm to score above zero. Its mean score of almost 60k points exceeds expert human performance. Because Go-Explore produces high-performing demonstrations automatically and cheaply, it also outperforms imitation learning work where humans provide solution demonstrations. Go-Explore opens up many new research directions into improving it and weaving its insights into current RL algorithms. It may also enable progress on previously unsolvable hard-exploration problems in many domains, especially those that harness a simulator during training (e.g. robotics).Links:
Bibtex:
"@article{ecoffet2019go, title={Go-Explore: a New Approach for Hard-Exploration Problems}, author={Ecoffet, Adrien and Huizinga, Joost and Lehman, Joel and Stanley, Kenneth O and Clune, Jeff}, journal={arXiv preprint arXiv:1901.10995}, year={2019} }"
-
MAP-Elites for Noisy Domains by Adaptive Sampling
Authors:
- Niels Justesen
- Sebastian Risi
- Jean-Baptiste Mouret
Abstract:
Quality Diversity algorithms (QD) evolve a set of high-performing phenotypes that each behaves as differently as possible. However, current algorithms are all elitist, which make them unable to cope with stochastic fitness functions and behavior evaluations. In fact, many of the promising applications of QD algorithms, for instance, games and robotics, are stochastic. Here we propose two new extensions to the QD-algorithm MAP-Elites adaptive sampling and drifting-elites — and demonstrate empirically that these extensions increase the quality of solutions in a noisy artificial test function and the behavioral diversity in a 2D bipedal walker environment.Links:
Bibtex:
"@inproceedings{justesen2019map, title={MAP-Elites for noisy domains by adaptive sampling}, author={Justesen, Niels and Risi, Sebastian and Mouret, Jean-Baptiste}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={121--122}, year={2019}, organization={ACM}}"
-
Mapping Hearthstone Deck Spaces with Map-Elites with Sliding Boundaries Games
Authors:
- Matthew Fontaine
- Lee Scott
- Lisa Soros
- Fernando De Mesentier Silva
- Julian Togelius
- Amy Hoover
Abstract:
Quality diversity (QD) algorithms such as MAP-Elites have emerged as a powerful alternative to traditional single-objective optimization methods. They were initially applied to evolutionary robotics problems such as locomotion and maze navigation, but have yet to see widespread application. We argue that these algorithms are perfectly suited to the rich domain of video games, which contains many relevant problems with a multitude of successful strategies and often also multiple dimensions along which solutions can vary. This paper introduces a novel modification of the MAP-Elites algorithm called MAP-Elites with Sliding Boundaries (MESB) and applies it to the design and rebalancing of Hearthstone, a popular collectible card game chosen for its number of multidimensional behavior features relevant to particular styles of play. To avoid overpopulating cells with conflated behaviors, MESB slides the boundaries of cells based on the distribution of evolved individuals. Experiments in this paper demonstrate the performance of MESB in Hearthstone. Results suggest MESB finds diverse ways of playing the game well along the selected behavioral dimensions. Further analysis of the evolved strategies reveals common patterns that recur across behavioral dimensions and explores how MESB can help rebalance the game.Links:
Bibtex:
"@inproceedings{fontaine2019, title={Mapping Hearthstone Deck Spaces with Map-Elites with Sliding Boundaries}, author={Matthew C. Fontaine and Scott Lee and L. B. Soros and Fernando De Mesentier Silva and Julian Togelius and Amy K. Hoover}, booktitle={Proceedings of The Genetic and Evolutionary Computation Conference}, year={2019}, organization={ACM} }"
-
Modeling user selection in quality diversity
Authors:
- Alexander Hagg
- Alexander Alexander
- Thomas Back
Abstract:
The initial phase in real world engineering optimization and design is a process of discovery in which not all requirements can be made in advance, or are hard to formalize. Quality diversity algorithms, which produce a variety of high performing solutions, provide a unique chance to support engineers and designers in the search for what is possible and high performing. In this work we begin to answer the question how a user can interact with quality diversity and turn it into an interactive innovation aid. By modeling a user's selection it can be determined whether the optimization is drifting away from the user's preferences. The optimization is then constrained by adding a penalty to the objective function. We present an interactive quality diversity algorithm that can take into account the user's selection. The approach is evaluated in a new multimodal optimization benchmark that allows various optimization tasks to be performed. The user selection drift of the approach is compared to a state of the art alternative on both a planning and a neuroevolution control task, thereby showing its limits and possibilities.Links:
Bibtex:
"@inproceedings{hagg2019modeling, title={Modeling user selection in quality diversity}, author={Hagg, Alexander and Asteroth, Alexander and B{\"a}ck, Thomas}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={116--124}, year={2019}, organization={ACM} }"
-
Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric Distance Games
Authors:
- Ethan Jackson
- Mark Daley
Abstract:
Reinforcement learning (RL) problems often feature deceptive local optima, and learning methods that optimize purely for reward signal often fail to learn strategies for overcoming them. Deep neuroevolution and novelty search have been proposed as effective alternatives to gradient-based methods for learning RL policies directly from pixels. In this paper, we introduce and evaluate the use of novelty search over agent action sequences by string edit metric distance as a means for promoting innovation. We also introduce a method for stagnation detection and population resampling inspired by recent developments in the RL community that uses the same mechanisms as novelty search to promote and develop innovative policies. Our methods extend a state-of-the-art method for deep neuroevolution using a simple-yet-effective genetic algorithm (GA) designed to efficiently learn deep RL policy network weights. Experiments using four games from the Atari 2600 benchmark were conducted. Results provide further evidence that GAs are competitive with gradient-based algorithms for deep RL. Results also demonstrate that novelty search over action sequences is an effective source of selection pressure that can be integrated into existing evolutionary algorithms for deep RL.Links:
Bibtex:
"@article{jackson2019novelty, title={Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric Distance}, author={Jackson, Ethan C and Daley, Mark}, journal={arXiv preprint arXiv:1902.03142}, year={2019} }"
-
Novelty search: a theoretical perspective
Authors:
- Stephane Doncieux
- Alban Laflaquiere
- Alexandre Coninx
Bibtex:
"@inproceedings{doncieux2019novelty, title={Novelty search: a theoretical perspective}, author={Doncieux, Stephane and Laflaquiere, Alban and Coninx, Alexandre}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={99--106}, year={2019}, organization={ACM} }"
-
POET: open-ended coevolution of environments and their optimized solutions
Authors:
- Rui Wang
- Joel Lehman
- Jeff Clune
- Kenneth Stanley
Bibtex:
"@inproceedings{wang2019poet, title={POET: open-ended coevolution of environments and their optimized solutions}, author={Wang, Rui and Lehman, Joel and Clune, Jeff and Stanley, Kenneth O}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={142--151}, year={2019}, organization={ACM} }"
-
PlayMapper: Illuminating Design Spaces of Platform Games
Authors:
- Vivek R. Warriar
- Carmen Ugarte
- John R. Woodward
- Laurissa Tokarchuk
Abstract:
In this paper, we present PlayMapper, a novel variant of the MAP-Elites algorithm that has been adapted to map the level design space of the Super Mario Bros game. Our approach uses player and level based features to create a map of playable levels. We conduct an experiment to compare the effect of different sets of input features on the range of levels generated using this technique. In this work, we show that existing search-based techniques for PCG can be improved to allow for more control and creative freedom for designers. Current limitations of the system and directions for future work are also discussed.Links:
Bibtex:
"@INPROCEEDINGS{8848118, author={V. R. {Warriar} and C. {Ugarte} and J. R. {Woodward} and L. {Tokarchuk}}, booktitle={2019 IEEE Conference on Games (CoG)}, title={PlayMapper: Illuminating Design Spaces of Platform Games}, year={2019}, volume={}, number={}, pages={1-4}}"
-
Policy search in continuous action domains: An overview
Authors:
- Olivier Sigaud
- Freek Stulp
Abstract:
Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms. In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods. The main message of this overview is in the relationship between the families of methods, but we also outline some factors underlying sample efficiency properties of the various approaches.Links:
Bibtex:
"@article{SIGAUD201928, title = "Policy search in continuous action domains: An overview", journal = "Neural Networks", volume = "113", pages = "28 - 40", year = "2019", issn = "0893-6080", doi = "https://doi.org/10.1016/j.neunet.2019.01.011", url = "https://hal.sorbonne-universite.fr/hal-02182466/document", author = "Olivier Sigaud and Freek Stulp", keywords = "Policy search, Sample efficiency, Deep reinforcement learning, Deep neuroevolution"}"
-
Procedural content generation through quality diversity Games
Authors:
- Daniele Gravina
- Ahmed Khalifa
- Antonios Liapis
- Julian Togelius
- Georgios Yannakakis
Abstract:
Quality-diversity (QD) algorithms search for a set of good solutions which cover a space as defined by behavior metrics. This simultaneous focus on quality and diversity with explicit metrics sets QD algorithms apart from standard single- and multi-objective evolutionary algorithms, as well as from diversity preservation approaches such as niching. These properties open up new avenues for artificial intelligence in games, in particular for procedural content generation. Creating multiple systematically varying solutions allows new approaches to creative human-AI interaction as well as adaptivity. In the last few years, a handful of applications of QD to procedural content generation and game playing have been proposed; we discuss these and propose challenges for future work.Links:
Bibtex:
"@article{gravina2019procedural, title={Procedural content generation through quality diversity}, author={Gravina, Daniele and Khalifa, Ahmed and Liapis, Antonios and Togelius, Julian and Yannakakis, Georgios N}, journal={arXiv preprint arXiv:1907.04053}, year={2019} }"
-
Quantifying the Effects of Increasing User Choice in MAP-Elites Applied to a Workforce Scheduling and Routing Problem
Authors:
- Neil Urquhart
- Emma Hart
- William Hutcheson
Bibtex:
"@inproceedings{urquhart2019quantifying, title={Quantifying the Effects of Increasing User Choice in MAP-Elites Applied to a Workforce Scheduling and Routing Problem}, author={Urquhart, Neil and Hart, Emma and Hutcheson, William}, booktitle={International Conference on the Applications of Evolutionary Computation (Part of EvoStar)}, pages={49--63}, year={2019}, organization={Springer} }"
-
Unsupervised Learning and Exploration of Reachable Outcome Space Robotics
Authors:
- Giuseppe Paolo
- Alban Laflaquiere
- Alexandre Coninx
- Stephane Doncieux
Abstract:
Performing Reinforcement Learning in sparse rewards settings, with very little prior knowledge, is a challenging problem since there is no signal to properly guide the learning process. In such situations, a good search strategy is fundamental. At the same time, not having to adapt the algorithm to every single problem is very desirable. Here we introduce TAXONS, a Task Agnostic eXploration of Outcome spaces through Novelty and Surprise algorithm. Based on a population-based divergent-search approach, it learns a set of diverse policies directly from high-dimensional observations, without any task-specific information. TAXONS builds a repertoire of policies while training an autoencoder on the high-dimensional observation of the final state of the system to build a low-dimensional outcome space. The learned outcome space, combined with the reconstruction error, is used to drive the search for new policies. Results show that TAXONS can find a diverse set of controllers, covering a good part of the ground-truth outcome space, while having no information about such space.Links:
Bibtex:
"@article{paolo2019unsupervised, title={Unsupervised Learning and Exploration of Reachable Outcome Space}, author={Paolo, Giuseppe and Laflaquiere, Alban and Coninx, Alexandre and Doncieux, Stephane}, journal={algorithms}, volume={24}, pages={25}, year={2019} }"
2018: 24 Papers
-
An approach to evolve and exploit repertoires of general robot behaviours Robotics
Authors:
- Jorge Gomes
- Sancho Moura Oliveira
- Anders Lyhne Christensen
Abstract:
Recent works in evolutionary robotics have shown the viability of evolution driven by behavioural novelty and diversity. These evolutionary approaches have been successfully used to generate repertoires of diverse and high-quality behaviours, instead of driving evolution towards a single, task-specific solution.Having repertoires of behaviours can enable new forms of robotic control, in which high-level controllers continually decide which behaviour to execute. To date, however, only the use of repertoires of open-loop locomotion primitives has been studied. We propose EvoRBC-II, an approach that enables the evolution of repertoires composed of general closed-loop behaviours, that can respond to the robot’s sensory inputs. The evolved repertoire is then used as a basis to evolve a transparent higher-level controller that decides when and which behaviours of the repertoire to execute. Relying on experiments in a simulated domain, we show that the evolved repertoires are composed of highly diverse and useful behaviours. The same repertoire contains sufficiently diverse behaviours to solve a wide range of tasks, and the EvoRBC-II approach can yield a performance that is comparable to the standard tabula-rasa evolution. EvoRBC-II enables automatic generation of hierarchical control through a two-step evolutionary process, thus opening doors for the further exploration of the advantages that can be brought by hierarchical control.Links:
Bibtex:
"@article{gomes2018approach, title={An approach to evolve and exploit repertoires of general robot behaviours}, author={Gomes, Jorge and Oliveira, Sancho Moura and Christensen, Anders Lyhne}, journal={Swarm and Evolutionary Computation}, year={2018}, publisher={Elsevier} }"
-
Bayesian optimization with automatic prior selection for data-efficient direct policy search Robotics
Authors:
- Remi Pautrat
- Konstantinos Konstantinos
- Jean-Baptiste Mouret
Abstract:
One of the most interesting features of Bayesian optimization for direct policy search is that it can leverage priors (e.g., from simulation or from previous tasks) to accelerate learning on a robot. In this paper, we are interested in situations for which several priors exist but we do not know in advance which one fits best the current situation. We tackle this problem by introducing a novel acquisition function, called Most Likely Expected Improvement (MLEI), that combines the likelihood of the priors and the expected improvement. We evaluate this new acquisition function on a transfer learning task for a 5-DOF planar arm and on a possibly damaged, 6-legged robot that has to learn to walk on flat ground and on stairs, with priors corresponding to different stairs and different kinds of damages. Our results show that MLEI effectively identifies and exploits the priors, even when there is no obvious match between the current situations and the priors.Links:
Bibtex:
"@inproceedings{pautrat2018bayesian, title={Bayesian optimization with automatic prior selection for data-efficient direct policy search}, author={Pautrat, R{\'e}mi and Chatzilygeroudis, Konstantinos and Mouret, Jean-Baptiste}, booktitle={2018 IEEE International Conference on Robotics and Automation (ICRA)}, pages={7571--7578}, year={2018}, organization={IEEE} }"
-
Combining map-elites and incremental evolution to generate gaits for a mammalian quadruped robot Robotics
Authors:
- Jorgen Nordmoen
- Kai Olav Ellefsen
- Kyrre Glette
Abstract:
Four-legged mammals are capable of showing a great variety of movement patterns, ranging from a simple walk to more complex movement such as trots and gallops. Imbuing this diversity to quadruped robots is of interest in order to improve both mobility and reach. Within the field of Evolutionary Robotics, Quality Diversity techniques have shown a remarkable ability to produce not only effective, but also highly diverse solutions. When applying this approach to four-legged robots an initial problem is to create viable movement patterns that do not fall. This difficulty stems from the challenging fitness gradient due to the mammalian morphology. In this paper we propose a solution to overcome this problem by implementing incremental evolution within the QualityDiversity framework. This allows us to evolve controllers that become more complex while at the same time utilizing the diversity produced byQuality Diversity. We show that our approach is able to generate high fitness solutions early in the search process, keep these solutions and perform a more open-ended search towards the end of evolution.Links:
Bibtex:
"@inproceedings{nordmoen2018combining, title={Combining map-elites and incremental evolution to generate gaits for a mammalian quadruped robot}, author={Nordmoen, Jorgen and Ellefsen, Kai Olav and Glette, Kyrre}, booktitle={International Conference on the Applications of Evolutionary Computation}, pages={719--733}, year={2018}, organization={Springer} }"
-
Comparing Approaches for Evolving High-Level Robot Control Based on Behaviour Repertoires Robotics
Authors:
- Jorge Gomes
- Anders Lyhne Christensen
Bibtex:
"@inproceedings{gomes2018comparing, title={Comparing Approaches for Evolving High-Level Robot Control Based on Behaviour Repertoires}, author={Gomes, Jorge and Christensen, Anders Lyhne}, booktitle={2018 IEEE Congress on Evolutionary Computation (CEC)}, pages={1--6}, year={2018}, organization={IEEE} }"
-
Data-Efficient Design Exploration through Surrogate-Assisted Illumination
Authors:
- Adam Gaier
- Alexander Asteroth
- Jean-Baptiste Mouret
Abstract:
Design optimization techniques are often used at the beginning of the design process to explore the space of possible designs. In these domains illumination algorithms, such as MAP-Elites, are promising alternatives to classic optimization algorithms because they produce diverse, high-quality solutions in a single run, instead of only a single near-optimal solution. Unfortunately, these algorithms currently require a large number of function evaluations, limiting their applicability. In this article we introduce a new illumination algorithm, Surrogate-Assisted Illumination (SAIL), that leverages surrogate modeling techniques to create a map of the design space according to user-defined features while minimizing the number of fitness evaluations. On a 2-dimensional airfoil optimization problem SAIL produces hundreds of diverse but high-performing designs with several orders of magnitude fewer evaluations than MAP-Elites or CMA-ES. We demonstrate that SAIL is also capable of producing maps of high-performing designs in realistic 3-dimensional aerodynamic tasks with an accurate flow simulation. Data-efficient design exploration with SAIL can help designers understand what is possible, beyond what is optimal, by considering more than pure objective-based optimization.Links:
Bibtex:
"@article{gaier2018data, title={Data-Efficient Design Exploration through Surrogate-Assisted Illumination}, author={Gaier, Adam and Asteroth, Alexander and Mouret, Jean-Baptiste}, journal={Evolutionary computation}, pages={1--30}, year={2018}, publisher={MIT Press} }"
-
Discovering the Elite Hypervolume by Leveraging Interspecies Correlation Robotics
Authors:
- Vassilis Vassiliades
- Jean-Baptiste Mouret
Abstract:
Evolution has produced an astonishing diversity of species, each filling a different niche. Algorithms like MAP-Elites mimic this divergent evolutionary process to find a set of behaviorally diverse but high-performing solutions, called the elites. Our key insight is that species in nature often share a surprisingly large part of their genome, in spite of occupying very different niches; similarly, the elites are likely to be concentrated in a specific "elite hypervolume" whose shape is defined by their common features. In this paper, we first introduce the elite hypervolume concept and propose two metrics to characterize it: the genotypic spread and the genotypic similarity. We then introduce a new variation operator, called "directional variation", that exploits interspecies (or inter-elites) correlations to accelerate the MAP-Elites algorithm. We demonstrate the effectiveness of this operator in three problems (a toy function, a redundant robotic arm, and a hexapod robot)Links:
Bibtex:
"@article{vassiliades2018discovering, title={Discovering the Elite Hypervolume by Leveraging Interspecies Correlation}, author={Vassiliades, Vassilis and Mouret, Jean-Baptiste}, journal={Proceedings of the Genetic and Evolutionary Computation Conference }, year={2018}, organization={ACM} }"
-
Dynamic mutation in MAP-Elites for robotic repertoire generation Robotics
Authors:
- Jorgen Nordmoen
- Eivind Samuelsen
- Kai Olav Ellefsen
- Kyrre Glette
Abstract:
One of the core functions in most Evolutionary Algorithms is mutation. In complex search spaces, which are common in Evolutionary Robotics, mutation is often used both for optimizing existing solutions, described as exploitation, and for spanning the search space, called exploration. This presents a difficult challenge for researchers as mutation parameters must be selected with care in order to balance the two, often contradictory, effects. Strategies that vary mutation during the search often try to estimate these effects in order to modify the mutation parameters. In this regard MAP-Elites, aQuality Diversity algorithm, presents an interesting opportunity. Because factors related to exploration and exploitation are readily available during the search, optimization based on these factors could be utilized to improve the search. In this paper we study how online adaptation of mutation rate, dynamic mutation, affects MAP-Elites in order to gain insightinto how the search process is affected by the mutation rate.Our study compares fixed and dynamic mutation parameters for two different complex gait controllers. The results show that dynamic mutation combines favorably with MAP-Elites and that there is a strong relation between mutation parameters and exploration that can be utilized.Links:
Bibtex:
"@inproceedings{nordmoen2018dynamic, title={Dynamic mutation in MAP-Elites for robotic repertoire generation}, author={Nordmoen, Jorgen and Samuelsen, Eivind and Ellefsen, Kai Olav and Glette, Kyrre}, booktitle={Artificial Life Conference Proceedings}, pages={598--605}, year={2018}, organization={MIT Press} }"
-
Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm Robotics
Authors:
- Emma Hart
- Andreas Steyven
- Ben Paechter
Abstract:
The presence of functional diversity within a group has been demonstrated to lead to greater robustness, higher performance and increased problem-solving ability in a broad range of studies that includes insect groups, human groups and swarm robotics. Evolving group diversity however has proved challenging within Evolutionary Robotics, requiring reproductive isolation and careful attention to population size and selection mechanisms. To tackle this issue, we introduce a novel, decentralised, variant of the MAP-Elites illumination algorithm which is hybridised with a well-known distributed evolutionary algorithm (mEDEA). The algorithm simultaneously evolves multiple diverse behaviours for multiple robots, with respect to a simple token-gathering task. Each robot in the swarm maintains a local archive defined by two pre-specified functional traits which is shared with robots it come into contact with. We investigate four different strategies for sharing, exploiting and combining local archives and compare results to mEDEA. Experimental results show that in contrast to previous claims, it is possible to evolve a functionally diverse swarm without geographical isolation, and that the new method outperforms mEDEA in terms of the diversity, coverage and precision of the evolved swarm.Links:
Bibtex:
"@article{hart2018evolution, title={Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm}, author={Hart, Emma and Steyven, Andreas SW and Paechter, Ben}, journal={Proceedings of the Genetic and Evolutionary Computation Conference }, year={2018}, organization={ACM} }"
-
Evolution of repertoire-based control for robots with complex locomotor systems Robotics
Authors:
- Miguel Duarte
- Jorge Gomes
- Sancho Moura Oliveira
- Anders Lyhne Christensen
Bibtex:
"@article{duarte2018evolution, title={Evolution of repertoire-based control for robots with complex locomotor systems}, author={Duarte, Miguel and Gomes, Jorge and Oliveira, Sancho Moura and Christensen, Anders Lyhne}, journal={IEEE Transactions on Evolutionary Computation}, volume={22}, number={2}, pages={314--328}, year={2018}, publisher={IEEE} }"
-
Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multi-Objective Evolutionary Algorithm Robotics
Authors:
- Joost Huizinga
- Jeff Clune
Abstract:
An important challenge in reinforcement learning, including evolutionary robotics, is to solve multimodal problems, where agents have to act in qualitatively different ways depending on the circumstances. Because multimodal problems are often too difficult to solve directly, it is helpful to take advantage of staging, where a difficult task is divided into simpler subtasks that can serve as stepping stones for solving the overall problem. Unfortunately, choosing an effective ordering for these subtasks is difficult, and a poor ordering can reduce the speed and performance of the learning process. Here, we provide a thorough introduction and investigation of the Combinatorial Multi-Objective Evolutionary Algorithm (CMOEA), which avoids ordering subtasks by allowing all combinations of subtasks to be explored simultaneously. We compare CMOEA against two algorithms that can similarly optimize on multiple subtasks simultaneously: NSGA-II and Lexicase Selection. The algorithms are tested on a multimodal robotics problem with six subtasks as well as a maze navigation problem with a hundred subtasks. On these problems, CMOEA either outperforms or is competitive with the controls. Separately, we show that adding a linear combination over all objectives can improve the ability of NSGA-II to solve these multimodal problems. Lastly, we show that, in contrast to NSGA-II and Lexicase Selection, CMOEA can effectively leverage secondary objectives to achieve state-of-the-art results on the robotics task. In general, our experiments suggest that CMOEA is a promising, state-of-the-art algorithm for solving multimodal problems.Links:
Bibtex:
"@article{huizinga2018evolving, title={Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multi-Objective Evolutionary Algorithm}, author={Huizinga, Joost and Clune, Jeff}, journal={arXiv preprint arXiv:1807.03392}, year={2018} }"
-
Evolving a Repertoire of Controllers for a Multi-function Swarm Robotics
Authors:
- Sondre Engebraten
- Jonas Moen
- Oleg Yakimenko
- Kyrre Glette
Abstract:
Automated design of swarm behaviors with a top-down approach is a challenging research question that has not yet been fully addressed in the robotic swarm literature. This paper seeks to explore the possibility of using an evolutionary algorithm to evolve, rather than hand code, a wide repertoire of behavior primitives enabling more effective control of a large group or swarm of unmanned systems. We use the MAP-elites algorithm to generate a repertoire of controllers with varying abilities and behaviors allowing the swarm to adapt to user-defined preferences by selection of a new appropriate controller. To test the proposed method we examine two example applications: perimeter surveillance and network creation. Perimeter surveillance require agents to explore, while network creation requires them to disperse without los-ing connectivity. These are distinct application that have drastically different requirements on agent behavior, and are a good benchmark for our swarm controller optimization framework. We show a performance com-parison between a simple weighted controller and a parametric controller.Evolving controllers allows for specifying desired behaviors top-down, in terms of objectives to solve, rather than bottom-up.Links:
Bibtex:
"@inproceedings{engebraaten2018evolving, title={Evolving a Repertoire of Controllers for a Multi-function Swarm}, author={Engebr{\aa}ten, Sondre A and Moen, Jonas and Yakimenko, Oleg and Glette, Kyrre}, booktitle={International Conference on the Applications of Evolutionary Computation}, pages={734--749}, year={2018}, organization={Springer} }"
-
Exploring genetic programming systems with MAP-Elites
Authors:
- Emily Dolson
- Alexander Lalejini
- Charles Ofria
Links:
Bibtex:
"@article{dolson2018exploring, title={Exploring genetic programming systems with MAP-Elites}, author={Dolson, Emily and Lalejini, Alexander and Ofria, Charles}, journal={PeerJ Preprints}, volume={6}, pages={e27154v1}, year={2018}, publisher={PeerJ Inc. San Francisco, USA} }"
-
Hierarchical Behavioral Repertoires with Unsupervised Descriptors Robotics
Authors:
- Antoine Cully
- Yiannis Demiris
Abstract:
Enabling artificial agents to automatically learn complex, versatile and high-performing behaviors is a long-lasting challenge. This paper presents a step in this direction with hierarchical behavioral repertoires that stack several behavioral repertoires to generate sophisticated behaviors. Each repertoire of this architecture uses the lower repertoires to create complex behaviors as sequences of simpler ones, while only the lowest repertoire directly controls the agent's movements. This paper also introduces a novel approach to automatically define behavioral descriptors thanks to an unsupervised neural network that organizes the produced high-level behaviors. The experiments show that the proposed architecture enables a robot to learn how to draw digits in an unsupervised manner after having learned to draw lines and arcs. Compared to traditional behavioral repertoires, the proposed architecture reduces the dimensionality of the optimization problems by orders of magnitude and provides behaviors with a twice better fitness. More importantly, it enables the transfer of knowledge between robots: a hierarchical repertoire evolved for a robotic arm to draw digits can be transferred to a humanoid robot by simply changing the lowest layer of the hierarchy. This enables the humanoid to draw digits although it has never been trained for this task.Links:
Bibtex:
"@article{cully2018hierarchical, title={Hierarchical Behavioral Repertoires with Unsupervised Descriptors}, author={Cully, Antoine and Demiris, Yiannis}, journal={Proceedings of the Genetic and Evolutionary Computation Conference }, year={2018}, organization={ACM} }"
-
Mapping structural diversity in networks sharing a given degree distribution and global clustering: Adaptive resolution grid search evolution with Diophantine equation-based mutations
Authors:
- Peter Overbury
- Istvan Kiss
- Luc Berthouze
Abstract:
Methods that generate networks sharing a given degree distribution and global clustering can induce changes in structural properties other than that controlled for. Diversity in structural properties, in turn, can affect the outcomes of dynamical processes operating on those networks. Since exhaustive sampling is not possible, we propose a novel evolutionary framework for mapping this structural diversity. The three main features of this framework are: (a) subgraph-based encoding of networks, (b) exact mutations based on solving systems of Diophantine equations, and (c) heuristic diversity-driven mechanism to drive resolution changes in the MapElite algorithm. We show that our framework can elicit networks with diversity in their higher-order structure and that this diversity affects the behaviour of the complex contagion model. Through a comparison with state of the art clustered network generation methods, we demonstrate that our approach can uncover a comparably diverse range of networks without needing computationally unfeasible mixing times. Further, we suggest that the subgraph-based encoding provides greater confidence in the diversity of higher-order network structure for low numbers of samples and is the basis for explaining our results with complex contagion model. We believe that this framework could be applied to other complex landscapes that cannot be practically mapped via exhaustive sampling.Links:
Bibtex:
"@article{overbury2018mapping, title={Mapping structural diversity in networks sharing a given degree distribution and global clustering: Adaptive resolution grid search evolution with Diophantine equation-based mutations}, author={Overbury, Peter and Kiss, Istv{\'a}n Z and Berthouze, Luc}, journal={arXiv preprint arXiv:1809.06293}, year={2018} }"
-
Multi-objective Analysis of MAP-Elites Performance Robotics
Authors:
- Eivind Samuelsen
- Kyrre Glette
Abstract:
In certain complex optimization tasks, it becomes necessary to use multiple measures to characterize the performance of different algorithms. This paper presents a method that combines ordinal effect sizes with Pareto dominance to analyze such cases. Since the method is ordinal, it can also generalize across different optimization tasks even when the performance measurements are differently scaled. Through a case study, we show that this method can discover and quantify relations that would be difficult to deduce using a conventional measure-by-measure analysis. This case study applies the method to the evolution of robot controller repertoires using the MAP-Elites algorithm. Here, we analyze the search performance across a large set of parametrizations; varying mutation size and operator type, as well as map resolution, across four different robot morphologies. We show that the average magnitude of mutations has a bigger effect on outcomes than their precise distributions.Links:
Bibtex:
"@article{samuelsen2018multi, title={Multi-objective Analysis of MAP-Elites Performance}, author={Samuelsen, Eivind and Glette, Kyrre}, journal={arXiv preprint arXiv:1803.05174}, year={2018} }"
-
Open-ended evolution with multi-containers QD Robotics
Authors:
- Stephane Doncieux
- Alexandre Coninx
Bibtex:
"@inproceedings{doncieux2018open, title={Open-ended evolution with multi-containers QD}, author={Doncieux, Stephane and Coninx, Alexandre}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={107--108}, year={2018}, organization={ACM} }"
-
Optimisation and Illumination of a Real-World Workforce Scheduling and Routing Application (WSRP) via Map-Elites
Authors:
- Neil Urquhart
- Emma Hart
Abstract:
Workforce Scheduling and Routing Problems (WSRP) are very common in many practical domains, and usually, have a number of objectives. Illumination algorithms such as Map-Elites (ME) have recently gained traction in application to {m design} problems, in providing multiple diverse solutions as well as illuminating the solution space in terms of user-defined characteristics, but typically require significant computational effort to produce the solution archive. We investigate whether ME can provide an effective approach to solving WSRP, a {m repetitive} problem in which solutions have to be produced quickly and often. The goals of the paper are two-fold. The first is to evaluate whether ME can provide solutions of competitive quality to an Evolutionary Algorithm (EA) in terms of a single objective function, and the second to examine its ability to provide a repertoire of solutions that maximise user choice. We find that very small computational budgets favour the EA in terms of quality, but ME outperforms the EA at larger budgets, provides a more diverse array of solutions, and lends insight to the end-user.Links:
Bibtex:
"@inproceedings{urquhart2018optimisation, title={Optimisation and Illumination of a Real-World Workforce Scheduling and Routing Application (WSRP) via Map-Elites}, author={Urquhart, Neil and Hart, Emma}, booktitle={International Conference on Parallel Problem Solving from Nature}, pages={488--499}, year={2018}, organization={Springer} }"
-
Prototype discovery using quality-diversity
Authors:
- Alexander Hagg
- Alexander Asteroth
- Thomas Back
Abstract:
An iterative computer-aided ideation procedure is introduced, building on recent quality-diversity algorithms, which search for diverse as well as high-performing solutions. Dimensionality reduction is used to define a similarity space, in which solutions are clustered into classes. These classes are represented by prototypes, which are presented to the user for selection. In the next iteration, quality-diversity focuses on searching within the selected class. A quantitative analysis is performed on a 2D airfoil, and a more complex 3D side view mirror domain shows how computer-aided ideation can help to enhance engineers' intuition while allowing their design decisions to influence the design process.Bibtex:
"@inproceedings{hagg2018prototype, title={Prototype discovery using quality-diversity}, author={Hagg, Alexander and Asteroth, Alexander and B{\"a}ck, Thomas}, booktitle={International Conference on Parallel Problem Solving from Nature}, pages={500--511}, year={2018}, organization={Springer} }"
-
Quality Diversity Through Surprise Robotics
Authors:
- Daniele Gravina
- Antonios Liapis
- Georgios Yannakakis
Abstract:
Quality diversity is a recent family of evolutionary search algorithms which focus on finding several well-performing (quality) yet different (diversity) solutions with the aim to maintain an appropriate balance between divergence and convergence during search. While quality diversity has already delivered promising results in complex problems, the capacity of divergent search variants for quality diversity remains largely unexplored. Inspired by the notion of surprise as an effective driver of divergent search and its orthogonal nature to novelty this paper investigates the impact of the former to quality diversity performance. For that purpose we introduce three new quality diversity algorithms which employ surprise as a diversity measure, either on its own or combined with novelty, and compare their performance against novelty search with local competition, the state of the art quality diversity algorithm. The algorithms are tested in a robot navigation task across 60 highly deceptive mazes. Our findings suggest that allowing surprise and novelty to operate synergistically for divergence and in combination with local competition leads to quality diversity algorithms of significantly higher efficiency, speed and robustness.Links:
Bibtex:
"@article{gravina2018quality, title={Quality Diversity Through Surprise}, author={Gravina, Daniele and Liapis, Antonios and Yannakakis, Georgios N}, journal={arXiv preprint arXiv:1807.02397}, year={2018} }"
-
Quality and diversity optimization: A unifying modular framework Review Robotics
Authors:
- Antoine Cully
- Yiannis Demiris
Abstract:
The optimization of functions to find the best solution according to one or several objectives has a central role in many engineering and research fields. Recently, a new family of optimization algorithms, named Quality-Diversity optimization, has been introduced, and contrasts with classic algorithms. Instead of searching for a single solution, Quality-Diversity algorithms are searching for a large collection of both diverse and high-performing solutions. The role of this collection is to cover the range of possible solution types as much as possible, and to contain the best solution for each type. The contribution of this paper is threefold. Firstly, we present a unifying framework of Quality-Diversity optimization algorithms that covers the two main algorithms of this family (Multi-dimensional Archive of Phenotypic Elites and the Novelty Search with Local Competition), and that highlights the large variety of variants that can be investigated within this family. Secondly, we propose algorithms with a new selection mechanism for Quality-Diversity algorithms that outperforms all the algorithms tested in this paper. Lastly, we present a new collection management that overcomes the erosion issues observed when using unstructured collections. These three contributions are supported by extensive experimental comparisons of Quality-Diversity algorithms on three different experimental scenarios.Links:
Bibtex:
"@article{cully2018quality, title={Quality and diversity optimization: A unifying modular framework}, author={Cully, Antoine and Demiris, Yiannis}, journal={IEEE Transactions on Evolutionary Computation}, volume={22}, number={2}, pages={245--259}, year={2018}, publisher={IEEE} }"
-
Reset-free trial-and-error learning for robot damage recovery Robotics
Authors:
- Konstantinos Chatzilygeroudis
- Vassilis Vassiliades
- Jean-Baptiste Mouret
Abstract:
The high probability of hardware failures prevents many advanced robots (e.g., legged robots) from being confidently deployed in real-world situations (e.g., post-disaster rescue). Instead of attempting to diagnose the failures, robots could adapt by trial-and-error in order to be able to complete their tasks. In this situation, damage recovery can be seen as a Reinforcement Learning (RL) problem. However, the best RL algorithms for robotics require the robot and the environment to be reset to an initial state after each episode, that is, the robot is not learning autonomously. In addition, most of the RL methods for robotics do not scale well with complex robots (e.g., walking robots) and either cannot be used at all or take too long to converge to a solution (e.g., hours of learning). In this paper, we introduce a novel learning algorithm called “Reset-free Trial-and-Error” (RTE) that (1) breaks the complexity by pre-generating hundreds of possible behaviors with a dynamics simulator of the intact robot, and (2) allows complex robots to quickly recover from damage while completing their tasks and taking the environment into account. We evaluate our algorithm on a simulated wheeled robot, a simulated six-legged robot, and a real six-legged walking robot that are damaged in several ways (e.g., a missing leg, a shortened leg, faulty motor, etc.) and whose objective is to reach a sequence of targets in an arena. Our experiments show that the robots can recover most of their locomotion abilities in an environment with obstacles, and without any human intervention.Links:
Bibtex:
"@article{chatzilygeroudis2018reset, title={Reset-free trial-and-error learning for robot damage recovery}, author={Chatzilygeroudis, Konstantinos and Vassiliades, Vassilis and Mouret, Jean-Baptiste}, journal={Robotics and Autonomous Systems}, volume={100}, pages={236--250}, year={2018}, publisher={Elsevier}}"
-
Talakat: Bullet Hell Generation through Constrained Map-Elites Games
Authors:
- Ahmed Khalifa
- Scott Lee
- Andy Nealen
- Julian Togelius
Abstract:
We describe a search-based approach to generating new levels for bullet hell games, which are action games characterized by and requiring avoidance of a very large amount of projectiles. Levels are represented using a domain-specific description language, and search in the space defined by this language is performed by a novel variant of the Map-Elites algorithm which incorporates a feasible- infeasible approach to constraint satisfaction. Simulation-based evaluation is used to gauge the fitness of levels, using an agent based on best-first search. The performance of the agent can be tuned according to the two dimensions of strategy and dexterity, making it possible to search for level configurations that require a specific combination of both. As far as we know, this paper describes the first generator for this game genre, and includes several algorithmic innovations.Links:
Bibtex:
"@article{khalifa2018talakat, title={Talakat: Bullet Hell Generation through Constrained Map-Elites}, author={Khalifa, Ahmed and Lee, Scott and Nealen, Andy and Togelius, Julian}, journal={Proceedings of the Genetic and Evolutionary Computation Conference }, year={2018}, organization={ACM} }"
-
The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation and artificial life research communities
Authors:
- Joel Lehman
- Jeff Clune
- Dusan Misevic
- and many others
Abstract:
Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms subverting their intentions, exposing unrecognized bugs in their code, producing unexpected adaptations, or exhibiting outcomes uncannily convergent with ones in nature. Such stories routinely reveal creativity by evolution in these digital worlds, but they rarely fit into the standard scientific narrative. Instead they are often treated as mere obstacles to be overcome, rather than results that warrant study in their own right. The stories themselves are traded among researchers through oral tradition, but that mode of information transmission is inefficient and prone to error and outright loss. Moreover, the fact that these stories tend to be shared only among practitioners means that many natural scientists do not realize how interesting and lifelike digital organisms are and how natural their evolution can be. To our knowledge, no collection of such anecdotes has been published before. This paper is the crowd-sourced product of researchers in the fields of artificial life and evolutionary computation who have provided first-hand accounts of such cases. It thus serves as a written, fact-checked collection of scientifically important and even entertaining stories. In doing so we also present here substantial evidence that the existence and importance of evolutionary surprises extends beyond the natural world, and may indeed be a universal property of all complex evolving systems.Links:
Bibtex:
"@article{lehman2018surprising, title={The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation and artificial life research communities}, author={Lehman, Joel and Clune, Jeff and Misevic, Dusan and others}, journal={arXiv preprint arXiv:1803.03453}, year={2018} }"
-
Using Centroidal Voronoi Tessellations to Scale Up the Multidimensional Archive of Phenotypic Elites Algorithm
Authors:
- Vassilis Vassiliades
- Konstantinos Chatzilygeroudis
- Jean-Baptiste Mouret
Abstract:
The recently introduced Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) is an evolutionary algorithm capable of producing a large archive of diverse, high-performing solutions in a single run. It works by discretizing a continuous feature space into unique regions according to the desired discretization per dimension. While simple, this algorithm has a main drawback: it cannot scale to high-dimensional feature spaces since the number of regions increase exponentially with the number of dimensions. In this paper, we address this limitation by introducing a simple extension of MAP-Elites that has a constant, pre-defined number of regions irrespective of the dimensionality of the feature space. Our main insight is that methods from computational geometry could partition a high-dimensional space into well-spread geometric regions. In particular, our algorithm uses a centroidal Voronoi tessellation (CVT) to divide the feature space into a desired number of regions; it then places every generated individual in its closest region, replacing a less fit one if the region is already occupied. We demonstrate the effectiveness of the new " CVT-MAP-Elites " algorithm in high-dimensional feature spaces through comparisons against MAP-Elites in maze navigation and hexapod locomotion tasks.Links:
Bibtex:
"@article{vassiliades2018using, title={Using Centroidal Voronoi Tessellations to Scale Up the Multidimensional Archive of Phenotypic Elites Algorithm}, author={Vassiliades, Vassilis and Chatzilygeroudis, Konstantinos and Mouret, Jean-Baptiste}, journal={IEEE Transactions on Evolutionary Computation}, volume={22}, number={4}, pages={623--630}, year={2018}, publisher={IEEE} }"
2017: 8 Papers
-
A comparison of illumination algorithms in unbounded spaces
Authors:
- Vassilis Vassiliades
- Konstantinos Chatzilygeroudis
- Jean-Baptiste Mouret
Abstract:
Illumination algorithms are a new class of evolutionary algorithms capable of producing large archives of diverse and high-performing solutions. Examples of such algorithms include Novelty Search with Local Competition (NSLC), the Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) and the newly introduced Cen-troidal Voronoi Tessellation (CVT) MAP-Elites. While NSLC can be used in unbounded behavioral spaces, MAP-Elites and CVT-MAP-Elites require the user to manually specify the bounds. In this study, we introduce variants of these algorithms that expand their bounds based on the discovered solutions. In addition, we introduce a novel algorithm called "Cluster-Elites" that can adapt its bounds to non-convex spaces. We compare all algorithms in a maze navigation problem and illustrate that Cluster-Elites and the expansive variants of MAP-Elites and CVT-MAP-Elites have comparable or better performance than NSLC, MAP-Elites and CVT-MAP-Elites.Links:
Bibtex:
"@inproceedings{vassiliades2017comparison, title={A comparison of illumination algorithms in unbounded spaces}, author={Vassiliades, Vassilis and Chatzilygeroudis, Konstantinos and Mouret, Jean-Baptiste}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={1578--1581}, year={2017}, organization={ACM} }"
-
Aerodynamic design exploration through surrogate-assisted illumination
Authors:
- Adam Gaier
- Alexander Asteroth
- Jean-Baptiste Mouret
Abstract:
A new method for design space exploration and optimization, Surrogate-Assisted Illumination (SAIL), is presented. Inspired by robotics techniques designed to produce diverse repertoires of behaviors for use in damage recovery, SAIL produces diverse designs that vary according to features specified by the designer. By producing high-performing designs with varied combinations of user-defined features a map of the design space is created. This map illuminates the relationship between the chosen features and performance, and can aid designers in identifying promising design concepts. SAIL is designed for use with compu-tationally expensive design problems, such as fluid or structural dynamics, and integrates approximative models and intelligent sampling of the objective function to minimize the number of function evaluations required. On a 2D airfoil optimization problem SAIL is shown to produce hundreds of diverse designs which perform competitively with those found by state-of-the-art black box optimization. Its capabilities are further illustrated in a more expensive 3D aerodynamic optimization task.Links:
Bibtex:
"@inproceedings{gaier2017aerodynamic, title={Aerodynamic design exploration through surrogate-assisted illumination}, author={Gaier, Adam and Asteroth, Alexander and Mouret, Jean-Baptiste}, booktitle={18th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference}, pages={3330}, year={2017} }"
-
Comparing multimodal optimization and illumination Robotics
Authors:
- Vassilis Vassiliades
- Konstantinos Chatzilygeroudis
- Jean-Baptiste Mouret
Abstract:
Illumination algorithms are a recent addition to the evolutionary computation toolbox that allows the generation of many diverse and high-performing solutions in a single run. Nevertheless, traditional multimodal optimization algorithms also search for diverse and high-performing solutions: could some multimodal optimization algorithms be beeer at illumination than illumination algorithms? In this study, we compare two illumination algorithms (Novelty Search with Local Competition (NSLC), MAP-Elites) with two multimodal optimization ones (Clearing, Restricted Tournament Selection) in a maze navigation task. e results show that Clearing can have comparable performance to MAP-Elites and NSLC.Links:
Bibtex:
"@inproceedings{vassiliades2017comparing, title={Comparing multimodal optimization and illumination}, author={Vassiliades, Vassilis and Chatzilygeroudis, Konstantinos and Mouret, Jean-Baptiste}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={97--98}, year={2017}, organization={ACM} }"
-
Data-efficient exploration, optimization, and modeling of diverse designs through surrogate-assisted illumination
Authors:
- Adam Gaier
- Alexander Asteroth
- Jean-Baptiste Mouret
Abstract:
The MAP-Elites algorithm produces a set of high-performing solutions that vary according to features deened by the user. This technique to 'illuminate' the problem space through the lens of chosen features has the potential to be a powerful tool for exploring design spaces, but is limited by the need for numerous evaluations. The Surrogate-Assisted Illumination (SAIL) algorithm, introduced here, integrates approximative models and intelligent sampling of the objective function to minimize the number of evaluations required by MAP-Elites. The ability of SAIL to efficiently produce both accurate models and diverse high-performing solutions is illustrated on a 2D airfoil design problem. The search space is divided into bins, each holding a design with a diierent combination of features. In each bin SAIL produces a better performing solution than MAP-Elites, and requires several orders of magnitude fewer evaluations. The CMA-ES algorithm was used to produce an optimal design in each bin: with the same number of evaluations required by CMA-ES to find a near-optimal solution in a single bin, SAIL finds solutions of similar quality in every bin.Links:
Bibtex:
"@inproceedings{gaier2017data, title={Data-efficient exploration, optimization, and modeling of diverse designs through surrogate-assisted illumination}, author={Gaier, Adam and Asteroth, Alexander and Mouret, Jean-Baptiste}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={99--106}, year={2017}, organization={ACM} }"
-
Discovering evolutionary stepping stones through behavior domination
Authors:
- Elliot Meyerson
- Risto Miikkulainen
Abstract:
Behavior domination is proposed as a tool for understanding and harnessing the power of evolutionary systems to discover and exploit useful stepping stones. Novelty search has shown promise in overcoming deception by collecting diverse stepping stones, and several algorithms have been proposed that combine novelty with a more traditional fitness measure to refocus search and help novelty search scale to more complex domains. However, combinations of novelty and fitness do not necessarily preserve the stepping stone discovery that novelty search affords. In several existing methods, competition between solutions can lead to an unintended loss of diversity. Behavior domination defines a class of algorithms that avoid this problem, while inheriting theoretical guarantees from multiobjective optimization. Several existing algorithms are shown to be in this class, and a new algorithm is introduced based on fast non-dominated sorting. Experimental results show that this algorithm outperforms existing approaches in domains that contain useful stepping stones, and its advantage is sustained with scale. The conclusion is that behavior domination can help illuminate the complex dynamics of behavior-driven search, and can thus lead to the design of more scalable and robust algorithms.Links:
Bibtex:
"@inproceedings{meyerson2017discovering, title={Discovering evolutionary stepping stones through behavior domination}, author={Meyerson, Elliot and Miikkulainen, Risto}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={139--146}, year={2017}, organization={ACM} }"
-
Feature space modeling through surrogate illumination
Authors:
- Adam Gaier
- Alexander Asteroth
- Jean-Baptiste Mouret
Abstract:
The MAP-Elites algorithm produces a set of high-performing solutions that vary according to features defined by the user.This technique has the potential to be a powerful tool for design space exploration, but is limited by the need for numerous evaluations. The Surrogate-Assisted Illumination algorithm (SAIL),introduced here, integrates approximative models and intelligent sampling of the objective function to minimize the number of evaluations required by MAP-Elites.The ability of SAIL to efficiently produce both accurate models and diverse high performing solutions is illustrated on a 2D air-foil design problem. The search space is divided into bins, each holding a design with a different combination of features. In each bin SAIL produces a better performing solution than MAP-Elites, and requires several orders of magnitude fewer evaluations. TheCMA-ES algorithm was used to produce an optimal design in each bin: with the same number of evaluations required by CMA-ES to find a near-optimal solution in a single bin, SAIL finds solutions of similar quality in every bin.Links:
Bibtex:
"@inproceedings{gaier2017feature, title={Feature space modeling through surrogate illumination}, author={Gaier, Adam and Asteroth, Alexander and Mouret, Jean-Baptiste}, booktitle={Proc. of GECCO}, year={2017} }"
-
Learning highly diverse robot throwing movements through quality diversity search Robotics
Authors:
- Seungsu Kim
- Stephane Doncieux
Links:
Bibtex:
"@inproceedings{kim2017learning, title={Learning highly diverse robot throwing movements through quality diversity search}, author={Kim, Seungsu and Doncieux, St{\'e}phane}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference Companion}, pages={1177--1178}, year={2017}, organization={ACM} }"
-
Minimal Criterion Coevolution: A New Approach to Open-Ended Search.
Authors:
- Jonathan C. Brant
- Kenneth O. Stanley
Links:
Bibtex:
"@inproceedings{Brant2017, author = {Brant, Jonathan C. and Stanley, Kenneth O.}, title = {Minimal Criterion Coevolution: A New Approach to Open-Ended Search.}, year = {2017}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://drive.google.com/file/d/1qdLWKsOzU7HgnO9hyymkeOp_xcDuWrx9/view}, booktitle = {Genetic and Evolutionary Computation Conference (GECCO '17) Companion}, series = {GECCO '17} } "
2016: 12 Papers
-
An Extended Study of Quality Diversity Algorithms
Authors:
- Justin Pugh
- Lisa Soros
- Kenneth Stanley
Abstract:
In a departure from conventional optimization where the goal is to find the best possible solution, a new class of evolutionary algorithms instead search for quality diversity(QD) – a maximally diverse collection of individuals in which each member is as high-performing as possible. In QD, diversity of behaviors or phenotypes is defined by a behavior characterization (BC) that is typically unaligned with (i.e.orthogonal to) the notion of quality. As experiments in a difficult maze task reinforce, QD algorithms driven by such an unaligned BC are unable to discover the best solutions on sufficiently deceptive problems. This study comprehensively surveys known QD algorithms and introduces several novel variants thereof, including a method for successfully confronting deceptive QD landscapes: driving search with multiple BCs simultaneously.Links:
Bibtex:
"@inproceedings{pugh2016extended, title={An Extended Study of Quality Diversity Algorithms}, author={Pugh, Justin K and Soros, Lisa B and Stanley, Kenneth O}, booktitle={Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion}, pages={19--20}, year={2016}, organization={ACM} }"
-
EvoRBC: evolutionary repertoire-based control for robots with arbitrary locomotion complexity Robotics
Authors:
- Miguel Duarte
- Jorge Gomes
- Sancho Moura Oliveira
- Anders Lyhne Christensen
Bibtex:
"@inproceedings{duarte2016evorbc, title={EvoRBC: evolutionary repertoire-based control for robots with arbitrary locomotion complexity}, author={Duarte, Miguel and Gomes, Jorge and Oliveira, Sancho Moura and Christensen, Anders Lyhne}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference 2016}, pages={93--100}, year={2016}, organization={ACM} }"
-
Gaining insight into quality diversity
Authors:
- Joshua Auerbach
- Giovanni Iacca
- Dario Floreano
Abstract:
Recently there has been a growing movement of researchers that believes innovation and novelty creation, rather than pure optimization, are the true strengths of evolutionary algorithms relative to other forms of machine learning. This idea also provides one possible explanation for why evolutionary processes may exist in nervous systems on top of other forms of learning. One particularly exciting corollary of this, is that evolutionary algorithms may be used to produce what Pugh et al have dubbed Quality Diversity (QD): as many as possible different solutions (according to some characterization), which are all as fit as possible. While the notion of QD implies choosing the dimensions on which to measure diversity and performance, we propose that it may be possible (and desirable) to free the evolutionary process from requiring defining these dimensions. Toward that aim, we seek to understand more about QD in general by investigating how algorithms informed by different measures of diversity (or none at all) create QD, when that QD is mea- sured in a diversity of ways.Links:
Bibtex:
"@inproceedings{auerbach2016gaining, title={Gaining insight into quality diversity}, author={Auerbach, Joshua E and Iacca, Giovanni and Floreano, Dario}, booktitle={Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion}, pages={1061--1064}, year={2016}, organization={ACM} }"
-
How Do Different Encodings Influence the Performance of the MAP-Elites Algorithm? Robotics
Authors:
- Danesh Tarapore
- Jeff Clune
- Antoine Cully
- Jean-Baptiste Mouret
Abstract:
The recently introduced Intelligent Trial and Error algorithm (IT&E) both improves the ability to automatically generate controllers that transfer to real robots, and enables robots to creatively adapt to damage in less than 2 minutes. A key component of IT&E is a new evolutionary algorithm called MAP-Elites, which creates a behavior-performance map that is provided as a set of " creative " ideas to an online learning algorithm. To date, all experiments with MAP-Elites have been performed with a directly encoded list of parameters: it is therefore unknown how MAP-Elites would behave with more advanced encodings, like HyperNeat and SUPG. In addition, because we ultimately want robots that respond to their environments via sensors, we investigate the ability of MAP-Elites to evolve closed-loop controllers, which are more complicated, but also more powerful. Our results show that the encoding critically impacts the quality of the results of MAP-Elites, and that the differences are likely linked to the locality of the encoding (the likelihood of generating a similar behavior after a single mutation). Overall, these results improve our understanding of both the dynamics of the MAP-Elites algorithm and how to best harness MAP-Elites to evolve effective and adaptable robotic controllers.Links:
Bibtex:
"@inproceedings{tarapore2016different, title={How Do Different Encodings Influence the Performance of the MAP-Elites Algorithm?}, author={Tarapore, Danesh and Clune, Jeff and Cully, Antoine and Mouret, Jean-Baptiste}, booktitle={Genetic and Evolutionary Computation Conference}, year={2016} }"
-
How the Strictness of the Minimal Criterion Impacts Open–Ended Evolution
Authors:
- L. B. Soros
- Nick Cheney
- Kenneth O. Stanley
Links:
Bibtex:
@inproceedings{Soros_2016, series={ALIFE 2016}, title={How the Strictness of the Minimal Criterion Impacts Open-Ended Evolution}, url={http://dx.doi.org/10.7551/978-0-262-33936-0-ch040}, DOI={10.7551/978-0-262-33936-0-ch040}, booktitle={Proceedings of the Artificial Life Conference 2016}, publisher={MIT Press}, author={Soros, L. B. and Cheney, Nick and Stanley, Kenneth O.}, year={2016}, collection={ALIFE 2016} }
-
On the critical role of divergent selection in evolvability Robotics
Authors:
- Joel Lehman
- Bryan Wilder
- Kenneth Stanley
Abstract:
An ambitious goal in evolutionary robotics (ER) is to evolve increasingly complex robotic behaviors with minimal human design effort. Reaching this goal requires evolutionary algorithms that can unlock from genetic encodings their latent potential for evolvability. One issue clouding this goal is conceptual confusion about evolvability that often obscures important or desirable aspects of evolvability. The danger from such confusion is that it may establish unrealistic goals for evolvability that prove unproductive in practice. An important issue separate from conceptual confusion is the common misalignment between selection and evolvability in ER. While more expressive encodings can represent higher-level adaptations (e.g. sexual reproduction or developmental systems) that increase long-term evolutionary potential (i.e. evolvability), realizing such potential requires gradients of fitness and evolvability to align. In other words, selection is often a critical factor limiting increasing evolvability. Thus, drawing from a series of recent papers, this article seeks to both (1) clarify and focus the ways in which the term evolvability is used within artificial evolution and (2) argue for the importance of one type of selection, i.e. divergent selection, for enabling evolvability. The main argument is that there is a fundamental connection between divergent selection and evolvability (on both the individual and population level) that does not hold for typical goal-oriented selection. The conclusion is that selection pressure plays a critical role in realizing the potential for evolvability and that divergent selection in particular provides a principled mechanism for encouraging evolvability in artificial evolution.Links:
Bibtex:
"@article{lehman2016critical, title={On the critical role of divergent selection in evolvability}, author={Lehman, Joel and Wilder, Bryan and Stanley, Kenneth O}, journal={Frontiers in Robotics and AI}, volume={3}, pages={45}, year={2016}, publisher={Frontiers} }"
-
Quality Diversity: A New Frontier for Evolutionary Computation Robotics
Authors:
- Justin PughK
- Lisa Soros
- Kenneth Stanley
Abstract:
While evolutionary computation and evolutionary robotics take inspiration from nature, they have long focused mainly on problems of performance optimization. Yet, evolution in nature can be interpreted as more nuanced than a process of simple optimization. In particular, natural evolution is a divergent search that optimizes locally within each niche as it simultaneously diversifies. This tendency to discover both quality and diversity at the same time differs from many of the conventional algorithms of machine learning, and also thereby suggests a different foundation for inferring the approach of greatest potential for evolutionary algorithms. In fact, several recent evolutionary algorithms called quality diversity (QD) algorithms (e.g., novelty search with local competition and MAP-Elites) have drawn inspiration from this more nuanced view, aiming to fill a space of possibilities with the best possible example of each type of achievable behavior. The result is a new class of algorithms that return an archive of diverse, high-quality behaviors in a single run. The aim in this paper is to study the application of QD algorithms in challenging environments (in particular complex mazes) to establish their best practices for ambitious domains in the future. In addition to providing insight into cases when QD succeeds and fails, a new approach is investigated that hybridizes multiple views of behaviors (called behavior characterizations) in the same run, which succeeds in overcoming some of the challenges associated with searching for QD with respect to a behavior characterization that is not necessarily sufficient for generating both quality and diversity at the same time.Links:
Bibtex:
"@article{pugh2016quality, title={Quality Diversity: A New Frontier for Evolutionary Computation}, author={Pugh, Justin K and Soros, Lisa B and Stanley, Kenneth O}, journal={Frontiers in Robotics and AI}, volume={3}, pages={40}, year={2016}, publisher={Frontiers} }"
-
Rapid Phenotypic Landscape Exploration Through Hierarchical Spatial Partitioning
Authors:
- Davy Smith
- Laurissa Tokarchuk
- Geraint Wiggins
Abstract:
Exploration of the search space through the optimisation of phenotypic diversity is of increasing interest within the field of evolutionary robotics. Novelty search and the more recent MAP-Elites are two state of the art evolutionary algorithms which diversify low dimensional phenotypic traits for divergent exploration. In this paper we introduce a novel alternative for rapid divergent search of the feature space. Unlike previous phenotypic search procedures, our proposed Spatial, Hierarchical, Illuminated Neuro-Evolution (SHINE) algorithm utilises a tree structure for the maintenance and selection of potential candidates. SHINE penalises previous solutions in more crowded areas of the landscape. Our experimental results show that SHINE significantly outperforms novelty search and MAP-Elites in both performance and exploration. We conclude that the SHINE algorithm is a viable method for rapid divergent search of low dimensional, phenotypic landscapes.Links:
Bibtex:
"@inproceedings{smith2016rapid, title={Rapid phenotypic landscape exploration through hierarchical spatial partitioning}, author={Smith, Davy and Tokarchuk, Laurissa and Wiggins, Geraint}, booktitle={International conference on parallel problem solving from nature}, pages={911--920}, year={2016}, organization={Springer}}"
-
Safety-aware robot damage recovery using constrained bayesian optimization and simulated priors Robotics
Authors:
- Vaios Papaspyros
- Konstantinos Chatzilygeroudis
- Vassilis Vassiliades
- Jean-Baptiste Mouret
Abstract:
The recently introduced Intelligent Trial-and-Error (IT&E) algorithm showed that robots can adapt to damage in a matter of a few trials. The success of this algorithm relies on two components: prior knowledge acquired through simulation with an intact robot, and Bayesian optimization (BO) that operates on-line, on the damaged robot. While IT&E leads to fast damage recovery, it does not incorporate any safety constraints that prevent the robot from attempting harmful behaviors. In this work, we address this limitation by replacing the BO component with a constrained BO procedure. We evaluate our approach on a simulated damaged humanoid robot that needs to crawl as fast as possible, while performing as few unsafe trials as possible. We compare our new "safety-aware IT&E" algorithm to IT&E and a multi-objective version of IT&E in which the safety constraints are dealt as separate objectives. Our results show that our algorithm outperforms the other approaches, both in crawling speed within the safe regions and number of unsafe trials.Links:
Bibtex:
"@inproceedings{papaspyros2016safety, title={Safety-aware robot damage recovery using constrained bayesian optimization and simulated priors}, author={Papaspyros, Vaios and Chatzilygeroudis, Konstantinos and Vassiliades, Vassilis and Mouret, Jean-Baptiste}, booktitle={ Bayesian Optimization: Black-box Optimization and Beyond (workshop at NIPS), 2016, Barcelone, Spain.}, year={2016} }"
-
Searching for quality diversity when diversity is unaligned with quality
Authors:
- Justin Pugh
- Lisa Soros
- Kenneth Stanley
Abstract:
Inspired by natural evolution’s affinity for discovering a wide variety of successful organisms, a new evolutionary search paradigm has emerged wherein the goal is not to find the single best solution but rather to collect a diversity of unique phenotypes where each variant is as good as it can be. These quality diversity(QD) algorithms therefore must explore multiple promising niches simultaneously. A QD algorithm’s diversity component, formalized by specifying a behavior characterization (BC), not only generates diversity but also promotes quality by helping to overcome deception in the fitness landscape. However, some BCs (particularly those that are unaligned with the notion of quality) do not adequately mitigate deception, rendering QD algorithms unable to discover the best-performing solutions on difficult problems. This paper introduces a solution that enables QD algorithms to pursue arbitrary notions of diversity without compromising their ability to solve hard problems: driving search with multiple BCs simultaneouslyLinks:
Bibtex:
"@inproceedings{pugh2016searching, title={Searching for quality diversity when diversity is unaligned with quality}, author={Pugh, Justin K and Soros, Lisa B and Stanley, Kenneth O}, booktitle={International Conference on Parallel Problem Solving from Nature}, pages={880--889}, year={2016}, organization={Springer} }"
-
Towards semi-episodic learning for robot damage recovery Robotics
Authors:
- Konstantinos Chatzilygeroudis
- Antoine Cully
- Jean-Baptiste Mouret
Abstract:
The recently introduced Intelligent Trial and Error algorithm (IT&E) enables robots to creatively adapt to damage in a matter of minutes by combining an off-line evolutionary algorithm and an on-line learning algorithm based on Bayesian Optimization. We extend the IT&E algorithm to allow for robots to learn to compensate for damages while executing their task(s). This leads to a semi-episodic learning scheme that increases the robot's lifetime autonomy and adaptivity. Preliminary experiments on a toy simulation and a 6-legged robot locomotion task show promising results.Links:
Bibtex:
"@inproceedings{chatzilygeroudis2016towards, title={Towards semi-episodic learning for robot damage recovery}, author={Chatzilygeroudis, Konstantinos and Cully, Antoine and Mouret, Jean-Baptiste}, booktitle={IEEE International Conference on Robotics and Automation (ICRA) Workshop on AI for Long-Term Autonomy}, year={2016} }"
-
Understanding innovation engines: Automated creativity and improved stochastic optimization via deep learning
Authors:
- Anh Nguyen
- Jason Yosinski
- Jeff Clune
Abstract:
The Achilles Heel of stochastic optimization algorithms is getting trapped on local optima. Novelty Search mitigates this problem by encouraging exploration in all interest-ing directions by replacing the performance objective with a reward for novel behaviors. This reward for novel behaviors has traditionally required a human-crafted, behavioral distance function. While Novelty Search is a major conceptual breakthrough and outperforms traditional stochastic optimization on certain problems, it is not clear how to apply it to challenging, high-dimensional problems where specifying a useful behavioral distance function is difficult. For example, in the space of images, how doy ou encourage novelty to produce hawks and heroes instead of endless pixel static?Here we propose a new algorithm, the Innovation Engine, that builds on NoveltySearch by replacing the human-crafted behavioral distance with a Deep Neural Net-work (DNN) that can recognize interesting differences between phenotypes. The key insight is that DNNs can recognize similarities and differences between phenotypesat an abstract level, wherein novelty means interesting novelty. For example, a DNN-based novelty search in the image space does not explore in the low-level pixel space, but instead creates a pressure to create new types of images (e.g. churches, mosques, obelisks, etc.). Here we describe the long-term vision for the Innovation Engine algorithm, which involves many technical challenges that remain to be solved. We then implement a simplified version of the algorithm that enables us to explore some of the algorithm’s key motivations. Our initial results, in the domain of images, suggest that Innovation Engines could ultimately automate the production of endless streams of interesting solutions in any domain: e.g. producing intelligent software, robot con-trollers, optimized physical components, and art.Links:
Bibtex:
"@article{nguyen2016understanding, title={Understanding innovation engines: Automated creativity and improved stochastic optimization via deep learning}, author={Nguyen, Anh and Yosinski, Jason and Clune, Jeff}, journal={Evolutionary computation}, volume={24}, number={3}, pages={545--572}, year={2016}, publisher={MIT Press} }"
2015: 8 Papers
-
Confronting the challenge of quality diversity
Authors:
- Justin Pugh
- Lisa Soros
- Paul Szerlip
- Kenneth Stanley
Abstract:
In contrast to the conventional role of evolution in evolution-ary computation (EC) as an optimization algorithm, a newclass of evolutionary algorithms has emerged in recent yearsthat instead aim to accumulate as diverse a collection of dis-coveries as possible, yet where each variant in the collection isas fit as it can be. Often applied in both neuroevolution andmorphological evolution, these newquality diversity(QD)algorithms are particularly well-suited to evolution’s inherentstrengths, thereby offering a promising niche for EC withinthe broader field of machine learning. However, because QDalgorithms are so new, until now no comprehensive studyhas yet attempted to systematically elucidate their relativestrengths and weaknesses under different conditions. Tak-ing a first step in this direction, this paper introduces anew benchmark domain designed specifically to compare andcontrast QD algorithms. It then shows how the degree ofalignment between the measure of quality and the behaviorcharacterization (which is an essential component of all QDalgorithms to date) impacts the ultimate performance ofdifferent such algorithms. The hope is that this initial studywill help to stimulate interest in QD and begin to unify thedisparate ideas in the area.Links:
Bibtex:
"@inproceedings{pugh2015confronting, title={Confronting the challenge of quality diversity}, author={Pugh, Justin K and Soros, LB and Szerlip, Paul A and Stanley, Kenneth O}, booktitle={Proceedings of the 2015 on Genetic and Evolutionary Computation Conference}, pages={967--974}, year={2015}, organization={ACM} }"
-
Constrained novelty search: A study on game content generation Games
Authors:
- Antonios Liapis
- Georgios Yannakakis
- Julian Togelius
Abstract:
Novelty search is a recent algorithm geared towards exploring search spaces without regard to objectives. When the presence of constraints divides a search space into feasible space and infeasible space, interesting implications arise regarding how novelty search explores such spaces. This paper elaborates on the problem of constrained novelty search and proposes two novelty search algorithms which search within both the feasible and the infeasible space. Inspired by the FI-2pop genetic algorithm, both algorithms maintain and evolve two separate populations, one with feasible and one with infeasible individuals, while each population can use its own selection method. The proposed algorithms are applied to the problem of generating diverse but playable game levels, which is representative of the larger problem of procedural game content generation. Results show that the two-population constrained novelty search methods can create, under certain conditions, larger and more diverse sets of feasible game levels than current methods of novelty search, whether constrained or unconstrained. However, the best algorithm is contingent on the particularities of the search space and the genetic operators used. Additionally, the proposed enhancement of offspring boosting is shown to enhance performance in all cases of two-population novelty search.Links:
Bibtex:
"@article{liapis2015constrained, title={Constrained novelty search: A study on game content generation}, author={Liapis, Antonios and Yannakakis, Georgios N and Togelius, Julian}, journal={Evolutionary computation}, volume={23}, number={1}, pages={101--129}, year={2015}, publisher={MIT Press} }"
-
Enhancing divergent search through extinction events Robotics
Authors:
- Joel Lehman
- Risto Miikkulainen
Abstract:
A challenge in evolutionary computation is to create representations as evolvable as those in natural evolution. This paper hypothesizes that extinction events, i.e. mass extinctions, can significantly increase evolvability, but only when combined with a divergent search algorithm, i.e. a search driven towards diversity (instead of optimality). Extinctions amplify diversity-generation by creating unpredictable evolutionary bottlenecks. Persisting through multiple such bottlenecks is more likely for lineages that diversify across many niches, resulting in indirect selection pressure for the capacity to evolve. This hypothesis is tested through experiments in two evolutionary robotics domains. The results show that combining extinction events with divergent search increases evolvability, while combining them with convergent search offers no similar benefit. The conclusion is that extinction events may provide a simple and effective mechanism to enhance performance of divergent search algorithms.Links:
Bibtex:
"@inproceedings{lehman2015enhancing, title={Enhancing divergent search through extinction events}, author={Lehman, Joel and Miikkulainen, Risto}, booktitle={Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation}, pages={951--958}, year={2015}, organization={ACM} }"
-
Evolving a behavioral repertoire for a walking robot Robotics
Authors:
- Antoine Cully
- Jean-Baptiste Mouret
Abstract:
Numerous algorithms have been proposed to allow legged robots to learn to walk. However, the vast majority of these algorithms is devised to learn to walk in a straight line, which is not sufficient to accomplish any real-world mission. Here we introduce the Transferability-based Behavioral Repertoire Evolution algorithm (TBR-Evolution), a novel evolutionary algorithm that simultaneously discovers several hundreds of simple walking controllers, one for each possible direction. By taking advantage of solutions that are usually discarded by evolutionary processes, TBR-Evolution is substantially faster than independently evolving each controller. Our technique relies on two methods: (1) novelty search with local competition, which searches for both high-performing and diverse solutions, and (2) the transferability approach, which com-bines simulations and real tests to evolve controllers for a physical robot. We evaluate this new technique on a hexapod robot. Results show that with only a few dozen short experiments performed on the robot, the algorithm learns a repertoire of con-trollers that allows the robot to reach every point in its reachable space. Overall, TBR-Evolution opens a new kind of learning algorithm that simultaneously optimizes all the achievable behaviors of a robot.Links:
Bibtex:
"@article{cully2015evolving, title={Evolving a behavioral repertoire for a walking robot}, author={Cully, Antoine and Mouret, Jean-Baptiste}, journal={Evolutionary Computation}, year={2015}, publisher={MIT Press} }"
-
Extinction events can accelerate evolution
Authors:
- Joel Lehman
- Risto Miikkulainen
Abstract:
Extinction events impact the trajectory of biological evolution significantly. They are often viewed as upheavals to the evolutionary process. In contrast, this paper supports the hypothesis that although they are unpredictably destructive, extinction events may in the long term accelerate evolution by increasing evolvability. In particular, if extinction events extinguish indiscriminately many ways of life, indirectly they may select for the ability toe xpand rapidly through vacated niches. Lineages with such an ability are more likely to persist through multiple extinctions. Lending computational support for this hypothesis, this paper shows how increased evolvability will result from simulated extinction events in two computational models of evolved behavior. The conclusion is that although they are destructive in the short term, extinction events may make evolution more prolific in the long term.Links:
Bibtex:
"@article{lehman2015extinction, title={Extinction events can accelerate evolution}, author={Lehman, Joel and Miikkulainen, Risto}, journal={PloS one}, volume={10}, number={8}, pages={e0132886}, year={2015}, publisher={Public Library of Science} }"
-
Illuminating search spaces by mapping elites Robotics
Authors:
- Jean-Baptiste Mouret
- Jeff Clune
Abstract:
Many fields use search algorithms, which automatically explore a search space to find high-performing solutions: chemists search through the space of molecules to discover new drugs; engineers search for stronger, cheaper, safer designs, scientists search for models that best explain data, etc. The goal of search algorithms has traditionally been to return the single highest-performing solution in a search space. Here we describe a new, fundamentally different type of algorithm that is more useful because it provides a holistic view of how high-performing solutions are distributed throughout a search space. It creates a map of high-performing solutions at each point in a space defined by dimensions of variation that a user gets to choose. This Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) algorithm illuminates search spaces, allowing researchers to understand how interesting attributes of solutions combine to affect performance, either positively or, equally of interest, negatively. For example, a drug company may wish to understand how performance changes as the size of molecules and their cost-to-produce vary. MAP-Elites produces a large diversity of high-performing, yet qualitatively different solutions, which can be more helpful than a single, high-performing solution. Interestingly, because MAP-Elites explores more of the search space, it also tends to find a better overall solution than state-of-the-art search algorithms. We demonstrate the benefits of this new algorithm in three different problem domains ranging from producing modular neural networks to designing simulated and real soft robots. Because MAP- Elites (1) illuminates the relationship between performance and dimensions of interest in solutions, (2) returns a set of high-performing, yet diverse solutions, and (3) improves finding a single, best solution, it will advance science and engineering.Links:
Bibtex:
"@article{mouret2015illuminating, title={Illuminating search spaces by mapping elites}, author={Mouret, Jean-Baptiste and Clune, Jeff}, journal={arXiv preprint arXiv:1504.04909}, year={2015} }"
-
Innovation engines: Automated creativity and improved stochastic optimization via deep learning
Authors:
- Anh Nguyen
- Jason Yosinski
- Jeff Clune
Abstract:
The Achilles Heel of stochastic optimization algorithms is etting trapped on local optima. Novelty Search avoids this problem by encouraging a search in all interesting directions.That occurs by replacing a performance objective with a reward for novel behaviors, as defined by a human-crafted, and often simple, behavioral distance function. While NoveltySearch is a major conceptual breakthrough and outperforms traditional stochastic optimization on certain problems, it is not clear how to apply it to challenging, high-dimensional problems where specifying a useful behavioral distance function is difficult. For example, in the space of images, how do you encourage novelty to produce hawks and heroes instead of endless pixel static? Here we propose a new algorithm, the Innovation Engine, that builds on Novelty Search by re-placing the human-crafted behavioral distance with a DeepNeural Network (DNN) that can recognize interesting differences between phenotypes. The key insight is that DNNs can recognize similarities and differences between phenotypes at an abstract level, wherein novelty means interesting novelty. For example, a novelty pressure in image space does not explore in the low-level pixel space, but instead creates a pressure to create new types of images (e.g. churches, mosques, obelisks, etc.). Here we describe the long-term vision for the Innovation Engine algorithm, which involves many technical challenges that remain to be solved. We then implement a simplified version of the algorithm that enables us to explore some of the algorithm’s key motivations. Our initial results, in the domain of images, suggest that Innovation Engines could ultimately automate the production of endless streams of interesting solutions in any domain: e.g.producing intelligent software, robot controllers, optimized physical components, and art.Links:
Bibtex:
"@inproceedings{nguyen2015innovation, title={Innovation engines: Automated creativity and improved stochastic optimization via deep learning}, author={Nguyen, Anh Mai and Yosinski, Jason and Clune, Jeff}, booktitle={Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation}, pages={959--966}, year={2015}, organization={ACM} }"
-
Robots that can adapt like animals Robotics
Authors:
- Antoine Cully
- Jeff Clune
- Danesh Tarapore
- Jean-Baptiste Mouret
Abstract:
As robots leave the controlled environments of factories to autonomously function in more complex, natural environments, they will have to respond to the inevitable fact that they will become damaged. However, while animals can quickly adapt to a wide variety of injuries, current robots cannot "think outside the box" to find a compensatory behavior when damaged: they are limited to their pre-specified self-sensing abilities, can diagnose only anticipated failure modes, and require a pre-programmed contingency plan for every type of potential damage, an impracticality for complex robots. Here we introduce an intelligent trial and error algorithm that allows robots to adapt to damage in less than two minutes, without requiring self-diagnosis or pre-specified contingency plans. Before deployment, a robot exploits a novel algorithm to create a detailed map of the space of high-performing behaviors: This map represents the robot's intuitions about what behaviors it can perform and their value. If the robot is damaged, it uses these intuitions to guide a trial-and-error learning algorithm that conducts intelligent experiments to rapidly discover a compensatory behavior that works in spite of the damage. Experiments reveal successful adaptations for a legged robot injured in five different ways, including damaged, broken, and missing legs, and for a robotic arm with joints broken in 14 different ways. This new technique will enable more robust, effective, autonomous robots, and suggests principles that animals may use to adapt to injury.Links:
Bibtex:
"@article{cully2015robots, title={Robots that can adapt like animals}, author={Cully, Antoine and Clune, Jeff and Tarapore, Danesh and Mouret, Jean-Baptiste}, journal={Nature}, volume={521}, number={7553}, pages={503--507}, year={2015}, publisher={Nature Publishing Group} }"
2013: 2 Papers
-
Behavioral repertoire learning in robotics Robotics
Authors:
- Antoine Cully
- Jean-Baptiste Mouret
Abstract:
Learning in robotics typically involves choosing a simple goal(e.g. walking) and assessing the performance of each con-troller with regard to this task (e.g. walking speed). How-ever, learning advanced, input-driven controllers (e.g. walk-ing in each direction) requires testing each controller on a large sample of the possible input signals. This costly pro-cess makes difficult to learn useful low-level controllers in robotics. Here we introduce BR-Evolution, a new evolutionary learn-ing technique that generates a behavioral repertoire by taking advantage of the candidate solutions that are usually discarded. Instead of evolving a single, general controller,BR-evolution thus evolves a collection of simple controllers, one for each variant of the target behavior; to distinguish similar controllers, it uses a performance objective that al-lows it to produce a collection of diverse but high-performing behaviors. We evaluated this new technique by evolving gait controllers for a simulated hexapod robot. Results show that a single run of the EA quickly finds a collection of controllers that allows the robot to reach each point of the reachable space. Overall, BR-Evolution opens a new kind of learning algorithm that simultaneously optimizes all the achievable behaviors of a robot.Links:
Bibtex:
"@inproceedings{cully2013behavioral, title={Behavioral repertoire learning in robotics}, author={Cully, Antoine and Mouret, Jean-Baptiste}, booktitle={Proceedings of the 15th annual conference on Genetic and Evolutionary Computation}, pages={175--182}, year={2013}, organization={ACM} }"
-
Transforming exploratory creativity with DeLeNoX Games
Authors:
- Antonios Liapis
- Hector Martinez
- Julian Togelius
- Georgios Yannakakis
Abstract:
We introduce DeLeNoX (Deep Learning Novelty Explorer), a system that autonomously creates artifacts in constrained spaces according to its own evolving interestingness criterion. DeLeNoX proceeds in alternating phases of exploration and transformation. In the exploration phases, a version of novelty search augmented with constraint handling searches for maximally diverse artifacts using a given distance function. In the transformation phases, a deep learning autoencoder learns to compress the variation between the found artifacts into a lower-dimensional space. The newly trained encoder is then used as the basis for a new distance function, transforming the criteria for the next exploration phase. In the current paper, we apply DeLeNoX to the creation of spaceships suitable for use in two-dimensional arcade-style computer games, a representative problem in procedural content generation in games. We also situate DeLeNoX in relation to the distinction between exploratory and transformational creativity, and in relation to Schmidhuber’s theory of creativity through the drive for compression progress.Links:
Bibtex:
"@inproceedings{liapis2013transforming, title={Transforming exploratory creativity with DeLeNoX}, author={Liapis, Antonios and Mart{\i}nez, H{\'e}ctor P and Togelius, Julian and Yannakakis, Georgios N}, booktitle={Proceedings of the Fourth International Conference on Computational Creativity}, pages={56--63}, year={2013}, organization={AAAI Press} }"
2011: 3 Papers
-
Abandoning objectives: Evolution through the search for novelty alone Robotics
Authors:
- Joel Lehman
- Kenneth Stanley
Abstract:
In evolutionary computation, the fitness function normally measures progress towards an objective in the search space, effectively acting as an objective function. Through deception, such objective functions may actually prevent the objective from being reached. While methods exist to mitigate deception, they leave the underlying pathology untreated: Objective functions themselves may actively misdirect search towards dead ends. This paper proposes an approach to circumventing deception that also yields a new perspective on open-ended evolution: Instead of either explicitly seeking an objective or modeling natural evolution to capture open-endedness, the idea is to simply search for behavioral novelty. Even in an objective-based problem, such novelty search ignores the objective. Because many points in the search space collapse to a single behavior, the search for novelty is often feasible. Furthermore, because there are only so many simple behaviors, the search for novelty leads to increasing complexity. By decoupling open-ended search from artificial life worlds, the search for novelty is applicable to real world problems. Counterintuitively, in the maze navigation and biped walking tasks in this paper, novelty search significantly outperforms objective-based search, suggesting the strange conclusion that some problems are best solved by methods that ignore the objective. The main lesson is the inherent limitation of the objective-based paradigm and the unexploited opportunity to guide search through other means.Links:
Bibtex:
"@article{lehman2011abandoning, title={Abandoning objectives: Evolution through the search for novelty alone}, author={Lehman, Joel and Stanley, Kenneth O}, journal={Evolutionary computation}, volume={19}, number={2}, pages={189--223}, year={2011}, publisher={MIT Press} }"
-
Evolving a diversity of virtual creatures through novelty search and local competition Robotics
Authors:
- Joel Lehman
- Kenneth Stanley
Abstract:
An ambitious challenge in artificial life is to craft an evolutionary process that discovers a wide diversity of welladapted virtual creatures within a single run. Unlike in nature, evolving creatures in virtual worlds tend to converge to a single morphology because selection therein greedily rewards the morphology that is easiest to exploit. However, novelty search, a technique that explicitly rewards diverging, can potentially mitigate such convergence. Thus in this paper an existing creature evolution platform is extended with multi-objective search that balances drives for both novelty and performance. However, there are different ways to combine performance-driven search and novelty search. The suggested approach is to provide evolution with both a novelty objective that encourages diverse morphologies and a local competition objective that rewards individuals outperforming those most similar in morphology. The results in an experiment evolving locomoting virtual creatures show that novelty search with local competition discovers more functional morphological diversity within a single run than models with global competition, which are more predisposed to converge. The conclusions are that novelty search with local competition may complement recent advances in evolving virtual creatures and may in general be a principled approach to combining novelty search with pressure to achieve.Links:
Bibtex:
"@inproceedings{lehman2011evolving, title={Evolving a diversity of virtual creatures through novelty search and local competition}, author={Lehman, Joel and Stanley, Kenneth O}, booktitle={Proceedings of the 13th annual conference on Genetic and evolutionary computation}, pages={211--218}, year={2011}, organization={ACM} }"
-
Novelty-based multiobjectivization Robotics
Authors:
- Jean-Baptiste Mouret
Abstract:
Novelty search is a recent and promising approach to evolve neurocontrollers, especially to drive robots. The main idea is to maximize the novelty of behaviors instead of the efficiency. However, abandoning the efficiency objective(s) may be too radical in many contexts. In this paper, a Pareto-based multi-objective evolutionary algorithmis employed to reconcile novelty search with objective-based optimization by following a multiobjectivization process. Several multiobjectivizations based on behavioral novelty and on behavioral diversity are compared on a maze navigation task. Results show that the bi-objective variant “Novelty + Fitness” is better at fine-tuning behaviors than basic novelty search, while keeping a comparable number of iterations to converge.Links:
Bibtex:
"@incollection{mouret2011novelty, title={Novelty-based multiobjectivization}, author={Mouret, Jean-Baptiste}, booktitle={New horizons in evolutionary robotics}, pages={139--154}, year={2011}, publisher={Springer} }"
2010: 1 Papers
-
Revising the Evolutionary Computation Abstraction: Minimal Criteria Novelty Search.
Authors:
- Joel Lehman
- Kenneth O. Stanley
Links:
Bibtex:
"@inproceedings{lehman2010, author = {Lehman, Joel. and Stanley, Kenneth O.}, title = {Revising the Evolutionary Computation Abstraction: Minimal Criteria Novelty Search}, year = {2010}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://drive.google.com/file/d/1p7GdqyOwn-beyKoyuA0T5eQxgKoai21M/view}, booktitle = {Genetic and Evolutionary Computation Conference (GECCO '10) Companion}, series = {GECCO '10} } "
Undated: 1 Papers
-
Toward a Plug–and–Play Vision–Based Grasping Module for Robotics Robotics
Authors:
- François Hélénon
- Johann Huber
- Faïz Ben Amar
- Stéphane Doncieux
Abstract:
Despite recent advancements in AI for robotics, grasping remains a partially solved challenge, hindered by the lack of benchmarks and reproducibility constraints. This paper introduces a vision-based grasping framework that can easily be transferred across multiple manipulators. Leveraging Quality-Diversity (QD) algorithms, the framework generates diverse repertoires of open-loop grasping trajectories, enhancing adaptability while maintaining a diversity of grasps. This framework addresses two main issues: the lack of an off-the-shelf vision module for detecting object pose and the generalization of QD trajectories to the whole robot operational space. The proposed solution combines multiple vision modules for 6DoF object detection and tracking while rigidly transforming QD-generated trajectories into the object frame. Experiments on a Franka Research 3 arm and a UR5 arm with a SIH Schunk hand demonstrate comparable performance when the real scene aligns with the simulation used for grasp generation. This work represents a significant stride toward building a reliable vision-based grasping module transferable to new platforms, while being adaptable to diverse scenarios without further training iterations.Links:
Bibtex:
@article{Hélénon2023Toward, title={Toward a Plug–and–Play Vision–Based Grasping Module for Robotics}, author={Hélénon, François and Huber, Johann and Ben Amar, Faïz and Doncieux, Stéphane}, journal={arXiv preprint arXiv:2310.04349v2}, year={2023} }