EReLELA: Exploration in Reinforcement Learning via Emergent Language Abstractions

Abstract

Instruction-following from prompts in Natural Languages (NLs) is an important benchmark for Human-AI collaboration. Training Embodied AI agents for instruction-following with Reinforcement Learning (RL) poses a strong exploration challenge. Previous works have shown that NL-based state abstractions can help address the exploitation versus exploration trade-off in RL. However, NLs descriptions are not always readily available and are expensive to collect. We therefore propose to use the Emergent Communication paradigm, where artificial agents are free to learn an emergent language (EL) via referential games, to bridge this gap. ELs constitute cheap and readily-available abstractions, as they are the result of an unsupervised learning approach. In this paper, we investigate (i) how EL-based state abstractions compare to NL-based ones for RL in hard-exploration, procedurally-generated environments, and (ii) how properties of the referential games used to learn ELs impact the quality of the RL exploration and learning. Results indicate that the EL-guided agent, namely EReLELA, achieves similar performance as its NL-based counterparts without its limitations. Our work shows that Embodied RL agents can leverage unsupervised emergent abstractions to greatly improve their exploration skills in sparse reward settings, thus opening new research avenues between Embodied AI and Emergent Communication.

Kevin Denamganaï
Kevin Denamganaï
Independent Researcher

My research investigates the conditions under which AI systems acquire and deploy structured symbolic representations — towards in-context grounding of novel atomic symbols and their systematic recombination into unseen configurations — spanning Compositional Generalisation, Formal Mathematics, Differentiable Language Models, and Physical Simulation. I have also investigated Language Emergence & Grounding (Emergent Communication), Unsupervised Representation Learning, Natural Language Processing, and Multi-Agent Deep Reinforcement Learning.

Related