Instruction-following from prompts in Natural Languages (NLs) is an important benchmark for Human-AI collaboration. Training Embodied AI agents for instruction-following with Reinforcement Learning (RL) poses a strong exploration challenge. Previous works have shown that NL-based state abstractions can help address the exploitation versus exploration trade-off in RL. However, NLs descriptions are not always readily available and are expensive to collect. We therefore propose to use the Emergent Communication paradigm, where artificial agents are free to learn an emergent language (EL) via referential games, to bridge this gap. ELs constitute cheap and readily-available abstractions, as they are the result of an unsupervised learning approach. In this paper, we investigate (i) how EL-based state abstractions compare to NL-based ones for RL in hard-exploration, procedurally-generated environments, and (ii) how properties of the referential games used to learn ELs impact the quality of the RL exploration and learning. Results indicate that the EL-guided agent, namely EReLELA, achieves similar performance as its NL-based counterparts without its limitations. Our work shows that Embodied RL agents can leverage unsupervised emergent abstractions to greatly improve their exploration skills in sparse reward settings, thus opening new research avenues between Embodied AI and Emergent Communication.