Multi-Agent Reinforcement Learning with a Hierarchy of Reward Machines

The dataset used in the paper is a hierarchical structure of propositions, where a higher-level proposition is a temporal abstraction of lower-level propositions. Each proposition represents a subtask, which is assigned to a group of agents that coordinate to make the proposition become true.

BibTex: