Claude Formanek, Asad Jeewa, Jonathan Shock, Arnu Pretorius


Feb 2023


Being able to harness the power of large, static datasets for developing autonomous multi-agent systems could unlock enormous value for real- world applications. Many important industrial systems are multi-agent in nature and are difficult to model using bespoke simulators. However, in industry, distributed system processes can often be recorded during operation, and large quantities of demonstrative data can be stored. Offline multi- agent reinforcement learning (MARL) provides a promising paradigm for building effective online controllers from static datasets. However, offline MARL is still in its infancy, and, therefore, lacks standardised benchmarks, baselines and evalua- tion protocols typically found in more mature sub- fields of RL. This deficiency makes it difficult for the community to sensibly measure progress. In this work, we aim to fill this gap by releasing off-the-grid MARL (OG-MARL): a framework for generating offline MARL datasets and algorithms. We release an initial set of datasets and base- lines for cooperative offline MARL, created using the framework, along with a standardised evalua- tion protocol. Our datasets provide settings that are characteristic of real-world systems, includ- ing complex dynamics, non-stationarity, partial observability, suboptimality and sparse rewards, and are generated from popular online MARL benchmarks. We hope that OG-MARL will serve the community and help steer progress in offline MARL, while also providing an easy entry point for researchers new to the field.