Adversarially Guided Self-Play for Adopting Social Conventions

Reading time: 2 minute
...

๐Ÿ“ Original Info

  • Title: Adversarially Guided Self-Play for Adopting Social Conventions
  • ArXiv ID: 2001.05994
  • Date: 2020-10-09
  • Authors: Mycal Tucker, Yilun Zhou, Julie Shah

๐Ÿ“ Abstract

Robotic agents must adopt existing social conventions in order to be effective teammates. These social conventions, such as driving on the right or left side of the road, are arbitrary choices among optimal policies, but all agents on a successful team must use the same convention. Prior work has identified a method of combining self-play with paired input-output data gathered from existing agents in order to learn their social convention without interacting with them. We build upon this work by introducing a technique called Adversarial Self-Play (ASP) that uses adversarial training to shape the space of possible learned policies and substantially improves learning efficiency. ASP only requires the addition of unpaired data: a dataset of outputs produced by the social convention without associated inputs. Theoretical analysis reveals how ASP shapes the policy space and the circumstances (when behaviors are clustered or exhibit some other structure) under which it offers the greatest benefits. Empirical results across three domains confirm ASP's advantages: it produces models that more closely match the desired social convention when given as few as two paired datapoints.

๐Ÿ“„ Full Content

๐Ÿ“ธ Image Gallery

auto_asp.png auto_base.png commnet_arch.png commnet_comms0.png commnet_comms1.png commnet_comms8.png commnet_early.png commnet_graph.png mnist0.png mnist1.png mnist_asp_latent.png mnist_asp_latent_good.png mnist_latent.png particles.jpg particles.png particles_bad.png particles_graph.png particles_new.jpg

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

โ†‘โ†“
โ†ต
ESC
โŒ˜K Shortcut