HackMD
  • New!
    New!  “Bookmark” and save your note
    Find a note that's worth keeping or want reading it later? “Bookmark” it to your personal reading list.
    Got it
      • Create new note
      • Create a note from template
    • New!  “Bookmark” and save your note
      New!  “Bookmark” and save your note
      Find a note that's worth keeping or want reading it later? “Bookmark” it to your personal reading list.
      Got it
      • Options
      • Versions and GitHub Sync
      • Transfer ownership
      • Delete this note
      • Template
      • Save as template
      • Insert from template
      • Export
      • Dropbox
      • Google Drive
      • Gist
      • Import
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
      • Download
      • Markdown
      • HTML
      • Raw HTML
      • ODF (Beta)
      • Sharing Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • More (Comment, Invitee)
      • Publishing
        Everyone on the web can find and read all notes of this public team.
        After the note is published, everyone on the web can find and read this note.
        See all published notes on profile page.
      • Commenting Enable
        Disabled Forbidden Owners Signed-in users Everyone
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
        • Everyone
      • Invitee
      • No invitee
    Menu Sharing Create Help
    Create Create new note Create a note from template
    Menu
    Options
    Versions and GitHub Sync Transfer ownership Delete this note
    Export
    Dropbox Google Drive Gist
    Import
    Dropbox Google Drive Gist Clipboard
    Download
    Markdown HTML Raw HTML ODF (Beta)
    Back
    Sharing
    Sharing Link copied
    /edit
    View mode
    • Edit mode
    • View mode
    • Book mode
    • Slide mode
    Edit mode View mode Book mode Slide mode
    Note Permission
    Read
    Owners
    • Owners
    • Signed-in users
    • Everyone
    Owners Signed-in users Everyone
    Write
    Owners
    • Owners
    • Signed-in users
    • Everyone
    Owners Signed-in users Everyone
    More (Comment, Invitee)
    Publishing
    Everyone on the web can find and read all notes of this public team.
    After the note is published, everyone on the web can find and read this note.
    See all published notes on profile page.
    More (Comment, Invitee)
    Commenting Enable
    Disabled Forbidden Owners Signed-in users Everyone
    Permission
    Owners
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Invitee
    No invitee
       owned this note    owned this note    
    Published Linked with
    Like BookmarkBookmarked
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # What is Emergent Communication and Why You Should Care For the past two years, I have been fascinated by the field of emergent communication; I even made it the topic of my master's thesis. This year, I am co-organizing the event that sparked my interest: [the Workshop on Emergent Communication at NeurIPS](https://sites.google.com/view/emecom2019). To give others a look into the field (and also to make my thesis writing more fun), I will be writing a series of blog posts on emergent communication that are meant to be accessible and interesting. I'm hoping to communicate some of the magic that has captivated me. This is the first post in the series and my goal is explain the basic idea and give a couple different exciting motivations for studying it. Update: this post was also translated to [Russian](https://habr.com/ru/post/496830/) by [Anton Alexeyev](https://habr.com/en/users/alexeyev/) ## What is Emergent Communication? Emergent Communication (EC) is the study of learning "communication protocols in order to share information that is needed to solve [some] task" [(Foerster et al, 2016)](https://arxiv.org/abs/1605.06676), between two or more agents in an environment [^1]. In the modern incarnation, EC is within deep learning, specifically a subfield of deep multi-agent reinforcement learning (MARL) but also having ties to language. The common setup has agents in an environment with the ability to send communication on some linguistic channel that other agents can see. Agents are usually initialized without any prior agreement on what that communication should look like and learn with RL to agree on a protocol. That protocol allows them to coordinate with each other and transfer information effectively. This description is quite abstract, so to make it clear, let's look at one of the simplest setups. ### Sender-Receiver Games __Sender-Receiver games__ are basic games between two players: a sender and a receiver[^2]. The sender has some information that the receiver doesn't know, and the goal of the game is for the sender to communicate that information and the receiver to understand it. ![Basic Sender Reciever Game](https://i.imgur.com/nfq5lvU.png) For example, the sender can be given one of three shapes: triangle, square, pentagon. All the receiver knows is that the answer is one of the shapes but doesn't know which one. We choose a vocabulary for the sender to use and the sender then creates message to send to the receiver. The receiver must use that message to guess which of the shapes was given to the sender and then both agents get a reward based on whether the receiver's answer is correct. Through iterations of training, they can learn to agree which messages correspond to which numbers. If, for example, a message is a single token from a vocabulary of three possible tokens, you can see how the two agents could learn to map each shape to a token[^3]: 0 = square, 1 = pentagon, 2 = triangle. What is important to note here is that there is usually no single "correct" protocol that we are trying to learn; square could just as well be mapped to token 1 or 2 or even a distribution over tokens that the receiver guesses with some probability. EC is not about supervised learning of a specific protocol but about learning _any_ protocol that will allow us to solve the game. If the game is such that we don't receive any reward for guessing square correctly, then we may not care about its mapping at all. In short, EC is not just about communicating _all_ the information, but about communicating information that's useful for that game. Since usefulness is defined by the game and the rewards, the key idea is that we __learn a communication protocol to optimize playing the game__, whatever it may be. ### More Complex Games EC isn't restricted to such simple games, though. A natural extension is using more complex data such as images [(Lazaridou et al, 2017)](https://arxiv.org/abs/1612.07182) or a complicated game environment [(Resnick et al, 2018)](https://www.pommerman.com/). Our messages can be variable length [(Havrylov et al, 2017)](https://arxiv.org/abs/1705.11192) and we can have multiple rounds of back-and-forth communication [(Das, Kottur et al 2017)](https://arxiv.org/abs/1703.06585). We can make both agents able to send and receive and therefore look at 3 or more agents all talking and acting simultaneously [(Foerster et al, 2016)](https://arxiv.org/abs/1605.06676). Even the cooperative nature of the game is not necessary and we can look at large-scale games with many players and competing preferences [(Leibo et al, 2017)](https://arxiv.org/abs/1702.03037). <!-- It is important, though, to be careful in measuring communication as our environments get more complex. One reason is that agents can be implicitly communicating through an action space (e.g. running towards a target instead of saying "I'm going towards the target"). Another issue, pointed out by [Lowe et al (2019)](), is that agents can essentially be just saying what they're doing without adding any extra information (e.g. "I am at the target" while the agent is visibly at the target) and certain metric can mistake this for informative communication. Lowe et al suggest a simple way to measure informative communication is just to look at the reward: if the game with a communication channel allows agents to get higher reward than the exact same game without a communication channel, we can be certain that informative communication is occuring.[^4] --> ## Why Study Emergent Communication? So now we've described _what_ EC is, the question is _why_ are we doing it in the first place? There is no single answer and the motivation depends on the exact research question studied but it seems to me there are a couple schools of thought that most papers fall in to. <!-- Here, I describe the main approaches I've noticed and arguments for and against each perspective. --> ### Modelling Human Language Emergence Clearly RL agents and environments cannot encompass all the complexity of humans learning and the world we live in but there is still value in simplified models. EC can be used as a model of "how-possibly" human language emerged. Using simplified versions of real scenarios, we can run empirical tests whether certain pressures can cause fundmental parts of human language to emerge [^5]. A hot topic in recent years is how to learn a language that is [*compositional*](https://plato.stanford.edu/entries/compositionality/) with many EC papers looking at different hypotheses. Achieving compositionality would not just be an interesting scientific discovery but could also be essential to research in language understanding and specifically the problem of systematically generalizing outside the training distribution [(Lake and Baroni, 2018)](https://arxiv.org/abs/1711.00350). In this way, research in reproducing facets of human language can still be applicable to engineering and linguistics outside of the pure scientific interest. <!-- I can see criticism of this approach from two different perspectives: antropologists and engineers. The anthropological criticism is that "how-possibly" is not a good enough model. This perspective would argue that research into human language should look at humans and their actual history, using methods tied closer to reality that modelling. The engineering criticism is completely different, and would claim that there is little practical use in reproducing human language (with some exceptions). This persepective is mostly about seeing AI research as building tools not making contributions to anthropology. --> ### Learning Better Protocols But EC does not necessarily need to have connections to human language, the basic setup is _computer agents_ communicating. The modern world is already made up of networks of computers communicating using various protocols from TCP/IP to Bluetooth. The difference is that many existing protocols have never been computationally optimized for their use cases, they are just rule sets or _fixed_ protocols. This approach makes the bet that learned protocols will be more efficient, more resilient, and better suited to their tasks compared to fixed protocols. For example, learned protocols could implicitly account for the distribution of messages and allow for more efficient messages on average [(Kraska et al, 2018)](https://arxiv.org/abs/1712.01208). Even more exciting, protocols are learned when optimizing a loss function which means that any specific requirements or needs can be incorporated into the loss function and optimized. These ideas can be combined with the possibility of learning a protocol together with a policy. Self-driving cars are quickly becoming a possibility but should coodrindation between them be limited to the same communication human drivers have? It seems reasonable that self-driving cars could also learn to communicate amongst themselves to better coordinate their actions: from warning cars far behind them of a crash to letting other drivers know their route. The main point is that communication protocols between machines don't need to be limited to the rule sets humans can come up with. <!-- Criticism of this and similar approaches seems to focuses on interpretability and the practical applicability. For one, emergent protocols are inherently less interpretable than fixed, designed protocols and this can contribute to the guarantees we can make and the trust we have. TCP is guaranteed to send all packets until the receiver is satisfied but an emergent protocol may not. Even worse, it may not be clear whether it would or adversarial examples could cause it to fail. --> <!-- As for the superiority of learned vs fixed, a similar argument has already been made for index structures [(Kraska et al, 2018)](). Critics have argued that the added complexity of a neural network would eliminate any practical efficiency improvements. --> ### Modelling and Coordinating with Other Agents One of the main rules of telling a good joke is knowing your audience. Similarly, good communication requires understanding and modelling the agents you are interacting with. A good sender must understand the difference between what they know themselves and what their receiver knows, then from that difference extract the most essential pieces of information the receiver should get in that moment. So the most effective communication requires opponent modelling as well as good contextual understanding. This is clear in games like Hanabi [(Bard et al, 2019)](https://arxiv.org/abs/1902.00506) that have implicit communication but it is even more important for explicit strategic communication in competitive games. By using communicative games as a test bed, we can look to improve our communication by improving our opponent-modelling. ### Bottom-Up Natural Language The final view is something of a moonshot steeped in philosophy: EC as natural language understanding/generation. If we follow [Wittgenstein (1953)](https://en.wikipedia.org/wiki/Philosophical_Investigations), it isn't just our protocols that get their meaning from use in an environment but human natural language as well. We can consider regular language understanding to be "top-down": looking at text in context and trying to derive the meanings. In contrast, EC seeks to learn "bottom-up": if we learn a language in an environment resembling the real world, our emerged language could be equivalent to natural language. One idea is that if humans and agents are given similar environments, then an emergent language learned in the environment should be mappable to the language humans use in it. If we manage to learn that mapping, then together with the emergent language we should have a system that understands words by grounding them in their true meaning: how they are used. This could be the right approach to language understanding and potentially more effective than current approaches that understand words by reading text. <!-- The criticism here is also the idea: this is a moonshot. There are a couple assumptions both practical and philosophical and there are no guarantees that they will all pan out. --> ## Conclusion I've introduced the basics of emergent communication and hopefully given a taste of some of the possible research directions. My explanations are not meant to be exhaustive but illustrative to illumate this nascent field full of interesting directions and promise. But many of the ideas here are not new and it is important to look at previous work in fields such as signalling, game theory, and information theory. Look forward to the next blog post on my [website](https://mnoukhov.github.io/) which will delve into the past and some of the progress made so far! If you've found any errors (even minor) or have commentary please email me `mnoukhov` at `gmail.com`. I would love to get feedback and improve this for others, so I appreciate all feedback! If you'd like to talk in person, then you can find me at NeurIPS this year or, even better, come to the [Workshop on Emergent Communication](https://sites.google.com/view/emecom2019)! ## Acknowledgements Thank you to my friend Christine Xu for feedback and editing, my coauthor [Travis Lacroix](https://travislacroix.github.io/) and [my workshop co-organizers](https://sites.google.com/view/emecom2019#h.p_tzr6nLBshF3H) for read-throughs and support. ## References Bard, Nolan, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling. "The Hanabi Challenge: A New Frontier for AI Research" Artificial Intelligence Nov 2019 Das, Abhishek, Satwik Kottur, José M. F. Moura, Stefan Lee, Dhruv Batra. "Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning." ICCV 2017 Foerster, Jakob N., Yannis M. Assael, Nando de Freitas and Shimon Whiteson. “Learning to Communicate with Deep Multi-Agent Reinforcement Learning.” NIPS 2016 Havrylov, Serhii, Ivan Titov. "Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols" NIPS 2017 Kraska, Tim, Alex Beutel, Ed H. Chi, Jeffrey Dean and Neoklis Polyzotis. "The Case for Learned Index Structures."" SIGMOD 2018 Lake, Brenden M., Marco Baroni. "Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks." ICML 2018 Lazaridou, Angeliki, Karl Moritz Hermann, Karl Tuyls and Stephen Clark. “Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input.” ICLR 2018 Lazaridou, Angeliki, Alexander Peysakhovich and Marco Baroni. “Multi-Agent Cooperation and the Emergence of (Natural) Language.” ICLR 2017 Leibo, Joel Z., Vinícius Flores Zambaldi, Marc Lanctot, Janusz Marecki and Thore Graepel. “Multi-agent Reinforcement Learning in Sequential Social Dilemmas.” AAMAS 2017 Mordatch, Igor and Pieter Abbeel. “Emergence of Grounded Compositional Language in Multi-Agent Populations.” AAAI 2018 Resnick, Cinjon, Wes Eldridge, David Ha, Denny Britz, Jakob Foerster, Julian Togelius, Kyunghyun Cho, Joan Bruna. "Pommerman: A Multi-Agent Playground." Arxiv 2018 Sutton, Richard S. and Andrew G. Barto. “Reinforcement Learning: An Intro.” 1998 Wittgenstein, Ludwig. "Philosophical investigations." 1953. ### Cite This ``` @misc{noukhovitch2019emergentblogwhy author = {Michael Noukhovitch}, title = {What is Emergent Communication and Why You Should Care}, year = {2019} } ``` ### Footnotes [^1]: This is meant to be a simplified description and overlooks some related fields and their history (e.g. signalling) but we will expand on those in future posts. Careful readers will notice that this description is also vague in regards to _where_ the communication takes place. [^2]: There are many names for this type of game from "signalling games" ([Skryms, 2010](https://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780199580828.001.0001/acprof-9780199580828)) to "referential games" ([Lazaridou et al, 2016](https://arxiv.org/abs/1612.07182)) and others. [^3]: This mapping of objects in the environment to linguistic symbols is known as "grounding" in machine learning. For a good discussion on this term, see [Chris Manning's talk at VIGiL @ NeurIPS 2018](https://bluejeans.com/playback/s/jftkhICjhUnEbcglGD4qWWpHsvunBNISIZNdGdUo2AD7vD9nAq5aI2yXus70immP) <!-- [^4]: This is true for _cooperative_ games but not so clear in competitive games. Shameless plug of [our paper]() for a discussion on communication vs manipulation (aka "cue-reading") --> [^5]: This is bolstered by reinforcement learning being more than just a powerful search method and also having connections to biological learning from experience [(Sutton and Barto, 1998)](http://incompleteideas.net/book/the-book.html)

    Import from clipboard

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lost their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template is not available.
    All
    • All
    • Team
    No template found.

    Create a template

    Delete template

    Do you really want to delete this template?

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in via Google

    New to HackMD? Sign up

    Help

    Documents

    Tutorials
    YAML Metadata
    Slide Example
    Book Example

    Contacts

    Talk to us
    Report an issue
    Send us email

    Cheatsheet

    Example Syntax
    Header # Header
    • Unordered List
    - Unordered List
    1. Ordered List
    1. Ordered List
    • Todo List
    - [ ] Todo List
    Blockquote
    > Blockquote
    Bold font **Bold font**
    Italics font *Italics font*
    Strikethrough ~~Strikethrough~~
    19th 19^th^
    H2O H~2~O
    Inserted text ++Inserted text++
    Marked text ==Marked text==
    Link [link text](https:// "title")
    Image ![image alt](https:// "title")
    Code `Code`
    var i = 0;
    ```javascript
    var i = 0;
    ```
    :smile: :smile:
    Externals {%youtube youtube_id %}
    LaTeX $L^aT_eX$

    This is a alert area.

    :::info
    This is a alert area.
    :::

    Versions

    Versions and GitHub Sync

    Sign in to link this note to GitHub Learn more
    This note is not linked with GitHub Learn more
     
    Add badge Pull Push GitHub Link Settings

    Version named by    

    More Less
    • Edit
    • Delete

    Note content is identical to the latest version.
    Compare with
      Choose a version
      No search result
      Version not found

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub

        Please sign in to GitHub and install the HackMD app on your GitHub repo. Learn more

         Sign in to GitHub

        HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully