mnoukhov
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# What is Emergent Communication and Why You Should Care For the past two years, I have been fascinated by the field of emergent communication; I even made it the topic of my master's thesis. This year, I am co-organizing the event that sparked my interest: [the Workshop on Emergent Communication at NeurIPS](https://sites.google.com/view/emecom2019). To give others a look into the field (and also to make my thesis writing more fun), I will be writing a series of blog posts on emergent communication that are meant to be accessible and interesting. I'm hoping to communicate some of the magic that has captivated me. This is the first post in the series and my goal is explain the basic idea and give a couple different exciting motivations for studying it. Update: this post was also translated to [Russian](https://habr.com/ru/post/496830/) by [Anton Alexeyev](https://habr.com/en/users/alexeyev/) ## What is Emergent Communication? Emergent Communication (EC) is the study of learning "communication protocols in order to share information that is needed to solve [some] task" [(Foerster et al, 2016)](https://arxiv.org/abs/1605.06676), between two or more agents in an environment [^1]. In the modern incarnation, EC is within deep learning, specifically a subfield of deep multi-agent reinforcement learning (MARL) but also having ties to language. The common setup has agents in an environment with the ability to send communication on some linguistic channel that other agents can see. Agents are usually initialized without any prior agreement on what that communication should look like and learn with RL to agree on a protocol. That protocol allows them to coordinate with each other and transfer information effectively. This description is quite abstract, so to make it clear, let's look at one of the simplest setups. ### Sender-Receiver Games __Sender-Receiver games__ are basic games between two players: a sender and a receiver[^2]. The sender has some information that the receiver doesn't know, and the goal of the game is for the sender to communicate that information and the receiver to understand it. ![Basic Sender Reciever Game](https://i.imgur.com/nfq5lvU.png) For example, the sender can be given one of three shapes: triangle, square, pentagon. All the receiver knows is that the answer is one of the shapes but doesn't know which one. We choose a vocabulary for the sender to use and the sender then creates message to send to the receiver. The receiver must use that message to guess which of the shapes was given to the sender and then both agents get a reward based on whether the receiver's answer is correct. Through iterations of training, they can learn to agree which messages correspond to which numbers. If, for example, a message is a single token from a vocabulary of three possible tokens, you can see how the two agents could learn to map each shape to a token[^3]: 0 = square, 1 = pentagon, 2 = triangle. What is important to note here is that there is usually no single "correct" protocol that we are trying to learn; square could just as well be mapped to token 1 or 2 or even a distribution over tokens that the receiver guesses with some probability. EC is not about supervised learning of a specific protocol but about learning _any_ protocol that will allow us to solve the game. If the game is such that we don't receive any reward for guessing square correctly, then we may not care about its mapping at all. In short, EC is not just about communicating _all_ the information, but about communicating information that's useful for that game. Since usefulness is defined by the game and the rewards, the key idea is that we __learn a communication protocol to optimize playing the game__, whatever it may be. ### More Complex Games EC isn't restricted to such simple games, though. A natural extension is using more complex data such as images [(Lazaridou et al, 2017)](https://arxiv.org/abs/1612.07182) or a complicated game environment [(Resnick et al, 2018)](https://www.pommerman.com/). Our messages can be variable length [(Havrylov et al, 2017)](https://arxiv.org/abs/1705.11192) and we can have multiple rounds of back-and-forth communication [(Das, Kottur et al 2017)](https://arxiv.org/abs/1703.06585). We can make both agents able to send and receive and therefore look at 3 or more agents all talking and acting simultaneously [(Foerster et al, 2016)](https://arxiv.org/abs/1605.06676). Even the cooperative nature of the game is not necessary and we can look at large-scale games with many players and competing preferences [(Leibo et al, 2017)](https://arxiv.org/abs/1702.03037). <!-- It is important, though, to be careful in measuring communication as our environments get more complex. One reason is that agents can be implicitly communicating through an action space (e.g. running towards a target instead of saying "I'm going towards the target"). Another issue, pointed out by [Lowe et al (2019)](), is that agents can essentially be just saying what they're doing without adding any extra information (e.g. "I am at the target" while the agent is visibly at the target) and certain metric can mistake this for informative communication. Lowe et al suggest a simple way to measure informative communication is just to look at the reward: if the game with a communication channel allows agents to get higher reward than the exact same game without a communication channel, we can be certain that informative communication is occuring.[^4] --> ## Why Study Emergent Communication? So now we've described _what_ EC is, the question is _why_ are we doing it in the first place? There is no single answer and the motivation depends on the exact research question studied but it seems to me there are a couple schools of thought that most papers fall in to. <!-- Here, I describe the main approaches I've noticed and arguments for and against each perspective. --> ### Modelling Human Language Emergence Clearly RL agents and environments cannot encompass all the complexity of humans learning and the world we live in but there is still value in simplified models. EC can be used as a model of "how-possibly" human language emerged. Using simplified versions of real scenarios, we can run empirical tests whether certain pressures can cause fundmental parts of human language to emerge [^5]. A hot topic in recent years is how to learn a language that is [*compositional*](https://plato.stanford.edu/entries/compositionality/) with many EC papers looking at different hypotheses. Achieving compositionality would not just be an interesting scientific discovery but could also be essential to research in language understanding and specifically the problem of systematically generalizing outside the training distribution [(Lake and Baroni, 2018)](https://arxiv.org/abs/1711.00350). In this way, research in reproducing facets of human language can still be applicable to engineering and linguistics outside of the pure scientific interest. <!-- I can see criticism of this approach from two different perspectives: antropologists and engineers. The anthropological criticism is that "how-possibly" is not a good enough model. This perspective would argue that research into human language should look at humans and their actual history, using methods tied closer to reality that modelling. The engineering criticism is completely different, and would claim that there is little practical use in reproducing human language (with some exceptions). This persepective is mostly about seeing AI research as building tools not making contributions to anthropology. --> ### Learning Better Protocols But EC does not necessarily need to have connections to human language, the basic setup is _computer agents_ communicating. The modern world is already made up of networks of computers communicating using various protocols from TCP/IP to Bluetooth. The difference is that many existing protocols have never been computationally optimized for their use cases, they are just rule sets or _fixed_ protocols. This approach makes the bet that learned protocols will be more efficient, more resilient, and better suited to their tasks compared to fixed protocols. For example, learned protocols could implicitly account for the distribution of messages and allow for more efficient messages on average [(Kraska et al, 2018)](https://arxiv.org/abs/1712.01208). Even more exciting, protocols are learned when optimizing a loss function which means that any specific requirements or needs can be incorporated into the loss function and optimized. These ideas can be combined with the possibility of learning a protocol together with a policy. Self-driving cars are quickly becoming a possibility but should coodrindation between them be limited to the same communication human drivers have? It seems reasonable that self-driving cars could also learn to communicate amongst themselves to better coordinate their actions: from warning cars far behind them of a crash to letting other drivers know their route. The main point is that communication protocols between machines don't need to be limited to the rule sets humans can come up with. <!-- Criticism of this and similar approaches seems to focuses on interpretability and the practical applicability. For one, emergent protocols are inherently less interpretable than fixed, designed protocols and this can contribute to the guarantees we can make and the trust we have. TCP is guaranteed to send all packets until the receiver is satisfied but an emergent protocol may not. Even worse, it may not be clear whether it would or adversarial examples could cause it to fail. --> <!-- As for the superiority of learned vs fixed, a similar argument has already been made for index structures [(Kraska et al, 2018)](). Critics have argued that the added complexity of a neural network would eliminate any practical efficiency improvements. --> ### Modelling and Coordinating with Other Agents One of the main rules of telling a good joke is knowing your audience. Similarly, good communication requires understanding and modelling the agents you are interacting with. A good sender must understand the difference between what they know themselves and what their receiver knows, then from that difference extract the most essential pieces of information the receiver should get in that moment. So the most effective communication requires opponent modelling as well as good contextual understanding. This is clear in games like Hanabi [(Bard et al, 2019)](https://arxiv.org/abs/1902.00506) that have implicit communication but it is even more important for explicit strategic communication in competitive games. By using communicative games as a test bed, we can look to improve our communication by improving our opponent-modelling. ### Bottom-Up Natural Language The final view is something of a moonshot steeped in philosophy: EC as natural language understanding/generation. If we follow [Wittgenstein (1953)](https://en.wikipedia.org/wiki/Philosophical_Investigations), it isn't just our protocols that get their meaning from use in an environment but human natural language as well. We can consider regular language understanding to be "top-down": looking at text in context and trying to derive the meanings. In contrast, EC seeks to learn "bottom-up": if we learn a language in an environment resembling the real world, our emerged language could be equivalent to natural language. One idea is that if humans and agents are given similar environments, then an emergent language learned in the environment should be mappable to the language humans use in it. If we manage to learn that mapping, then together with the emergent language we should have a system that understands words by grounding them in their true meaning: how they are used. This could be the right approach to language understanding and potentially more effective than current approaches that understand words by reading text. <!-- The criticism here is also the idea: this is a moonshot. There are a couple assumptions both practical and philosophical and there are no guarantees that they will all pan out. --> ## Conclusion I've introduced the basics of emergent communication and hopefully given a taste of some of the possible research directions. My explanations are not meant to be exhaustive but illustrative to illumate this nascent field full of interesting directions and promise. But many of the ideas here are not new and it is important to look at previous work in fields such as signalling, game theory, and information theory. Look forward to the next blog post on my [website](https://mnoukhov.github.io/) which will delve into the past and some of the progress made so far! If you've found any errors (even minor) or have commentary please email me `mnoukhov` at `gmail.com`. I would love to get feedback and improve this for others, so I appreciate all feedback! If you'd like to talk in person, then you can find me at NeurIPS this year or, even better, come to the [Workshop on Emergent Communication](https://sites.google.com/view/emecom2019)! ## Acknowledgements Thank you to my friend Christine Xu for feedback and editing, my coauthor [Travis Lacroix](https://travislacroix.github.io/) and [my workshop co-organizers](https://sites.google.com/view/emecom2019#h.p_tzr6nLBshF3H) for read-throughs and support. ## References Bard, Nolan, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling. "The Hanabi Challenge: A New Frontier for AI Research" Artificial Intelligence Nov 2019 Das, Abhishek, Satwik Kottur, José M. F. Moura, Stefan Lee, Dhruv Batra. "Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning." ICCV 2017 Foerster, Jakob N., Yannis M. Assael, Nando de Freitas and Shimon Whiteson. “Learning to Communicate with Deep Multi-Agent Reinforcement Learning.” NIPS 2016 Havrylov, Serhii, Ivan Titov. "Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols" NIPS 2017 Kraska, Tim, Alex Beutel, Ed H. Chi, Jeffrey Dean and Neoklis Polyzotis. "The Case for Learned Index Structures."" SIGMOD 2018 Lake, Brenden M., Marco Baroni. "Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks." ICML 2018 Lazaridou, Angeliki, Karl Moritz Hermann, Karl Tuyls and Stephen Clark. “Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input.” ICLR 2018 Lazaridou, Angeliki, Alexander Peysakhovich and Marco Baroni. “Multi-Agent Cooperation and the Emergence of (Natural) Language.” ICLR 2017 Leibo, Joel Z., Vinícius Flores Zambaldi, Marc Lanctot, Janusz Marecki and Thore Graepel. “Multi-agent Reinforcement Learning in Sequential Social Dilemmas.” AAMAS 2017 Mordatch, Igor and Pieter Abbeel. “Emergence of Grounded Compositional Language in Multi-Agent Populations.” AAAI 2018 Resnick, Cinjon, Wes Eldridge, David Ha, Denny Britz, Jakob Foerster, Julian Togelius, Kyunghyun Cho, Joan Bruna. "Pommerman: A Multi-Agent Playground." Arxiv 2018 Sutton, Richard S. and Andrew G. Barto. “Reinforcement Learning: An Intro.” 1998 Wittgenstein, Ludwig. "Philosophical investigations." 1953. ### Cite This ``` @misc{noukhovitch2019emergentblogwhy author = {Michael Noukhovitch}, title = {What is Emergent Communication and Why You Should Care}, year = {2019} } ``` ### Footnotes [^1]: This is meant to be a simplified description and overlooks some related fields and their history (e.g. signalling) but we will expand on those in future posts. Careful readers will notice that this description is also vague in regards to _where_ the communication takes place. [^2]: There are many names for this type of game from "signalling games" ([Skryms, 2010](https://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780199580828.001.0001/acprof-9780199580828)) to "referential games" ([Lazaridou et al, 2016](https://arxiv.org/abs/1612.07182)) and others. [^3]: This mapping of objects in the environment to linguistic symbols is known as "grounding" in machine learning. For a good discussion on this term, see [Chris Manning's talk at VIGiL @ NeurIPS 2018](https://bluejeans.com/playback/s/jftkhICjhUnEbcglGD4qWWpHsvunBNISIZNdGdUo2AD7vD9nAq5aI2yXus70immP) <!-- [^4]: This is true for _cooperative_ games but not so clear in competitive games. Shameless plug of [our paper]() for a discussion on communication vs manipulation (aka "cue-reading") --> [^5]: This is bolstered by reinforcement learning being more than just a powerful search method and also having connections to biological learning from experience [(Sutton and Barto, 1998)](http://incompleteideas.net/book/the-book.html)

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully