Try   HackMD

Case study 1 - the JCVI debate - what can we learn about online science communication?

Or “a case study in degenerating science discourse”

The last 48 hours of the JCVI debate forms an interesting case study for the micro dynamics of how things go wrong. We document this to spark ideas for building the information environment we need for effective scientific discourse and communication.

N.b. DISCLAIMER - it is not our focus whether or not the JCVI decision was or was not right in substance. The point of this case study is to help highlight salient features of online science discourse in its current form in order to help us think about building better tools.

1. The problem with collective attribution

Quoted tweet:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Retweeted with this comment:

This tweet (one of the most viewed of the exchange) raises multiple issues regarding collectives:

  • In what sense does a group (in this case JCVI) “want” something (discussion point in minutes vs. decision criterion etc)?
  • How does one/should one deal with attributions across collectives (i.e., an individual tweets something
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    “iSAGE is doing/believes x,y,z”)?
  • How can this be managed among informal collectives, i.e. the kind of collective intelligence we might be envisioning?
  • When should we take scientists as speaking for themselves and when as (at least partly) speaking for others?
  • What rules/responsibilities/safeguards can one put in place in a context that will see extensive probes for conflicts of interest and or bias?

2. The slide from actual words to attributed implication

(a) Is a core feature of Twitter, e.g.,

(b) leading to:

and

and

  • Do initial speakers bear some responsibility for this?
  • Should this/could this have been foreseen?
  • Is it ever possible to avoid this on a platform that mixes academics and lay people in the same discourse, given different levels of training and sensitivity to language?
  • Is it fuelled by the focus on individual speakers? Brevity?

3. Chinese whispers/elision…

  • Which design features of Twitter foster this? Character limits? Hashtags?
  • What other design features of Twitter promote this problem (e.g., character limit)?
  • Could this be stopped via (automated) argument/content aggregation that pointed (and, in particular, pointed back) to other instances of the claim?
  • Is NLP good enough to do at least some of this? Is it desirable?

Other, related problems:

  • Taking statements out of context?
  • Will such issues persist even in the absence of bad faith actors-are there platform “solutions” (see linking/aggregation)?

4. Is it even humanly possible to remain neutral and calm?

​​​​This space is placeholder for the hundreds of tweets a high profile figure will receive that are unbelievably abusive

Is it humanly possible/desirable to carry on in such an environment? This will also almost inevitably lead to overreaction somewhere down the road, which will be hard for observers to contextualise as everyone is “seeing” different versions of the conversation.

Abuse seems easy enough to fix in a platform with the right incentives, with control over entry.

However, there are also interesting questions raised by this about the respective design features of Twitter vs. Reddit: should what you see in a debate be customizable (see also “muting”, “blocking”), specific to the individual? If yes why, if no, why not?

5. Lay person weighs in:

6. The amplification of academic aggression

Stylistically, this plays well on Twitter, but it does not add to the argument. It seems problematic that it will be rewarded (in particular by non-scientists). The problems of Twitter specifically as an “outrage machine” are well-known/discussed, but don’t just concern claims, they also concern balance/visibility within argumentative threads.

More generally, this raises the questions of the right incentives for promoting high-quality discourse, and shielding academics from perverse incentives that undermine science (seen at its most extreme elsewhere in the COVID debates of academics who have seemingly thrown away academic respectability in favour of “influencer” status….). It also raises the question of who should reward for what, if “likes” of some sort are to be a currency for shaping discourse.

Would more fine-grained “like” options be of use? (content, clarity, style, novelty, cogency…)

7. General ramping up

8. Friendly fire

vs.

9. But Twitter still, amidst all of this, offers timely substantive argument - even in this debate.

So, what can we build that preserves that and loses the problems?


Some first, summarizing, thoughts:

(from here)

The current JCVI minutes debate clearly illustrates the problems with Twitter and scientific debate:

  • meaning is glossed, hedges and distinctions left behind, claims about arguments are conflated with claims about people, giving way to ramped up, emotive soundbites and claims.

  • important nuance is lost through repeated transmission of messages via actors who do not understand the subtlety in the language and actors who intentionally ignore it

  • In no time, everyone is outraged, and discussion has degenerated to exchanges about "the other side", and away from the actual issues themselves that we should be debating.

  • it's not new, but it's depressing every time, and when the stakes are so high, we really need something better.

So how can we build a platform that avoids this?

Some suggested ingredients:

  • social norms and content promotion that reward carefully worded material.
  • calling out (and sanction?) of misrepresentation
    onsite training/support to help people appreciate the kinds of linguistic distinctions that matter to science
  • algorithms for content aggregation and visualisation that help link connected pieces across the unfolding debate (both to promote accuracy and undercut bad faith "flooding the zone")

Discussion Points (in no particular order)

  1. One discourse for scientists and non-scientists or two separate (but interrelated) platforms (see e.g., SciBeh’s original set of reddits which had a scientist only space and a scientist-public interaction space.)

  2. Could such spaces be built such that integration/separation is dynamic and can be controlled by the user? (i.e., I can switch between seeing just scientists or scientist-public). This seems technically possible but is it desirable?

  3. Should there be “entry requirements”? If yes, what?

  4. How (and by whom?) is gatekeeping handled?

  5. If we had a suitable eco-system, would we still need Twitter or need to engage on Twitter given its relationship to other systems?

  6. Can we build desirable systems out of or on top of Twitter, or to interact with Twitter in other ways? (particularly if answer to preceding question is 5)

  7. Twitter pandemic science dialogue mixes science and policy recommendations. Could/should these be separated? Is this a concern in times of normal science?

  8. Twitter mixes personal and scientific. Is this helpful/harmful?

  9. Tools for highlighting conflicts of interest/bias/etc? Good idea/bad idea?

  10. Algorithmic rewiring, content promotion: what kind of content promotion does science discourse need? What is possible? What do we know? N.b, there will always be an organizing principle for how material appears- there is no ‘neutral’ here

  11. Organisation by arguer vs. organisation by argument. Twitter organises material by “arguer” and only indirectly (and only very loosely) by “argument” (via #). Reddit, because it has persistent structure, gives slightly more prominence to a claim, but does not amalgamate identical claims across sources. Do we want/need something radically different, i.e., discourse organized (or organizable) by “argument” not by “source”. Is current NLP good enough for this?