Try   HackMD

Notes: GGG298 Discussion - Week 1

Paper: A preliminary review of influential works in data-driven discovery. Stalzer & Mentzel (2016)

Range of Topics Among People in GGG298

  • genomics
  • gene expression
  • endangered species
    • intellectual property
  • biodiversity
  • plant biology
  • livestock
  • microbiome
  • adaptive evolution
  • genetics
  • breed diversity
  • human disease
  • developmental biology
  • translational genetics
  • engineering & biomechanics
  • modeling

A bit about the paper

Authors used the word "data driven discovery" and they wanted to know what exactly it wasor meant. They wanted to know what tools were being used by everyone. To do this, they open up a grant competition for $1.5M dollars to ask researchers about what they believed data driven discovery to be.

How biased was the call?

They completely missed the social sciences. Home department was not a good indication of what the researcher did. There were sets of domains within those who submit––particularly in astronomy & biology.

The 8 Clusters

"The dataset" contained

Big data = any datasets you don't have the tools to handle

"New instruments are also showing that data-driven discovery is not just about the volume of data, but also the “velocity”."
The amount of sequencing data outnumbers the amount of available compute.

Domain Sciences: These disciplines, at the time, were the most challenged by big data.

  • Astronomy
  • Genomics

Methodologies

  • Foundational theory
    • Information Theory
    • Bayes Theorem
    • The limits of extracting data
  • Classical statistical methods
  • Machine Learning
    • Can I look at a dataset and find interesting features
    • Deep learning
      • wildly uncertain learning. Give no input and
  • GOOGLE
  • General tools

You can't infer things that you haven't sampled.

Two questions:

  1. Is the answer in the data to begin with?
  2. Do I have the tools, today, to work with this data?

Homework

Submit 3-4 sentences describing either a problem that you tackled with one of the techniques described in this paper, or a problem that you want to tackle.