owned this note changed 4 years ago
Linked with GitHub

Data Science Help Desk

Most of this content has been moved to https://osf.io/h9vqt/wiki/home/ and https://osf.io/h9vqt/wiki/Weekly Office Hours/

This week's writing hour all are invited to collaborate on envisioning and drafting a document that defines the data science help desk

2020-11-06

Definition of the Data Science Help and consulting desk

Action Items

Phase 0: Get ready
  • Need short descriptive blurb of what we're doing (Enable science in CALS), for whom we're doing it, and how we're doing it
  • Name - is "help desk" the right term?
    • consulting desk?
    • data science consulting for researchers
    • Focused on computational and data-intensive (CDI) research
Phase 1: Start engaging and planning
  • (Ongoing) Engage with CALS data science ambassadors
  • (Ongoing) Office hours
    • Ensure that somebody from DIAG or CALS Data Science Ambassador is at ResBaz Coffee&Code and/or Hacky Hour (roster/volunteer schedule)
    • Add Coffee&Code & Hacky Hour to Data Science Institute events calendar
    • Tweet every week about this
  • (Ongoing, regularly) Evaluate how the 'funnel' is working, and refine
  • (Ongoing) Keep track of time spent on different projects/'engagements'
    • To evaluate relative amount of effort different projects require, to avoid a few of them monopolizing our resources more than the benefits they bring
    • Use data to justify more resources (We could help a larger number of promising projects)
  • (Ongoing) Knowledge base: Start curating answers to common problems
    • Idea: Can start with an OSF page, and eventually publish using something like ZenDesk
  • Plan Workshop/webinar for end of Q1 2021
  • CALS/Parker: Present plan
    • Define metrics
    • Future is uncertain, how we plan to adapt and learn
Phase 2: Commit and spread the word
Phase 3: Spread the word even more widely
Phase 4: Workshop
  • Present workshop/webinar (end of Q1 2021)
    • Use workshop as way to create new ongoing relationships, while spreading word further

Mission/Goals/Outcomes of the Help Desk

  • Mission Enable science in CALS
  • Need short descriptive blurb of what we're doing

Metrics

  • co-authored and acknowledgment in publications
  • data publications
  • co-pi on grants
  • grants that we contributed to writing of
  • number of ad-hoc consultations

Expectations for acknowledgments etc

We offer many services. Typically shorter engagements don't warrant co-authorship. We will discuss this at the time of defining a scope of work and as it evolves. In these cases, we appreciate if you would acknowledge our help and cite our work, and let us help you and potentially co-author publication of software and data as standalone research objects. It also helps if you let us and others know how we have enabled your research; what we have made possible that you would not have otherwise been able to do (whether due to lack of knowledge, skills, time or other resources)

We use PLOS definitions of contributions to define expectations for authorship. But developing software and providing data don't require co-authorship if you cite the software / data separately in your paper.

Target audience (CALS / AES specific)

  • Grad students
  • Postdocs
  • Faculty
  • USDA

Outreach

Copied these to Action Items
julianp-diag

Mechanisms

  • Mentoring & Apprenticeships
  • Workshops
  • Webinars
  • Grant writing support
  • Office hours align w/ ResBaz
  • Engage with CALS data science ambassador

RADICAL (Rutgers) - Mentions using GitHub to manage projects (manage expectations) and keeping track of effort

Roles and responsibilities from Computing Success for Scientists:

Roles and responsibilities: each role is not necessarily exclusive of another and an individual may act in multiple roles.

  1. Trainer: provide webinar, walkthrough, tutorial, and video solutions to address how to use available data with computing and visualizing resources; also provides training outreach
  2. SME: knowledgeable in domain specific areas: including and not restricted to: data capture, data storage, data retention, data classification, data extraction, data transformation, computing workflows, data visualization, attribution, and domain-specific knowledge (such as agriculture, medical, etc).
  3. Technical: has knowledge of what a technical solution needs as it relates to scientific endeavors; part of teams developing technical solutions to ensure scientific needs and issues are met and appropriately addressed; interprets scientific needs to technical requirements
  4. Outreach: provides articles and publications; works with groups to develop best practices; works to increase the profile of the organization; identify new areas to investigate for potential for further development

Background research

Make a list of user types and paths to engagement for Data Science Helpdesk #339

Find existing personas & pathways for DIAG & ResBaz

  • OSF (@emily)
  • GitHub (@emily)
  • OLS (@emily)
  • HackMD (@emily)
  • ResBaz Google Docs (@julianp-diag)
  • CALS mission & vision (@julianp-diag)
  • DIAG mission & vision (@julianp-diag)

Existing Resources

Reference: CALS Strategic Goals

Research

  • Build on existing strengths and identify strategic new investment areas to maximize research achievement.
  • Optimize CALS research infrastructure to support the CALS research mission.
  • Define and measure resource generation for research.
  • Expand communication on research activities.
  • Build tech transfer, IP development, external business relations/development.

Reference: DIAG Mission & Vision

Mission

Provide open software, data, and computing to enable productive and sustainable agriculture.
The mission of the UA ag data group is to provide scientists and engineers with open software, data, and computing that will allow more efficient discovery and invention so that we can engineer crops and manage sustainable agricultural landscapes that produce food, energy, and ecosystem services.

Vision

Faster and more collaborative agricultural science and engineering through shared software and data
Our software will be used and collaboratively developed by researchers at major land grant universities, global agricultural research centers, and industry. We will enable scientists to spend less time engineering bespoke pipelines and collecting redundant data so that they can spend more time developing algorithms, augmenting existing data with strategic data collection, and analyzing data.

Reference: UX@UA resources

Reference: ResBazAZ Personas

Mariko

Mariko is a first year computer science PhD student, who obtained an undergraduate degree in engineering in Japan. During her first semester, Mariko struggled to learn several programming concepts due to unclear and confusing instructions written in English, which is her second language. She doesn't feel comfortable to ask her classmates for help either, because they are quite competitive. One day in her computer science course, there is an anouncement made for a hands-on Python meetup for functional programming, which is exactly what she is trying to study. She brings her computer to the meetup and follows along the exercise. Since many people are asking questions, she feels comfortable to ask questions as well, and recieves some help with preparing her virtual environment. At the meetup, she also learns about ResBazAZ's PhTea and Hacky Hour events. Since then, she attends PhTea weekly to work on her homework assignments, and feels safe to ask for help when the English instructions aren't clear. Mariko feels much more confident in her learning when others can help her.

Peter

Peter is a 5th year PhD student studying Spanish literature, who has never programmed before because he has always struggled with math and assumes programming will be quite hard. For his dissertation, Peter is studying Don Quixote. One day his advisor sends him an interesting link from the Stanford Digital Humanities Lab, about a project showing the different moods associated with various cities in England in 19th century British literature in a beautiful data visualization. Peter has slowly been documenting the moods and themes associated with different Spanish cities in Don Quixote, but the process has been painful and slow. Peter also has no idea how to make a data visualization for his dissertation, like the one in the Stanford paper. While looking for resources around campus, Peter finds the ResBazAZ twitter page, which has several tweets every week inviting people to come get help with coding. He decides to give it a try and come to Hacky Hour, where he is surprised to find that many people are interested in his project. With the help of other friendly researchers, Peter is able to better search through Don Quixote, and also visualize the data. After the visualization is done, he frequently goes to ShutUpAndWrite for focused writing time on his dissertation.

Rahim

Rahim is an MBA student in Eller who is interested in tech startups, specifically those dealing with AI. For a business class, Rahim is assigned to research workers in his field of interest and he chooses to talk to AI researchers on campus, who would likely go a tech startup. After searching around, Rahim hears about a Tucson Data Science meetup, where the topic with be reinforcement learning. At the meeting, he sees code run in a Jupyter notebook for the first time and it looks really cool! He wants to try it as well. At the meetup, Rahim hears the ResBazAZ announcement and thinks that Hacky Hour would be a good place to interview the AI engineers, and perhaps also ask them how he can get started. After the meeting he asks if he can come to Hacky Hour for this, and recieves a warm welcome and invitation to bring his computer and start learning some hands on deep learning! In the end, Rahim completes his assignment of researching AI engineers, and recieves a lot of praise from his professor. Moreover, Rahim has a new interest in learning to program for himself, which he feels will help him to be a better management if he pursues a career in tech startups.

Select a repo