owned this note
owned this note
Published
Linked with GitHub
---
title: ResBaz AZ Annual Festival 2023 - Track A
description: "ResB[az]2's annual data science festival where researchers come together to level-up in digital research tools and skills."
image: https://github.com/resbazaz/resbazaz_promotional_materials/blob/master/resbazaz_logos/ResBazAZ_square.png?raw=true
robots: noindex, nofollow
---
## ResBaz AZ '23 - Track A
:::warning
### Welcome to ResBaz AZ 2023! :tada:
:page_facing_up: Use this doc to get info on sessions you want to attend, see notes on sessions you can't attend, collaborate, help archive our conference for future learners, and much more!
Edit this doc by clicking ":pencil2: Edit" at the top, and typing in the black box side. You can switch back to preview mode with :eye: and view this page's table of contents
#### :video_camera: [**Zoom Link**](https://arizona.zoom.us/j/87241009518)
#### :train: [HackMD - Track B](https://hackmd.io/@hidyverse/ResBazAZ23_TrackB)
:::
# Monday
## Workshop: Intro to machine learning in R (HPC)
> Apr 17, 1:00-2:30 PM (PDT) [color=#244c7b]
> Chris Reidy
> Location: Zoom
::: info
Get a brief overview of machine learning, see the implementation of RStudio on HPC, and participate in hands-on exercises
:::
###### tags: `beginner`, `high performance computing`, `R`, `machine learning`
:::spoiler Session Notes {}
### :wrench: Getting Started
<!-- Add any session prerequisites here) -->
There will be hands-on exercises that do not assume more than basic knowledge of RStudio. It would be helpful to already ‘install.packages(“VIM”)’
### :goal_net: Learning Objectives
<!-- add learning objectives here -->
Gain some take way knowledge of concepts of machine learning and reusable exercises.
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Chris Reidy; UArizona University Information Technology Services; chrisreidy@arizona.edu
- :hugging_face: **Helper:** Viviana Zapata
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
- Link to the github repo: https://github.com/resbazaz/festival2023/tree/main/introToMachineLearningInR
- In github go to Code and download
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Speaker Session: Distributed Computing (DISCOVER Platform)
> Apr 18, 3-3.30 PM (PDT) [color=#EA5A2A]
> [Morgan Vigil-Hayes](https://www.discoverccri.org/index.html)
> Location: Zoom
::: info
Learn about research infrastructure for distributed sensing and computing over sparse dnvironments through the DISCOVER Research Infrastructure project hosted at NAU.
The DISCOVER cyberinfrastructure, funded through the NSF CISE Community Research Infrastructure Program, allows researchers to study how Internet of things (IoT) devices and networks can be designed to work in technically challenging rural or remote areas.
:::
###### tags: `speaker`, `infrastructure`
:::spoiler Session Notes
### :card_index: Speaker Bio
<!-- Add speaker bio) -->
[Morgan Vigil-Hayes](https://www.canis-lab.com) is an assistant professor of computer science in the School of Informatics, Computing, and Cyber Systems at NAU.
Her research focuses on human-centered aspects of computer networking with the goal of innovating networked architectures, services, and policies for underserved communities.
She is a Co-PI on the DISCOVER CCRI Project.
### :spider_web: [DISCOVER Website](https://www.discoverccri.org/index.html)
### :computer: [DISCOVER Experimenter Portal](https://discover-dev.rc.nau.edu/)
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
1. What is the DISCOVER Platform?
2. DISCOVER Sites
3. DISCOVER Infrastructure
4. DISCOVER Experimenter Portal
-
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: Intro to text mining for humanists and social scientists
> Apr 17, 3:30-5:00 PM (PDT) [color=#244c7b]
> [Anuj Gupta](https://seed-radish-681.notion.site/UXR-Portfolio-2bbad02946c74a9384f708ab65c8b480)
> Location: Zoom
::: info
Text mining is a valuable tool for researchers in the humanities and social sciences. By using computer languages, like Python, to analyze large textual datasets, researchers can discover new patterns and insights that were previously hidden.
This virtual workshop is designed for advanced undergraduate and graduate students who want to **learn how to use text mining tools for their research**. The workshop will cover a range of techniques, including distant reading, close reading, and data visualization.
:::
###### tags: `beginner`, `python`, `text`
:::spoiler Session Notes
### :wrench: Getting Started
<!-- Add any session prerequisites here) -->
No prior experience with programming or mining is necessary. Participants will need a computer, good internet connection and familiarity with using Zoom.
### :goal_net: Learning Objectives
<!-- add learning objectives here -->
By completing this workshop, you will:
**Goal 1:** EXPLORE iconic studies in Humanities and Social Sciences that have used text mining techniques
**Goal 2:** PRACTICE how to use computational methods using Python that can help in anlaysing textual data using distant reading (frequency distributions, dispersions) and close-reading (collocations, concordances) methods.
**Goal 3:** BRAINSTORM how some of those techniques could be applied to your own work
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Anuj Gupta; UArizona English department; anujgupta@arizona.edu
- :hugging_face: **Helper:** Tina L. Johnson
- :hugging_face: **Helper:** Chen Chen
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
- Click link and let it load the computational notebook: https://mybinder.org/v2/gh/mettalrose/DS2F_Workshop_Text_Mining/HEAD?labpath=DS2F_draft.ipynb
- Sign in via Google Form: https://docs.google.com/forms/d/e/1FAIpQLSfEsN2evjhe9eThuvkG6lwd1pjNcCtBIpE2TPhpc2sPRXB9Tg/viewform?usp=sf_link
- Click on CC in Zoom to enable closed caption for yourself
- Download: a library called "nltk" or Natural Language ToolKit (© 2022, NLTK Project) which is a specific library created by these amazing folks (https://www.nltk.org/team.html) to help us do text mining.
- Technique Type: 1: Distant Reading (macro-analysis)
- Datasets: https://www.kaggle.com/datasets
https://crow.corporaproject.org/
https://elicorpora.info/main
https://dataverse.harvard.edu/dataverse/gwu-libraries
https://catalog.docnow.io/
- https://forms.gle/NzVpUdCx8KqHauL36 Let Anuj know about today's session.
- Heather Froehlich is the UAL resource: Feel welcome to reach out to me q questions too — froehlich@arizona.edu
-
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
# Tuesday
## Workshop: Intro to text analysis
> Apr 18, 9:00-10:00 AM (PDT) [color=#244c7b]
> [Rongbo Jin](www.rongbojin.com)
> Location: Zoom
::: info
This workshop will cover basic comcepts and steps in text analysis for social scientists in `R`.
:::
###### tags: `beginner`, `R`, `text`
:::spoiler Session Notes
### :wrench: Getting Started
<!-- Add any session prerequisites here) -->
### :goal_net: Learning Objectives
<!-- add learning objectives here -->
Learners will leave with an understanding of:
- Procedures of text analyses in `R`
- Basic models
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Rongbo Jin; UArizona School of Government and Public Policy; rongbojin@arizona.edu
- :hugging_face: **Helper:**
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
-
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: Making your first R package
> Apr 18, 10:00 AM -12:00 PM (PDT) [color=#244c7b]
> [Eric Scott](https://www.ericrscott.com/)
> Location: Zoom
###### tags: `intermediate`, `R`, `packages`
:::info
In this workshop, build a functional, installable `R` package and learn the fundamentals of what is involved in taking your package to the next level by submitting it to a repository like CRAN.
:::
::: spoiler Session Notes
### :wrench: Getting Started
For this workshop you'll need:
* Some familiarity with writing code in `R` and using RStudio
* recent versions of R and RStudio
* a (free) GitHub account
* Some familiarity with git and GitHub is useful (check out Heidi's [workshop](https://hackmd.io/6rbeXjRATjW_I209qNaL1A?view#Workshop-Refresher-on-version-control-with-git) on Monday)
### :goal_net: Learning Objectives
Learners will:
* Understand the basic components of an R package
* Understand what a unit test is and why it's important
* Learn to write a simple function in R
* Learn to write documentation for functions with roxygen2
* Learn to install a development package from GitHub
* Gain familiarity to submitting to package repositories, like CRAN
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Eric Scott; UArizona Communications & Cyber Technologies Data Science; ericrscott@arizona.edu
- :hugging_face: **Helper:**
- Viviana Freire Zapata, 3rd year PhD student at Environmental Science. Working with omics data.
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
[Notes](https://cct-datascience.quarto.pub/making-your-first-r-pkg-notes/)
1. [**Intro Slides**](https://cct-datascience.quarto.pub/making-your-first-r-pkg-slides/#/title-slide)
2. [**Coding notes**](https://cct-datascience.quarto.pub/making-your-first-r-pkg-notes/#convert-into-a-package)
3. [**Suggested book**](https://r-pkgs.org/)
4. **Project Setup**
5. **Git/GitHub Setup**
6. **Convert to a Package**
7. **Write a Function**
8. **Add Some Data**
9. **Write Tests and Automate**
10. **Share Your Package**
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: Using geospatial tools for big data processing in R + GDAL
> Apr 18, 1:00-3:00 PM (PDT) [color=#244c7b]
> Ivan Gonzalez
> Location: NAU - Du Bois South Union, Juniper room
###### tags: `intermediate`, `R`
Hi all! We need to install the programs and libraries in the following order: QGIS (OSGeo4W), R, Rtools, Rstudio and R libraries.
This will take a long, so please consider to install them a day before
Here the folder with the executables: https://drive.google.com/drive/folders/1CLhh_Quiwt78hDk2b5AUW7ezkZhgp3Nk?usp=share_link
- QGIS: 01_QGIS-OSGeo4W-3.22.6-1.msi
- R:02_R-4.2.0-win.exe
- RTools: 03_rtools42-5253-5107.exe
- RStudio: 04_RStudio-2022.02.2-485.exe
- R libraries: 05_Install_libs_R.R
::: spoiler Session Notes
### :wrench: Getting Started
<!-- add workshop prerequisites -->
### :goal_net: Learning Objectives
<!-- add learning objectives here or other matierals as needed! -->
- Undesrtand GDAL/OGR work
- Compare R and GDAL/OGR performance
- Run practical cases for improving speed and file sizes
- Run GDAL external programs from R
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Ivan Gonzalez
- :hugging_face: **Helper:**
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: GPU acceleration: General approaches to implementing GPUs in computation
> Apr 18, 3:00-4:00 PM (PDT) [color=#244c7b]
> Gil Speyer
> Location: ASU - Goldwater Center for Science and Engineering, room 487
###### tags: `intermediate`, `graphics processing`
:::info
GPUs can potentially offer order-of-magnitude scaling, but it is important to understand the kinds of computations that lend themselves to this.
This session will survey the various methods of exploiting GPU in computing.
:::
::: spoiler Session Notes
### :wrench: Getting Started
<!-- add workshop prerequisites -->
It will be helpful for learners to have an understanding of programming concepts (no specific language)
### :goal_net: Learning Objectives
<!-- add learning objectives here or other matierals as needed! -->
Learners will improve their general understanding of GPU.
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Gil Speyer; Arizona State University Research Technology Office; speyer@asu.edu
- :hugging_face: **Helper:**
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: HPC and GPU approaches with Matlab
> Apr 18, 4:00-5:00 PM (PDT) [color=#244c7b]
> Gil Speyer
> location: ASU - Goldwater Center for Science and Engineering, room 487
###### tags: `intermediate`, `graphics processing`, `high performance computing`
:::info
This workshop will focus on approaches to porting MATLAB applications to a larger scale or accelerated compute resource such as a supercomputer or a GPU.
This is not an introduction to MATLAB course - learners should have a general understanding of Matlab programming
:::
jfdkls :hamburger:
::: spoiler Session Notes
### :wrench: Getting Started
<!-- add workshop prerequisites -->
### :goal_net: Learning Objectives
<!-- add learning objectives here or other matierals as needed! -->
Learners will gain a better understanding of computational possibilities with Matlab.
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Gil Speyer; Arizona State University Research Technology Office; speyer@asu.edu
- :hugging_face: **Helper:**
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
-
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Social: Tucson Python Meetup :pizza:
> Apr 18, 6:00 PM -8:00 PM (PDT) [color=#5992d2]
> Tucson Python Meetup community
> Location: UArizona Libraries CATalyst Studio, room 254
> Zoom link: https://roche.zoom.us/j/92517258720
> Passcode: 632128
###### tags: `python`, `social`
:::info
[Tucson Python Meetup @ ResBaz AZ](https://www.meetup.com/tucson-python-meetup/events/xrmtvsyfcgbxb/): Poppy Argus talks about "Web in Python with Flask"
:::
# Wednesday
## Speaker Session: Data Publication for Reproducible Research
> Apr 19, 9:00-10:00 AM (PDT) [color=#EA5A2A]
> [Fernando Rios, PhD](fernandorios.net)
> Location: Zoom
###### tags: `beginner`, `data publishing`
:::info
How do you publish data to make it findable and reusable? :mag_right:
Learn how to curate and prepare your data for publication in data repositories like ReDATA.
:::
:::spoiler Session Notes
### :card_index: Speaker Bio
<!-- Add speaker bio) -->
Fernando Rios is situated within the Libraries and provides data management and curation services for UA researchers.
His academic background is in physics, geography, and computational hydrogeology.
Slides: https://osf.io/d8kfh
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
- Tedersoo et al. 2021 [10.1038/s41597-021-00981-0](https://doi.org/10.1038/s41597-021-00981-0)
- https://www.fosteropenscience.eu/
- [re3data.org](https://re3data.org)
- https://redata.arizona.edu
- [Deposit Guidelines](https://osf.io/dyu7m/)
- [Deposit Checklist](https://osf.io/ad8jc/)
- [Sensitive Data](https://osf.io/4xa72/)
- [License selection matrix](https://osf.io/f57nz/)
- https://orcid.arizona.edu
- [Tidy Data principles](https://r4ds.had.co.nz/tidy-data.html)
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: Basic probability theory for statistics
> Apr 19, 10:00 AM -12:00 PM (PDT) [color=#244c7b]
> [MD Nafis Ul Alam](https://orcid.org/0000-0003-3088-5279)
> Location: UArizona Libraries CATalyst Studio, Zoom
###### tags: `beginner`, `intermediate`, `statistics`, `R`
:::info
We will begin with basic concepts in probability theory such as random variables, expectation values and probability distributions.
We will then look at some properties of commonly used distributions like the Binomial, Poisson, Normal and Gamma. We will inspect the central limit theorem and observe how statistical tests compute p-values.
We will end with conditional probabilities and the basics of Bayesian estimation. We will practice some intuitive methods for parameter estimation using maximum likelihood and Markov Chain Monte Carlo.
:::
::: spoiler Session Notes
### :wrench: Getting Started
<!-- add workshop prerequisites -->
Slides: https://tinyurl.com/RezBaz23slides
Code: https://tinyurl.com/RezBaz23code
As we cover theory and concepts, we will follow along with examples and exercises in RStudio. Prior knowledge of R is not a prerequisite, but it will be helpful to have RStudio installed on your computer so you can execute the provided scripts, play around with the code and try solving practice problems.
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** MD Nafis Ul Alam; UArizona Plant Sciences, Arizona Genomics Institute; mdalam@arizona.edu
- :hugging_face: **Helper:** Qiuyu Jiang; UArizona Ecology and Evolutionary Biology; qiuyujiang@arizona.edu
### :goal_net: Learning Objectives
<!-- add learning objectives here or other matierals as needed! -->
Learners will:
- Build familiarity with commonly used probability distributions
- Learn the theory behind statistical tests and p-values
- Understand the idea behind sophisticated parameter estimation methods such as maximum likelihood and MCMC
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
#### Content headers
Section 1: Fundamental concepts
- Prelude to probability: the birthday problem
- Probability space and events
- Random variables, expectation values and linearity of expectation
numerical vaules -
expectaion value - average value of the random variable weighted by probabilty
#even random variables are dependent on each other, linearity always holds;
https://www.youtube.com/@rmcelreath(statistical rethinking)
- Probability distributions
Discrete random variables
Normal distribution: mean and standard deviation
Continuous distribution (PX=x) = 0; probability is calculated across a range
Section 2: Applications of probability distributions
- Distribution functions in R
dfunc; pfunc; qfunc(qfunc); rfunc(simulating random variables)
- Discrete distributions: Binomial and Poisson
- Discrete distribution
Binomial
Poisson: expectation = variance =
Bimomial converges to poisson when rates goes below 1, we can treat it as probability; For poisson there is no upper/lower bounds.
- Countinuous distribution
Normal distribution
- The central limit theorem and standard error of the mean
- Continuous distributions: Normal, Gamma, Chi2, T, F
Gamma: a factorial for a fractions;
Sum of squares: how spread the data are; model comparison
T test: whether a particular mean is siginificant from the population mean
F test:
Section 3: Parameter estimation
- Prelude to simulations: the three-door problem
- Conditional probabilities and Bayes’ theorem
- Maximum likelihood estimation
- Monte Carlo estimations and MCMC sampling
### :question: Session Questions
<!-- add good questions that came up during your session here -->
- fill in quetsion here
:::
## Workshop: Reproducibility in Python: intro to testing
> Apr 19, 1:00-2:00 PM (PDT) [color=#244c7b]
> Ken Youens-Clark
> Location: Zoom
###### tags: `beginner`, `python`
:::info
TBA
:::
::: spoiler Session Notes
### :wrench: Getting Started
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Ken Youens-Clark, MS; DNAnexus; kyclark@gmail.com
- :hugging_face: **Helper:**
-
### :goal_net: Learning Objectives
<!-- add learning objectives here or other matierals as needed! -->
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
-
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
## Workshop: Blender for cinematic visualization
> Apr 19, 2:00-3:00 PM (PDT) [color=#244c7b]
> [Devin Bayly](rtdatavis.github.io)
> Location: Zoom
###### tags: `beginner`, `3D animation`
:::info
This workshop will be an introduction to using blender in the domain of scientific visualization.
We will cover bringing in csv data and how we can associate it with geometry that we render for high quality visualizations.
Open to all learners!
:::
--> Please Download Blender ahead of time https://www.blender.org/download/
--> Then go get the release files here https://github.com/DevinBayly/blender_CSV_resbaz
::: spoiler Session Notes {state="open"}
### :wrench: Getting Started
<!-- add workshop prerequisites -->
Please make sure to install the latest blender program (3.5) @ https://www.blender.org/download/
### :goal_net: Learning Objectives
<!-- add learning objectives here or other matierals as needed! -->
Participants will become familiar with blender in the following ways:
- navigating the user interface and 3d viewport
- importing csv data
- exporting rendered images
- basic tasks in the geometry nodes editor
- positioning cameras for still frame and animated renders
### :wave: Introductions
<!-- If you want, have your learners introduce themselves here so they get a chance to learn HackMD before starting -->
- :mega: **Instructor:** Devin Bayly; UArizona Libraries; baylyd@arizona.edu
- :hugging_face: **Helper:** Ben Kruse; Undergraduate CS/Math/**Allstar**
-
### :spiral_note_pad: Session Notes
<!-- Add your agenda bones here and encourage learners to fill in notes as you go -->
- step 1 placing points
- step 2 instancing
- step 3 adding color
- step 4 creating source and target
- step 5 making curve lines
- step 7 full lines
- step 8 sticks not lines
- step 9 applying to B
- step 10 adding camera control
### :question: Session Questions
<!-- add good questions that came up during your session here -->
-
:::
# Thursday
## [Event - separate registration] Women in Data Science - Tucson
> Apr 20, 8:30 AM - 5:00 PM (PDT) [color=#5992d2]
> UArizona Student Union Kachina Room
###### tags: `data science`, `speaker sessions`
:::info
[Women in Data Science (WiDS) Tucson](https://widstucson.org/) is an independent event organized by the University of Arizona to coincide with the Global WiDS Conference held at Stanford University and an estimated 200+ locations worldwide annually.
:::
## Social: Hacky Hour
> Apr 20, 4:00 PM -7:00 PM (PDT) [color=#5992d2]
> [Snakes and Lattes Tempe](https://www.snakesandlattes.com/tempe), 20 West 6th St, **Tempe**, AZ
> [Mother Road Brewing Company](https://www.motherroadbeer.com/), 7 S Mikes Pike St, **Flagstaff**, AZ
> [Snakes & Lattes](https://www.snakesandlattes.com/tucson), 988 E University Blvd, **Tucson**, AZ
###### tags: `social`
:::info
Celebrate the end of ResBaz AZ 2023! :tada:
:::