Chia Shen Tsai

Speed ML V1
Speed ML V1 Author: Jia-Shen Tsai, Wendy Wen, Zhengqi Jiao, Miaojun Pang, Alexander Yoshizumi When run, the script create the random forest model and results for the given dataset. rm(list=ls()) library(ggplot2) library(dplyr) library(randomForest) # devtools::install_github("MI2DataLab/randomForestExplainer")
Chia Shen Tsai changed a year agoView mode Like Bookmark
ENV 710 Tutorial RMD
ENVIRON710 Tutorial TOC Tutorials Tutorial 1: Introduction to R Notebook and R Markdown Tutorial 2: Summary StatisticsWorking through the Bonus Example: A PhD student's approach to new code Tutorial 2 Bonus Solution Tutorial 3: Histogram Tutorial 4: Importing and joining data
Chia Shen Tsai changed a year agoBook mode Like Bookmark
Tutorial 12: GLR
Tutorial 12: GLR For this lab on logistic regression, you will work through each of the chunks and respond to the questions. You should knit the lab to html (not Word). Please submit your knitted html and Rmd to Sakai by noon on November 21st. We are using dta from the General Social Survey (GSS) which is a survey that occurs every few years and provides researchers and policymakers a better understanding of Americans views on a variety of policy issues. Please read more about the GSS here: https://gss.norc.org/About-The-GSS. The central purpose of our modeling is to determine if there is a relationship between time spent in nature, access to nature and beliefs about spending on the environment. Response Variable natenvir is a variable that measures the response to the question:
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Tutorial 10: Simple Linear Regression
Tutorial 10: Simple Linear Regression This tutorial will walk you through the steps of running a simple linear regression (SLR) in R. I encourage you to read about Yale's Environmental Performance Index (EPI) here: https://epi.envirocenter.yale.edu/. Variables: GDP and Environmental Performance Index (EPI) For the tutorial we will explore the relationship between GDP per capita (GDPpc) and the Yale Environmnetal Performance Index. The Environmental Performance Index (EPI.new) is an index for each nation on its environmental performance. It is an index combined of numerous weighted measures. Read more about the data on the Yale website and the technical appendix on Sakai (Wolf et al., 2022).
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Tutorial 9: Paired t-test
Tutorial 9: Paired t-test library(moments) #package for skewness library(knitr) #package for making tables (kable) library(tidyverse) #multiple packages for data wrangling library(gt) # a package to make tables library(lubridate) # a package to manipulate dates rm(list=ls()) #removing objects airquality.df<-read_csv("airquality.csv") #reading data
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Lab2
Lab Assignment 2 rm(list = ls()) library(moments) #package for skewness library(knitr) #package for making tables (kable) library(tidyverse) #multiple packages for data wrangling library(gt) # a package to make tables library(lubridate) # a package to manipulate dates PEC <- read.csv('../input/PEC.csv') glimpse(PEC) PEC$year <- as.factor(PEC$year)
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Tutorial 8: Extract values from Welch Output and Make into a Lovely Table
Tutorial 8: Extract values from Welch Output and Make into a Lovely Table This tutorial goes over how to extract test statistics from the Welch's t test results that R gives you. We will use the airquality.csv New Jersey January and July data. library(dplyr) library(readr) We will use the airquality.csv data again. Last time, I hope. airquality.df <- read_csv("~/🪅Master/04_Study/Fall 2023/ENV 710 TA/R for stats/input/airquality.csv") airquality.df <- airquality.df %>% #making airquality data frame
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Tutorial 6: Data Wrangling, the t-test and the Log Transformation
Tutorial 6: Data Wrangling, the t-test and the Log Transformation Hi All! Welcome to Tutorial 6! You do NOT need to turn in anything as a part of this tutorial, but it is critical that you understand this tutorial to be successful in your Group Lab that is due on October 2nd. In this tutorial, we will work on what we call data wrangling (managing data). This is often a critical step before any analysis can occur. You will need to install a couple of new packages. Please do so in the console if you have not already. install.packages("tidyverse") # a package of packages used to manipulate data install.packages("gt") # we will try to make fancier tables today! with the gt package After you install these packages, please run the library chunk to make sure all of these packages are loaded. Don't load the package papeR today. You can read about tidyverse here: https://www.tidyverse.org/packages/. ggplot2 is part of tidyverse.
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Working through the Bonus Example: A PhD student's approach to new code
Tutorial 2 Bonus Load your libraries library(wbstats) # a package that enables us to import data from the World Bank. library(moments) # allows us to calculate skewness and kurtosis library(dplyr) # a package that helps us wrangle/manage data library(tidyr) # a package that allows us to pivot the data Load your data wb_data <- # pull the country data down from the World Bank - five indicators wb_data(
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Tutorial 3
Tutorial 3: Histogram Congratulations! You have made it to Tutorial 3! In this tutorial we will work on developing a histogram! We will need to install a few new packages. Be sure to do so in the console below. You should already have the other packages we need (and listed in the first chunk) installed and ready to load. ggplot2: This is a data visualization package that we will use throughout the semester. The ggplot2 package enables you to develop all sorts of graphs and visualizations including histograms, bar charts, and scatterplots. ggthemes: Provides settings to make visualizations consistent and attractive. library(wbstats) # a package that enables us to import data from the World Bank. library(ggplot2) # a data visualization package. library(ggthemes) # a package of themes for visualizations. themes are settings to make are visualizations consistent and attractive. library(moments) # allows us to calculate skewness and kurtosis
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
Target Actor 2023
Target Actor 2023 read all data from source rm(list = ls()) setwd('~/🪅Master/12_internship/2023 Global Climate Action report/Raw_data_formatted/data_2023/CDP Cities') library(tidyverse) x <- list.dirs("..")[grepl("clean data", list.dirs(".."), ignore.case = T)] get_dirs <- function(path){ # Takes a path to the folder containing the cleaned data folders and
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
GCT_subnational_2023
rm(list = ls()) library(dplyr) library(tidyr) library(ggplot2) library(countrycode) ta2022 <- read.csv('target_actor2022.csv') ta2023 <- read.csv('target_actor2023.csv') str(ta2022) unique(ta2022$target_year)
Chia Shen Tsai changed 2 years agoView mode Like Bookmark
ENV710 Final
Response Variable Unit Description deforestation.percent % Percent of forest lost from 2010-2021 by country Explanatory Variables
Chia Shen Tsai changed 3 years agoEdit mode Like Bookmark
MLR_lab_CJS
Introduction The central question Spotwood et al. investigated is the existence of an association between COVID-19 cases, nature accessibility, and other sociodemographic characteristics. They attempted to utilize ZIP-Code-scale data to examine whether the negative association between COVID- 19 case rates and greenness shown with county-level data in the United States still holds at the finer geospatial level. The variables are all in the scale of ZIP Code Tabulation Areas (ZCTAs). The primary response variable is the number of COVID cases per 100,000 people between March and September 2020 (fetched on 1 October 2020). The explanatory variables include greenness, such as the Normalized Difference Vegetation Index (NDVI) and park access, and sociodemographic features, such as the proportion of White people and people of colour (POC), age, income, and population density. Notably, Spotswood et al. excluded rural areas from their analysis while we kept the rural data in our model. Furthermore, the "Urban" variable is the only binary variable in our dataset. We hypothesized that more green space and higher income would lead to lower Covid-19 case rates. In order to test this hypothesis, we used Covid-19 case rates as our response variable, and selected median income, proportion of white residents, NDVI, percent parks, median age, and aridity, as well as the dummy variable urban, as our explanatory variables. We selected Arizona, Florida, and Maine as the states on which to conduct our analysis. Summary statistics We chose Florida (n=912), Arizona (n=353), and Maine (n=376) as the analysis target. The distributions of variables across different State varied (See Table1). Overall, Florida has the most COVID-19 cases rates (mean = 3064.34, median = 2611.34), Arizona the second (mean = 2645.94, median = 1968.03), and Maine the least (mean = 238.479, median = 149.115). The greenness also differ across states (mean = 0.32(AZ), 0.58(FL), 0.78(ME)), but the standard deviations (AZ 0.09, Fl 0.10, ME 0.04) show that the difference among each state is relatively small.
Chia Shen Tsai changed 3 years agoEdit mode Like Bookmark
ENV710 20221019
title: "Corn and the Great Depression" output: html_notebook author: Ina Liao, Chia Shen Tsai For this example, you need to install the package "Sleuth3" (This is from the Statistical Sleuth Textbook (one of my favorites, but no one else seems to love it.)) library(Sleuth3) library(ggplot2) We will use the data from ex0195. The data are annual rainfall in inches, Yield is average corn production in bushels per acres and time is year.The corn production was measured in six Midwest states (Go Midwest!).
Edit mode Like Bookmark
in class assignment
library(ggplot2) library(dplyr) library(wbstats) library(r2symbols) rm(list=ls()) Check out the list of World Bank indicators here: https://data.worldbank.org/indicator?tab=all. To see information about an indicator (including ID), click on the indicator name and in the line graphic, click on the Details button on the upper right corner. You should see information about the indicator and the ID code. Select one variable of interest as your response variable and one variable as your explanatory variable. It is best if both of the variables are continuous.
Chia Shen Tsai changed 3 years agoEdit mode Like Bookmark
Group Lab
[TOC] Code Setup the environment rm(list=ls()) options(scipen = 999) library(moments) library(knitr)
Chia Shen Tsai changed 3 years agoEdit mode Like Bookmark
農工要分區，灌排要分離
農工要分區，灌排要分離資料區違章工廠資料基礎知識污染地圖污染靜態資料潛在污染源污染檢索系統污染途徑
 Chia Shen Tsai changed 3 years agoBook mode Like Bookmark
疫情期間線上活動舉辦指南
前置 :::info 問完以下問題才能決定工具、注意事項 ::: 講者是線下聚集一地或是都線上參與？來自國內、國外？有沒有投影片要分享？
Chia Shen Tsai changed 4 years agoView mode Like Bookmark
☀️綠能發展vs環境與社會檢核機制認知培力工作坊
tags: 再生能源、太陽光電、漁電共生、農地光電、環社檢核 Intro 背景資訊 slido提問逐場筆記【5/5 Day1】
Chia Shen Tsai changed 4 years agoBook mode Like Bookmark