Errors, warnings, issues in R programming

###### tags: `R` `error` `bash` `n` `glm()` `hoslem.test()` `complete.cases` `cat` `merge` `dplyr::filter` `dplyr::group_by_` `dplyr::summarize` `read.table(fill=TRUE)` `tryCatch()` `.libPaths()` `R.home()` `Sys.getenv()` `ifelse()` `unloadNamespace()` `is.na()` `dplyr::mutate()` `dplyr::case_when()` `dplyr::summarise_at()` `dplyr` `rlang` `attributes()` `foreach()` `%dopar%` `%:%` `%do` `devtools::install_github()` `install.packages()` # Errors, warnings, issues in R programming --- **Issue** Graph color not matched legend color. activity.type='Strength & Stability Workout' is shown in gray, not pink. Using a different color palette doesn't make a difference. ![Screen Shot 12-02-23 at 11.17 AM](https://hackmd.io/_uploads/Bkz6BW_B6.png) ```r! tail(activities.2023.11, n=3) # start.date.local moving.time.hour activity.type #23 2023-11-15 2.182778 Ride #24 2023-11-15 1.505278 Strength & Stability Workout #25 2023-11-16 1.158056 Run ``` --- **Issue** `R files shown as 0 KB. R files reopened as empty in RStudio.` **Solution** I had a similar issue with older R files that opened as empty. It turned out that RStudio didn't use the correct encoding as default and therefore wasn't able to read the file (presented the file as empty). You can make sure that you are using the correct encoding by: 0. Copying R code to a text file 1. Opening the file in RStudio as you normally would (the file will be empty) 2. Navigate to File -> Reopen with Encoding... 3. Select UTF-8 and click OK. This will clear the content in the R file 4. Copy the R code back to the R file UTF-8 will most likely be the encoding you need. You can also choose to set this as the default for all source files. [.Rmd files open as completely empty](https://stackoverflow.com/questions/44298161/rmd-files-open-as-completely-empty) --- **Error** `Warning in install.packages : 'lib = "C:/Program Files/R/R-4.0.3/library"' is not writable Error in install.packages : unable to install packages` **Solution** Close RStudio/R. Run it as administrator ["not writable" error when installing new packages on 3.2.0](https://github.com/RevolutionAnalytics/RRO/issues/144) --- **Error** devtools::install_github("DillonHammill/CytoExploreRData") Downloading GitHub repo DillonHammill/CytoExploreRData@HEAD Error in utils::download.file(url, path, method = method, quiet = quiet, : cannot open URL 'https://api.github.com/repos/DillonHammill/CytoExploreRData/tarball/HEAD' **Solution** In RStudio, Tools > Global options> Package > check the Use secure download method for HTTP box [having trouble getting devtools::install_github() to work in R on Win 7 64bit machine](https://stackoverflow.com/questions/53845962/having-trouble-getting-devtoolsinstall-github-to-work-in-r-on-win-7-64bit-ma) [Secure Package Downloads for R](https://support.rstudio.com/hc/en-us/articles/206827897-Secure-Package-Downloads-for-R) ![Secure Package Downloads for R](https://i.imgur.com/zT7K9Gv.png) ```r! # Before checking the Use secure download method for HTTP box in RStudio devtools::install_github("DillonHammill/CytoExploreRData") # Downloading GitHub repo DillonHammill/CytoExploreRData@HEAD # Error in utils::download.file(url, path, method = method, quiet = quiet, : # cannot open URL 'https://api.github.com/repos/DillonHammill/CytoExploreRData/tarball/HEAD' # After checking the Use secure download method for HTTP box in RStudio devtools::install_github("DillonHammill/CytoExploreRData", lib=dir.R.packages) # Downloading GitHub repo DillonHammill/CytoExploreRData@HEAD # √ checking for file 'C:\Users\lunC\AppData\Local\Temp\Rtmp4erTUM\remotes2d6c53fb506c\DillonHammill-CytoExploreRData-488edf0/DESCRIPTION' # - preparing 'CytoExploreRData': (1s) # √ checking DESCRIPTION meta-information ... # - checking for LF line-endings in source and make files and shell scripts # - checking for empty or unneeded directories # NB: this package now depends on R (>= 3.5.0) # WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: 'CytoExploreRData/data/Activation.rda' WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: 'CytoExploreRData/data/Activation_gatingTemplate.rda' WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: 'CytoExploreRData/data/Compensation.rda' WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: 'CytoExploreRData/data/Compensation_gatingTemplate.rda' # - building 'CytoExploreRData_1.0.3.tar.gz' # * installing *source* package 'CytoExploreRData' ... # ** using staged installation # ** R # ** data # *** moving datasets to lazyload DB # ** inst # ** byte-compile and prepare package for lazy loading # ** help # *** installing help indices # converting help for package 'CytoExploreRData' # finding HTML links ... done # Activation html # Activation_gatingTemplate html # Compensation html # Compensation_gatingTemplate html # CytoExploreRData html # ** building package indices # ** installing vignettes # ** testing if installed package can be loaded from temporary location # *** arch - i386 # *** arch - x64 # ** testing if installed package can be loaded from final location # *** arch - i386 # *** arch - x64 # ** testing if installed package keeps a record of temporary installation path # * DONE (CytoExploreRData) ``` --- **Error** The procedure entry point EXTPTR_PTR Rcould not be located in the dynamic link library ![](https://i.imgur.com/RExVKph.png) **Solution** Use a R version no later than the package to load. For instance, to load package ‘magick’ that was built under R version 4.0.2, use R version 4.0.2. Using a version older than R-4.0.2 could result in the error. I experienced that the "Entry Point" error message (concerning "rlang.dll", "Rcpp.dll" or other .dll) occures when using/choosing a lower R version in RStudio on packages of a higher built , i.e. using R 3.4 on packages of built 3.5, e.g. after using update.packages(..., checkBuilt = TRUE) on user library of lower R Version 3.4...after getting rid of the packages of the higher built or re-installing them on the lower built in the user library, no "Entry Point" error messages occure anymore when using the lower R Version R 3.4... [Entry point Not Found #416](https://github.com/r-lib/rlang/issues/416) ```r! # Not working dir.R.packages <- "C:/Program Files/R/R-4.0.0/library" #"C:/Program Files/R/R-3.6.2/library" install.packages("magick",lib=dir.R.packages) library(magick) # Error: package or namespace load failed for ‘magick’ in inDL(x, as.logical(local), as.logical(now), ...): # unable to load shared object 'C:/Program Files/R/R-4.0.0/library/Rcpp/libs/x64/Rcpp.dll': # LoadLibrary failure: The specified procedure could not be found. # In addition: Warning message: # package ‘magick’ was built under R version 4.0.2 # Working # RStudio > Tools > Global Options > General > R version > [64 bit] C:\Program Files\R\R-4.0.2 dir.R.packages <- "C:/Program Files/R/R-4.0.2/library" #"C:/Program Files/R/R-3.6.2/library" install.packages("magick",lib=dir.R.packages) library(magick) # Linking to ImageMagick 6.9.9.14 # Enabled features: cairo, freetype, fftw, ghostscript, lcms, pango, rsvg, webp # Disabled features: fontconfig, x11 ``` --- **WARNING**: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: https://cran.rstudio.com/bin/windows/Rtools/ Warning in install.packages : package ‘edgeR’ is not available (for R version 3.6.2) **Solution** * Install Rtools35.exe resolved the warning [Building R for Windows](https://cran.r-project.org/bin/windows/Rtools/history.html) * Installing bioconductor packages using `BiocManager::install(pkgs=)` ```r! # Not working install.packages("edgeR", lib="C:/Program Files/R/R-3.6.2/library") # WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: https://cran.rstudio.com/bin/windows/Rtools/ # Warning in install.packages : # package ‘edgeR’ is not available (for R version 3.6.2) # Working dir.R.packages <- "C:/Program Files/R/R-3.6.2/library" if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(pkgs = "Glimma", lib=dir.R.packages) BiocManager::install(pkgs = "Mus.musculus", lib=dir.R.packages) ``` --- **Warning message:** `funs()` is deprecated as of dplyr 0.8.0. Please use a list of either functions or lambdas: **Error**: `.vars` must be a character/numeric vector or a `vars()` object, not a `formula` object. **Solution** Specify the functions to summarise or rename multiple columns using `vars("column1","column2","column3"), ~functionName()` `vars("column1","column2","column3"), list(~function1(), ~function2())` [Warning message for dplyr package when trying to utilize the summarise_each function](https://community.rstudio.com/t/warning-message-for-dplyr-package-when-trying-to-utilize-the-summarise-each-function/55576/2) ```r! # Deprecated as of dplyr 0.8.0 dplyr::summarise_at( .vars= c("TMA.core.ID","specimenID","shortID") ,.funs= funs(paste(na.omit(.), collapse = ","))) # Working with 1 function to pass to 3 columns dplyr::summarise_at( vars(c("TMA.core.ID","specimenID","shortID")) , ~ paste(na.omit(.), collapse = ",")) # Working code with 2 functions to pass to 1 column library(tidyverse) #> Warning: package 'forcats' was built under R version 3.6.3 library(nycflights13) #> Warning: package 'nycflights13' was built under R version 3.6.3 flights %>% group_by(carrier) %>% summarise_at(vars(matches("delay")), list(~min(., na.rm = TRUE), ~max(., na.rm = TRUE))) #> # A tibble: 1 ``` --- **Error**: "Error in unserialize(socklist[[n]]) : error reading from connection" In addition: Warning messages: 1: In .Internal(gc(verbose, reset, full)) closing unused connection 9 (<-QIMR18447.adqimr.ad.lan:11372) 2: In .Internal(gc(verbose, reset, full)) closing unused connection 8 (<-QIMR18447.adqimr.ad.lan:11372) 3: In .Internal(gc(verbose, reset, full)) closing unused connection 7 (<-QIMR18447.adqimr.ad.lan:11372) 4: In .Internal(gc(verbose, reset, full)) closing unused connection 6 (<-QIMR18447.adqimr.ad.lan:11372) 5: In .Internal(gc(verbose, reset, full)) closing unused connection 5 (<-QIMR18447.adqimr.ad.lan:11372) 6: In .Internal(gc(verbose, reset, full)) closing unused connection 4 (<-QIMR18447.adqimr.ad.lan:11372) 7: In .Internal(gc(verbose, reset, full)) --- **Error**: Error in foreach(i = 1:length(Exp08.categorised.predictors.list), .combine = "rbind", : "%:%" was passed an illegal right operand **Solution**: Nest foreach loops correctly as shown in the code below. [Outer loop variable in nested R foreach loop](https://stackoverflow.com/questions/9674530/outer-loop-variable-in-nested-r-foreach-loop) ```r! X <- c("A", "B") Y <- 1:3 ## (1) EITHER merge two 'foreach' objects using '%:%' ... foreach (j = X, .combine = c) %:% foreach(i = Y, .combine = c) %do% { paste(j, i, sep = "") } # [1] "A1" "A2" "A3" "B1" "B2" "B3" ## (2) ... OR Nest two 'foreach' objects using a pair of '%do%' operators ... foreach(j = X, .combine = c) %do% { foreach(i = Y, .combine = c) %do% { paste(j, i, sep = "") } } # [1] "A1" "A2" "A3" "B1" "B2" "B3" ## (3) ... BUT DON'T use a hybrid of the approaches. foreach(j = X, .combine = c) %:% { foreach(i = Y, .combine = c) %do% { paste(j, i, sep = "") } } # Error in foreach(j = X, .combine = c) %:% { : # "%:%" was passed an illegal right operand ``` --- **Error**: `x` must be a vector, not a `data.frame/surv_categorize` object. **Solution** Multiple classes in the data attributes cause the error. Changing the class to data.frame only gets rid of the error while running base functions (e.g., nrow(), names()) within dply package. ```r! # This generates the error `x` must be a vector, not a `data.frame/surv_categorize` object. Exp26.predictors.cat.RFS.tumor <- dplyr::left_join( x= .Exp26.predictors.cat.RFS.tumor ,y= Exp26.survival.markers.tumor[,c(ID.columns, "RFS_years","RFS_censor", Exp26.covariates.RFS)] ,by=c( "row.numb"="row.numb" ,"RFS_years"="RFS_years" ,"RFS_censor"="RFS_censor")) %>% # Reorder columns dplyr::select_(.dots = c(ID.columns ,"RFS_years","RFS_censor" , sort(grep(pattern = "^number", x= names(.), value = TRUE)) , sort(grep(pattern = "^CD", x= names(.), value = TRUE)) )) Error: `x` must be a vector, not a `data.frame/surv_categorize` object. # This is working # Change the class of data attribute to data.frame rahter than two types ("data.frame", "surv_categorize") attributes(.Exp26.predictors.cat.RFS.tumor)$class <- c("data.frame") # Merge categorised predictors back with predictor value columns ## table x : dim(.Exp26.predictors.cat.RFS.tumor) 29 7 ## table y : dim(Exp26.survival.markers.tumor[,c(ID.columns, "RFS_years","RFS_censor", Exp26.covariates.RFS)]) 29 10 Exp26.predictors.cat.RFS.tumor <- dplyr::left_join( x= .Exp26.predictors.cat.RFS.tumor ,y= Exp26.survival.markers.tumor[,c(ID.columns, "RFS_years","RFS_censor", Exp26.covariates.RFS)] ,by=c( "row.numb"="row.numb" ,"RFS_years"="RFS_years" ,"RFS_censor"="RFS_censor")) %>% # Reorder columns dplyr::select_(.dots = c(ID.columns ,"RFS_years","RFS_censor" , sort(grep(pattern = "^number", x= names(.), value = TRUE)) , sort(grep(pattern = "^CD", x= names(.), value = TRUE)) )) # dim(Exp26.predictors.cat.RFS.tumor) 29 14 ``` --- **Error**: package or namespace load failed for ‘dplyr’: .onLoad failed in loadNamespace() for 'pillar', details: call: loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) error: namespace ‘rlang’ 0.4.1 is already loaded, but >= 0.4.2 is required **Solution** The error occurred because the rlang package that was automatically loaded when starting R is in a version older than the rlang version required by another package dplyr. To rid of this error, try * Solution 1: unload the loaded package rlang and then reload the rlang, or * Solution 2: Remove the rlang package and reinstall it [rror: package or namespace load failed for ‘dplyr’](https://github.com/tidyverse/dplyr/issues/5214) ```r! # Error while loading dplyr package R_3.6.2_packages_dir <- "/software/R/R-3.6.2/lib64/R/library" library(dplyr, lib.loc= R_3.6.2_packages_dir) #Error: package or namespace load failed for ‘dplyr’: # .onLoad failed in loadNamespace() for 'pillar', details: # call: loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) # error: namespace ‘rlang’ 0.4.1 is already loaded, but >= 0.4.2 is required sessionInfo() # R version 3.6.2 (2019-12-12) # Platform: x86_64-pc-linux-gnu (64-bit) # Running under: CentOS Linux 7 (Core) # # Matrix products: default # BLAS/LAPACK: /software/OpenBLAS/OpenBLAS-0.3.3/lib/libopenblasp-r0.3.3.so # # locale: # [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C # [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 # [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 # [7] LC_PAPER=en_US.UTF-8 LC_NAME=C # [9] LC_ADDRESS=C LC_TELEPHONE=C # [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # other attached packages: # [1] RColorBrewer_1.1-2 magick_2.0 # # loaded via a namespace (and not attached): # [1] compiler_3.6.2 assertthat_0.2.1 R6_2.4.0 magrittr_1.5 # [5] pillar_1.4.3 glue_1.3.1 tibble_2.1.3 crayon_1.3.4 # [9] Rcpp_1.0.2 pkgconfig_2.0.2 rlang_0.4.1 # This is ther version of rlang that has been loaded packageVersion("rlang") #[1] ‘0.4.1’ # This is ther version of rlang that you will reload packageVersion("rlang",lib.loc=R_3.6.2_packages_dir) #[1] ‘0.4.5’ # Unload the loaded package rlang and load it again using latest version. This won't work detach("package:rlang",unload=TRUE,character.only=TRUE) # Error in detach("package:rlang", unload = TRUE, character.only = TRUE) : # invalid 'name' argument # Unload the loaded package rlang and load it again using latest version. This works unloadNamespace("rlang") # Error in unloadNamespace("rlang") : # namespace ‘rlang’ is imported by ‘tibble’, ‘pillar’ so cannot be unloaded #----------------------------------- # Solution 1: unload the loaded rlang package and reload it ### Note that this may not work if packages that are dependent on rlang have been loaded #----------------------------------- unloadNamespace("tibble") unloadNamespace("pillar") unloadNamespace("rlang") # Reload rlang package library(rlang, lib.loc= R_3.6.2_packages_dir) # Load dplyr package library(dplyr, lib.loc= R_3.6.2_packages_dir) #---------------------------------------------- # Solution 2: remove rlang and reinstall rlang in a newer version #---------------------------------------------- # Deal with errors associated with rlang ## Error: package or namespace load failed for ‘Seurat’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): namespace ‘rlang’ 0.4.7 is already loaded, but >= 0.4.9 is required # Clear the work environment. Restart R remove.packages("rlang", lib = dir.R.packages) install.packages("rlang", lib = dir.R.packages) # Reload rlang package library(rlang, lib.loc= dir.R.packages) ``` --- #### `Error in get_oauth_sig() : OAuth has not been registered for this session` --- #### ust be a double vector, not a primitive function Call `rlang::last_error()` to see a backtrace.` Error arises because column c is not found in the data.frame ```r! a = c(3,7,NA, 9) b = c(2,NA,9,3) f = c(5,2,5,6) d = c(NA,3,4,NA) mydf = data.frame(a=a,b=b,f=f,d=d) # This code generates the error above because column c is not found in the dataframe library(dplyr) mydf <- mydf %>% dplyr::mutate(e= dplyr::case_when( is.na(a) ~ b ,is.na(b) ~ d , TRUE ~ c)) # This code is working mydf[,5] <- ifelse(is.na(mydf[,1]) & !is.na(mydf[,2]), mydf[,2], ifelse(is.na(mydf[,2]) & !is.na(mydf[,4]), mydf[,4], mydf[,3])) # This code is working mydf <- mydf %>% dplyr::mutate(e= dplyr::case_when( is.na(a) & !is.na(b) ~ b ,is.na(b) & !is.na(d) ~ d , TRUE ~ f)) # This code is working a = c(3,7,NA, 9) b = c(2,NA,9,3) c = c(5,2,5,6) d = c(NA,3,4,NA) e <- c e[which (is.na(mydf$a))] <- b [which (is.na(mydf$a))] e[which (is.na(mydf$b))] <- d [which (is.na(mydf$b))] mydf = data.frame(a=a,b=b,c=c,d=d, e=e) ``` #### `Error: package dplyr was installed by an R version with different internals; it needs to be reinstalled for use with this R version In addition: Warning message: package ‘dplyr’ was built under R version 3.5.1` It depends on how you use your custom installed libraries. If you use them "stand-alone" i.e. without /software/R/version in your libPaths, then deleting that folder will break anything installed to your ~/R directory that depends on Rcpp (which could be a lot of things), so I wouldn't necessarily recommend that. The real root of the issue here is that there is no way to guarantee that software you've installed yourself in one location is compatible with software compiled and installed to a different location, but you can try one of the following: 1. As you suggested, just set .libPaths("/software/R/R-3.4.1/lib64/R") at the start of your session so you're only using the installed system libraries. This is the safest option. 2. Reverse the order of paths in libPaths, e.g. .libPaths(system path, custom path) so that the system libraries are found first. Then at least the system libraries will work. If (and there's no guarantee) your custom libraries are compatible with the system libraries, then those will also work. 3. Xikun: if you mould load R/3.5.1 then in the R script simply library(package). Don't add lib path as this will result in a conflict #### `> detach("package:plyr", unload=TRUE) Error in detach("package:plyr", unload = TRUE) : invalid 'name' argument` * [How to unload a package without restarting R](https://stackoverflow.com/questions/6979917/how-to-unload-a-package-without-restarting-r/24153255) ```r! # detach() did not work detach("package:plyr", unload=TRUE) Error in detach("package:plyr", unload = TRUE) : invalid 'name' argument # unloadNamespace() worked unloadNamespace("plyr") ``` --- #### `Error in ifelse(m$cohorts == "UKB" & m$traits.abb == "CI", "OA", ifelse(m$cohorts == : unused arguments ("OA, GA", "GA")` This error occurred when a ifelse() doesn't contain 3 arguments. A simple check is to see if there are 2 commas in each ifelse, and the number of ifelse() functions is similar to the number of ending brackets. * [Unused argument error in nested ifelse statements](https://stackoverflow.com/questions/13303810/unused-argument-error-in-nested-ifelse-statements) ```r! # Six nested ifelse() should have six ending brackets rem <- 21%%7 day <- ifelse(rem==0, "Thursday", ifelse (rem==1, "Friday", ifelse (rem==2, "Saturday", ifelse (rem==3, "Sunday", ifelse (rem==4, "Monday", ifelse(rem==5, "Tuesday", "Wednesday") ) ) ) ) ) ``` --- #### `Error in filter_impl(.data, quo) : Evaluation error: `as_dictionary()` is defunct as of rlang 0.3.0. Please use `as_data_pronoun()` instead.` The error occurred when running `dplyr::filter()` The error is gone after HPC Admin updated the package dplyr --- #### `Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : namespace ‘rlang’ 0.2.1 is already loaded, but >= 0.3.1 is required` occurred while running `library(dplyr)` There are two problems (1)`rlang` in the dafault path is in an old version, (2) the default path is in a user created folder that is not centrally controlled. * [Remove a library from .libPaths() permanently without Rprofile.site](https://stackoverflow.com/questions/15217758/remove-a-library-from-libpaths-permanently-without-rprofile-site) * [How to unload a package without restarting R](https://stackoverflow.com/questions/6979917/how-to-unload-a-package-without-restarting-r) * [Delete Files and Directories](https://astrostatistics.psu.edu/su07/R/html/base/html/unlink.html) ```r! # Loading the package dplyr resulted in an error library(dplyr) Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : namespace ‘rlang’ 0.2.1 is already loaded, but >= 0.3.1 is required # In your RStudio session, unload the package that the error is based on (rlang here). Every library you've ever loaded is likely still in your session and conflicts will arise detach("package:rlang", unload=TRUE) # Configure R to *not* use you locally installed packages in /mnt/backedup/home/lunC/R/x86_64-pc-linux-gnu-library/3.4. If you need packages installed, let us know. We're happy to put the in centrally so you don't have conflicts between your personally installed and our centrally installed packagesUse ## .libPaths() inside R to check current library paths .libPaths() [1] "/mnt/backedup/home/lunC/R/x86_64-pc-linux-gnu-library/3.4" # This is a user-created folder [2] "/software/R/R-3.4.1/lib64/R/library" # This is a folder under HPC Admin's central control ## Identify which paths to keep. In my case, it kept R's original library but removed link to my documents. Find R-Home path using R.home() or Sys.getenv("R_HOME"). R.home() [1] "/software/R/R-3.4.1/lib64/R" # This is read every time R kernel starts. Therefore, any modification will be persistent to every run of R ## Keep the second lib path, by removing all lib paths but the second path .libPaths(.libPaths()[2]) .libPaths() [1] "/software/R/R-3.4.1/lib64/R/library" # Delete the rlang package folder under the user-created path unlink("/mnt/backedup/home/lunC/R/x86_64-pc-linux-gnu-library/3.4/rlang", recursive = TRUE) # recursive=TRUE deletes a directory # Try loading dplyr package again library(dplyr) Attaching package: ‘dplyr’ The following objects are masked from ‘package:stats’: filter, lag The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union ``` --- #### Handle errors with `tryCatch()` ```r! # Step 1: copy this tryCatch({},error=function(e){cat("ERROR :",conditionMessage(e), "\n")}) # Step 2: add code that is likely to result in error to the {} ``` #### `Error in [[.default(group, 1) : subscript out of bounds` It means you're trying to get something, say the column you are subsetting has a varying number of delimiter _ but your search pattern is built without considering this. * [Subscript out of bounds - general definition and solution?](https://stackoverflow.com/questions/15031338/subscript-out-of-bounds-general-definition-and-solution) #### `mutate_each()` is deprecated. Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead. To map `funs` over a selection of variables, use `mutate_at()` * [Use mutate_at to change multiple column types](https://stackoverflow.com/questions/39279724/use-mutate-at-to-change-multiple-column-types) * [Create new variables with mutate_at while keeping the original ones](https://stackoverflow.com/questions/45947787/create-new-variables-with-mutate-at-while-keeping-the-original-ones) ```r! # A dataset to work on PRS_everDrug1to10_CUD <- read.table(paste0(locPheno,"pheno2drugUseCUD-IDremapped_standardised-IDremapped-PRS-GSCAN.txt") ,header = T , sep=" " , stringsAsFactors = F , na.strings=c('.','NA')) # The data structure: > str(PRS_everDrug1to10_CUD) 'data.frame': 2463 obs. of 76 variables: $ FAMID : chr "80020" "80020" "80021" "80030" ... $ ID : int 8002001 8002002 8002102 8003001 8003002 8003101 8003102 8003201 8003202 8003301 ... $ FATHERID : chr "8002003" "8002003" "8002103" "8003003" ... $ MOTHERID : chr "8002004" "8002004" "8002104" "8003004" ... $ GENDER : int 1 2 2 1 1 1 1 2 2 1 ... $ famID : int 80020 80020 80021 80030 80030 80031 80031 80032 80032 80033 ... $ wave : chr "NU2" "NU3" "NU3" "NU2" ... $ ZYGOSITY : int 6 6 1 4 4 2 2 1 1 6 ... $ everDrug1 : int 0 0 NA 0 0 1 0 0 0 0 ... $ everDrug2 : int 0 0 NA 0 0 1 0 0 0 0 ... $ everDrug3 : int 0 0 NA 1 0 0 0 0 0 0 ... $ everDrug4 : int 0 0 NA 0 1 0 0 0 0 0 ... $ everDrug5 : int 0 0 NA 0 0 1 0 1 0 0 ... $ everDrug6 : int 0 0 NA 0 0 0 0 0 0 0 ... $ everDrug7 : int 0 0 NA 0 0 1 0 1 0 0 ... $ everDrug8 : int 0 0 NA 0 0 1 0 0 0 0 ... $ everDrug9 : int 0 0 NA 0 0 0 0 0 0 0 ... $ everDrug10 : int 0 0 NA 0 0 0 0 0 0 0 ... $ CUD : int NA NA NA 3 NA 10 9 NA NA NA ... $ age : num 30.3 33.8 33.1 30.6 30.7 ... $ nSEX : int 1 2 2 1 1 1 1 2 2 1 ... $ ageSq : num 920 1141 1095 935 941 ... $ sexAge : num 30.3 67.5 66.2 30.6 30.7 ... $ sexAgeSq : num 920 2281 2190 935 941 ... $ DOB : chr "15JUL1980" "15JUL1980" "20OCT1979" "08MAR1980" ... $ GSCAN.ai.S1 : num -0.515 -1.141 -1.313 -1.185 -1.134 ... $ GSCAN.ai.S2 : num 0.325 -2.008 -0.627 0.366 -0.731 ... $ GSCAN.ai.S3 : num 0.99 -0.252 -2.104 -1.147 -1.202 ... $ GSCAN.ai.S4 : num 1.505 3.152 -0.597 -1.691 -0.385 ... $ GSCAN.ai.S5 : num 0.668 2.307 -2.531 -1.427 -0.707 ... $ GSCAN.ai.S6 : num 0.922 1.447 -1.837 -1.418 -1.357 ... $ GSCAN.ai.S7 : num -0.249 0.978 -3.346 -2.074 -2.022 ... $ GSCAN.ai.S8 : num -0.302 1.045 -3.165 -2.128 -1.888 ... $ GSCAN.cpd.S1: num -0.9651 0.799 -0.1756 0.1013 -0.0447 ... $ GSCAN.cpd.S2: num -0.6752 0.8222 0.5144 -0.0802 -0.379 ... $ GSCAN.cpd.S3: num -0.676 0.218 -0.765 -0.141 -0.543 ... $ GSCAN.cpd.S4: num -0.9386 3.2686 -0.577 0.0432 -0.2242 ... $ GSCAN.cpd.S5: num -0.095 4.892 -0.495 0.124 -0.11 ... $ GSCAN.cpd.S6: num 0.276 5.415 -0.456 0.039 -0.173 ... $ GSCAN.cpd.S7: num 0.2396 4.6077 -0.336 0.3825 -0.0603 ... $ GSCAN.cpd.S8: num 0.138 4.398 -0.294 0.308 -0.247 ... $ GSCAN.dpw.S1: num 0.118 -0.474 -0.551 -2.615 0.184 ... $ GSCAN.dpw.S2: num 0.201 0.875 0.913 -1.487 0.314 ... $ GSCAN.dpw.S3: num -1.583 -0.446 1.746 -0.259 0.44 ... $ GSCAN.dpw.S4: num -0.162 0.621 1.23 -1.076 0.286 ... $ GSCAN.dpw.S5: num 1.334 1.184 -0.477 -1.833 0.188 ... $ GSCAN.dpw.S6: num 1.535 0.335 -1.644 -1.583 0.319 ... $ GSCAN.dpw.S7: num 1.69 0.174 -1.82 -1.467 0.221 ... $ GSCAN.dpw.S8: num 1.7488 -0.0555 -2 -1.3446 0.1773 ... $ GSCAN.sc.S1 : num -0.7893 -1.171 -2.5624 -0.077 -0.0439 ... $ GSCAN.sc.S2 : num -0.1717 -1.4473 -2.1254 0.304 -0.0508 ... $ GSCAN.sc.S3 : num -0.858 -1.781 3.678 0.811 0.543 ... $ GSCAN.sc.S4 : num -0.953 0.537 1.975 1.286 0.857 ... $ GSCAN.sc.S5 : num -0.642 1.786 0.866 1.275 1.037 ... $ GSCAN.sc.S6 : num -0.243 2.205 1.777 0.956 0.71 ... $ GSCAN.sc.S7 : num 0.166 2.894 1.907 0.728 0.263 ... $ GSCAN.sc.S8 : num 0.126 2.791 1.814 0.843 0.334 ... $ GSCAN.si.S1 : num -1.273 -1.833 0.667 0.166 -0.717 ... $ GSCAN.si.S2 : num -1.541 -2.335 0.444 0.276 -0.193 ... $ GSCAN.si.S3 : num -2.082 -2.688 -1.063 -0.764 -0.981 ... $ GSCAN.si.S4 : num -2.4386 -2.2505 -1.3685 -1.0822 -0.0356 ... $ GSCAN.si.S5 : num -1.7702 -2.1189 -3.2785 -1.4399 0.0188 ... $ GSCAN.si.S6 : num -1.622 -3.822 -2.089 -1.53 -0.833 ... $ GSCAN.si.S7 : num -1.09 -3.89 -1.89 -2.13 -1.27 ... $ GSCAN.si.S8 : num -1.11 -4.07 -1.71 -2.12 -1.24 ... $ PC1 : num -0.0125 -0.0124 -0.0128 -0.0129 -0.0132 -0.0122 -0.0122 -0.0132 -0.0132 -0.0126 ... $ PC2 : num 0.0088 0.0092 0.0091 0.0087 0.0088 0.0082 0.0082 0.0087 0.0087 0.0079 ... $ PC3 : num 0.0013 0.0027 0.0005 0.0006 0.0036 -0.0004 -0.0004 0.0052 0.0052 0.0021 ... $ PC4 : num -0.0003 -0.0031 0.0005 -0.0021 -0.0015 -0.0079 -0.0079 -0.0012 -0.0012 -0.0005 ... $ PC5 : num -0.0031 -0.0055 -0.004 -0.0051 -0.0011 -0.0011 -0.0011 -0.003 -0.003 -0.0016 ... $ PC6 : num -0.0071 -0.0061 -0.0025 -0.0066 -0.0073 -0.012 -0.012 -0.0088 -0.0088 -0.0063 ... $ PC7 : num 0.0009 0.0009 -0.0004 0.0024 -0.0004 0.0029 0.0029 -0.0021 -0.0021 -0.0009 ... $ PC8 : num -0.0035 -0.0018 -0.0067 -0.007 -0.0032 -0.0004 -0.0004 0.0018 0.0018 -0.0004 ... $ PC9 : num -0.0005 -0.004 0.0012 -0.0055 -0.004 0.0009 0.0009 -0.0054 -0.0054 0.0009 ... $ PC10 : num -0.0018 -0.0015 0.0019 0.0004 -0.0011 0.0024 0.0024 0.0018 0.0018 0.0038 ... $ impCov : int 0 0 0 0 0 0 0 0 0 0 ... # Multiply every column that starts with GSCAN with nSEX, creating 40 new columns for the multiplication library(dplyr) # Multiply every column that starts with GSCAN (PRS columns) with nSEX and suffix newly created columns with nSEX # To change the suffix "_nSEX" to prefix "sex_", use rename_at() PRS_everDrug1to10_CUD2 <- PRS_everDrug1to10_CUD %>% mutate_at(vars(starts_with("GSCAN")),.funs= funs(nSEX= .*nSEX)) %>% rename_at(vars(contains("_nSEX")),funs(paste("sex", gsub("_nSEX", "", .), sep = "_"))) # Compare an old column with a new column > head(PRS_everDrug1to10_CUD2$GSCAN.ai.S1) [1] -0.5150501 -1.1410706 -1.3126117 -1.1851313 -1.1344985 -0.6225980 > head(PRS_everDrug1to10_CUD2$sex_GSCAN.ai.S1) [1] -0.5150501 -2.2821412 -2.6252234 -1.1851313 -1.1344985 -0.6225980 > ``` #### `Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 194 did not have 14 elements` `read.table` wants to return a data.frame, which must have an element in each column. Therefore R expects each row to have the same number of elements and it doesn't fill in empty spaces by default. Try read.table("/PathTo/file.csv" , fill = TRUE ) to fill in the blanks. * [Confusing error in R: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 did not have 42 elements) [duplicate]](https://stackoverflow.com/questions/19455070/confusing-error-in-r-error-in-scanfile-what-nmax-sep-dec-quote-skip-nli) #### `Error in full_join_impl(x, y, by$x, by$y, suffix$x, suffix$y, check_na_matches(na_matches)) : std::bad_alloc` means you are running out of RAM * [Using swap in R](https://stackoverflow.com/questions/44607641/using-swap-in-r) #### `Error: This function should not be called directly` or `Error in n() : This function should not be called directly` Both dplyr and plyr have the functions summarise/summarize. The error is due to the conflict of the same function of different packages * [dplyr: “Error in n(): function should not be called directly”](https://stackoverflow.com/questions/22801153/dplyr-error-in-n-function-should-not-be-called-directly) ```r! # Bad data %>% filter(pvalue2sided < signi_threshod) %>% group_by_(.dots=c("name_fixEffect_trait","phenotype")) %>% summarise(count=n()) # Good data %>% filter(pvalue2sided < signi_threshod) %>% group_by_(.dots=c("name_fixEffect_trait","phenotype")) %>% dplyr::summarise(count=n()) ``` #### `Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column` * [Merge r brings error “'by' must specify uniquely valid columns”](https://stackoverflow.com/questions/29671543/merge-r-brings-error-by-must-specify-uniquely-valid-columns) ```r! # Wrong merge(baseGscan_joined,binary.var.names.labels,by.x =variable, by.y =var.name, all.x = TRUE) # Correct merge(baseGscan_joined,binary.var.names.labels,by.x ="variable", by.y ="var.name", all.x = TRUE) ``` #### `Warning messages: In cor(variable1, variable2, method = correlation_method, use = "complete.obs") : the standard deviation is zero` * [Remove rows with all or some NAs (missing values) in data.frame](https://stackoverflow.com/questions/4862178/remove-rows-with-all-or-some-nas-missing-values-in-data-frame) ```r! # Subset non-missing values from both variable1 and variable2 before calculating correlation between them # Calculate phenotypic correlation or correlation between any 2 variables for (i in 1:length(input.data)){ for (j in 1:length(input.data)){ # Select observations where both variable 1 and variable 2 are non-missing input.data.complete.cases <- input.data[complete.cases(input.data[,c(i,j)]),] # Get variable 1 variable1 <- input.data.complete.cases[,i] # Get variable 2 variable2 <- input.data.complete.cases[,j] # Compute correlation correlation <- cor(variable1,variable2,method=correlation.method,use = "complete.obs") # Print the result on R console cat(correlation.method, "correlation between ", colnames[i], " and ", colnames[j], " is", correlation, "\n") phenotypic.correlation.matrix[i,j] <- correlation } } ``` #### `ERROR converting summary statistics` in LD score regression is possibly caused by one of the columns are empty *[Error converting summary statistics](https://github.com/bulik/ldsc/issues/66) #### `Error in model.frame.default(formula = cbind(y0 = 1 - y, y1 = y) ~ cutyhat) : variable lengths differ (found for 'cutyhat')` ```r! hoslem.test(na.omit(joined_rm2_ID2rm[,"X20453_0_0_recoded"]), fitted(mod_cann_pyos),g=10) ``` #### `Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels` occurs when there is not enough variation in the dependent variable with only one value. So, you need to drop that variable, irrespective of whether that is numeric or character or factor variable. *[Error in contrasts when defining a linear model in R](https://stackoverflow.com/questions/18171246/error-in-contrasts-when-defining-a-linear-model-in-r) ```r! # Logistic regression: Effect of exposure trait on outcome trait ## Set up covariates formula_covariates="age + TDI + overall_health_rating + factor(inferred.sex) + factor(qualification_edu_6138)+factor(X20116_recodeFinal)" ## Set up dependent and independent variables y=grep("X20453_0_0_recoded",names(joined_rm2_ID2rm),value = T) x1=grep("all_coffee_cpd",names(joined_rm2_ID2rm),value = T) x2=grep("complete_alcohol_unitsweekly",names(joined_rm2_ID2rm),value = T) x3=grep("X3436_recodeMean",names(joined_rm2_ID2rm),value = T) ## Set up formula formual_cann_coff <- paste0(y," ~ ", x1," +", formula_covariates) formual_cann_sdpw <- paste0(y," ~ ", x2," +", formula_covariates) formual_cann_aass <- paste0(y," ~ ", x3," +", formula_covariates) # Run logistic regression (models that are working) mod_cann_coff <- glm(formula= formual_cann_coff,data=joined_rm2_ID2rm,family=binomial(link = "logit")) mod_cann_spdw <- glm(formula= formual_cann_sdpw,data=joined_rm2_ID2rm,family=binomial(link = "logit")) summary(mod_cann_coff) # Coefficients: # Estimate Std. Error z value Pr(>|z|) # (Intercept) 6.330231 0.103336 61.259 < 2e-16 *** # all_coffee_cpd -0.029542 0.005432 -5.438 5.39e-08 *** # age -0.106128 0.001625 -65.329 < 2e-16 *** # TDI 0.117505 0.003944 29.796 < 2e-16 *** # overall_health_rating -0.022334 0.016544 -1.350 0.177 # factor(inferred.sex)1 -0.357984 0.023277 -15.379 < 2e-16 *** # factor(qualification_edu_6138)2 -0.555283 0.032778 -16.941 < 2e-16 *** # factor(qualification_edu_6138)3 -1.114870 0.030114 -37.022 < 2e-16 *** # factor(qualification_edu_6138)4 -1.616490 0.062381 -25.913 < 2e-16 *** # factor(qualification_edu_6138)5 -1.162332 0.051091 -22.750 < 2e-16 *** # factor(qualification_edu_6138)6 -0.943528 0.056040 -16.837 < 2e-16 *** # factor(X20116_recodeFinal)2 0.328791 0.030763 10.688 < 2e-16 *** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 summary(mod_cann_spdw) # Coefficients: # Estimate Std. Error z value Pr(>|z|) # (Intercept) 6.0543672 0.1002092 60.417 <2e-16 *** # complete_alcohol_unitsweekly 0.0089900 0.0006606 13.609 <2e-16 *** # age -0.1065484 0.0015752 -67.640 <2e-16 *** # TDI 0.1183401 0.0038145 31.023 <2e-16 *** # overall_health_rating -0.0124434 0.0160451 -0.776 0.438 # factor(inferred.sex)1 -0.2521848 0.0235726 -10.698 <2e-16 *** # factor(qualification_edu_6138)2 -0.5448068 0.0316619 -17.207 <2e-16 *** # factor(qualification_edu_6138)3 -1.1059661 0.0291064 -37.997 <2e-16 *** # factor(qualification_edu_6138)4 -1.6050344 0.0607355 -26.427 <2e-16 *** # factor(qualification_edu_6138)5 -1.1467857 0.0496522 -23.096 <2e-16 *** # factor(qualification_edu_6138)6 -0.9088833 0.0540306 -16.822 <2e-16 *** # factor(X20116_recodeFinal)2 0.2888456 0.0296775 9.733 <2e-16 *** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # Run logistic regression (models that are NOT working) mod_cann_aass <- glm(formula= formual_cann_aass,data=joined_rm2_ID2rm,family=binomial(link = "logit")) # Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2 or more levels summary(joined_rm2_ID2rm) # X20453_0_0_recoded X3436_recodeMean # Min. :0.00 Min. : 5.0 # 1st Qu.:0.00 1st Qu.:15.0 # Median :0.00 Median :16.0 # Mean :0.22 Mean :17.7 # 3rd Qu.:0.00 3rd Qu.:19.0 # Max. :1.00 Max. :69.0 # NA's :243314 NA's :333374 t <- subset(joined_rm2_ID2rm,(!is.na(joined_rm2_ID2rm[,x3])),select=c(y,x3)) table(t$X20453_0_0_recoded) # 0 1 # 3127 2566 ``` #### `dplyr: Error in n(): function should not be called directly` Possibly you have dplyr and plyr loaded in the same session. Both dplyr and plyr have the functions summarise/summarize. You may have a conflict between plyr and dplyr. You can unload the plyr package and load dplyr package. Both dplyr and plyr have the functions summarise/summarize. * [dplyr: “Error in n(): function should not be called directly”](https://stackoverflow.com/questions/22801153/dplyr-error-in-n-function-should-not-be-called-directly) ```R! # detach("package:reshape2", unload=TRUE) detach("package:plyr", unload=TRUE) library(dplyr) ``` #### `cat: write error: Broken pipe` occurred while using CompileProfileFiles.sh * [cat: write error: Broken pipe](https://askubuntu.com/questions/421663/cat-write-error-broken-pipe) * [Getting error “cat: write error: Broken pipe” only when running bash script non-interactively ](https://stackoverflow.com/questions/39296809/getting-error-cat-write-error-broken-pipe-only-when-running-bash-script-non) ```shell! # Here one of the multiple jobs had walltime shorter than the runtime. But this is not the cause to the error [lunC@hpcpbs01 Scripts]$ cat /mnt/lustre/working/lab_nickm/lunC/PRS_UKB_201711/allelicScoresCompiled/output/uniqSNPs_from_metaDataQCed-Release8-HRCr1.1_AND_SNP-rsNum_from_all-QCed_GWAS-GSCAN/innerJoinedSNPsByCHRBP_metaDataQCed-Release8-HRCr1.1_AND_si-noQIMR-noBLTS.ambiguSNPRemoved.subset/dosageFam_Release8_HRCr1.1/pbs_output/sumPRS_Release8_HRCr1.1_innerJoinedSNPsByCHRBP_metaDataQCed-Release8-HRCr1.1_AND_si-noQIMR-noBLTS.ambiguSNPRemoved.subset_S5.pbs.err =>> PBS: job killed: walltime 3627 exceeded limit 3600 -bash: line 1: 9919 Terminated /var/spool/PBS/mom_priv/jobs/5674535.hpcpbs02.SC ``` #### `Error in plot.new() : figure margins too large` first appears. Then `Error in ... plot.new has not been called yet` appears from every plotting function ```R! # The error occurred when exported plot size is too small plotDimensionRow=1 plotDimensionCol=1 png(file=outputFigFilePath ,width = plotDimensionRow*10 ,height =plotDimensionCol*10 ) # The error is gone when exported plot size enlarges plotDimensionRow=1 plotDimensionCol=1 png(file=outputFigFilePath ,width = plotDimensionRow*450 ,height =plotDimensionCol*450 ) ``` #### `foreach error “could not find function ”%do%“”` Each of the parallel workers operates in a clean R session, so you have to load the foreach package in each worker. Try adding .packages="foreach" to your first line * [foreach error “could not find function ”%do%“](https://stackoverflow.com/questions/25784642/foreach-error-could-not-find-function-do) ```R! # Code working x <- foreach(i=1:length(filePath_clumpedSNP_GSCAN_test), .combine='rbind', .packages="foreach") %dopar% { foreach(j=1:length(pThresholds), .combine='rbind') %do% { } } # Code not working x <- foreach(i=1:length(filePath_clumpedSNP_GSCAN_test), .combine='rbind' ) %dopar% { foreach(j=1:length(pThresholds), .combine='rbind') %do% { } } ``` #### ```Error in plot.window(xlim, ylim, log = log, ...) : need finite 'ylim' values``` *[R: need finite 'ylim' values in function](https://stackoverflow.com/questions/25871292/r-need-finite-ylim-values-in-function) ```R! # The error occurred when the range() read NA. Add na.rm=T as the following R2_min= range(subplotDataAll$R2,na.rm = TRUE)[1]*100 R2_max= range(subplotDataAll$R2,na.rm = TRUE)[2]*100 ``` #### An example of incorrectly specifying for loops ```R # Incorrect way to write iteration for (i in 1:length(phenoNames)*length(PRSpheno)){ actions } # Correctly write iteration. Put the multiplication in the () for (i in 1:(length(phenoNames)*length(PRSpheno))){ actions } ``` #### ```Error in \[.data.frame`(dataPhenoPRS, , fixEffect) : undefined columns selected``` This usually results from the variable name is not exactly matched between a data set and another data set that the first data set looks for ```R # The case of "ImpCov" doesn't match if (i >= 401 & i <=480){ categoricalCovars=c("wave","nSEX","ImpCov") } else { categoricalCovars=c("wave","wave_dup","nSEX","ImpCov") } # the case of "impCov" in the data that the code looks for for (fixEffect in allFixedEffectVar){ dfPart3$R2[which(dfPart3$rowNames== fixEffect)] <- (dfPart3$fix_eff[which(dfPart3$rowNames== fixEffect)]/sd(x=dataPhenoPRS[,depVarName],na.rm= T)*sd(dataPhenoPRS[,fixEffect],na.rm= T))**2 } ```

Read more

Plot start dates and end dates of events

Debugging SAS programs

Data visualisation in R

SAS programming for CDISC SDTM variables