R
grep
list.files
basename
dirname
regex
Sys.glob
file.copy
unlink
gsub
as.numeric
strsplit
glob2rx
Sys.genenv
dplyr::bind_rows
dplyr::filter
dplyr::mutate
grepl
names
stringr::str_replace_all
stringr::str_extract
sub
expression
eval
Regular Expressions in R
Quantifiers specify how many repetitions of the pattern.
*
: matches at least 0 times.
+
: matches at least 1 times.
?
: matches at most 1 times.
{n}
: matches exactly n times.
{n,}
: matches at least n times.
{n,m}
: matches between n and m times.
.: matches any single character, as shown in the first example.
[…]: a character list, matches any one of the characters inside the square brackets. We can also use - inside the brackets to specify a range of characters.
Regular Expressions in R
Character classes allows to – surprise! – specify entire classes of characters, such as numbers, letters, etc. There are two flavors of character classes, one uses [: and :] around a predefined name inside square brackets and the other uses \ and a special character. They are sometimes interchangeable.
\d: digits, equivalent to [0-9].
\D: non-digits, equivalent to [^0-9].
\w: word characters, equivalent to [[:alnum:]_]
or [A-z0-9_]
.
\W: not word, equivalent to [^A-z0-9_].
\s: space,
.
\S: not space.
[:digit:] or \d: digits, 0 1 2 3 4 5 6 7 8 9, equivalent to [0-9].
[:lower:]: lower-case letters, equivalent to [a-z].
[:upper:]: upper-case letters, equivalent to [A-Z].
[:alpha:]: alphabetic characters, equivalent to [[:lower:][:upper:]] or [A-z].
[:alnum:]: alphanumeric characters, equivalent to [[:alpha:][:digit:]] or [A-z0-9].
[:xdigit:]: hexadecimal digits (base 16), 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f, equivalent to [0-9A-Fa-f].
[:blank:]: blank characters, i.e. space and tab.
[:space:]: space characters: tab, newline, vertical tab, form feed, carriage return, space.
[:punct:]: punctuation characters, e.g.,
[:graph:]: graphical (human readable) characters: equivalent to [[:alnum:][:punct:]].
[:print:]: printable characters, equivalent to [[:alnum:][:punct:]\s].
[:cntrl:]: control characters, like \n or \r, [\x00-\x1F\x7F].
Note:
[:...:]
has to be used inside square brackets, e.g. [[:digit:]]
.
\
itself is a special character that needs escape, e.g. \\d
. Do not confuse these regular expressions with R escape sequences such as \t
.
gsub
, sub
Extract only numbers from a string with punctuation and spaces in R? [duplicate]
R Only extract 3 digit numbers from a string
*remove all delimiters at beginning and end of string
dplyr::filter(!grepl("pattern",variable))
glob2rx()
Change wildcard aka globbing patterns into the corresponding regular expressions (regexp).?
means 1
+
means a number between 1 and > 1
|
means "or"
stringr::str_replace_all
gsub(data$column,)
list.files(path=c(folder1,folder2,..),pattern=glob2rx(^patter1*patter2$),full.names=TRUE)
Sys.glob() doesn't seem to take a complex pattern. glob2rx() converts a pattern including a wildcard into the equivalent regular expression. You then need to pass this regular expression onto one of R's pattern matching tools. The 5 patterns are specified with list.files(path=,pattern=glob2rx(p1|p2|p3|p4|p5),full.names=TRUE)
will return full paths of matched filesstrsplit()
, you've got to escape the .
with \\.
, or use a charclass [.]
. Otherwise you use . as its special character meaning, "any single character".mv
in UNIX, first copy the directory recursively to the destination folder, check if the dates of last modification are preserved, and then delete the folders that have been copied.*Copy folders from one directory to another in R
R script file path: /mnt/backedup/home/lunC/scripts/MR_step06-05-01_prepare-input-files-for-MR-PRESSO.R ↩︎
R script file path:
/mnt/backedup/home/lunC/scripts/MR_ICC_GSCAN_201806/MR_step06-03-07_run_heterogeneity-test.R ↩︎
R script file path: /mnt/backedup/home/lunC/scripts/MR_ICC_GSCAN_201806/MR_step06-05-01_prepare-input-files-for-MR-PRESSO.R ↩︎
R script file path:
/mnt/backedup/home/lunC/scripts/MR_ICC_GSCAN_201806/MR_step08-02-03_parse-tabulate_LDSC-SNP-heritability_LDSC-genetic-correlations.R ↩︎