Linux
Unix
Shell
Bash
awk
grep
for
sort
tail
cat
tail
sed
tr
paste
regex
uniq
ls
realpath
wc
echo
cut
which
cp
scp
find
exec
mv
rm
ls
realpath
xargs
basename
dirname
dos2unix
od
EOL
notepad++
windows characters
od
mkdir
crontab
wget
ulimit
Chang's collection of working examples in Bash, Shell
Syntax awk -F "$file_delimiter" -v awk_var1=${shell_var1} -v awk_var2=${shell_var2} 'BEGIN {print "colname1","colname2"} {print $awk_var1,$awk_var2}' $filePath
for
and awk
.Get the line count from 2nd line of the file
awk
AWK: How to Compare Two Variables with Regular Expression
awk -F" +"
. It's dangerous to list files from multiple folders using a single ls -l
, as files will be sorted rather than the order you put after ls -l
. If you need just file paths, use realpath
*Repeat each line multiple times
awk -v
awk BEGIN{}
and awk {action}
$
. You can't embed them inside a /regex constant/
it's just text in there.awk first_file second_file
In awk, FNR
refers to the record number (typically the line number) in the current file and NR
refers to the total record number. The operator == is a comparison operator, which returns true when the two surrounding operands are equal. This means that the condition NR==FNR
is only true for the first_file, as FNR resets back to 1 for the first line of each file but NR keeps on increasing. This pattern is typically used to perform actions on only the first file. The next inside the block means any further commands are skipped, so they are only run on files other than the first. The condition FNR==NR
compares the same two operands as NR==FNR
, so it behaves in the same way.awk -v
The -v option allows shell variables to pass into your awk script.( .*)
using parameter expansion How can I remove the extension of a filename in a shell script?echo "expression"
, escape double quotes within and dollar signs for variables that need not to expand (e.g. awk variables). Don't escape dollar signs for variables that need to expand (e.g. output folder paths) How do I echo an expression with both single and double quotes?echo
doesn't print variable values correctly, it's likely due to Windows special character not readable to Linux. dos2unix
the CSV file before reading itecho $var
changes separators to white space. Use echo "$var"
to keep the original separator unchangedEcho changes my tabs to spaces
echo "$locLDSC/munge_sumstats.py..."
is exactly the same as the command in the next linetouch
commandcut -d
is used to specify the delimiter (default=tab)
cut -f
specify the field that should be cut, allowing use of a field range
cut
cut
.*How to define 'tab' delimiter with 'cut' in BASH?
paste
paste -d
is used to specify the delimiter in the output file (default=tab)
cut
doesn't support ordering of fields. Reorder columns of a space-delimited file using cut and paste or awkold column order: $1,$2......$19
new column order: $2,$1,$3...$19
$var
within single quotes that are enclosed by double quotes by \
if you don't want it to be interpreted. The ' single quote character in your echo example gets it literal value (and loses its meaning) as it enclosed in double quotes ("). The enclosing characters are the double quotes.Single quote within double quotes and the Bash reference manual
Removing all special characters from a string in Bash
sed tip: Remove / Delete All Leading Blank Spaces / Tabs ( whitespace ) From Each Line
syntax sed -i '1i text' filename
* -i
option stands for "in-place" editing. It is used to modify the file without the need to save the output of sed command to some temporary file and then replacing the original file.
* 1i
or 1 i
1 is to select first line. i means inserting text and newline
sed -n '2p' < file.txt
will print 2nd line
sed -n '2011p' < file.txt
will print 2011th line
sed -n '10,33p' < file.txt
prints line 10 up to line 33
sed -n '1p;3p' < file.txt
prints 1st and 3th line
sed 'NUMq;d' file
Where NUM is the number of the line you want to print; so, for example, sed '10q;d' file to print the 10th line of file.
Explanation:
NUMq will quit immediately when the line number is NUM.
d will delete the line instead of printing it; this is inhibited on the last line because the q causes the rest of the script to be skipped when quitting.
If you have NUM in a variable, you will want to use double quotes instead of single: sed "${NUM}q;d" file
Bash tool to get nth line from a file
Difference Between tr and sed Command
+
with white-space ' ' and this type of replacement can be done with both tr as well sed command as belowsed
instead of tr
tr
has done character based transformation and it is replacing good to best as g=b, o=e, o=s, d=t and because o is double so it ignores the first rule and using o=s.sed
is string based transformation and if there will 'good' string more than one time those will replace with 'best'for
loop is a block of code that iterates through a list of commands as long as the loop control condition is true. During each pass through the loop, arg takes on the value of each successive variable in the listHow to iterate Bash for Loop variable range under Unix or Linux
Using command line argument range in bash for loop prints brackets containing the arguments
IFS=$'\n'
which
locates a program file in the user's path. For each of its arguments which prints to stdout the full path of the executable(s). It does this by searching the directories listed in the environment variable PATH.Why not use “which”? What to use then?
cp -r /...A /...B
copies A folder and everything in it to the B folder
cp -r /...A/ /...B
copies everything in A folder (not including A) to B folder
cp -n A B
or cp --no-clobber A B
A does not overwrite an existing file B Linux how to copy but not overwrite?
cd /user_B/destination-folder; scp -r user_A_account@user_A_hostname:/user_A/source-folder .
scp username@remote:/file/to/send /where/to/put
scp /file/to/send username@remote:/where/to/put
scp -r username@hostname:/sourceDirectory /destinationDirectory
find . -maxdepth 1 -type f
.
is the directory to search
-maxdepth 1
limits the directory to search to 1 level (i.e., the current directory not subdirectorys)
-type f
finds only files
-type d
finds only directory
List only regular files (but not directories) in current directory
Linux find folders without files but only subfolders [closed]
find
commandsyntax form find sourceFolder -name '*.*' -exec mv {} destinationFolder \;
find sourceFolder -name '*.*'
finds the pattern specified by -name
in sourceFolder
-exec
runs any command
{}
inserts the filename found
\;
marks the end of the exec command
ls
syntaxls -l
gives a long listing of all files.
ls -r
lists the files in the reverse of the order that they would otherwise have been listed in.
ls -t
lists the files in order of the time when they were last modified (newest first) rather than in alphabetical order.
ls -lrt
gives a long listing, oldest first, which is handy for seeing which files in a large directory have recently been changed.
ls -F
to add a trailing /
to the names of directories (folders shown ended with ; files not)
ls -F
to add a trailing /
to the names of directories (folders shown ended with ; files not)mv /path/sourcefolder/* /path/destinationfolder/
rm -r /path/sourcefolder/
-r
means recursively, meaning everything in that directory
xargs
and cp
xargs
collects the input from the pipe and then executes its arguments with the input appended. xargs takes the stdout of commandA as the parameter for commandBcommandA | xargs commandB
/mnt/backedup/home/lunC/scripts/PRS_UKB_201711_step01_obtainGWASSummaryStatisticsFromDiscoverySamples.sh
Problem: For some reason, when I open files from a unix server on my windows machine, they occasionally have Macintosh EOL conversion, and when I edit/save them again they don't work properly on the unix server. I only use notepad ++ to edit files from this unix server, so is there a way to create a macro that automatically converts EOL to Unix format whenever I open a file?
Solution?: Your issue may be with whatever FTP program you are using. For example, I use WinSCP to remote into a Unix server, Notepad++ is set as my default editor, but I had to go into WinSCP's settings and set the transfer mode to Binary in order to keep line endings preserved. So, you may be able to reconfigure your FTP/SCP/etc program to transfer the files in a different manner
Solution: That functionality is already built into Notepad++. From the "Edit" menu, select "EOL Conversion" -> "UNIX/OSX Format".
dos2unix
converts DOS/MAC to UNIX text file format. Do the format conversion when an error occurs but you just cannot fine the bug. Script files created in Windows by Notepad++ or Sublime text can have Windows characters. These can cause errors when running the script file. Here is a sign of hidden Windows character.awk '{if ($1 ~ /pattern1/ && $1 ~ /pattern2/) print $0}' file
grep -E 'pattern1' file | grep -E 'pattern2'
grep
finds a matched pattern. grep -v
excludes a matched pattern (i.e. invert match)*Negative matching using grep (match lines that do not contain foo)
od dumps files in octal and other formats
od [OPTION]... [FILE]...
Option | |
---|---|
-t TYPE, –format=TYPE | select output format or formats |
format | |
---|---|
-c (same as -t c) | select ASCII characters or backslash escapes as format |
od -t
rmdir directoryname
removes the directory but only if it's empty
rm -r directoryname
removes the directory whether it's empty
mkdir -p
the command will create all the directories necessaries to fulfill your request, not returning any error in case that directory exists
crontab
Linux Crontab Syntax. Linux crontab has six fields. 1-5 fields defines the date and time of execution. The 6’th fields are used for command or script to be executed.The Linux crontab syntax are as following. Note (1) you can use either multiple values OR a range, not a mixture, (2) Cron doesn't support fractions in the time Crontab in Linux with 20 Useful Examples to Schedule Jobs
[Minute] [hour] [Day_of_the_Month] [Month_of_the_Year] [Day_of_the_Week] [command]
crontab
-f, --file=ARCHIVE
use archive file or device ARCHIVE-v, --verbose
verbosely list files processed-x, --extract, --get
extract files from an archive-z, --gzip
filter the archive through gzipgunzip
to decompress a file as shown in the following example:Parallel wget in Bash [duplicate]
Looking at the source for the page, the href values in the actual links are things like:
which are relative paths to the host website, not absolute URLs. I don't know if wget --spider
is sophisticated enough to actually convert those to links that you can download.
Conrad
At this point I'd be looking at something like Python's excellent beautifulsoup library to read the page and construct the links you want manually.
The ulimit
command sets or reports user process resource limits. The default limits are defined and applied when a new user is added to the system. Limits are categorized as either soft or hard. With the ulimit command, you can change your soft limits for the current shell environment, up to the maximum set by the hard limits. You must have root user authority to change resource hard limits HOWTO: Use ulimit command to set soft limits.
Each core that Cell Ranger uses will spawn 64 user processes, so you may run into problems if your system has a limit on the max user processes, 4069. You can use ulimit -u
to find out the limit.