# Linux Shell Scripting / January 2024 (live) :::danger ## Infos and important links - This is the archive version of the document https://hackmd.io/@AaltoSciComp/shellscripting2024archive - Program and materials: https://aaltoscicomp.github.io/linux-shell/ - Lecture's demospace copy: https://users.aalto.fi/~degtyai1/shell/ - Prerequisites: - Make sure you have a working Linux terminal on your machine or access to a remote machine running Linux. - Suggestion: if you have just one screen (e.g. a laptop), we recommend arranging your windows like this: ```bash ╔═══════════╗ ╔═════════════╗ ║ ║ ║ SHELL ║ ║ ZOOM ║ ║ WINDOW ║ ║ WINDOW ║ ╚═════════════╝ ║ WITH ║ ╔═════════════╗ ║ COURSE ║ ║ BROWSER ║ ║ STREAM ║ ║ W/Q&A ║ ╚═══════════╝ ╚═════════════╝ ``` - Linux Command Line Cheat Sheet https://cheatography.com/davechild/cheat-sheets/linux-command-line/ * **Do not put names or identifying information on this page** ::: *Please do not edit above this* # Icebreaker **Test how to use this hedgedoc live document, this is how we ask questions and get help.** - This is a question, or is it? - This is a reply, yes it is! - This is a nested comment to the reply - Actually... insert smart comment here - This is a different reply, but still valid - This is another question, or maybe not? - A comment with a `bit of code pasted` - Here a longer code block: ```[bash] #!/bin/bash whoami hostname str="hello" echo $str ``` - ... # Day 1 ## Icebreaker polls **1. Which OS do you use most of the time?** *Vote by adding a 'o' to the chosen option* - Windows: ooooo - Mac: ooooo - Linux: ooooooooo - ...? **2. My experience in Linux Shell scripting** - Completely new to Linux shell: oo - Basics (cd, ls, scp...): oooooooo - Daily user of command line; can read shell scripts: ooooo - Experienced shell user, I wrote my own scripts: ooo **3. Do you have a terminal window open and are you ready to start?** - Yes: ooooooooooooooooo - Not yet: o - I will only watch: o **4. What kind of work might you use shell scripting for?** - e.g. using a HPC cluster, automating your workflow, other scientific stuff, commercial stuff, ... - When dealing with lots of files (neuroimaging), shell scripting is great! (but remember that too many tiny files are bad for the filesystem) - Manipulating text files - Prepare some parallel jobs for doing calculation - running jobs on HPC cluster/automating work (protein structure alignments, RNA seq analysis for example > so biology research) o ## Exercises section ### [10 minutes: Starter Hello Bash] * Roughly from this section: https://aaltoscicomp.github.io/linux-shell/variables-functions-environments/#your-bin-and-path * Create a directory `~/bin` (if does not exist) and the first `hello_bash.sh` script that outputs `Hello Bash!` string. Make it executable as `~/bin/hello_bash.sh` * Make `~/bin` part of `PATH` variable ### [10 minutes: Variables] * Modify `hello_bash.sh`, assign `Hello Bash!` string to a variable and print the variable value to the output. * Create a script in `~/bin/printvars.sh` that outputs enviroment variables $HOME, $SHELL, $PATH one per line. ```bash ... hello_bash.sh #!/bin/bash # prints 'Hello Bash' message var='Hello Bash!' echo "$var" ... printvars.sh #!/bin/bash echo "HOME: $HOME" echo "SHELL: $SHELL" echo "PATH: $PATH" ``` ### [15 minutes: Variables continue] * Modify `hello_bash.sh` so that it changes variable on the fly to print string in capitals like HELLO BASH! * Define a variable `fpath=/home/user/archive.tar.gz` and make your script to return filename only without the full path and extenssion, i.e. `archive`. Hint: get rid of the path first, assing it to a new variable, and then print the name without extension. ```bash #!/bin/bash fpath=/home/user/archive.tar.gz # here we get archive.tar.gz fpath=${fpath##*/} # get archive out of archive.tar.gz echo ${fpath%%.*} ``` ### [15 minutes: Functions] * Add `spaceusage()` and `me()` to `~/bin/functions` (if not yet done so). Source the file and try that functions work. * Using find utility, implement a fast find function `ff()`. `ff search_word` must return all the files and directories in the current folder which name contains a 'search_word'. Let it be case insensitive. Hint: `find . -iname ...` ```bash ff() { word=${1:?search word is missing}; find . -iname "*$word*" } ``` ### [10 minutes: Redirections and piping] * Redirect output of `ls -lA` to a file with the name like `file.YYYY-MM-DD.out` where YYYY-MM-DD is a result of command `date +%Y-%m-%d` substitution * Make a pipe that counts number of files/directories (including dot files) in your directory. Hints: check 'wc' command for counting * Using pipes and commands `echo`, `tr` and `uniq`, find doubled words out of `My Do Do list: Find a a Doubled Word`. Hint: split sentence into words one per line and use `uniq -d` to get duplications only ```bash ls -lA > file.$(date +%Y-%m-%d).out ls -lA | wc -l echo 'My Do Do list: Find a a Doubled Word' | tr -s ' ' '\n' | uniq -d ``` ### [15 minutes: Conditionals] ### [15 minutes: Conditionals: matching operator] ## Intro - How does this relate to the previous course? - This is part 2, same learning materials: https://aaltoscicomp.github.io/linux-shell/ - The link with the history of what Ivan is typing is here https://users.aalto.fi/~degtyai1/shell/ (and you also have it above at the beginning of this doc) ## Variables, functions, environments *Materials https://aaltoscicomp.github.io/linux-shell/variables-functions-environments/* - I use VScode, I guess that is also fine for writing the scripts? - Yes, it is. Just save the .sh file in the same folder where your terminal is running, so you can execute the script from the terminal shell. - Does anyone know recommended extensions for scripts, or are built-in ones good enough? - I am not a VScode heavy user myself, but this blogpost seems relevant https://medium.com/devops-and-sre-learning/recommended-bash-scripting-extensions-for-vs-code-67c62a132978 #### Exercise (above) until xx:33 ## Continuining with Variables, functions, environments *Materials https://aaltoscicomp.github.io/linux-shell/variables-functions-environments/* :::success ### Exercise #2 ### [10 minutes: Variables]: till xx:00 + break ### Back on stream at xx:09 * Modify `hello_bash.sh`, assign `Hello Bash!` string to a variable and print the variable value to the output. * Create a script in `~/bin/printvars.sh` that outputs enviroment variables $HOME, $SHELL, $PATH one per line. ::: #### Progress - Done: oooooooooo - Yet need time: - Do not wait for me, go ahead: o --- - where should we write questions about the exercises - Here is good +1 - Remember to have a break :) - I have the same first line in printvars as in hello_bash, '!#/bin/bash', but it still complains '!#/bin/bash: No such file or directory'. However the script is executed normally. Why is it complaining? - must be `#!`, the order matters, the number sign first, then the exclamation mark - thanks! my bad.. but why it was executed with the wrong shebang? - The question mark inverts the exit status of the command following it. I am unsure why here it compains about bin bash. The rest of the script should run anyway. ## Continuining with Variables, functions, environments *Materials https://aaltoscicomp.github.io/linux-shell/variables-functions-environments/* :::success ## Exercise ### [15 minutes: Variables continue] until ~xx:35 * Modify `hello_bash.sh` so that it changes variable on the fly to print string in capitals like HELLO BASH! * Define a variable `fpath=/home/user/archive.tar.gz` and make your script to return filename only without the full path and extenssion, i.e. `archive`. Hint: get rid of the path first, assing it to a new variable, and then print the name without extension. ::: #### Progress - Done: oooo - Yet need time: - Do not wait for me, go ahead: ooo FYI: Answers to previous exercise are up on this doc, at https://notes.coderefinery.org/shellscripting2024?both#10-minutes-Variables - Are quotation marks after echo needed? - In general, it depends on what is after. If it's something that the shell would interpert, then yes - WHy things like this `${${part%.*}%.*}` do not work? - we could get into "why is was shell made the way it was (decades ago)", but basically, it seems that the `part` part has to be a name of a variable, and can't itself be substituted. This is unfortunately "just how it works" but maybe also keeps it simpler? +1 - I think nested parameters expansion is not allowed in bash - Yes, `${var...}` does not allow nesting :/. It is still a scripting language, not a real programming :) --- ## Functions *Materials at https://aaltoscicomp.github.io/linux-shell/variables-functions-environments/#functions* :::success ### Exercise until xx:22 + 10min break (until xx:32) ### [15 minutes: Functions] * Add `spaceusage()` and `me()` to `~/bin/functions` (if not yet done so). Source the file and try that functions work. * Using find utility, implement a fast find function `ff()`. `ff search_word` must return all the files and directories in the current folder which name contains a 'search_word'. Let it be case insensitive. Hint: `find . -iname ...` ::: #### Progress - Done: oooo - Yet need time: - Do not wait for me, go ahead: o --- - I did 'me' function, added to source, it works, but in the output there is always 'command not found' - You miss `echo`, the material example has "an error" to fix on the fly, has been explained in demo - can you paste the line which tries to print the "me" stuff? - the code for the function - ```#whoami as a function me() { $(id -un):$(id -gn)@$(hostname -s) } ``` - try an "echo" there. Without, it's like trying to run your username as a command - thanks, it works now :::success ### Exercise until xx:03 ### [10 minutes: Redirections and piping] * Redirect output of `ls -lA` to a file with the name like `file.YYYY-MM-DD.out` where YYYY-MM-DD is a result of command `date +%Y-%m-%d` substitution * Make a pipe that counts number of files/directories (including dot files) in your directory. Hints: check 'wc' command for counting * Using pipes and commands `echo`, `tr` and `uniq`, find doubled words out of `My Do Do list: Find a a Doubled Word`. Hint: split sentence into words one per line and use `uniq -d` to get duplications only ::: #### Progress - Done: ooooo - Yet need time: - Do not wait for me, go ahead: oo --- When done with the exercise: we continue tomorrow at 12:00 sharp ## Feedback for today Please write some comments about today, mention something that worked well and something that can be improved. We will try to publish the recordings by tomorrow morning - ... - Thank you for the day1! The time allocated for the exercises + breaks was enough for me to do the 'easy' tasks. But that doesn't mean the exersises are too difficult, it might mean I am not really intermediate. Anyway it was useful for me as is. - Thank you! Timing worked ok. - Thanks! The exercises were quite good for demonstrating concepts, and the time allotted was good enough, too. I loved that almost all exercises also contained some or the other tip that is helpful and the descriptions of common mistakes/errors. - Thank you for good a presentation and exercises! The time was enough for most but not all exercises, but I could follow along well anyway! - - # Day 2 ## Icebreaker **1. You might be familiar with a few shell commands, which one is the most difficult to use or most confusing with its multiple options...? Anything else that you find challenging when looking at shell scripts?** - Back when I started the command `find` was a bit misterious, but then I spent some time studying it. Also `sed` and `awk` are commands I still don't fully master - Regular expressions are still a mystery to me! :D +2 - Yes, awk (can only manage with much googling currently :D) - ... - ... - ... ## Conditionals *Materials https://aaltoscicomp.github.io/linux-shell/conditionals/* - ... - ... - ... :::success ### Exercise until ~xx:45 ### [15 minutes: Conditionals] * add check for the command line argument, if given, print the directory name to be archived if not, then print current directory name * check that archive name (i.e. *.tar.gz) we want to create does not exist. If it exists, then do nothing, print an error and exit. ::: **Progress:** *Mark your answer with an `o` here below* Done: ooooo Still working on it: o Not doing it: oooo - -n does not check if the directory exists, right? - in where? test-conditionals? In test / `[ ]` / `[[ ]]` it checks "is the string nonzero", `-d` is "file exists and is a directory", `-e` is "file exists" - i mean in the previous exercise in the conditional. I could put any string as a directory, even if it does not exist and the code would say that it exists - yeah, true. For in that case `-d` would be better. - Alright thanks - [[ -n string ]] checksthat string is not empty, it is not checking the directory existence but can be helpful to check a varibale or a string :::success ### Break (10m) and then Exercise (15m) until ~xx:32 ### [15 minutes: Conditionals: matching operator] `tarit.sh` * validate the given directory path, the path may have only alphanumeric symbols, dots, underscore and slashes as a directory delimiter. Hints: `[[ $d =~ regexpr ]]` ::: **Progress:** *Mark your answer with an `o` here below* Done: ooo Still working on it: oo Not doing it: ooo :::success ## Exercice until ~xx:01 (+ break until xx:10) ### [10 minutes: Arithmetics] * make a script that uses `read` command to get two integers, then compares them and prints the smaller one. If the numbers are equal, then the script prints that they are equal * (*) add check that given input is an integer ::: **Progress:** *Mark your answer with an `o` here below* Done: oooo Still working on it: o Not doing it: o - Can you please explain again what +$ at the end of the regex means? - `$` - end of string. It means that when matching, the stuff right before `$` has to be right before the end of the string. - `+` means "token before must exist one or more times". It's unrelated to the `$` - for example `a+$` would mean "a, one or more times, right before the end of the string". - Okay thank you :::success ### Exercise ### [10 minutes: For loop] ~xx:38 * Create a number of dummy empty files with `touch {1..5}.TXT`. Make script that renames all *.TXT the files to *.txt. Hint: for renaming, use `mv` and `${...}` combo. ::: **Progress:** *Mark your answer with an `o` here below* Done: ooo Still working on it: o Not doing it: o :::success ### Exercise till xx:07 ### [15 minutes: While loop] * Use `while.sh` as an example, use the same students.csv file and count total number of students. Hint: one must ommit lines with comments, empty line and the header, here you can either implement `if [[ "$f1" =~ ^[0-9\"]+$ ]] ...` or a process substitution `< <(grep -v -E '^$|^=' "$file")` and then sum over the field with the total number of students per Univ. ::: That is the last part of the Day 2 ## Feedback for today *Please let us know what went well, what can be improved, and any other comment you might have on the course so far or the course materials* - The pace was very nice for me - Could have used a slightly slower pace with the lectures. But I guess that would mean covering less topics, which would not be good either. - ... - ... - ---------------------------------- Please always write new questions at the end of the document, above this line here ^^.