# 2023-05-09-NCL Unix Shell
### This document:
# https://hackmd.io/@rseteam-ncl/2023-05-09-NCL
### JupyterHub:
# https://jupyter.ncldata.dev
**We asked for your uni login when you registered.Please use that to login - it is <span style="color:red;">case sensitive</span>. You don't have a password yet. The password you enter the first time you log in will become your password.**
### Introduction ###
- Instructors
- Helpers
- What is The Carpentries
- What is the RSE Team
- Coffee breaks and lunch
- Morning break: 11:00
- Lunch: 13:00
- Afternoon break: 15:30
### Links:
- [Code of Conduct](https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html)
- [Workshop website](https://nclrse-training.github.io/2023-05-09-NCL/)
- [Link to lesson website](https://swcarpentry.github.io/shell-novice/)
- [Link to data](https://swcarpentry.github.io/shell-novice/data/shell-lesson-data.zip)
- [Pre-workshop survey](https://carpentries.typeform.com/to/wi32rS?slug=2023-05-09-NCL)
- [Post-workshop survey](https://carpentries.typeform.com/to/UgVdRQ?slug=2023-05-09-NCL)
### Attendance:
## Please sign in using your <span style="color:red">university email</span> and your <span style="color:red">name</span>:
1. jannetta.steyn@newcastle.ac.uk, Jannetta Steyn
2. richard.noble@newcastle.ac.uk, Richard Noble
3. s.lane4@newcastle.ac.uk, Stephen Lane
4. S.Saleem3@newcastle.ac.uk
5. k.p.lendoye-l'eyebe2@newcastle.ac.uk , Knectt Lendoye
6. l.a.bruce@newcastle.ac.uk, Lawrence Bruce
7. a.bell16@newcastle.ac.uk, Alexander Bell
8. M.Deivarajan-Suresh2@newcastle.ac.uk, Mukilan Suresh
9. bethany.little@ncl.ac.uk, Beth Little
10. b9036125@newcastle.ac.uk, Matthaios Charidis
11. c0034294@newcastle.ac.uk, Xudong Mao
### Notes:
## Episode 4 Exercise 5
Pipe Reading Comprehension
A file called `animals.csv` (in the `shell-lesson-data/exercise-data/animal-counts` folder) contains the following data:
```
2012-11-05,deer,5
2012-11-05,rabbit,22
2012-11-05,raccoon,7
2012-11-06,rabbit,19
2012-11-06,deer,2
2012-11-06,fox,4
2012-11-07,rabbit,16
2012-11-07,bear,1
```
What text passes through each of the pipes and the final redirect in the pipeline below?
Note: The sort -r command sorts in reverse order.
`$ cat animals.csv | head -n 5 | tail -n 3 | sort -r > final.txt`
Hint: Build the pipeline up one command at a time to test your understanding.
## Episode 4 Exercise 6
Pipe Construction
For the file `animals.csv` from the previous exercise, consider the following command:
BASH
`$ cut -d , -f 2 animals.csv`
The `cut` command is used to remove or ‘cut out’ certain sections of each line in the file, and cut expects the lines to be separated into columns by a Tab character. A character used in this way is a called a delimiter. In the example above we use the -d option to specify the comma as our delimiter character. We have also used the -f option to specify that we want to extract the second field (column). This gives the following output:
**OUTPUT**
```
deer
rabbit
raccoon
rabbit
deer
fox
rabbit
bear
```
The uniq command filters out adjacent matching lines in a file. How could you extend this pipeline (using uniq and another command) to find out what animals the file contains (without any duplicates in their names)?
## Episode 4 Exercise 7
Which Pipe?
The file `animals.csv` contains 8 lines of data formatted as follows:
OUTPUT
```
2012-11-05,deer,5
2012-11-05,rabbit,22
2012-11-05,raccoon,7
2012-11-06,rabbit,19
````
The uniq command has a -c option which gives a count of the number of times a line occurs in its input. Assuming your current directory is shell-lesson-data/exercise-data/animal-counts, what command would you use to produce a table that shows the total count of each type of animal in the file?
```
sort animals.csv | uniq -c
sort -t, -k2,2 animals.csv | uniq -c
cut -d, -f 2 animals.csv | uniq -c
cut -d, -f 2 animals.csv | sort | uniq -c
cut -d, -f 2 animals.csv | sort | uniq -c | wc -l
```
## Episode 5 Exercise 3
Limiting Sets of Files
What would be the output of running the following loop in the shell-lesson-data/molecules directory?
```
> for filename in c*
> do
> ls $filename
> done`
```
1. No files are listed.
1. All files are listed.
1. Only `cubane.pdb`, `octane.pdb` and `pentane.pdb` are listed.
1. Only `cubane.pdb` is listed.
```cubane.pdb lengths.txt octane.pdb propane.pdb
ethane.pdb methane.pdb pentane.pdb sorted-lengths.txt
```
How would the output differ from using this command instead?
```
> for filename in *c*
> do
> ls $filename
> done
```
The same files would be listed. All the files are listed this time. No files are listed this time. The files `cubane.pdb` and `octane.pdb` will be listed. Only the file `octane.pdb` will be listed.
## Nested loop
```
for fn1 in `ls`
do
echo ----
echo $fn1
echo ----
for fn2 in `ls $fn1`
do
echo $fn2
done
done
```
## 7.2 Tracking a Species
Leah has several hundred data files saved in one directory, each of which is formatted like this:
```
2013-11-05,deer,5
2013-11-05,rabbit,22
2013-11-05,raccoon,7
2013-11-06,rabbit,19
2013-11-06,deer,2
```
She wants to write a shell script that takes a species as the first command-line argument and a directory as the second argument. The script should return one file called `species.txt` containing a list of dates and the number of that species seen on each date. For example using the data shown above, `rabbit.txt` would contain:
```
2013-11-05,22
2013-11-06,19
```
Put these commands and pipes in the right order to achieve this:
```
cut -d : -f 2
>
|
grep -w $1 -r $2
|
$1.txt
cut -d , -f 1,3
```
Hint: Use `man grep` to look for how to grep text recursively in a directory and `man cut` to select more than one field in a line.
An example of such a file is provided in `shell-lesson-data/data/animal-counts/animals.txt`.