Try   HackMD

2022 UC Carpentries Fall Workshop (The Unix Shell)

Workshop Details
Dates: September 6th - 13th, 2022
Time: 9am - 12pm

Workshop Agenda:
https://ucsdlib.github.io/2022-09-06-carpentries-uc/

Day 4: The Unix Shell

Software Installation:
Git for Windows - Windows
Windows: https://gitforwindows.org

Mac users will use the built-in Terminal application
required for Unix Shell and Git lesson

Recommended Text editors to use:
Nano

Lesson Data (download)

NOTES:

What is the shell?
Intro slides: https://docs.google.com/presentation/d/1JvAk7a-SgYUN4y8xXtQKB2Z2iwup0hEfJZmsZ-xlJ5Q/edit?usp=sharing

Navigating the filesystem
Webpage for help using the shell on windows machines: http://man.he.net/

Working with files and directories

Automating the tedious with loops
More about shell scripts: https://ryanstutorials.net/bash-scripting-tutorial/

Counting and mining with the shell
Regular Expressions resources:
LC regex cheatsheet: https://librarycarpentry.org/lc-data-intro/reference.html
regex101:https://regex101.com/
regexper: https://regexper.com/
Regex Testing: https://www.regextester.com/

Workshop Day 4

First name and Last Name/Organization/Dept./Email

Name (first & last) Organization Dept. Email
(example) Jane Doe UCSD IT jdoe1@ucsd.edu
Tom Le UCM tle267@ucmerced.edu
John Thompson UC Merced Cell & Molecular Biology jthompson44@ucmerced.edu
Sam Erickson UC Merced Physics serickson3@ucmerced.edu
Sarina Qin UC Merced Nat Sci sqin@ucmerced.edu
Douglas Zhang UCSD Chemistry and Biochemistry doz023@ucsd.edu
Kenan Chan UCSD SIO kmc001@ucsd.edu
Bruce Hamilton UCSD CMM bah@health.ucsd.edu
Jonathan Le UCR Mathematics jle173@ucr.edu
Amber Heidbrink UCSD Cell and Developmental Biology aheidbrink@ucsd.edu
Dayana Elizalde UCR deliz002@ucr.edu
Shang Su U Toledo Cell and Cancer Biology Shang.su@utoledo.edu
Christopher Gray UCR Computer Sciencee cgray024@ucr.edu
Oishee Misra UCSD Economics omisra@ucsd.edu
Ha Vu UCSD Economics vha@ucsd.edu
Michelle Wu UCLA Neuroscience mwwu@ucla.edu
Jay Chi UCSB ETS jaychi@ucsb.edu
Osika Tripathi UCSD PUblic Health otripathi@health.ucsd.edu
Kazuma Nagatsuka UCSD Robotics(Mechanical Engineering) knagatsuka@ucsd.edu
Melodi Frey UCSD CMM mtastemel@health.ucsd.edu
Matthew Falcone UCB Civil and Environmental Engineering matthew_falcone@berkeley.edu
Ivan Felix Rios UCSD Mathemathics & Economics ifelixrios@ucsd.edu
Brett Taylor UCSD Biomedical Sciences b5taylor@ucsd.edu
Agnieszka Pluta UCLA Psychology agpluta@ucla.edu
Jun Tan UCSD Economics j4tan@ucsd.edu
Roberto Silva UCSD Scripps rosilva@ucsd.edu
Vishakha Malhotra UCSF Biostatistics and Epidemiology vishakha.malhotra@ucsf.edu
Lillie Pennington UC Merced Life and Environmental Sciences lpennington@ucmerced.edu
Alexander Frey UCSD Rady School of Management alexander.frey@rady.ucsd.edu
Cherie Thompson UCSD Library
Dexin Zhou UCSD Mathematics dzhou@ucsd.edu
Nicole Rosenberg UCSD Scripps nrosenberg@ucsd.edu
Mugen Blue UCM EECS mblue3@ucmerced.edu
Haley Potts UCSD Math & Economics hpotts@ucsd.edu
Aleks Leszczynska UCSD Pediatrics aleszczynska@ucsd.edu
Junxiao Gao UCSF Biostatistics and Epidemiology Junxiao.Gao@ucsf.edu
Steven Krehel UCSD Economics skrehel@ucsd.edu
Zhaoning (Johnny) Wang UCSD CMM zhw063@health.ucsd.edu
Daryl Han UC Irvine Student Center and Event Services ddhan@uci.edu
Josiah Piceno UCM MBSE jpiceno3@ucmerced.edu
Apisit Kaewsanit UCSF Epidemiology and Biostatistics apisit.kaewsanit@ucsf.edu
Jacob Ross UCSD anesthesiology jaross@ucsd.edu
Jay Colond UCM sociology jcolond@ucmerced.edu
Anshika Kandhway UCM Environmetal System akandhway@ucmerced.edu
Dilawer Ali UCM Mechanical Engineering dali4@
Bineh Ndefru UCLA Materials Science bndefru@ucla.edu
Mario Cuaya UCR Computer Science mcuay001@ucr.edu
Andrew Chan UCSD IGPP andrewchan@ucsd.edu
Andrew Gorin UC Berkeley Earth and Planetery Science andrew_gorin@berkeley.edu
Donald Zarate UCR Political Science and Psychology dzara016@ucr.edu
Jessica Wu-Woods UCR Microbiology jwuw001@ucr.edu

Day 4 Exercises

  1. Copying a file
    Instead of moving a file, you might want to copy a file (make a duplicate), for instance to make a backup before modifying a file. Just like the mv command, the cp command takes two arguments: the old name and the new name. How would you make a copy of the file gulliver.txt called gulliver-backup.txt? Try it!

  2. Renaming a directory
    Renaming a directory works in the same way as renaming a file. Try using the mv command to rename the firstdir directory to backup.

  3. Moving a file into a directory
    If the last argument you give to the mv command is a directory, not a file, the file given in the first argument will be moved to that directory. Try using the mv command to move the file gulliver-backup.txt into the backup folder.

  4. For loop exercise
    Complete the blanks in the for loop below to print the name, first line, and last line of each text file in the current directory.

___ file in *.txt
___
    echo "_file"
    head -n 1 _____
    ____ __ _ _____
___
  1. Count, sort and print (faded example)
    To count the total lines in every tsv file, sort the results and then print the first line of the file we use the following:

    wc -l *.tsv | sort -n | head -n 1

    Now let’s change the scenario. We want to know the 10 files that contain the most words. Fill in the blanks below to count the words for each file, put them into order, and then make an output of the 10 files with the most words (Hint: The sort command sorts in ascending order by default).

    __ -w *.tsv | sort __ | ______

  2. Counting number of files
    Let’s make a different pipeline. You want to find out how many files and directories there are in the current directory. Try to see if you can pipe the output from ls into wc to find the answer.

  3. Counting the number of words
    Check the manual for the wc command (either using man wc or wc help) to see if you can find out what flag to use to print out the number of words (but not the number of lines and bytes). Try it with the .tsv files.
    If you have time, you can also try to sort the results by piping it to sort. And/or explore the other flags of wc.

  4. Case sensitive search in select files
    Search for all case sensitive instances of a word you choose in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to the shell.

  5. Count words (case sensitive)
    Count all case sensitive instances of a word you choose in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to the shell.

  6. Case insensitive search in select files (whole word)
    Search for all case insensitive instances of that whole word in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to a file results/hero-i.tsv.

Day 4 Reflection

Please enter how you will use the Unix Shell in your work or research here:

  1. As a sysadmin, I use the shell everyday with SSH. Move and migrate data all the time with tools like rsync. Getting to the point where I prefer to manage files/folders with shell rather than macOS GUI. Many of the commands from the workshop were familiar, but defintely found some very helpful additional commands to help manage my systems more efficiently (i.e. CTRL-r). Thanks!

Day 4 Survey

Link to survey for today: https://forms.gle/5qgx8X6H3GRMacwD6

Day 4 Questions

Please enter any questions not answered during live session here:
1.

End Day 4