Workshop Details
Dates: September 6th - 13th, 2022
Time: 9am - 12pm
Workshop Agenda:
https://ucsdlib.github.io/2022-09-06-carpentries-uc/
Software Installation:
Git for Windows - Windows
Windows: https://gitforwindows.org
Mac users will use the built-in Terminal application
required for Unix Shell and Git lesson
Recommended Text editors to use:
Nano
Lesson Data (download)
What is the shell?
Intro slides: https://docs.google.com/presentation/d/1JvAk7a-SgYUN4y8xXtQKB2Z2iwup0hEfJZmsZ-xlJ5Q/edit?usp=sharing
Navigating the filesystem
Webpage for help using the shell on windows machines: http://man.he.net/
Working with files and directories
Automating the tedious with loops
More about shell scripts: https://ryanstutorials.net/bash-scripting-tutorial/
Counting and mining with the shell
Regular Expressions resources:
LC regex cheatsheet: https://librarycarpentry.org/lc-data-intro/reference.html
regex101:https://regex101.com/
regexper: https://regexper.com/
Regex Testing: https://www.regextester.com/
Name (first & last) | Organization | Dept. | |
---|---|---|---|
(example) Jane Doe | UCSD | IT | jdoe1@ucsd.edu |
Tom Le | UCM | tle267@ucmerced.edu | |
John Thompson | UC Merced | Cell & Molecular Biology | jthompson44@ucmerced.edu |
Sam Erickson | UC Merced | Physics | serickson3@ucmerced.edu |
Sarina Qin | UC Merced | Nat Sci | sqin@ucmerced.edu |
Douglas Zhang | UCSD | Chemistry and Biochemistry | doz023@ucsd.edu |
Kenan Chan | UCSD | SIO | kmc001@ucsd.edu |
Bruce Hamilton | UCSD | CMM | bah@health.ucsd.edu |
Jonathan Le | UCR | Mathematics | jle173@ucr.edu |
Amber Heidbrink | UCSD | Cell and Developmental Biology | aheidbrink@ucsd.edu |
Dayana Elizalde | UCR | deliz002@ucr.edu | |
Shang Su | U Toledo | Cell and Cancer Biology | Shang.su@utoledo.edu |
Christopher Gray | UCR | Computer Sciencee | cgray024@ucr.edu |
Oishee Misra | UCSD | Economics | omisra@ucsd.edu |
Ha Vu | UCSD | Economics | vha@ucsd.edu |
Michelle Wu | UCLA | Neuroscience | mwwu@ucla.edu |
Jay Chi | UCSB | ETS | jaychi@ucsb.edu |
Osika Tripathi | UCSD | PUblic Health | otripathi@health.ucsd.edu |
Kazuma Nagatsuka | UCSD | Robotics(Mechanical Engineering) | knagatsuka@ucsd.edu |
Melodi Frey | UCSD | CMM | mtastemel@health.ucsd.edu |
Matthew Falcone | UCB | Civil and Environmental Engineering | matthew_falcone@berkeley.edu |
Ivan Felix Rios | UCSD | Mathemathics & Economics | ifelixrios@ucsd.edu |
Brett Taylor | UCSD | Biomedical Sciences | b5taylor@ucsd.edu |
Agnieszka Pluta | UCLA | Psychology | agpluta@ucla.edu |
Jun Tan | UCSD | Economics | j4tan@ucsd.edu |
Roberto Silva | UCSD | Scripps | rosilva@ucsd.edu |
Vishakha Malhotra | UCSF | Biostatistics and Epidemiology | vishakha.malhotra@ucsf.edu |
Lillie Pennington | UC Merced | Life and Environmental Sciences | lpennington@ucmerced.edu |
Alexander Frey | UCSD | Rady School of Management | alexander.frey@rady.ucsd.edu |
Cherie Thompson | UCSD Library | ||
Dexin Zhou | UCSD | Mathematics | dzhou@ucsd.edu |
Nicole Rosenberg | UCSD | Scripps | nrosenberg@ucsd.edu |
Mugen Blue | UCM | EECS | mblue3@ucmerced.edu |
Haley Potts | UCSD | Math & Economics | hpotts@ucsd.edu |
Aleks Leszczynska | UCSD | Pediatrics | aleszczynska@ucsd.edu |
Junxiao Gao | UCSF | Biostatistics and Epidemiology | Junxiao.Gao@ucsf.edu |
Steven Krehel | UCSD | Economics | skrehel@ucsd.edu |
Zhaoning (Johnny) Wang | UCSD | CMM | zhw063@health.ucsd.edu |
Daryl Han | UC Irvine | Student Center and Event Services | ddhan@uci.edu |
Josiah Piceno | UCM | MBSE | jpiceno3@ucmerced.edu |
Apisit Kaewsanit | UCSF | Epidemiology and Biostatistics | apisit.kaewsanit@ucsf.edu |
Jacob Ross | UCSD | anesthesiology | jaross@ucsd.edu |
Jay Colond | UCM | sociology | jcolond@ucmerced.edu |
Anshika Kandhway | UCM | Environmetal System | akandhway@ucmerced.edu |
Dilawer Ali | UCM | Mechanical Engineering | dali4@ |
Bineh Ndefru | UCLA | Materials Science | bndefru@ucla.edu |
Mario Cuaya | UCR | Computer Science | mcuay001@ucr.edu |
Andrew Chan | UCSD | IGPP | andrewchan@ucsd.edu |
Andrew Gorin | UC Berkeley | Earth and Planetery Science | andrew_gorin@berkeley.edu |
Donald Zarate | UCR | Political Science and Psychology | dzara016@ucr.edu |
Jessica Wu-Woods | UCR | Microbiology | jwuw001@ucr.edu |
Copying a file
Instead of moving a file, you might want to copy a file (make a duplicate), for instance to make a backup before modifying a file. Just like the mv command, the cp command takes two arguments: the old name and the new name. How would you make a copy of the file gulliver.txt called gulliver-backup.txt? Try it!
Renaming a directory
Renaming a directory works in the same way as renaming a file. Try using the mv command to rename the firstdir directory to backup.
Moving a file into a directory
If the last argument you give to the mv command is a directory, not a file, the file given in the first argument will be moved to that directory. Try using the mv command to move the file gulliver-backup.txt into the backup folder.
For loop exercise
Complete the blanks in the for loop below to print the name, first line, and last line of each text file in the current directory.
___ file in *.txt
___
echo "_file"
head -n 1 _____
____ __ _ _____
___
Count, sort and print (faded example)
To count the total lines in every tsv file, sort the results and then print the first line of the file we use the following:
wc -l *.tsv | sort -n | head -n 1
Now let’s change the scenario. We want to know the 10 files that contain the most words. Fill in the blanks below to count the words for each file, put them into order, and then make an output of the 10 files with the most words (Hint: The sort command sorts in ascending order by default).
__ -w *.tsv | sort __ | ______
Counting number of files
Let’s make a different pipeline. You want to find out how many files and directories there are in the current directory. Try to see if you can pipe the output from ls into wc to find the answer.
Counting the number of words
Check the manual for the wc command (either using man wc or wc –help) to see if you can find out what flag to use to print out the number of words (but not the number of lines and bytes). Try it with the .tsv files.
If you have time, you can also try to sort the results by piping it to sort. And/or explore the other flags of wc.
Case sensitive search in select files
Search for all case sensitive instances of a word you choose in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to the shell.
Count words (case sensitive)
Count all case sensitive instances of a word you choose in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to the shell.
Case insensitive search in select files (whole word)
Search for all case insensitive instances of that whole word in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to a file results/hero-i.tsv.
Please enter how you will use the Unix Shell in your work or research here:
Link to survey for today: https://forms.gle/5qgx8X6H3GRMacwD6
Please enter any questions not answered during live session here:
1.