UCSD Carpentries Bootcamp - Python & Git (June 2022)

--- tags: ucsd-carpentries --- # UCSD Carpentries Bootcamp - Python & Git (June 2022) **Workshop Details** Dates: June 6th-9th, 2022 Days: Monday - Thursday Time: 1-4pm **Workshop Agenda:** https://kthoma2484.github.io/2022-06-06-UCSD/ **Software Installation:** https://www.anaconda.com/products/individual **Lesson Data (download)** http://swcarpentry.github.io/python-novice-gapminder/files/python-novice-gapminder-data.zip ## NOTES: A copy of the instructor live session notes will be made available to participants upon request at the end of the workshop. Jupyterlab will be used for the lessons [m] Markdown cell = notes [#]also works in code cell for notes [b] = add cell below [a] is above [r]Raw cells cannot have text edits https://www.markdownguide.org/getting-started/ https://www.markdownguide.org/basic-syntax/ ## Workshop Day 1 - (39) ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization |Dept. | Email | | ------------------------- | ------------ | ---- | ----- | Lenissa Alcantara|UCSD|Biological Sciences|lmalay@ucsd.edu | Mathi Ganapathi | UCSD | Biomedical| | Janice Reimer | UCSD | CMM| jmreimer@health.ucsd.edu | | Joseph Oh | UCSD | Bio | juo014@ucsd.edu Shannon D'Ambrosio | UCSD | BioSci|sdambros@ucsd.edu| | Michelle Truong | UCSD | IGDPH | mitruong@health.ucsd.edu | |Clara Ortez |UCSD Health |Psychiatry |caortez@health.ucsd.edu| |Xiaohui Lyu | UCSD |BioSci | xil004@ucsd.edu | | Kiana Miyamoto |UCSD | BioSci | ktmiyamo@ucsd.edu | | Erin Schiksnis | UCSD | Bio |eschiksn@ucsd.edu | | Elizabeth Alcantara | UCSD | Psychiatry|elalcant@ucsd.edu | Andres Nevarez| UCSD | Bio | ajnevare@ucsd.edu | |Van Ninh |UCSD |Medicine |vaninh@health.ucsd.edu| | Jose Chacon | UCSD | Biosci |jgchacon@ucsd.edu | | Dina Zangwill | UCSD | BioSci | dzangwil@ucsd.edu | |Aga Kendrick | UCSD |CMM |agkendrick@ucsd.edu| | Po-Kai Hsu | UCSD | BioSci | pohsu@ucsd.edu |Gabriel Manzanarez | UCSD| Biosci | gmullinm@ucsd.edu |Brandon Gutierrez | UCSD | International Business | bjgutierrez@ucsd.edu | | Lydia Keppler | UCSD | Scripps Oceanography | lkeppler@ucsd.edu | | Kseniya Malukhina | UCSD | CMM | ksmalukhina@ucsd.edu | | Christina Agu | CSUF | Library | cagu2000@csu.fullerton.edu | | Arianna Brevi | UCSD | Medicine | abrevi@health.ucsd.edu | | Amulya Lingaraju| UCSD|Medicine|alingaraju@health.ucsd.edu | | | | | Rebecca Green| UCSD| Bio| regreen@ucsd.edu | | | | | Chris Day | UCSD | Bio | cdday@ucsd.edu | | Livia Songster | UCSD | Biosci | osongste@ucsd.edu | | Lin Zhang | UCSD | MCC | liz004@health.ucsd.edu | | Danny Heinz | UCSD | Biosci | dheinz@ucsd.edu | Kian Falah | UCSD | Medicine | kfalah@ucsd.edu |Erin Schiksnis|UCSD|Bio|eschiksn@ucsd.edu Kian Falah | UCSD | Medicine | kfalah@ucsd.edu Mridu Kapur| UCSD| CMM| mkapur@health.ucsd.edu ## Day 1 Questions: Please enter any questions not answered during live session here: 1. Is there an advantage of jupyter over spyder, since they both come with anaconda? I don’t have first hand experience with Spyder, but one suggested differentiation is as follows: Consider Jupyter if you work on data-driven projects where you need to easily present data to a non-technical audience. Consider Spyder for building data science applications with multiple scripts that reference each other. What would be the advantage to coding in JupyterLab vs. Visual Studio Code vs. command line? ### Markdown cheatsheet: https://www.markdownguide.org/cheat-sheet/ ```python= first_name = 'Ahmed' age = 42 print(first_name) ``` ```python= print(first_name, 'is', age, 'years old') ``` ```python= print(last_name) #will return a nameError because last_name varialbe is not defined ``` ```python= last_name = "Soares" print(last_name) ``` ```python= age = age + 1 print("Age is now:", age) ``` ```python= print(first_name) ``` # Exercise 1 **What are the values of the variables in this program after each statement is executed.** ```python x = 1 y = 3 swap = x x = y y = swap ``` ```python= favorite_person = 'baby yoda' print(favorite_person) ``` using an index to get a single character from a string ```python= print(favorite_person[0]) print(favorite_person[8]) print(favorite_person[9]) ``` ```python= print(favorite_person[0:4]) ``` use length function to find the length of a string ```python= len(favorite_person) ``` ```python= print(favorite_person[8]) ``` ```python= print(favorite_person[-1]) print(favorite_person[-5:-1]) print(favorite_person[-5:]) print(favorite_person[:4]) print(favorite_person[:]) print(favorite_person) ``` ## EXERCISE Given the following string ```python= species_name = "Acacia buxifolia" ``` What would these expressions return? * species_name[2:8] * species_name[11:] (without a value after the colon) * species_name[:4] (without a value before the colon) * species_name[:] (just a colon) * species_name[11:-3] * species_name[-5:-3] * What happens when you choose a stop value which is out of range? (i.e., try species_name[0:20] or species_name[:103]) ```python= # try to run the code. ``` finding data type ```python= type(first_name) print(type(age)) print(type(first_name)) ``` ```python= type(first_name) print(type(age)) print(5-3) print(last_name) print(first_name + last_name) print(first_name + " " + last_name) separator = '=' * 10 ``` ```python= variable_one = 1 variable_two = 5 * variable_one variable_one = 2 print('first is', variable_one, 'and second is', variable_two) ``` ```python= print('string to float:', float('3.4')) print('float to int:', int(3.4)) print('string to float:', float("hello world")) #will return 'could not convert string to float error' ``` ```python= #integer division // # float division / # % (modulo) ``` ```python= num_subjects = 600 num_per_survey = 42 num_surveys = num_subjects // num_per_survey print(num_surveys) ``` ```python= num_surveys = num_subjects // num_per_survey + 1 print(num_surveys) ``` ```python= print(num_subjects / num_per_survey) ``` ```python= num_surveys = (num_subjects -1) // num_per_survey + 1 print(num_surveys) ``` ### built-in Functions ```python= # print(argument) example usage for print() print(len(first_name)) ``` ```python= print(first_name) print('before') print() print('after') ``` ```python= result = print('example') print('result is ', result) ``` #remember python will always return someting ### other functions min, max, round ```python= print(max(1,2,3)) print(min('a','b','0')) ``` ```python= round(3.712) ``` ## help function ```python= help(round) ``` **quick help in active cell hold `shift + tab` and a help menu will display** ### Errors in python SyntaxError RuntimeError --- # End Day 1 ## Workshop Day 2 ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization | Dept. | Email | Mathi Ganapathi | UCSD | Biomedical informatics | mganapathi@ucsd.edu | Reid Otsuji| UCSD | Library | rotsuji@ucsd.edu |Joseph Oh | UCSD |Bio. |juo014@ucsd.edu | David Palmquist | CSUF |Library| dpalmquist@fullerton.edu | |Elizabeth Alcantara |UCSD|Psychiatry|elalcant@ucsd.edu| | Xiaohui Lyu | UCSD |BioSci | xil004@ucsd.edu| | Janice Reimer | UCSD | CMM | jmreimer@health.ucsd.edu| |Van Ninh | UCSD | Medicine |vaninh@health.ucsd.edu | |Rebecca Green | UCSD |Bio| regreen@ucsd.edu | | | | | Christina Agu | CSUF | Library| cagu2000@csu.fullerton.edu | | Po-Kai Hsu | UCSD | BioSci | pohsu@ucsd.edu | | Danny Heinz | UCSD | Biosci|dheinz@ucsd.edu | | | | Aga Kendrick | UCSD| CMM|agkendrick@ucsd.edu| | | | |Andres Nevarez |UCSD | Bio | ajnevare@ucsd.edu | | | Kian Falah |UCSD | Medicine | kfalah@ucsd.edu | | Brandon Gutierrez | UCSD | International Business | bjgutierrez@ucsd.edu | | | | | | | Arianna Brevi | UCSD | Medicine | abrevi@health.ucsd.edu | | Livia Songster | UCSD | Biosci | osongste@ucsd.edu | |Clara Ortez |UCSD Health|Psychiatry|caortez@health.ucsd.edu| | | | | | | Lenissa Alcantara | UCSD | BioSci | lmalay@ucsd.edu | | | | | | | Jose Chacon | UCSD | Biological Sciences | jgchacon@ucsd.edu | | | | | | | Amulya Lingaraju|UCSD|Medicine|alingaraju@health.ucsd.edu | | | | | Lydia Keppler |UCSD |Scripps Oceanography |lkeppler@ucsd.edu | | Kseniya Malukhina |UCSD |CMM |ksmalukhina@ucsd.edu | | Lin Zhang | UCSD | MCC | liz004@health.ucsd.edu | |Erin Schiksnis | UCSD | Bio | eschiksn@ucsd.edu | | Kiana Miyamoto |UCSD | BioSci | ktmiyamo@ucsd.edu | | chris day |ucsd |bio | daycd@ucsd.edu | | Michelle Truong | UCSD | IDGPH | mitruong@health.ucsd.edu | Dina Zangwill | UCSD | BioSci | dzangwil@ucsd.edu Mridu Kapur|UCSD|CMM|mkapur@health.ucsd.edu Shannon D'Ambrosio | UCSD| BioSci| sdambros@ucsd.edu Gabriel Mullin-Manzanarez| UCSD| Biosci|gmullinm@ucsd.edu # Day 2 - (32) ## Collaborative notes: Please enter any questions not answered during live session here: 1. **Lesson data: download both datasets:** http://swcarpentry.github.io/python-novice-gapminder/files/python-novice-gapminder-data.zip https://kthoma2484.github.io/2022-06-06-UCSD/data/inflammation-01.csv link to finding Python libraries: https://docs.python.org/3/library/ **import library math** ```python= import math ``` **using the math library** ```python= print('pi is ', math.pi) print('cos(pi) is', math.cos(math.pi)) ``` **using the numpy library** ```python= import numpy as np ``` **load dataset using numpy** ```python= np.loadtxt(fname='inflammation-01.csv',delimiter = ',') # when loading datsets make sure you specify the correct file path ``` ```python= data = np.loadtxt(fname='inflammation-01.csv', delimiter = ',') ``` ```python= #check data type for the np array print(type(data)) ``` creating variable names: don't create variable names starting with a number use CameCaseIfYouHaveAlongVariableName use `_` (underscore) to_seperate_words_in_variable_names do not use special characters you can press the `tab` keyafter typing the first few charactersin a viriable name to autocomplete long variable names ### mean calculation: ```python= meanval = np.mean(data) print(meanval) ``` ### max, min, sd calc: ```python= maxval = np.max(data) minval = np.min(data) stdval = np.std(data) ``` ```python= print('max, min, and sd:', maxval,'|', minval, '|',stdval) ``` ## Pandas and dataframes pandas data manipulation and data analysis ```python= import pandas as pd #import pandas library data = pd.read_csv('gapminder_gdp_oceania.csv', index_col='country') # set data to variable print(data) #view imported data ``` ``` # Exercise: Read the data in 'gapminder_gdp_europe.csv' into a variable called 'europe' and display its country. the parameters for the function are 'index_col = ?????' ``` ```python= #answer: europe = pd.read_csv('gapminder_gdp_europe.csv', index_col='country') print(europe) ``` dataframe.info ```python= data.info() #information about dataframe print(data.columns)#dataframe column info print(data.T) #transpose dataframe print(data.describe()) #quick stats for data ``` ### save to data file ```python= data.to_csv('mynewdatafile.csv') ``` Subsetting: ```python= data = pd.DataFrame(data) print(data.iloc[0,0]) ``` ```python= print(data.loc['Italy':'Poland','gdpPercap_1962':'gdppercap_1972']) ``` ```python= subset = data.loc['Italy':'Poland','gdpPercap_1962':'gdppercap_1972'] print('Subset of data:\n', subset) ``` ```python= print('\where are the values larger than 10000', subset > 10000) ``` ```python= mask = subset > 10000 print(subset[mask]) ``` ```python= print(subset > 10000) ``` ```python= def multiplyby5(x): return x*5 data['pop_by5_1954'] = data['gdpPercap_1952'].apply(multiplyby5) print(data['gdpPercap_1952']) ``` ``` # Challenge 2 Assume Pandas has been imported into your notebook and the Gapminder GDP data for Europe has been loaded: import pandas as pd df = pd.read_csv('data/gapminder_gdp_europe.csv', index_col='country') Write an expression to find the Per Capita GDP of Serbia in 2007. ``` ```python= # Answer: print(data.loc['Serbia','gdpPercap_2007']) ``` ## Plotting ```python= import matplotlib.pyplot as plt # load library ``` ```python= time = [0,1,2,3] position = [0,100,200,300] plt.plot(time,position) #plot data plt.xlabel('time (hr)') plt.ylabel('position (km)') ``` ```python= import pandas as pd data = pd.read_csv('Desktop/data/gapminder_gdp_oceania.csv', index_col='country') years = data.columns.str.strip('gdpPercap_') data.columns = years.astype(int) data.loc['Australia'].plot() ``` ```python= data.T.plot() plt.ylabel('GDP per capita') ``` ```python= plt.style.use('ggplot') data.T.plot(kind='bar') plt.ylabel('GDP per capita') ``` ```python= # Select two countries' worth of data. gdp_australia = data.loc['Australia'] gdp_nz = data.loc['New Zealand'] # Plot with differently-colored markers. plt.plot(years, gdp_australia, 'b-', label='Australia') plt.plot(years, gdp_nz, 'g-', label='New Zealand') # Create legend. plt.legend(loc='upper left') plt.xlabel('Year') plt.ylabel('GDP per capita ($)') ``` ```python= plt.scatter(gdp_australia, gdp_nz) ``` ```python= data.T.plot.scatter(x = 'Australia', y = 'New Zealand') ``` ``` Challenge 3 Complete the code to plot the minimum GDP per capita over time for all the countries in Europe. Modify it again to plot the maximum GDP per capita over time for Europe. Code: data_europe = pd.read_csv('data/gapminder_gdp_europe.csv', index_col='country') dataeurope.__.plot(label='min') dataeurope.__. plt.legend(loc='best') plt.xticks(rotation=90) ``` Great job everyone! # End Day 2 ## Workshop Day 3 - (30) ### First name and Last Name/Organization/Dept./Email | Name (first & last) | Organization | Dept. | Email | | ------------------------- | ------------ | ----- | --------------- | |Joseph Oh| UCSD| Medicine| juo014@ucsd.edu| |Van Ninh |UCSD |Medicine |vaninh@health.ucsd.edu |Shannon D'Ambrosio | UCSD | BioSci|sdambros@ucsd.edu|| |Elizabeth Alcantara |UCSD BIO |Pyschiatry| elalcant@ucsd.edu| | Lenissa Alcantara | UCSD | BioSci | lmalay@ucsd.edu | | Christina Agu | CSUF | Library | cagu2000@csu.fullerton.edu | | Livia Songster | UCSD | BioSci | osongste@ucsd.edu | |Erin Schiksnis | UCSD | Bio | eschiksn@ucsd.edu | | Danny Heinz | UCSD | BioSci | dheinz@ucsd.edu | | Arianna Brevi | UCSD | Medicine | abrevi@health.ucsd.edu | | Aga Kendrick | UCSD | CMM | agkendrick@ucsd.edu | | Po-Kai Hsu | UCSD | BioSci | pohsu@ucsd.edu | |Lin Zhang | UCSD | MCC | liz004@health.ucsd.edu | | | | | Gabriel Mullin-Manzanarez | UCSD | Biosci | gmullinm@ucsd.edu | | | Dina Zangwill | UCSD | BioSci |dzangwil@ucsd.edu | |Clara Ortez|UCSD Health|Psychiatry | caortez@health.ucsd.edu | Rebecca Green| UCSD| Bio| regreen@ucsd.edu | | | | | Jose Chacon | UCSD | BioSci | jgchacon@ucsd.edu | | Xiaohui Lyu | UCSD | BioSci | xil004@ucsd.edu | | Kian Falah | UCSD | Medicine | kfalah@ucsd.edu | | Amulya Lingaraju|UCSD|Medicine|alingaraju@health.ucsd.edu | | | | | chris day | UCSD | Bio | cdday@ucsd.edu | | | | | | |Andres Nevarez | UCSD | Bio | ajnevare@ucsd.edu | |Kiana Miyamoto |UCSD |BioSci |ktmiyamo@ucsd.edu | | Kseniya Malukhina | UCSD | CMM |ksmalukhina@ucsd.edu | | Brandon Gutierrez | UCSD | International Business | bjgutierrez@ucsd.edu | | | | | | | | | | | | | | | | | | | | | ## Day 3 Questions: Please enter any questions not answered during live session here: 1. ## lists ```python= pressures = [0.273, 0.275, 0.277, 0.275, 0.276] print('pressures:', pressures) print('length:', len(pressures)) ``` ### change value in the list ```python= print('zeroth item of pressures:', pressures[0]) # change value in the list pressures[0] = .0265 print('pressure at index:', pressures ) ``` ### append to a list ```python= #append to a list primes = [2, 3, 5] print('primes is initially:', primes) primes.append(7) print('primes has become:', primes) ``` ### extend append list ```python= teen_primes = [11, 13, 17, 19] middle_aged_primes = [37, 41, 43, 47] #extend a list primes.extend(teen_primes) print('primes has now become:', primes) #append to a list primes.append(middle_aged_primes) print('primes has finally become:', primes) ``` ### remove items from a list using `del` ```python= primes = [2, 3, 5, 7, 9] print('primes before removing last item:', primes) del primes[4] print('primes after removing last item:', primes) ``` ```python= mylist2 = #empty list mylist2.append('one') print(mylist2) ``` lists can have different data types ```python= goals = [1, 'eat', 2,'drink',3,'sleep'] print('my goals today:', goals[0], goals[1],goals[2],goals[3]) ``` ```python= goals = [1, 'eat', 2,'drink',3,'sleep'] goals2 = "1 Eat 2 Drink 3 Sleep" print('In a list index 3 is the item:', goals[3]) print('In a string index 3 is the character', goals2[3]) ``` ### Challenege Fill in the blanks so that the program below produces the output shown. values = values.(1) values.(3) values.(5) print('first time:', values) values = values[__] print('secondtime:',values) ```python= # challenge answer values = [] values.append(1) values.append(3) values.append(5) print('first time:', values) del values[0] print('secondtime:',values) ``` ### writing functions ```python= def say_hello(): print('hello!') say_hello() #running the function ``` ```python= def print_date(year,month,day): joined = str(year) + '/' + str(month) + '/' + str(day) print(joined) print_date('2022','1','2 print_date(month = 1, year = 2019, day = 22) #assign value to the variable in the function ``` ### function `return` ```python= values = [1,5,4,9,8,3] def average(values): if len(values) == 0: return None return sum(values) / len(values) avg = average([1,3,4]) print(avg) emptyAvg = average([]) print(emptyAvg) ``` ```python= result = print_date(1871,3,19) print('result of print_date:', result) ``` ### Challenge: What do I need to do to get a result printed? ```python= def print_time(hour, minute, second): time_string = str(hour) + ':' + str(minute) + ':' + str(second) print(time_string) result = print_time(11, 37, 59) print('result of call is:', result) ``` ```python= # challenge answer def print_time(hour, minute, second): time_string = str(hour) + ':' + str(minute) + ':' + str(second) print(time_string) return time_string #need to add the 'return' to display correct output value result = print_time(11, 37, 59) print('result of call is:', result) ``` ## Loops defining a for loop ```python= for number in [2,3,5]: #defining a for loop basic syntax print(number) # indentation is important! ``` ```python primes = [2,3,5] for p in primes: squared = p ** 2 #square each item in list cubed = p ** 3 #cube each item in the list print(p, squared, cubed) #print the list and the results squared and cubed ``` ```python= for number in range(0,3): print(number) ``` ```python= for number in range(1,10,2): print(number) ``` ```python= total = 0 for number in range(10): total = total + (number + 1) print(total) ``` ### challenge ```python= # Concatenate all words: ["red", "green", "blue"] => "redgreenblue" words = ["red", "green", "blue"] result = ____ for ____ in ____: ____ print(result) ``` ```python= words = ["red", "green", "blue"] result = "" for x in words: result = result + x print(result) ``` ### Conditionals ```python= mass = 3.54 if mass > 3.0: print(mass, 'is large') masses =[3.54, 2.07, 1.34, 4.0] for m in masses: if m > 3.0: print(m, 'is large') else: print(m, 'is small') ``` ```python= massNo = [3.54, 2.07, 1.34, 4.0, 5.4, 1, .05] for m in massNo: if m > 3: print(m, 'is huge') elif m < 3 and m > 1: print(m, 'is large') else: print('neither') ``` ## lookWhat is Version Controling over datasets ```python= import pandas as pd for filename in ['Desktop/data/gapminder_gdp_africa.csv','Desktop/data/gapminder_gdp_asia.csv' ]: #make sure to use the right file path to your data data = pd.read_csv(filename, index_col='country') print(filename, data.min()) ``` ```python= import glob for filename in glob.glob('Desktop/data/gapminder_*.csv'): # make sure to have the right file path to your data data = pd.read_csv(filename) print(filename, data['gdpPercap_1952'].min()) ``` ## Day 3 Github information: | Github user ID | First Name | Last Name | | --------------------------| ------------ | --------- | | (example) kkt008 | Kimberly | Thomas |juo014|Joseph|Oh | LiviaSongster | Livia |Songster | lencantara | Lenissa | Alcantara | linda1015 | Lin |Zhang | sdambrosio | Shannon | D'Ambrosio | kfalah | Kian |Falah |honeybobacat |Elizabeth |Alcantara| | jmreimer | Janice | Reimer | | eschiks | Erin |Schiksnis | agakendrick |Aga | Kendrick | | ChrisDDDD | Chris |Day | jgc64094 | Jose |Chacon | ktmiyawaki | Kiana | Miyamoto | DinaZangwill | Dina | Zangwill | Kevin-PKhsu | Po-Kai | Hsu | xiaohui121 | Xiaohui |Lyu| | DannyHeinz57 | Danny | Heinz | vninh1 | Van | Ninh | abrevi | Arianna |Brevi | regreen | Rebecca | Green | | | |KseniyaMalukhina |Kseniya |Malukhina | | | | | | | | | | | | | | | ### End Day 3 ## Workshop Day 4 - (30) ### First name and Last Name/Organization/Dept./Email | Github User ID | First Name | Last Name | Organization | Dept. | Email | | -------------- | ---------- | --------- | ------------ | ----- | ----- | | Reid Otsuji| UCSD | Library | rotsuji@ucsd.edu |juo014|Joseph|Oh|UCSD | LiviaSongster | Livia | Songster | UCSD | BioSci | osongste@ucsd.edu | | jmreimer | Janice | Reimer | UCSD | CMM | jmreimer@health.ucsd.edu | |DannyHeinz57 | Danny | Heinz | UCSD | BioSci | dheinz@ucsd.edu | | |vninh1 |Van|Ninh |UCSD |Medicine |vaninh@health.ucsd.edu | | Aga Kendrick | UCSD |CMM | agkendrick@ucsd.edu | | | |honeybobacat |Elizabeth |Alcantara | UCSD |Psychiatry|elalcant@ucsd.edu| | Kevin-PKHsu | Po-Kai | Hsu | UCSD | BioSci | pohsu@ucsd.edu | | Sdambrosio| Shannon| D'Ambrosio | UCSD | BioSci | sdambros@ucsd.edu | |ktmiyawaki |Kiana |Miyamoto |UCSD |BioSci |ktmiyamo@ucsd.edu | | kfalah | Kian | Falah | UCSD | Medicine | kfalah@ucsd.edu | |eschiks | Erin | Schiksnis | UCSD | Bio | eschiksn@ucsd.edu | | lencantara | Lenissa |Alcantara | UCSD | BioSci | lmalay@ucsd.edu | | Rebecca Green|UCSD|Bio|regreen@ucsd.edu | | | | | | | xiaohui121 | Xiaohui | Lyu | UCSD |BioSci | xil004@ucsd.edu | | KseniyaMalukhina |Kseniya |Malukhina |UCSD |CMM | ksmalukhina@ucsd.edu | | cagu2000 | Christina | Agu | CSUF | Library | cagu2000@csu.fullerton.edu | | | Brandon | Gutierrez | UCSD | International Business | bjgutierrez@ucsd.edu | | abrevi |Arianna | Brevi | UCSD | Medicine | abrevi@health.ucsd.edu | |amulyal| Amulya |Lingaraju|UCSD|Medicine|alingaraju@health.ucsd.edu | | | | | | | DIna Zangwill | UCSD |BioSci |dzangwil@ucsd.edu | | | | Lin Zhang | UCSD | MCC | liz004@health.ucsd.edu | | | | chris day | ucsd |bio | cdday@ucsd.edu | | | | gmanzanarez | Gabriel | Mullin-Manzanarez | UCSD | biosci | gmullinm@ucsd.edu | | jgc64094 | Jose | Chacon | UCSD | BioSci | jgchacon@ucsd.edu | | | | | | | | | | | | | | | | | | | | | | ## Day 4 Questions: Please enter any questions not answered during live session here: 1. ### What is Version Control? **Version control** is a name used for software which can help you record changes you make to the files in a directory on your computer. Version control software and tools (such as Git and Subversion/SVN) are often associated with software development, and increasingly, they are being used to collaborate in research and academic environments. **Benefits of using version control?** **Collaboration** - Version control allows us to define formalized ways we can work together and share writing and code. For example merging together sets of changes from different parties enables co-creation of documents and software across distributed teams. **Versioning** - Having a robust and rigorous log of changes to a file, without renaming files (v1, v2, final_copy) **Rolling Back** - Version control allows us to quickly undo a set of changes. This can be useful when new writing or new additions to code introduce problems. **Understanding** - Version control can help you understand how the code or writing came to be, who wrote or contributed particular parts, and who you might ask to help understand it better. **Backup** - While not meant to be a backup solution, using version control systems mean that your code and writing can be stored on multiple other computers. ### What are Git and GitHub? **Git** is one of the most widely used version control systems in the world. It is a free, open source tool that can be downloaded to your local machine and used for logging all changes made to a group of designated computer files (referred to as a “git repository” or “repo” for short) over time. **GitHub** on the other hand is a popular website for hosting and sharing Git repositories remotely. It offers a web interface and provides functionality and a mixture of both free and paid services for working with such repositories. The majority of the content that GitHub hosts is open source software, though increasingly it is being used for other projects. Starting git: Windows OS - run the application `Gitbash` application MacOS - start the `Terminal` application Basic git configiuration commands: ```bash= git config --list #display the git application configuration properties git config --global user.name "Your Name" git config --global user.email "yourname@domain.name" ``` most common Texteditor: Widnows OS set: ```bash= git config --global core.editor "notepad" ``` nano ```bash= git config --global core.editor "nano -w" ``` ### Git commands: run these steps in order: 1. git init #run only once to initialize the folder you want to track 2. git status 3. git add 4. git status 5. git commit -m 'write brief message' 6. git status #your file should now be committed and tracked by git # Git commands you learned: ```bash= git init # init a folder to be tracked by git git status # most importand command check the status of your git commands - use this command often! git add [file.name] #adding the file to the "staging area" git commit -m "write a message" #commiting the file to the "local repository" on your computer. commit messages are required! git restore git diff ### working with remotes in github ### git push -u origin main #push local commits to github repository ``` Connecting local repo to Github ```bash= git remote add origin git@github.com:yourname/hello-world.git git remote -v ``` # SSH setup one time setup - follow the instructions under SSH Background and Setup in the lesson: https://librarycarpentry.org/lc-git/03-sharing/index.html --- # Git cheatsheet: https://education.github.com/git-cheat-sheet-education.pdf ### Chat text: * make a change to the author line in your solo file * git add and commit it to your local repository * try to push your version to origin * should fail, read the errror * if appropriate, use pull to get the new work as part of your local repository * read the messages * as appropriate edit the conflicts identified to resolve the conflict * add and commit your solution to your local repository * push your resolution to the remote for everyone to pick up as they pull ### End Day 4