owned this note
owned this note
Published
Linked with GitHub
# Software Caprentry Intro to Python Workshop - Day 2
### UCSD Biomedical Library Building, Classroom 4 - 9:00am - 4:00pm
### November 19-20, 2019
---
# This HackMD: https://bit.ly/2XxSArl
# signin here ----
name, affiliation, dept/lab
Reid Otsuji, librarian, library
Ryan Johnson, librarian, library
Anand Saran, Postdoc, UCSD, Zarrinpar lab
LiYun Hsu, UCSD, School of Pharmacy MS of DDPM
Alexandra Akscyn, UCSD, School of Pharmacy MS of DDPM
Anjanei Dhayalan, UCSD, School of Pharmacy MS of DDPM
Thania Bejarano, UCSD, Urban Studies and Planning
Viona Deconinck, UCSD, Visual Arts
Amulya Lingaraju, UCSD, Postdoc, Zarrinpar Lab
Meng-Ping Hsieh, UCSD, MS in DDPM
## Getting help after the workshop
* Reid Otsuji, Data Curation Specialist (rotsuji@ucsd.edu)
* Stephanie Labou, Data Science Librarian (slabou@ucsd.edu)
* Data & GIS Lab (in Geisel Library): https://library.ucsd.edu/computing-and-technology/data-and-gis-lab/index.html
* Git/GitHub tutorials: https://ucsd.libguides.com/data-science/version-control
* Python tutorials: https://ucsd.libguides.com/data-science/python
# Day 2 Git Notes
git is the code and local version control (i.e. on your machine). GitHub is the online, collaborative place
#### line endings
Macs:
```
git config --global core.autocrlf input
```
#### h4
Windows:
```
git config --global core.autocrlf true
```
#### Other settings
Text editor (sets text editor as nano):
```
git config --global core.editor "nano -w"
```
Checking global settings:
```
git config --list
```
What else can you configure?
```
git config --help
```
#### Working with git
Initialize a git repository in your current working directory
```
git init
```
This has created a hidden file `.git`
To find out what's going on with git in your folder:
```
git status
```
__Note__: don't nest version controlled folders - it gets very complicated very fast!
Within your git repository folder, all changes will be tracked. Once you make changes, run `git status` again to see what has changed.
Tell git to track a certain file using:
```
git add filename.ext
```
Record the changes to this file using:
```
git commit -m "[message of commit]"
```
_Note_: you must add a commit message, it is not an optional argument
__Important!__: Order of operations is `git add` _then_ `git commit`
These 3 commangs - `git status`, `git add`, and `git commit` - are the majority of what you'll probably do with git!
See history of commits (most recent commit listed first):
```
git log
```
If your history gets long, see an abbreviated history using:
```
git log --oneline
```
See the changes between current status and last commit of file
```
git diff
```
(This will show diff of any files that have changed, or you can specify which file after git diff)
To see changes from staged (you've done `git add` but not `git commit`):
`git diff --staged`
Other `git log` options: use `git log --patch` to see filename for each change
Also: `git log --name-only --oneline`
__Note__: There are a lot of 'flag' options for these comments! (aka, what comes after `--`). The full documentation has a _lot_ of information, but if you have an idea of what you're looking for, it can be useful. For instance, for `git log`: https://www.git-scm.com/docs/git-log
How can you see other previous versions, not just the previous? To see diff from one previous:
```
git diff HEAD~1 filename.txt
```
To see diff from two previous:
```
git diff HEAD~2 filename.txt
```
and so on.
You can also use the alphanumeric identifier for the commit. For example
```
git diff 07c1c262 mars.txt
```
will show difference between current and status at commit 07c1c262 for the document mars.txt.
We've seen how to view differences between files, what how can we actually roll back the status of the file to undo changes?
This is `git checkout`. For example, to roll back a file to a previous status:
```
git checkout 07c1c262 mars.txt
```
Can get back to most recent using:
`git checkout master filename.txt`
#### Working with GitHub
GitHub: https://github.com/
You can connect a "remote" (i.e., GitHub) repository location to a local on your folder. From within your local git repository location:
```
git remote add origin [enter your https://githubURLrepo.git URL]"
```
See what remote repos you have using
```
git remote -v
```
To send changes from local to remote GitHub, use `git push`:
```
git push origin master
```
"origin" here is the GitHub repo and "master" is master branch of local repo
To pull changes from remote GitHub to local, use `git pull`:
```
git pull origin master
```
You may get a pop up that requires you to log in to GitHub when pushing or pulling changes. You can permanently authenticate using:
```
git config credential.helper store
git push [your URL for repo: https://github.com/owner/repo.git]
Username for 'https://github.com': <USERNAME>
Password for 'https://USERNAME@github.com': <PASSWORD>
```
You can also use the following to "cache" your credentials on a Windows machine:
`git config --global credential.helper wincred`
Something useful is the `.gitignore` file. This is when you don't want to track something, or maybe a sub folder that you don't want to push to GitHub, but want to have accessible for you on your local within your git repo. You can create a `.gitignore` and set git to ignore certain files, file extensions, or folders. For example `nano .gitignore` to create and open this file, then add `*.csv` to this document, and git will ignore all files with extension .csv.
## Python Day 2
```
f = 'inflamation-01.csv'
import matplotlib.pyplot as plt
plt.plot
import numpy as np
```
```
data = np.loadtxt(f, delimiter=',')
data
```
# Exercise
In [44]: name = 'Newton'
In [45]: for i in range(4):
...: print(i,name)
...:
0 Newton
1 Newton
2 Newton
3 Newton
for name in name:
...: print (name)
# Day 2 - afternoon - Python continued
### Exercise: Print out the letters in the string 'Newton' using a `for` loop.
```
name = 'Newton'
list(name)
for i in name:
print(i)
```
### Exercise: Use a `for` loop to reverse a string.
```
for i in name:
newstring = i + newstring
print(newstring)
```
### Exercise: Write a `for` loop to sum positive and negative numbers separately.
positive_sum = 0
negative_sum = 0
test_list = [3, 4, 6, 1, -1, -5, 0, 7, -8]
```
for i in test_list:
if i>0:
positive_sum = positive_sum + i
else:
negative_sum = negative_sum + i
```
### Exercise: Write a for loop to sort file types into buckets.
```
from glob import glob
data_files = []
image_files = []
other_things = []
filenames = glob('*')
for filename in filenames:
if 'inflammation' in filename and '.csv' in filename: # This is a data file.
data_files.append(filename)
elif '.png' in filename: # This is an image file.
image_files.append(filename)
else: # Neither a data nor an image file.
other_things.append(filename)
```
### Documenting a function.
```
def make_plots(filename):
fig = matplotlib.pyplot.figure(figsize=(10,3))
data = numpy.loadtxt(filename, delimiter=',')
ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)
ax1.set_ylabel('Average')
ax2.set_ylabel('Max')
ax3.set_ylabel('Min')
ax1.plot(numpy.mean(data, axis=0))
ax2.plot(numpy.max(data, axis=0))
ax3.plot(numpy.min(data, axis=0))
return fig
```
## Looking at Errors
In [1]: def favorite_ice_cream():
...: ice_cream = ["chocolate","vanilla","strawberry"]
...: print(ice_cream[3])
...:
In [2]: favorite_ice_cream
Out[2]: <function __main__.favorite_ice_cream>
In [3]: favorite_ice_cream()
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-3-2abb2966448f> in <module>()
----> 1 favorite_ice_cream()
<ipython-input-1-5c427c5d4c54> in favorite_ice_cream()
1 def favorite_ice_cream():
2 ice_cream = ["chocolate","vanilla","strawberry"]
----> 3 print(ice_cream[3])
4
IndexError: list index out of range
---
def some_function():
return 1
File "<ipython-input-8-0ebe323d3e9c>", line 2
return 1
^
IndentationError: expected an indented block
# History for day 2
```
ls
f = 'inflammation-01.csv'
import matplotlib.pyplot as plt
plt.plot
import numpy as np
data = np.loadtxt(f, delimiter=',')
data
np.diff?
pwd
a = [0, 2, 5, 9, 14]
np.array(a)
a_array = np.array(a)
np.diff(a_array)
datadiff = np.diff(data, axis=1)
datadiff
matplotlib.pyplot.imshow(data)
plt.imshow(data)
plt.figure()
plt.imshow(datadiff)
plt.colorbar()
plt.figure()
plt.imshow(datadiff, cmap=plt.cm.bwr)
plt.colorbar()
history
plt.xlabel('Days')
plt.xlabel('Days')
plt.xlabel('Days')
plt.gcf().set_xlabel('Days')
plt.gca().set_xlabel('Days')
5**2
5*2
5**2
result = 1
for i in range(3):
result = result*num
num = 5
for i in range(3):
result = result*num
result
5**3
result = 1
for i in range(3):
result = result*num
print(result)
for i in range(3):
result = result*num
print(i,result)
result = 1
for i in range(3):
result = result*num
print(i,result)
animals = ['cat', 'dog', 'fish']
for i in animals:
print(i)
range?
for animal in animals:
print(animal)
name = 'Newton'
name = 'Newton'
list(name)
for i in name:
print(i)
name
name = 'Newton'
list(name)
for i in name:
print(i)
name
list(name)
name
name = 'Newton'
name = list(name)
for i in name:
print(i)
name
name = 'Newton'
for i in name:
print(i)
name
name = 'Newton'
for i in range(len(name)):
print(name[i])
len(name)
name = 'Newton'
for i in range(len(name)):
print(i,name[i])
name
newstring = ''
'New' + 'ton'
'New' + 'ton'
newstring
for i in range(len(name)):
newstring = newstring + name[i]
newstring
newstring = ''
for i in range(len(name)):
newstring = newstring + name[i]
print(newstring)
for i in range(len(name)):
newstring = newstring + name[i]
print(i,newstring)
newstring = ''
for i in range(len(name)):
newstring = newstring + name[i]
print(i,newstring)
for i in name:
newstring = newstring + i
print(i,newstring)
newstring = ''
for i in name:
newstring = newstring + i
print(i,newstring)
newstring = ''
for i in name:
newstring = i + newstring
print(newstring)
newstring = ''
newstring = ''
for i in name:
newstring = i + newstring
print(i,newstring)
newstring = ''
for i, letter in enumerate(name):
newstring = i + newstring
print(i,newstring)
newstring = ''
for i, letter in enumerate(name):
print(i,newstring)
newstring = ''
for i, letter in enumerate(name):
print(i,letter)
enumerate?
x = 5
coefficients = [2, 4, 3]
2*x**0 + 4*x**1 + 3*x**2
y = 2*x**0 + 4*x**1 + 3*x**2
for i, cc in enumerate(coefficients):
print(i,cc)
y = 0
for i, cc in enumerate(coefficients):
y = y + cc*x**i
y
for animal in animals:
print(animal)
n = 0
for animal in animals:
print(animal)
n = n + 1
print(n)
for i, animal in enumerate(animals):
print(i,animal)
history
names = ['Curie', 'Newton', 'Turing']
names = ['Curie', 'Newtong', 'Turing']
names[1] = 'Newton'
names
name = 'Darwin'
name[0] = 'd'
names
name[1] = 'Darwin'
names = 'Darwin'
names = ['Curie', 'Newtong', 'Turing']
names[1] = 'Darwin'
names
name
name = 'darwin'
x = [1, 'Darwin', 3.14]
x
x = [['eggs', 'flour', 'sugar'],]
x = [['eggs', 'flour', 'sugar']]
x
x = [['eggs', 'flour', 'sugar'], ['onions', 'cucumbers', 'pepper']]
x
x[0]
x[0][0]
x[-1]
my_list = []
for baking_supply in x[0]:
my_list.append(baking_supply)
my_list
num = 53
if num > 0:
print('number is positive')
elif num==0:
print('number is zero')
else:
print('number is negative')
if num>50 and num<100:
print('number is between 50 and 100')
elif num==0:
print('number is zero')
else:
print('number is negative')
if num<50 or num>100:
print('number is smaller than 50 or greater than 100.')
else:
print('number is between 50 and 100.')
history
list(range(5))
x = 1
counter = 1
for i in range(10):
counter = counter + 1
print(counter)
for i in range(10):
counter += 1
print(counter)
counter = 1
for i in range(10):
counter = counter*i
print(counter)
for i in range(1,10):
counter = counter*i
print(counter)
counter = 1
for i in range(1,10):
counter = counter*i
print(counter)
counter = 1
for i in range(1,10):
counter *= counter
print(counter)
for i in range(1,10):
counter *= i
print(counter)
for i in range(1,10):
counter = counter*i
print(counter)
counter = 1
for i in range(1,10):
counter -= i
print(counter)
positive_sum = 0
negative_sum = 0
test_list = [3, 4, 6, 1, -1, -5, 0, 7, -8]
for i in test_list:
if i>0:
positive_sum = positive_sum + i
else:
negative_sum = negative_sum + i
positive_sum
negative_sum
for i in test_list:
if i>0:
positive_sum = positive_sum + i
elif i==0:
pass # Do nothing.
else:
negative_sum = negative_sum + i
ls
large_files = []
small_files = []
other_things = []
from glob import glob
filenames = glob("*")
filenames
string = "I'm hungry"
string = 'I'm hungry'
filenames
large_files
small_files
other_things
data_files = []
image_files = []
other_things = []
string
if 'hun' in string:
print("'hun' is a substring of "+string)
'inflammation-' in 'inflammation-01.csv'
filenames
other_things
other_things.append('script.py')
other_things
from glob import glob
data_files = []
image_files = []
other_things = []
filenames = glob('*')
for filename in filenames:
if '.csv' in filename: # This is a data file.
data_files.append(filename)
elif '.png' in filename: # This is an image file.
image_files.append(filename)
else: # Neither a data nor an image file.
other_things.append(filename)
data_files
image_files
other_things
data_files
from glob import glob
data_files = []
image_files = []
other_things = []
filenames = glob('*')
for filename in filenames:
if 'inflammation' in filename and '.csv' in filename: # This is a data file.
data_files.append(filename)
elif '.png' in filename: # This is an image file.
image_files.append(filename)
else: # Neither a data nor an image file.
other_things.append(filename)
data_files
from glob import glob
data_files = []
image_files = []
other_things = []
filenames = glob('*')
for filename in filenames:
if 'inflammation' in filename: # This is a data file.
data_files.append(filename)
elif '.png' in filename: # This is an image file.
image_files.append(filename)
else: # Neither a data nor an image file.
other_things.append(filename)
data_files
from glob import glob
data_files = []
image_files = []
other_things = []
filenames = glob('*')
for filename in filenames:
if 'inflammation' in filename and 'csv' in filename: # This is a data file.
data_files.append(filename)
elif '.png' in filename: # This is an image file.
image_files.append(filename)
else: # Neither a data nor an image file.
other_things.append(filename)
data_files
ls
ls *png
glob('*.png')
def make_plots(filename):
fig = plt.figure(figsize=(10,3))
data = np.loadtxt(filename, delimiter=',')
ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)
ax1.set_ylabel('Average')
ax2.set_ylabel('Max')
ax3.set_ylabel('Min')
ax1.plot(np.mean(data, axis=0))
ax2.plot(np.max(data, axis=0))
ax3.plot(np.min(data, axis=0))
return fig
make_plots?
def make_plots(filename):
"Function to make plots of patient data."
fig = plt.figure(figsize=(10,3))
data = np.loadtxt(filename, delimiter=',')
ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)
ax1.set_ylabel('Average')
ax2.set_ylabel('Max')
ax3.set_ylabel('Min')
ax1.plot(np.mean(data, axis=0))
ax2.plot(np.max(data, axis=0))
ax3.plot(np.min(data, axis=0))
return fig
make_plots?
def make_plots(filename):
"""
Function to make plots of patient data.
Example: fig = make_plots(data_filename)
"""
fig = plt.figure(figsize=(10,3))
data = np.loadtxt(filename, delimiter=',')
ax1 = fig.add_subplot(1, 3, 1)
ax2 = fig.add_subplot(1, 3, 2)
ax3 = fig.add_subplot(1, 3, 3)
ax1.set_ylabel('Average')
ax2.set_ylabel('Max')
ax3.set_ylabel('Min')
ax1.plot(np.mean(data, axis=0))
ax2.plot(np.max(data, axis=0))
ax3.plot(np.min(data, axis=0))
return fig
make_plots?
np.mean?
def favorite_ice_cream():
ice_creams = ['chocolate', 'vanilla', 'strawberry']
print(ice_creams[3])
favorite_ice_cream()
def some_function()
def some_function():
return 1
print(a)
print(b)
count = 1
Count = 1
count is Count
count == Count
file_handle = open('myfile.txt', 'r')
file_handle = open('inflammtion-01.csv', 'r')
file_handle = open('inflammation-01.csv', 'r')
clear
numbers = [1.5, 2.3, 0.7, -0.001, 4.4]
total = 0
for n in numbers:
total += n
total
total = 0
for n in numbers:
assert n>0
total += n
for n in numbers:
assert n>0, "Data should only contain positive values."
total += n
```