Introduction to Python Programming

Renato Alves & Toby Hodges
EMBL Bio-IT Project

23 & 24 March 2020

EMBL Bio-IT
Sponsored by de.NBI

Course Material

https://hackmd.io/nFqIR8nqQ86FeFNY-Em8nA?both
You can access the course material at https://github.com/tobyhodges/ITPP

  • download ZIP of material
  • unzip
  • move unzipped itpp-master folder to Desktop

Type 'x' below when you've downloaded and unzipped the folder to your Desktop

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxXxxxxxxxXxXXx

Things to do next

It's important to keep learning and playing with your new skills and to start applying them to some small projects! Otherwise, what you have learned in this course will be lost soon

  • More python:

    • Check out the material from an intermediate python course run by EMBL Bio-IT in 2017: https://github.com/mgalardini/2017_python_course
    • EMBL members can join the EMBL Python User Group. The group meets every two weeks for a 60-90 minute session discussing and/or teaching a particular topic. Sessions will be on Zoom for the coming weeks/months (at least) - contact us if you'd like to be added to the mailing list.
      • next session: 2020-03-31 16:00
    • connect with other EMBL members and ask questions on the Python channel of the EMBL chat: https://chat.embl.org/embl/channels/python
  • Scientific computing / data science in python:

  • Coding Exercises:

    • Rosalind - write your own Python programs to solve bioinformatics problems
    • Advent of Code - (Christmas themed) programming challenges. There are 50 for every december since 2015 (250 total!)

Schedule

Sessions marked with a * will take place with the whole group.
Self-directed sessions will take place in breakout rooms with
support from the instructors/helpers.

We will do our best to ensure breaks happen as listed below.
Length of discussion & demo sessions may vary from those listed,
depending on number of questions received, etc.

Day 1
09:30 Introduction & Installation Troubleshooting *
10:00 Self-Paced Work with Support
11:00 Morning Break *
11:15 Working with Lists (Discussion, Demo & Exercise Walkthroughs) *
11:30 Self-Paced Work with Support
12:30 Lunch Break *
13:30 Debugging (Discussion, Demo & Exercise Walkthroughs) *
14:00 Self-Paced Work with Support
15:00 Afternoon Break *
16:00 Discussion, Demo, Feedback & Wrap-up *
17:00 End
Day 2
09:30 Recap & Discussion
10:00 Self-Paced Work with Support
11:00 Morning Break *
11:15 Nested Data Structures (Discussion, Demo & Exercise Walkthroughs) *
11:30 Self-Paced Work with Support
12:30 Lunch Break *
13:30 Reading Data from a File (Discussion, Demo & Exercise Walkthroughs) *
14:00 Self-Paced Work with Support
15:00 Afternoon Break *
16:00 Plotting Exercise Walkthrough & Wrap-up *
17:00 End

Questions

write your questions here - we will review them in the discussion sessions

  • Q: What does [*] before a cell mean ?

    • A: The [*] means the cell is executing. It should turn into a number once done. If it doesn't the python process might be stuck. In the menu above the notebook you should see "Kernel", in that menu select "Interrupt Kernel" or if this doesn't work, "Restart kernel". If you restart the kernel you will lose any defined variables.
  • Q: Is there a shortcut to delete a cell?

    • A: select the cell with the mouse or arrows and press X or dd. See also the Jupyter essentials section further down in this document.
  • Q: what does this operation ^ mean?

    • A: In Python if you want to do the power operation you use ** (e.g 3**3 == 27).
      ^ is a bitwise operation. See this stackoverflow question for more information.
  • Q: Is there a difference between typing "10 + 3" and "10+3" without spaces in the command? I've seen that the given answer is the same, but maybe this distinction is important in further uses.

    • A: No. In python spaces in such context are ignored. Space is important only in the beginning of the line, i.e. for indentation.
  • Q: what is the difference between an object and a variable?

    • A: A variable is a name given to an object (e.g. number=10, the variable is number and the object the number 10). Every entity in Python is an object.
  • Q: I couldn't get the command keys to work

    • A: Press Esc to exit insert mode. The blue box around should disappear. At this point the command keys should work.
  • Q: I don't understand where is the error in the final exercise "Debugging exercise", please could you explain what kind of error is?

    • A: If you try to execute the cell what kind of error do you get? If you fix the error is there a second one?
  • Q: When using .pop(), it doesn't let me choose which mayonnaise I want to remove telling me it can only take one string or command at a time: no numbers then to define which mayonnaise I want gone How do I tell the program which mayonnaise in my list I want gone?

    • A(from GR): I think the object and not the variable has to be entered in parethensis of .pop() for example use .pop(0) instead of .pop(mayonnaise) if you want delete your mayonnaise in position 0 of the list
    • A (from Q): If you do .pop(mayonnaise) , the last mayonnaise is being removed. If you do .pop(number) the word (not necessarily mayonnaise) is being removed. That's what I experienced. :D
    • A (from Renato): help(shopping.pop) should help here. GR is correct, .pop() takes a number representing a position in the list. .pop() doesn't accept strings so .pop("mayonnaise") doesn't work. Note as well that .pop(mayonnaise) is referring to a variable name mayonnaise. If you want to remove one of the "mayonnaise" entries you can use .remove().
    • Q: but then remove will not remove selectively, corect? Only will remove the first of the named item.
      • A: .remove() will remove the first occurrence in the list. You will need to call .remove() multiple times to remove all occurrences.
      • Q: If I have multiple mayonnaise and only want to remove Mayyonnaise 3 in position 5, I cannot selectively remove mayonnaise 3 in position 5.
        • A: You would need to know the position of the mayonnaise you want to remove and then use .pop(position). You can use .index() to find the first occurrence of mayonnaise. help(.index) will tell you how to start looking for words after a given position which will allow you to find subsequent occurrences.
  • Q: can .sort(), sort by something else than alphabetical order?

    • A: Yes, try help(variable.sort) and you will find the key= attribute. key= is supposed to be a function that defines how to sort elements. See here for more examples.
  • Q: Probably not important, but I've noticed that once you append mayonnaise into the list, the list is displayed visually with every word in a different line; however, when you remove all the "mayonnaise"s added and execute, then the list appears again in a single line like in the very beginning. Is there any specific reason for that?

    • A: This happens because Jupyter is trying to be friendly and make it readable. It can also happen if some of the words are quite long. If you use print() the result should be consistent.
  • Q: What is a "syntax error EOF"?

    • A: EOF stands for End of File. This usually happens if Python is looking for a character but it reached the end of the file before finding it. One case where this would happen is if you open a quote for a string but never close that string (e.g. name = "This string is missing the end quote).
  • Q: Before exercise 2.4 it is mentioned that even in cases in which indirect loops are needed, there are ways to do it that are more efficient than using "range" objects. Could you provide some examples on this? For example another option that came to mind was using something like "for i,j in zip(list_A, list_B):" Would this be a more efficient option or a less efficient one?

    • A: This is another case of a legacy option. In python 3 you should always use for element in my_list. Using zip() is efficient if you need to loop over pairs of elements or list elements that are linked. In Python 3 zip() is a very efficient function and doesn't create a duplicate of the lists being zipped.
  • Q: How do you get to use the "Markdown essentials" or how are they relevant? I'm assuming it's not possible to use them in the JupyterLab since I haven't found out a way to do so (not even using print()). So in which situation/where would you use them to see the expected visual return? Is it mentioned only so that we can use it here when asking questions and such?

    • A: Jupyter can render markdown but you need to change the cell type to Markdown instead of Code. Select it in the drop-down menu on the top of the notebook page.
  • Q: in Jupyter, there are different cells, where we run different functions now in Spyder, there is this main code that we run on this left-side window, and if we want to run different things, we open different files, is that correct?

    • A: A file in Spyder would be equivalent to a notebook in Jupyter. In Spyder you can run only a subset of lines by selecting the lines and pressing F9 or the corresponding button on the toolbar.
  • Q: I tried running only some selected lines. Did not work :( spyder runs whole file.

    • A: Use the F9 key to run selected lines in Spyder. There is also a button for this: find the green arrow in the menu at the top and then move 3 buttons to the right; there you go. The symbol looks like a square followed by a vertical line followed by an arrow.
  • Q: When formatting text: How can I include formatting such as padding and aligning (Example:{:>10}'.format('test')) when more than one different elements are displayed (for example in exercise 2.5)?

    • A (from Renato): I recommend the great https://pyformat.info/ resource. If you want to format more than one element you need more than one set of {} and you can add padding instructions to each individually. (e.g "{:>10}{:>5}".format("one", "two"))
  • Q: How I can resolve ex Exercise 1.3?

    • A: words already contains a list of words. You can start by obtaining the fourth word (fourth = words[3]) and then the third letter (fourth[2]) or without using an intermediate variable (words[3][2]).
  • Q: How do I run the spyder interface? Does it come with the .zip folder?

    • A: Spyder should have been installed as part of the Anaconda installation you did before the course. That means it can be launched via the Anaconda Navigator. You may also be able to find and run it by searching in the Start Menu (Windows) or pressing command+spacebar (Mac) and searching for "Spyder". If this isn't working for you, please let us know in the main room and we will try to figure it out alongside you.
  • Q: What exactly iselif:?

    • A: elif is a shorthand for "else if" - you should use it to provide alternative tests to run after the initial if statement. For example:
    ​​​​if colour == "red":
    ​​​​    print("you lose!")
    ​​​​elif colour == "black":
    ​​​​    print("you win!")
    ​​​​elif colour == "green":
    ​​​​    print("Everybody wins! :)")
    

    will print "you lose!" if the variable colour has the value "red" but, if it doesn't have that variable, the next test is performed. This next test checks whether colour has the value "black": if it does, then "you win!" is printed; if not, then the next test (whether colour has the value "green") is run. Please let us know below if this is still unclear!

    • It is now clear, thanks
  • Q: (Exercise 3.2) The cheat sheet says that myDict.keys() returns the list of keys of the dictionary. But list Operations like myDict.keys().sort() do not work. Is there an easy way to make the statement's return item a list?

    • A: In python 2, myDict.keys() used to return a list. In python 3, this is no longer the case; it returns a view instead, which is faster in some applications. This is a mistake in the cheat sheet. You can use list(myDict.keys()) to get the list and then sort it, i.e. list(myDict.keys()).sort().
    • Q: The list now contains str items and myList.sort()returns None in my environment.
      • A: Actually, myList.sort() always returns None. It sorts the list "in place", which indeed doesn't work here because the list of keys wasn't assigned to any variable. To have the sorted list returned, use sorted(myList). See the following examples:
        ​​​​​​​​​​​​myList = [2,1,3]
        ​​​​​​​​​​​​print(myList) 
        ​​​​​​​​​​​​>> [2,1,3]
        ​​​​​​​​​​​​myList.sort()
        ​​​​​​​​​​​​print(myList) 
        ​​​​​​​​​​​​>> [1,2,3]
        ​​​​​​​​​​​​print([2,1,3].sort()) 
        ​​​​​​​​​​​​>> None
        ​​​​​​​​​​​​print(sorted([1,2,3]))
        ​​​​​​​​​​​​>> [1,2,3]
        
  • Q: However, the behavior seems different if myList contains only str items?

    • A: It shouldn't be different as far as I know. The example below shows exactly the same behavior with strings, no?
    • Q: As soon as I do not try to assign and print sortedKeyList, it does. Will think about it. :-) Thank you.
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
      Added comment in code below.
      • A: Correct! myList.sort() modifies an existing list. sorted(myList) creates and returns a modified version of the input list.

        ​​​​​​​​​​​​theDict = {
        ​​​​​​​​​​​​    'A':{},
        ​​​​​​​​​​​​    'C':{},
        ​​​​​​​​​​​​    'B':{},
        ​​​​​​​​​​​​}
        
        ​​​​​​​​​​​​keyList = list(theDict.keys())
        ​​​​​​​​​​​​print(keyList)
        ​​​​​​​​​​​​# >> ['A', 'C', 'B']
        ​​​​​​​​​​​​sortedKeyList = keyList.sort() # keyList.sort() does not return a list. The sort() functions operates on a list!
        ​​​​​​​​​​​​print(sortedKeyList)
        ​​​​​​​​​​​​# >>  None
        ​​​​​​​​​​​​print(sorted(keyList))
        ​​​​​​​​​​​​# >> ['A', 'B', 'C']
        
  • Q: What is a cell (in Spyder)? and a Kernel?

    • A: A cell in Spyder is a block of code more info here. A kernel is the working engine of a jupyter notebook. Jupyter can be used with languages other than Python. You can have kernels for other languages or even different versions of the same language.
  • Q: For exercise 2.7 the following code works:

    ​​​​def hypot(a, b):
    ​​​​    return (a**2 + b**2)**0.5
    
    ​​​​hypot(3, 4)
    

    I dont quite understand the missing sqrt part, I keep getting name math is not defined

    • A: The sqrt function is not directly available in python. It lives in the math library and as such it needs to be imported. If you use import math you can then use math.sqrt().
    • Great thanks! That worked.
  • Q: For exercise 3.2 - what is the best way to sort the names in the group of groups? I did a sort function for each subgroup. Not very elegant:

    ​​​​GroupA.sort()
    ​​​​GroupB.sort()
    ​​​​GroupC.sort()
    
    ​​​​for group in AllGroups.keys():
    ​​​​    print(group)
    ​​​​    for student in AllGroups[group]:
    ​​​​        print(student)
    
    • A: Whenever you repeatedly call the same function (such as here .sort()), it's a good idea to think about how to move this into a loop to remove the repetition. Also, have a look at sorted(list) as an alternative to list.sort(); it could be helpful here.
  • Q: (Chapter 4 Getting Data From Files) I don't get any output (in Spyder) if I try to print every line from datafile directly. Only the creation of lines and then printing every line from lines yields the desired output which defeats the whole point of trying to save memory capacities. :-/

    ​​​​dataFile = open('speciesDistribution.txt', 'r')
    ​​​​allDataLines = dataFile.readlines()
    
    ​​​​for line in allDataLines:
    ​​​​    line = line.strip()
    ​​​​    print(line)
    
    • A: Okay, so this works as intended, right? What was the version that doesn't work for you?
      ​​​​​​​​dataFile = open('speciesDistribution.txt', 'r')
      
      ​​​​​​​​for line in dataFile:
      ​​​​​​​​    line = line.strip()
      ​​​​​​​​    print(line)
      
      • A: This works for me. If you're still having trouble with this, please head back to the main room and we'll try working through it together.
  • Q: It works for me now, too. I didn't remove/uncomment the allDataLines = dataFile.readlines(). As soon as I uncomment this line everything works as intended.

    • A: Ah yes. It's important to remember that everytime a line is read from a file object (whether by dataFile.readlines() or by for line in dataFile), that line "disappears".
    • Thank you for the explanation!
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
  • Q: Can you explain the difference between (), [], {} in the command? Thnx!

    • A: These symbols are a different syntax to define tuples, lists and dictionaries. [] is roughly equivalent to list(), {} is roughly equivalent to dict() and () can refer to tuple() when used with a comma (e.g. (1,2,4) or (1,)), to call a function as in list() or simply as mathematical precedence in 2 * (3 + 1). There are additional contexts where these symbols have different meanings. Some of the keywords to follow on for that are: list comprehension, dictionary comprehension, generators, sets.
  • Q: I still can't figure out how to do this (previous question referring to exercise 3.2). If I move group.sort() into the loop it doesn't work

    • A: Check what is group inside the loop. Can you make it be the list you want to sort?
  • Q: When talking about reading files, r+ mode is mentioned as being reading + write. Does this "write" function work like "w" or more like "append" mode. I mean, will "r+" clear the text in the file?

    • A: "r+" is both "r" and "w" in a way. For append you have "a". The difference between "r+" and "a" has to do with where in the file the reading/writing cursor is positioned. In the case of "r+" it's positioned at the beginning of the file while in "a" it's positioned at the end.
    • Very clear! Thank you very much :)
  • Q: I've realised that in Exercise 2.4, both making use of:

for i in range(len(shopping)):
    print(shopping[i], amounts[i])

and

for i in range(len(amounts)):
    print(shopping[i], amounts[i])

works fine and returns what you expect. I don't quite understand why, though. I would have expected (and lost quite some time trying that) that I had to use the for statement and the range(len()) function in both "shopping" and "amounts", such that it knows that I want it to go through all indices inside both lists. How making the loop only indicating that for the given indices in a given range in ONE of the lists takes into consideration to go through both lists and return all indices?

  • A: Perhaps what is confusing here is that once we use the function len() we will effectively be working with one number. So the code is doing something like: len(...) -> 7 and then range(7). As long as both shopping and amounts have 7 elements, we will get the same result. If you want to see a different behavior between the two cases you pasted, try adding an additional item to the shopping list with shopping.append(item) and repeat the first case.
  • Q: Maybe a better way of questioning it is: I thought that until you use the range(len()) function, your list components behave as items and not indices (items meaning parts in a list and indices meaning positions in a given range), so what's confusing to me is that you don't have to specify that you will treat something as an index in a given range rather than an item on a list. I hope this is not too confusig, I don't know how to ask it in a clearer way
    • A: The original list isn't actually being modified. What defines how our for loop behaves is what we place in front of it. Perhaps compare and run the following code:
    ​​​​print(shopping)
    ​​​​for item in shopping:
    ​​​​    print(item)
    ​​​​    
    ​​​​indices = list(range(len(shopping)))
    ​​​​print(indices)
    ​​​​for index in indices:
    ​​​​    print(index)
    ​​​​I
    ​​​​# Notice that indices was derived from shopping but effectively is just a list with numbers
    
  • Q: In the end of part 4, we are asked import gridplot from bokeh.io. I got an error "cannot import name 'gridplot' from 'bokeh.io'". Am I the only one to have this problem and how can I fix it?
    • A: Apparently, bokeh has changed its internal structure since this tutorial was created. The function gridplot is now no longer in io but instead in layouts. So try from bokeh.layouts import gridplot.
  • Q: I just copied and pasted the solution for exercise 3.3 into my notebook and it produced an error message - offending line: group_scores = list(AllGroupResults[group].values()) The error says TypeError: 'list' object is not callable. If I delete 'list' and the surrounding brackets it works.
    • A: can you print(list)? It should show <class 'list'>.
      • No
    • A: You must have overwritten list somewhere in the notebook. If you have something like list=[1,2,3] then the built-in function list() is "gone".
      • Yes - I have found it further up my notebook - Thanks!
  • Q: How do I clear this list error?
    • A: You mean the one above? You will have to restart your kernel and rerun the notebook (without the line where the list function is being overwritten). Alternatively you can also use del list to delete the variable list returning to the original list() behavior.
  • Q: Right after Exercise 4.8 the gridplot command takes a list of lists of figures as first (and only provided) argument. Plotting does work as intended when layout is changed to:
layout = gridplot([[fig1, fig2],
                  [fig3, None]])
​​​​- A: 

Exercise progress

This section has been moved to the end of the current document


Advice

  • Make a new cell for experimenting in
  • Check for typos
  • Use tab-completion
  • Python is case-sensitive i.e. Aa
  • Dealing with errors is part of the process
  • If you're stuck, ask for help

Jupyter Essentials

  • shift + enter: execute cell
  • ctrl + enter: execute cell
  • alt + enter: execute cell open a new code cell immediately below
  • esc: exit edit mode -> command mode

When in command mode, the following keyboard shortcuts can be used:

key effect
C copy selected cell
V paste copied cell below selected cell
X cut (copy+remove) selected cell
A insert new, blank cell above selected cell
B insert new, blank cell below selected cell
H display help sheet

Markdown essentials

and multiple
lines
of code
with triple backticks
on their own lines
import numpy

def print_explanation:
    explanation = '''
    you can even turn on syntax highlighting for
    most languages by naming the language immediately
    after the opening three backticks
    '''
    print(explanation)

Zoom chat transcript

  • Q: I downloaded ITPP materials but got a GitHubDesktopSetup file?
    • A: You have downloaded the GitHub Desktop application instead of a copy of the ITPP materials. Click the "Clone this repository" followed by "Download as Zip"
  • Q: I can't hear anything, audio doesn't work.
    • A: Check that you have joined the audio call by clicking the "Join the audio" button on the bottom left corner of your Zoom screen. If you see an unmute button you are already joined. You can also click in the up arrow next to the unmute button to check your audio devices and settings.
  • Q: MacOS instructions say to launch ipython. I'm confused.
    • A: Sorry for the confusing instructions. ipython is another interface to python. For the present exercises you should be able to use the Jupyter Notebook or Jupyter Lab interface. Check above for the Jupyter essentials section which includes keyboard shortcuts to execute and edit a notebook cell.
  • Q: Where can I find other functions?
    • A: We will cover how to import python modules on day two. You have also installed anaconda which includes the conda package manager with which you can install additional python packages.
  • Q: Can you define a set of numbers as a string?
    • A: numbers = "0123456789"
  • Q: What is the meaning of / in the help page of a function?

Sign In

Name / Affiliation / Operating system e.g. Toby Hodges / Zeller Team, EMBL Heidelberg, MacOS

  • Renato Alves, Bio-IT, Gibson Team, EMBL Heidelberg, Linux
  • Toby Hodges, Bio-IT, Zeller Team, EMBL Heidelberg, MacOS
  • Jonas Hartmann, De Renzis group, EMBL Heidelberg
  • Yacine Kherdjemil, Furlong Group, EMBL Heidelberg
  • Christina Ritter, Wilmanns Group, EMBL Hamburg
  • Catherine Stober, Korbel group, EMBL Heidelberg
  • Effie Mutasa-Gottgens, EMBL- EBI
  • Saravanan Panneerselvam, Schneider's Group, EMBL-Hamburg
  • Sarela Garcia-Santamarina, Typas group, EMBL-Heidelberg
  • Andrea D'Amato, IT Group, EMBL Hamburg
  • Eduard Avetisyan, IT Group, EMBL Hamburg
  • Clemente Borges, IT Group, EMBL Hamburg
  • Vincent Gureghian, DLSM, University of Luxembourg
  • Nadine Fernandez, Noh Group, EMBL Heidelberg
  • Zehra Sayers, Svergun Group, EMBL Hamburg
  • Georgia Rapti, EMBL Heidelberg
  • Julian Bauer, Karlsruhe Institute of Technology
  • Gera Smyshlyaev, Barabas Lab, EMBL Heidelberg
  • Mireia Osuna,EMBL Heidelberg
  • Matt Hall, EMBL-EBI
  • Luisa Abreu, Alexandrov Team, EMBL Heidelberg
  • Patryk Poliński, EMBL/CRG Barcelona
  • Tafsut Tala-Ighil, Hackett Group, EMBL Rome
  • Mariya Timotey Miteva, EMBL Monterotondo
  • Chiara Mungo, Rapti group, EMBL Heidelberg
  • Vibha Patil, Noh Lab, EMBL Heidelberg
  • Almudena Garcia, Noh Group, EMBL Heidelberg
  • Inessa De, C.W. Müller Group, EMBL Heidelberg
  • Kate Beckham - Wilmanns Group, EMBL Hamburg
  • Swaminathan K - IT Group, EMBL Hamburg
  • Kesha Josts - University of Hamburg
  • Martina Varisco / Furlong Group, EMBL Heidelberg
  • Montse Coll- EMBL Barcelona
  • Silvija Svambaryte / Aulehla Team / EMBL Heidelberg / MacOS
  • Clement Blanchet - SAXS Group, EMBL Hamburg
  • Laura Rodriguez de la Ballina, EMBL Heidelberg (ALMF visiting scientist from Simonsen lab, University of Oslo)
  • Jonas Hund, Karlsruhe Institute of Technology (KIT), MacOS
  • Ines Martinez Martin - Wilmanns Group, EMBL Hamburg
  • Victor Armijo, EMBL Grenoble
  • Sukrita Deb- Gross group, EMBL Rome
  • Eva-Lotta Käsper, Noh Group, EMBL Heidelberg
  • Jona Rada EMBL Heidelberg
  • Luca Santangeli - Arendt Group, EMBL Heidelberg
  • Lucas Teixeira Boldrini - Gross Group, EMBL Rome
  • Christoph Diehl, Erb Group, MPI Marburg
  • Elena Vizcaya Molina, Furlong Group, EMBL-heidelberg
  • Heura Cardona, Sharpe Group, EMBL-Barcelona
  • Marina Gil López - Uni Bielefeld
  • Aleksandra Sparavier, Marti-Renom group, CRG
  • Martin Gutierrez, Zeller Group, EMBL Heidelberg
  • Anthony Fullam, EMBL Heidelberg
  • Alberto Molares, Health Research Institute Santiago de Compostela
  • Umut Yildiz, Noh Group, EMBL-Heidelberg
  • Simonne Griffith-Jones, Bhogaraju Group, EMBL-Grenoble
  • Patrick Hasenfeld, Korbel group, EMBL-Heidelberg
  • Michael Agthe, Schneider Group, EMBL Hamburg
  • Filipa Torrao, Gross Group, EMBL Rome
  • Steffen Neumann, IPB Halle
  • Serdar Balci, Turkey, MacOS
  • Ulrike Wittig, HITS, Heidelberg
  • Aylin Haas, Greb group Centre for Organismal Studies, Uni Heidelberg
  • Alina Ritz, Rapti group, EMBL Heidelberg, MacOS
  • Goksin Liu, Svergun Group, EMBL Hamburg, MacOS
  • Dorian Battivelli, Gross Group, EMBL Rome
  • Cecile Petit, Wilmanns, EMBL-Hamburg
  • Silvia Natale, Gross Group, EMBL Roma
  • Chrysi Kapsali, Lancrin Group, EMBL Rome
  • Monika Mielnicka, Boulard Group, EMBL Rome
  • Charles, Windows
  • Biruhalem Taye, EMBL Hamburg

Introduction and installation trouble shooting

  • Python is a general purpose programming language and can be adapted to a large variety of tasks: web development, machine learning, image analysis, molecular modelling
  • relatively easy to read and write
  • but still a programming language
  • course materials as jupyter notebooks
  • Launch Jupyter Lab via Anaconda Navigator
  • you can access the course material at https://github.com/tobyhodges/ITPP
  • click clone or download
  • unzip ITPP-master.zip to Desktop or somewhere where you are able to find it
  • Use the interface on theHi effie, oo left to find the course materials folder you downloaded
  • Open 1_GettingStarted.ipynb
  • Jupyter & Markdown cheatsheet in the shared notes
  • most sessions will be in breakout rooms
  • taught session will take place in the main room
  • if you have questions come back back to the main room and ask for help

1_Getting started

Advice

  • make a new cell for experimenting in
  • check for typos
  • use tab-completion, start typing something, e.g if you type pr and press tab it will get completed to print, you may get a dropdown menu if there are different options
  • python is case-sensitive
  • dealing with errors is part of the process
  • if you are stuck, ask for help

Feedback from Day 1

Something that was good about today

  • Nobody infected each other with SARS-CoV-2 ;)
  • interesting exercises
  • Thank you! Very helpful and easy to understand :)
  • I really like this approach in which everyone works more or less on their own, so we can all learn even when having really different backgrounds or programming levels.
  • well explained, working in breakout groups helps focusing on self-learning
  • course was exceptionally well run consideering it is all online and the zoom worked well.
  • Thanks for answering the questions in real time. Nice answers
  • very good (easy fo follow) examples and good documentation
  • With a minimum of background (structure of Spyder for example) the course is ideal. Really enjoyed how you explain the material. Easy to follow! Thanks a lot
  • It was great! Easy to follow and easy to ask things when needed
  • Good explanations and very progressive course
  • Very clear explanation. Examples easy to understand. Thank you!!
  • The course is very useful for those who already some background but for who never done the programming, in my opinion, it is difficult to manage to solve the exercises alone in the rooms.
  • having some experience with different programming languages, I found the day great and liked the amount of "home work"vs group work. Thanks
  • Having no experience with any programming language, I also really enjoyed the stucture of the course. I like that I have to work and think on my own before discussing the material. Makes learning more effective.
  • As a person with 0 programming experience, I think that the material in the course is fitting and well explained and I personally enjoy that we get to work on our own for the most part (probably a matter of taste). Very well organised considering how big the group was. Best part: being able to ask questions in real time and in the shared document!

Something that we could improve for tomorrow

  • Clear instructions for breakout sessions (which part exactly should be worked on)
  • A lot of material was covered maybe a bit much for one day But I really liked it. Thanks. #stimulating
  • Some of the questions, if the answers are provided by the end of the day, might be useful to run through the course quicker or not waste too much time on being stuck at a specific question
  • it would be helpful to get an idea of how we should pace ourselves - i.e. which stage are we expected to have reached at the end of each session?
  • When someone asks if you can explain for example exercise 2.5 but I'm still at 1.4 I get distracted trying to follow the answer because if someone has already asked I can not ask the same question, but in the meantime, I can not understand it because I am behind. Can you go slower and show the solutions all the exercises from the first exercise and eventually explain the last at the end t? Because those who are fast can follow those who are slow like me are left behind and it becomes even more difficult.
  • all the different programming environments like ipython, jupyter-notebook, jupyter-lab, spyder meantioned in the materials may be confusing for beginners, maybe show and explain them once
  • When working on smaller groups we don´t know if people are asking in the main room, and we could be interested in these questions/answers
  • Go through the theory parts together as a group (the parts where the new concepts are being introduced) and then let us do the exercises alone, similar to how the regular expressions course was structured. This way we will all be on the same pace.
  • I guess it would be great to solve all exercises together, after the self-paced work.
  • maybe it's just my problem, but I'd like to see how the exercises are solved. I have 0 experience in the field and I had many problems almost in 99% of cases I had 'error'. For tomorrow, can you explain how all the exercises are solved?
  • It would be better if we were assigned to the same small working groups everytime, so that we could help each other more. Some rooms were very quite :)
  • Day 2 was way rushed I was a bit lost. But I really liked it!Thanks

Progress check

Please type 'x' next o the chapter that you were working on at the end of day 1:

  1. x
  2. xxxxxxxxxxxxxxxxxxxxxxxxxxXxxxXxxx
  3. xxxxxxxxxxxxxxxxxxxXXxxxxXx
  4. XxxxxxxxxxxxxxxxxxXxxxxxxxxx
  5. xxx
  6. xxxx

Can someone clarify what are chapters 5 to 7 in the list above - There are only 4 notebooks?

Exercise progress

Put your exercise solution requests below (e.g. 1.2; 2.3). put an 'X' next to the question number to "upvote" other people's requests

  • Exercise 1.3 - XXX
  • Exercise 2.1 -
  • Exercise 2.3 - XXXX
  • Exercise 2.4 -
  • Exercise 2.6 - XXX
  • Beggining programming: lst debugging exercise (+ optional exercise) - XXXXXX
  • Exercise 3.2 - XXxxxxxxx
  • Exercise 3.3 - Xxxxxxxxxx
  • Exercise 3.4 - xxxxxx
  • Exercise 4.4 - x

Other requests

Feedback Survey

Please fill out this survey at the end of day 2:

https://de.surveymonkey.com/r/denbi-course?sc=hdhub&id=000260

Exercise 3.3 Solution

for group in AllGroupResults:
    print()
    print('Results for {}'.format(group))
    for student in AllGroupResults[group]:
        print('{}:\t{}'.format(student, AllGroupResults[group][student]))
    group_scores = list(AllGroupResults[group].values())
    mean_score = sum(group_scores) / len(group_scores)
    print('Mean score for {}: {}'.format(group, mean_score))

Exercise 3.4 Solution

my_name = 'Toby'

def get_feedback_message(score):
    if score < 60:
        return 'You must try harder next time. Are you taking this course seriously?'
    elif 60 <= score <= 79:
        return "Well done, that's a good score."
    else:
        return "Congratulations! That's an excellent score!"

template = '''Dear {}, 
I have finished marking the assessment for your seminar group, {}. 
You scored {}. 
{}
Kind regards,
{}'''
for group in AllGroupResults:
    for student in AllGroupResults[group]:
        score = AllGroupResults[group][student]
        print(template.format(student, 
                              group, 
                              score, 
                              get_feedback_message(score), 
                              my_name))
        print()

Clarifying what are key/values and how the inner dictionaries are values of the outer dictionary

for group in AllGroupResults:
    group_dictionary = AllGroupResults[group]
    print(group_dictionary)
    #for student in AllGroupResults[group]:
    for student in group_dictionary:
        print(student)
        # FOR X IN DICTIONARY:
        # X -> key of that DICTIONARY
        # Y = DICTIONARY[X] ; Y -> value corresponding to key X
        score = AllGroupResults[group][student]

Feedback Survey

Please fill out this survey at the end of day 2:

https://de.surveymonkey.com/r/denbi-course?sc=hdhub&id=000260

Select a repo