Jupyter Tutorial === [Jupyter](https://jupyter.org/) notebook is an interactive web application that allows you to type and edit lines of code and see the output. The software requires Python installation, but currently supports interaction with over 40 languages. ## So, what **is** a Jupyter notebook? In this case, "_notebook_" or "_notebook documents_" denote documents that contain both code and rich text elements, such as figures, links, equations, etc. Because of the mix of code and text elements, these documents are the ideal place to bring together an analysis description and its results as well as they can be executed perform the data analysis in real time. These documents are produced by the [Jupyter Notebook App](http://jupyter.org/). As a fun note, "Jupyter" is a loose acronym meaning [Julia](julialang.org), [Python](https://www.python.org/), and [R](https://www.r-project.org/). These programming languages were the first target languages of the Jupyter application, but nowadays, the notebook technology also supports [many other languages](http://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages). The main components of the whole Jupyter environment are, on one hand, the notebooks themselves and the application. On the other hand, you also have a notebook kernel (that is the language interpreter that will be executing the code in the background) and a notebook dashboard. Notebook files have the extension `.ipynb` to distinguish them from plain-text Python programs. Notebooks can be exported as Python scripts that can be run from the command line. And there you have it: the Jupyter Notebook - there are also several [examples of Jupyter notebooks](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks) that you can see/browse. ## Cool! How do I install Jupyter. Generally, you'll need to install Python (which is a prerequisite). The general recommendation is that you use Conda to install both Python and the notebook application. If you haven't yet installed conda, you can get a miniconda installation [here](https://docs.conda.io/en/latest/miniconda.html). We highly recommend installing the python 3.x version (don't worry, you can still use python 2.x if needed, but we recommend doing it within a conda environment!). Once you have a working conda installation: ``` conda install jupyter ``` If you'd like to follow along with some python plotting, also install the `matplotlib` and `seaborn` libraries. ``` conda install seaborn conda install -c conda-forge matplotlib ``` ## Ok, I'm set! What's next? After installation, the only thing necessary is to actually start the notebook. This can be done at command line using the following command: ``` jupyter notebook ``` After running the command, you will see a bunch of information on the command line window, and at the same time, a new page will open on your browser. There are three main tabs `Files`, `Running` and `Clusters`. You'll be mostly using the first two (when not *in* the actual notebook): - `Files`: is the listing of your current working directory. When you first launch the notebook, the directory is the same where you launched the app. - `Running`: is a list of all *active* notebooks, i.e. notebooks that have been running commands through one of the available kernels. - `Clusters`: this is a listing of all clusters that are available for a back-end execution (will be empty, unless you have connected the Jupyter Notebook app to a cluster) Note: if jupyter notebook is asking you for a password, type Ctrl-C twice to exit the notebook. On the command line, type `jupyter notebook password` and set a password for yourself. Then relaunch the notebook with `jupyter notebook`. ### Creating a Notebook Creating a Notebook is as straightforward as clicking on the `New` button on the top right, and selecting the kernel (i.e. the engine that will be interpreting our commands). > _Note_ Jupyter really shines for Python and Julia notebooks. R users usually go the RMarkdown, which is much more optimized for R (as opposed to Jupyter). Eventually however, it all comes down to personal preference (or lab inheritance...) You'll notice the following points: - There is an `In []` section in the middle. This is called a `cell` and is essentially an interface where you can put your code, text, markdown, etc. Simply put, every cell that is an _input_ is marked as `In` followed by an increasing number that corresponds to the relative order that the particular cell was executed. - There is an indication of the current kernel being employed to execute each cell on the top right (in this instance the kernel is Python3). - The notebook is still Untitled, but it has already been saved/created as a file (you can have a look at the working directory to verify this). Bear in mind that Jupyter automatically saves (i.e. autosaves) your notebooks, and also creates **checkpoints**, essentially snapshots of your Notebook that you can go back to (this is **not** versioning, as it doesn't capture all changes, just snapshots in time). After writing some code/text in the currently active cell, the main keyboard command to remember is how to execute the code. - `Shift-Enter`: Executes the code and creates a new cell underneath. - `Ctrl-Enter`: Executes the code _without_ creating a new cell. Type the following code into a cell and then hit `Shift-Enter`. ``` print("Hello World!") ``` You may have briefly seen an asterisk after `In`, i.e. `In [ * ]`. The asterisk means that the kernel is currently trying to run the code, so you should be waiting for the output. After successful execution, the `*` will change to the next number of the Cell (`1` in our instance), and the output of the command will be visible below (`5` in our case). Finally, as we executed the code with `Shift-Enter`, a brand new cell has been created for us. At this point, we can rename our Notebook, by clicking on the `Untitled` entry, and let's rename it to `Jupyter-is-fun` Well done! You've just created you first Jupyter Notebook! ### Mingling code and text One of the most powerful things in Jupyter is the fact that you can write both text and code in the same notebook - much like a paper Lab notebook where you have your text notes and your equations/figures/etc. Let's try and put some text in our notebook. To do that, we need to tell Jupyter that the cell should be interpreted as text (Markdown-formatted) and not as code. Click on the empty cell (it should have a green outline), and then go to `Cell` -> `Cell Type` -> `Markdown`. You will notice that the `In [ ]` indicator just disappeared, as there will be no need to execute something (and therefore no output will be produced). Let's copy the following text into the cell: ``` # Writing Notebooks We can write lots of formatted text here, using the [Markdown syntax](https://en.wikipedia.org/wiki/Markdown). It is an easy way to write pretty text easily and efficiently. ## Formatting It does support several common formatting styles: - It can do **bold** or __bold__ - It can do _italics_ or *italics* - It can also do sub lists * with items one * two * and three - or you can number your lists 1. one 2. two 3. and so on It also allows to write [LaTex](https://www.latex-project.org) equations, like that: $$c = \sqrt{a^2 + b^2}$$ Pretty neat, right? ``` If you press `Shift-Enter` after putting this text, it should render nicely. ### Plotting Jupyter notebooks enable you to include notes, code, and plots all in the same document! Let's see that in action: ``` import seaborn as sns sns.set() # Load the iris dataset iris = sns.load_dataset("iris") # Plot sepal width as a function of sepal_length across days g = sns.lmplot(x="sepal_length", y="sepal_width", hue="species", height=5, data=iris) # Use more informative axis labels than are provided by default g.set_axis_labels("Sepal length (mm)", "Sepal width (mm)") ``` Check out the seaborn package [here](https://seaborn.pydata.org/index.html) Ok, you executed that, and now see something like: `<seaborn.axisgrid.FacetGrid at 0x11cb56828>`. Why didn't the plot show up? Jupyter notebook requires one more incantation to display python plots. Execute the following in another cell: ``` %matplotlib inline ``` Now run the plot code again. ### Sharing Notebooks For more instructions, the `Help` menu has a good tour and detailed information. Notebooks can be downloaded locally by going to the `File` menu, then selecting `Download` and choosing a file type to download, and it supports both `pdf` and `html` as file type choices. You can also share the entire file that you have just created (there should be a file named `Jupyter-is-fun.ipynb` in your working directory). You can even grab the one we created right now from [here](Jupyter-is-fun.ipynb). That's it! ### Additonal Tips and Tricks Jupyter Notebook has Command and Edit modes. If you press Esc and Return alternately, the outer border of your code cell will change from gray to blue. These are the Command (blue) and Edit (green) modes of your notebook. Command mode allows you to edit notebook-level features, and Edit mode changes the content of cells. When in Command mode (esc/blue), - The b key will make a new cell below the currently selected cell. - The a key will make one above. - The x key will delete the current cell. - The z key will undo your last cell operation (which could be a deletion, creation, etc). All actions can be done using the menus, but there are lots of keyboard shortcuts to speed things up. see this [software carpentry lesson](https://swcarpentry.github.io/python-novice-gapminder/aio/index.html) for more ## Reading material - References for learning Python - http://rosalind.info/problems/locations/ - http://learnpythonthehardway.org/book/ - http://www.learnpython.org/ - http://www.pythontutor.com/visualize.html#mode=edit - [gallery of interesting jupyter notebooks](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks) *tutorial modified from [angus 2017 tutorial](https://angus.readthedocs.io/en/2017/Jupyter-Notebook-Notes.html)*