This document was created for a workshop at Harvard Medical School on Nov 14, 2022.
(It may have been edited since then, see top of the page for last change)
Table of contents
Python is an extremely popular, powerful, and flexible programming language and ecosystem. But it can be confusing for newcomers (and even those who have used Python for years) to understand exactly what it means to have Python and Python packages "installed" on a system.
The goal of this workshop & document is to demystify and answer the following questions:
pip install
or conda install
a package, where do they download from? where does it install to?import
a package, where does Python look for it?The actual python executable that parses and runs human readable source code.
Type
which python
(mac/linux) orwhere python
(windows) to show the path to the active python interpreter.
An organizational unit of python code. Usually, a single file ending in .py
that contains Python definitions and expressions.
A collection of modules. Usually, this is a folder of python modules that also contains an __init__.py
file. "Package" also frequently refers to an installable python library/application (e.g. numpy
, matplotlib
, pandas
…)
A program that automates the installation, updating and removal of packages (e.g. pip
, conda
)
An isolated collection of packages, settings, and an associated python interpreter. Virtual environments allow many different collections of Python and packages to exist on the same system
A program that automates the creation and deletion of virtual environments (e.g. conda
, virtualenv
, venv
)
pip
installs packages from pypi.org.pip
looks for packages).conda
installs packages from anaconda.org. Note that conda can install both python and non-python packages. (see also mamba, a fast implementation of conda written in C)python
executable, the conda
program, and a few hundred pre-installed python packages in the base environment.conda install ...
something, it searches & installs packages from anaconda.orgconda
that does not contain all of the additional packages in the anaconda distribution (see also miniforge and mambaforge, which install conda
and mamba
respectively, and set conda-forge as the default channel.)There are three classes of tools that you'll want to be familiar with when using Python:
There are many tools that perform each of these tasks, and some tools perform multiple tasks. The following venn diagram shows where a few commonly used programs fit into these classes:
For our purposes, we will be using conda as an environment manager and a tool to install Python itself; and we will use both conda
and pip
to install Python packages.
❓ conda
vs. mamba
Throughout this page, whenever I refer to conda
as a command you can run on the command line, you can substitute the command/program mamba
.
mamba
is a reimplementation of the conda package manager in C++. It is much faster than conda
in many cases, and – unlike conda – doesn't require Python itself, which removes a "bootstrapping" problem in some cases.
You can install mamba (using conda
!) into your base environment:
conda install mamba -n base -c conda-forge
A virtual environment is an isolated collection of packages, settings, and an associated Python interpreter, that allows multiple different collections to exist on the same system
Conflicting package dependencies:
Environments allow you to use a different set of packages for different projects and applications.
When completed, you should:
✅ have a new mambaforge
folder in your home directory
✅ be able to run conda
(and/or mamba
) from the terminal
✅ have a base environment with python installed.
There are many ways to get python installed. Here, we will jump to my recommended approach of installing python via conda.
Conda is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. With it, you can create virtual environments and install python itself (along with many other python and even non-python programs!)
The most well-known way to install conda is via anaconda.com. However, we will install conda using "miniforge". Miniconda & Miniforge are much smaller distributions than the anaconda distribution. They provide the bare minimum required to get started with python conda-based virtual environments. Specifically, the installer will:
conda
, python
, and a couple other packages useful for bootstrapping environments (like pip
.) into a new folder in your home folder.conda-forge
as the default (and only) channel.mamba
(if you used mambaforge
)Note: miniforge is very similar to miniconda, except that it also sets up conda-forge as your default conda channel. We'll learn about conda channels below.
In a terminal, run:
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh
During install, you will see a question like "Do you wish the installer to initialize conda/Miniforge/Mambaforge?". It's best to enter "yes" to this.
Key parts of conda
's functionality require that it interact directly with the shell within which conda
is being invoked. The conda activate
and conda deactivate
commands specifically are shell-level commands. That is, they affect the state (e.g. environment variables) of the shell context being interacted with. Other core commands, like conda create
and conda install
, also necessarily interact with the shell environment. They're therefore implemented in ways specific to each shell. Each shell must be configured to make use of them.
This command makes changes to your system that are specific and customized for each shell. To see the specific files and locations on your system that will be affected before, use the --dry-run
flag. To see the exact changes that are being or will be made to each location, use the --verbose
flag.
Where did it go?
Take a moment to make sure you know what just happened to your system above 👆.
By default anaconda/miniconda/miniforge/mambaforge will create a new folder in your home directory (e.g. ~/mambaforge
, or ~/miniconda3
)
Just for the sake of completeness, here are some alternative methods that you will see recommended in various place, along with why I didn't use them here.
Download the version you want from https://docs.conda.io/en/latest/miniconda.html
Miniforge and miniconda are very similar in that they are both provide the bare minimum to get the conda
environment and package manager installed. Miniforge also configures conda-forge
as the default (and only) channel. Not having conda-forge in your configuration is a common source of problems for many newcomers.
Download the version you want from https://www.python.org/downloads/
While downloading from python.org is of course the "canonical" way to install Python, it will install it at the system level; and, by default, all installed packages will go into your "global" collection of packages.
Installing conda
gets us python
, and the machinery to create virtual environments all in one, and can install it into your home directory without any special permissions. You could accomplish a similar thing with python.org, pyenv
, and/or venv/virtualenv
… but conda
very quickly gets us everything we need.
Click the download button on https://www.anaconda.com/, double click the installer and follow the prompts.
While installing from anaconda.com
does get us everything and more from miniconda
, it is "bloated" in that it additionally comes with many hundreds of packages pre-installed in the base environment.
In most cases, you will want to create multiple environments with a collection of packages for your specific tasks. Anaconda provides a very quick way to get up and running with scientific python, but also comes at a very large package size, and obscures a few very basic details and best practices about (re)creating environments.
After installing Homebrew, run:
brew install python
While homebrew is fantastic for programs that you only want 1 version of, it can be challenging for something like Python (where you often want python 3.7, 3.8, and 3.9 installed all at the same time). Also, similar to installing from python.org, a homebrew install can get you into trouble with global package installs if you're not careful.
pyenv is a tool that lets you easily install and switch between multiple Python installations.
See installation guide here, and for an introduction to pyenv, see this blog post
There's nothing wrong with using pyenv
, it's very convenient and lighter weight than conda. Since I generally know that I will also want to be installing packages with conda
, I tend to use conda for python as well.
But if you know you only want to install with pip
, then pyenv
can get you setup quickly, and also create virtual environments with pyenv virtualenv
.
✅ create a new virtual environment with conda
with a specific version of Python
✅ activate and deactivate environments
✅ know which environment (and python interpreter) you're currently using
✅ delete an environment
If you retain one bit of advice today, let it be this:
Environments allow you to experiment with various packages and versions without fear of breaking your entire system (and needing to reinstall everything). As you install packages over time, you will inevitably install something that doesn't "play well" with something else you've already installed. In some cases this can be hard to recover from. With virtual environments, you can just create a fresh environment and start again – without needing to do major surgery on your system.
Reminder: mamba
is a fast version of conda
. I use it here in these examples, but if you don't have it installed, you can replace the "mamba
" command with "conda
".
# create a new empty environment named 'ENV_NAME'
mamba create --name ENV_NAME
# create a new empty environment named 'ENV_NAME' (`-n` is short for `--name`)
mamba create -n ENV_NAME
You can also install things (using conda
/mamba
) in the same command that creates the new environment by adding a list of packages to install to the end of the command. For example, you'll usually want to create an environment with a specific version of python installed:
# create a new environment with the latest version of python
mamba create -n ENV_NAME python
# create a new environment with python 3.10
mamba create -n ENV_NAME python=3.10
Where did it go?
Take a moment to make sure you know what just happened 👆.
Calling conda create
will result in a new folder in the envs
folder in your conda installation (e.g. ~/<conda_folder>/envs/ENV_NAME
)
We've now created an environment named ENV_NAME
, but we aren't currently "using" it. To activate a virtual environment, use conda activate
# activate environment named 'ENV_NAME'
conda activate ENV_NAME
you should see your prompt change to includeImage Not Showing Possible ReasonsLearn More →
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
(ENV_NAME)
somewhere, indicating the active environment.
Now, when we run the command python
(or any other command that in turn calls python
), the specific interpreter that we installed into our environment will be used. To prove this to yourself – or if you ever want to double check which python
is being used – type:
# on mac/linux
which python
# on windows cmd
where python
*You must activate environments each time you open a new terminal.
The main effect of calling conda activate ENV_NAME
is to add the environment folder (a.k.a. the "prefix", which usually lives in ~/<conda_folder>/envs/ENV_NAME
) to the front of your shell's PATH
.
It will also update the CONDA_PREFIX
and CONDA_DEFAULT_ENV
environment variables to reflect your activate environment prefix and name.
# activate an environment
conda activate ENV_NAME
echo $PATH # windows: echo %PATH%
env | grep CONDA # windows: set | findstr "CONDA"
# deactivate and look again
conda deactivate
echo $PATH # windows: echo %PATH%
env | grep CONDA # windows: set | findstr "CONDA"
To deactivate the environment, use conda deactivate
:
# deactivate the currently active environment
conda deactivate
# or, explicitly activate the base environment
conda activate base
To delete an environment permanently, first make sure to deactivate it, then enter:
conda env remove -n ENV_NAME
… the folder in ~/<conda_folder>/envs
should be gone now.
Conda is not the only game in town for managing virtual environments! You'll want to use conda
if you're going to be using conda install
to add packages, but if you know you don't need to install using conda, there are alternative environment managers like:
When completed, you should:
✅ know how to install with pip
✅ know how to install with conda
✅ know where to go to read more about specifying versions and sources
The extensive ecosystem of third-party scientific packages is a primary driver of the success of Python. (By "third-party" here, I mean packages and modules that don't ship with the Python standard library; packages like numpy
, pandas
, and matplotlib
.) Most of the time, the first thing you'll do after creating an environment is to install some packages.
pip
To install with pip, use the install
command:
pip install requests
pip
can install packages from many different locations:
# install the current working directory
pip install .
# install a file that someone sent you
pip install some_local_file.whl
# install the bleeding edge dev version from some github repo
pip install git+https://github.com/psf/requests
conda
or mamba
To install with conda
or mamba
, use the install
command:
mamba install requests
# or
conda install requests
Often with conda
, you will want to install from a specific channel (discussed below). You can either add channels permanently to your config, or specify channels at install time:
conda install -c conda-forge requests
Both pip
and conda
have a lot of ways to specify version constraints and package sources. See their respective documentation pages for details:
(… or use
pip install --help
orconda install --help
on the command line)
The most important thing to know is how to install a specific version:
pip install package==1.2.3
conda install package=1.2.3
This section discusses where pip
and conda
look for packages when you run install
, and where those packages end up on your computer.
This part is generally a bit mysterious to new Python users, and it can be very elucidating to understand where packages are downloaded from, and where they go on your computer when you install them
One of the main differences between installing a package using conda
vs pip
is the package repository that gets used (i.e. where the package is downloaded). It's useful to have a sense for where these programs look for packages when you use the install
command.
The two main package repositories are PyPI.org and anaconda.org
pip
searches at PyPI.orgBy default, pip
searches for packages in the Python Package Index (PyPI; pronounced "pie-pee-eye", not "pie pie").
If you'd like to use a web browser to see what packages, versions and files are available, you could also search directly at https://pypi.org/.
As an example, if you search for numpy, it will lead you to the page dedicated to the numpy
package. Clicking on release history will show you all versions available and their dates of release:
And clicking download files will show you the exact files that pip
would be selecting from and installing if you were to type pip install numpy
(more on "source distributions" and "binary distributions" later):
Learn More →
conda
searches at anaconda.orgBy default, conda
/mamba
searches for packages at anaconda.org. Here, however, things are a little more complicated than PyPI: conda
has the concept of channels. Channels are the locations where packages are stored; if we search for numpy
as we did above, this time we see a lot of entries:
Each entry above is numpy
, built and distributed in a different channels. By default, packages are downloaded from the defaults
channel; however: you'll amost always want to use the conda-forge channel.
Conda-forge is an awesome community-driven collection of (~20K) packages, which are found in the conda-forge
channel at anaconda.org. (The name "conda-forge" can also refer to the organization of open source contributors that maintains the channel as well.)
As mentioned above, to install a package from a specific channel, use the -c
flag when installing:
conda install -c conda-forge PACKAGE_NAME
If you regularly install from a specific channel, like conda-forge
, you can modify your channels list. For example, to add the conda-forge
channel:
conda config --add channels conda-forge
(now you no longer need to use -c conda-forge
every time you use conda
)
Miniforge & Mambaforge
Adding conda-forge to your channels is so common, and so useful, that Miniforge – the installer we used above to install conda
– was created. It is a minimal conda installer (just like Miniconda), with the added feature that conda-forge
is set as the default channel. Hopefully you now understand why we used it!
In most cases (though there are many exceptions), when you run pip install
or conda install
:
Packages will be added to your site-packages
folder
Platform | Standard installation location |
---|---|
Mac/Linux | prefix/lib/pythonX.Y/site-packages |
Windows | prefix\Lib\site-packages |
(… where prefix
will depend on the active virtual environment.)
In a "global" python installation without a virtual environment (
prefix
will be something like /usr/local/lib/pythonX.Y/site-packages
on Unix systems and C:\PythonXY\Lib\site-packages
on Windows.If you have a conda virtual environment active, prefix
will refer to your environment folder (e.g. ~/<conda_folder>/envs/ENV_NAME/
)
To print your current prefix
using Python:
python -c "import sys; print(sys.prefix)"
To print your site-packages folder location:
python -c "import site; print(site.getsitepackages())"
pip
and conda
don't always install to site packages…
User installs
The --user
flag makes pip
install packages in your home directory instead, which doesn't require any special privileges.
pip install --user PACKAGE_NAME
--user
installs, and delete them if I discover them on my system. See also the discussion of sys.path
below for tips on finding where a package is being imported from.
Editable installs
A common way to install packages you are actively developing is to pip install
in "editable" mode, with -e
/--editable
:
# install the current working directory in "editable mode"
pip install -e .
A good way to show where all the packages in your environment are installed is pip list
with the "verbose" flag -v
added:
$ pip list -v
Package Version Editable location Location Installer
------------ ---------- ------------------ -------------------------------------------------- ---------
certifi 2022.9.24 ~/mambaforge/envs/ttt/lib/python3.11/site-packages pip
nictool 0.1.0 ~/dev/self/nic ~/mambaforge/envs/ttt/lib/python3.11/site-packages pip
numpy 1.23.4 ~/mambaforge/envs/ttt/lib/python3.11/site-packages conda
pip 22.3.1 ~/mambaforge/envs/ttt/lib/python3.11/site-packages
requests 2.28.1 ~/.local/lib/python3.11/site-packages pip
setuptools 65.5.1 ~/mambaforge/envs/ttt/lib/python3.11/site-packages
wheel 0.38.3 ~/mambaforge/envs/ttt/lib/python3.11/site-packages
nictool
was installed locally in "editable" modenumpy
was installed using conda (everything else with pip)requests
was installed using --user
pip
and conda
The difference between (and when to use) pip
and conda
is one of the most common questions/confusions I see.
These are both extensive, complicated tools, with a broad range of functionality, so it's hard to summarize quickly without glossing over important details… but here goes
pip
installs Python packages (mostly from PyPI), and does not manage virtual environments. conda
installs any package (including Python itself, not just Python packages) – mostly from anaconda.org – and can also manage virtual environments.
Summarized in a table:
conda |
pip |
|
---|---|---|
Manages | binaries | wheel or source |
Can require local compiler | No | Yes |
Package types | Any | Python only |
Creates environments | Yes, built-in | No (use venv /virtualenv ) |
Strict dependency checks | Yes | No* |
Default package source | anaconda.org | PyPI |
binaries? wheels?
To really understand the motivation for conda – and what "can require local compiler" means in the table above – one must understand a little about "compiled" binaries. This is a bit beyond the scope here, but here's a very brief intro for those interested:
Python is an "interpreted language". Among other things, this generally means that the job of converting the human-readable source code into executable machine code is done on the machine executing the code. The developer ships a .py
file.
By constrast, compiled languages like C/C++ are generally converted into machine code elsewhere (by the developer), for each platform being supported, and then shipped to the end user as (e.g.) an .exe
file.
Lower-level compiled languages like C often perform better than "pure" Python code. However, it's very common for Python developers to write or generate small parts of their code (e.g. just the very frequently used functions) in C. These "C extensions" must then be compiled for each platform.
Many packages in the scientific python ecosystem have at least some compiled code.
If you've ever run pip install ...
and seen a ton of text fly by with some big red "failed to compile" errors at the end, then you've seen what can happen when you try to install a package that includes C extensions that are not pre-compiled for each platform. Not all computers have the programs necessary to compile these extensions, and so when pip tries to install and compile these packages, they may fail.
"Wheels" are a binary distribution format that you'll see on PyPI that allow a developer to pre-compile their extensions for every platform they'd like, so that the end user doesn't need to compile it. A wheel can be simply unzipped and dropped into site-packages
.
Conda doesn't use wheels, but a conda package achieves the same goal of distributing pre-compiled files so that the end-user needn't compile them.
pip
and conda
Most of the time, it is fine to use both pip install
and conda install
in the same environment. Sometimes, you don't have a choice: it is up to the package developer to make their packages available via conda
and/or pip
and you will find packages that are only available on pip
, or only on conda
.
However, you should be aware that there are cases where installing the same package from both package managers can cause problems (regardless of whether you install the package, or it get installed as an indirect dependency)
Here's an example of something you might very reasonably do that would result in a broken environment.
# create a new environment with python
conda create -n doomed_env python=3.10
# activate it
conda activate doomed_env
# install spyder (which depends on pyqt) using pip
pip install spyder
# go ahead and launch spyder ... so far so good!
spyder
# now install pyqt from conda (or... one of many conda packages that depend on it!)
conda install pyqt
# try to launch spyder again...
spyder
here's the error I see:
WARNING: You might be loading two sets of Qt binaries into the
same process. Check that all plugins are compiled against the
right Qt binaries. Export DYLD_PRINT_LIBRARIES=1 and check that
only one set of binaries are being loaded.
What happened here?
Without going into too much detail: both package managers (pip
and conda
) tried to install some stuff into the same folder (site-packages/PyQt5
). However, they installed slightly different "parts" (different compiled binaries) resulting in a package that just can't run.
note: This won't always happen: this particular case was caused by the fact that the package is unfortunately called
pyqt
in conda, butpyqt5
in pip… making it even harder for the two programs to "work together". Moreover, the order in which you install things (i.e.pip
-then-conda
, vsconda
-then-pip
) can also affect whether you run into this.
The main lessons here:
pip
and conda
togetherpip
or conda
.sys.path
The goal of this section is to understand where Python looks for modules when you type import PACKAGE
We've learned that packages generally (but not always) end up in the site-packages
folder in your environment. Let's now discuss where the Python interpreter finds when you import
them. It's pretty simple:
Python searches for modules on sys.path
sys
is an important module in the Python standard library (it will always be available to you).
sys.path
Let's look at sys.path
:
Start a python interpreter:
python
Now, import sys
and print out sys.path
:
import sys
print(sys.path)
You'll see something like this:
[
'/Users/talley/mambaforge/envs/ENV_NAME/bin',
'/Users/talley/mambaforge/envs/ENV_NAME/lib/python310.zip',
'/Users/talley/mambaforge/envs/ENV_NAME/lib/python3.10',
'/Users/talley/mambaforge/envs/ENV_NAME/lib/python3.10/lib-dynload',
'',
'/Users/talley/mambaforge/envs/ENV_NAME/lib/python3.10/site-packages'
]
There are three particularly important entries in there.
.../ENV_NAME/lib/python3.10
, This is where all of standard library modules will be found.''
: This empty string refers to "the current working directory": which starts as the directory you were in when you launched python
. If you want to see the current directory in python:
import os
print(os.getcwd())
.../ENV_NAME/lib/python3.10/site-packages
: This is the site-packages
folder we discussed above. Most of your installed packages should be there.Let's take advantage of the empty string entry ''
in sys.path
. Exit out of python (type exit()
) and create a new file named mymodule.py
with the following function:
# mymodule.py
def hello():
print("hi!")
Now, start python again
python
Then import your new module and use the function
import mymodule
mymodule.hello()
Take home message: Any custom modules you've created in the current working directory may be imported directly as long as ''
exists in sys.path
.
sys.path
sys.path
is not static: you can modify it like any Python list.
One reason you might want to do this is to add a folder of modules with some useful code that you've stored somewhere on your computer:
import sys
sys.path.append('/Users/talley/my_handy_python_stuff')
… Now I can import any python modules in /Users/talley/my_handy_python_stuff'
!
Don't get too carried away relying on modifying sys.path
. If you have a set of custom code that you routinely use across many different environments, consider creating a proper python package that you can pip install
as usual (remember, packages don't necessarily need to be public on PyPI: you can pip install
from github, or locally as well)
If you'd like to know all the nitty-gritty details of how
sys.path
gets initialized at runtime, see the Python documentation.
When completed, you should:
✅ Know how to create and use requirement files in pip
✅ Know how to create and use environment.yml
files in conda
✅ Understand the limitations of environment recipes
✅ Be aware of lock files (that comprehensively list the exact version of every package in an environment.)
Don't get attached to environments; create requirements files!
Environments are made to be broken and (re)created. Don't view an environment as something you worked hard to create "just right", and dread having to recreate. View them as a little sandbox that is isolated exactly so that it can be broken without messing up the rest of your system.
You may even occasionally need or want to uninstall the entire anaconda/miniconda/mambaforge folder, along with all of the environments you've made.
If you have an environment that you would be "sad" to loose, you should instead work to create a recipe for that environment that you can use to recreate the environment whenever necessary. Both pip
and conda
support this.
pip
requirements filesRequirements files serve as a list of items to be installed by pip, when using pip install. Files that use this format are often called “pip requirements.txt files”, since requirements.txt
is usually what these files are named (although, that is just a convention, not a requirement).
Each line of a requirements file supports the same requirements specifier syntax that you would use for pip install ...
# requirements.txt
numpy
nd2[legacy]
urllib3 @ https://github.com/urllib3/urllib3/archive/refs/tags/1.26.8.zip
To install everything listed in a requirements file, use the -r
flag with pip install
pip install -r requirements.txt
requirements.txt
into the currently active environment.
conda
allows you to create a new environment from an environment file (conventionally, these are called environment.yml
, but that is not a requirement):
Environment files have a specific structure (see full documentation).
#environment.yml
name: stats2
channels:
- conda-forge
dependencies:
- python=3.9
- bokeh=2.4.2
- pandas=1.4.4
- flask=2.2.2
To create an environment from an environment file:
conda env create -f environment.yml
name
field in the file itself, but you can override it by appending -n MY_NAME
Sometimes you'd like to have some assurance that you will be able to recreate an environment "indefinitely" in the future (for example: to reproduce an analysis for a paper you wrote 10 years ago).
Now, it may appear like the environment.yml
listed above would provide a fully "reproducible" setup that you could run years later to achieve the same thing
But there's a problem:
Each of the dependencies we declared has its own dependencies.
(Did you notice how many other things got installed in the stats2
environment above?) … and it's completely possible that one of those subdependencies might release a version in the future that changes or breaks one of our direct dependencies, and the result of our code.
To create a comprehensive list of the pinned versions of every package in our environment, we can use conda env export
:
env export > environment.lock.yml
… and if we later run conda env create -f environment.lock.yml
, we should get an exact duplicate of our current environment (at least, as long as we're on the same operating system).
conda-lock
Exported/locked environment files are great, but still have some practical difficulties & annoyances. They don't work "effortlessly" cross-platform, it can be hard to update one of the dependencies, and they can still be somewhat slow to solve and install (even though in theory you know exactly what packages are needed).
If you do this a lot, consider looking into conda-lock, which solves all of the above with a unified lockfile format.
# install conda-lock
conda install -c conda-forge conda-lock
# generate a multi-platform lockfile
conda-lock -f environment.yml -p osx-64 -p linux-64
Locking a pip-based environment
We won't go into them here, but there are also lockfile solutions for the pip
ecosystem (that do not require using conda):
✅ always work in a virtual environment
✅ don't be afraid to wipe it and start over
✅ avoid installing into the conda
base environment (and never with pip)
✅ create a "kitchen sink" environment rather than using base
Basically, all the time!
base
environmentYou should (almost) never install things into your base conda environment. Do your work in another environment and leave the base environment only for dependencies that actually manage environments (like conda
itself, or mamba
, or other dev-related dependencies like conda-build
, etc…)
In particular, try to never pip
install anything into your base environment.
If you ever make a mess of your base environment with pip
, and would like to restore
base to something like its original state, you can run this command (on unix systems)
conda activate base
conda list | grep pypi_0 | awk '{print $1;}' | xargs -I {} sh -c "pip uninstall -y {}";
conda install -y --revision 0
conda install mamba
mamba update -n base conda -y
Because frequently switching environments can be annoying, I like to create a "kitchen sink" environment (I call it "all
") that I use for all of my generic tasks, and I install things into it with reckless abandon.
conda create -y -n all python
conda activate all
If you keep installing things into one environment, it will eventually break. At that point, just delete it and recreate it. To make it easier to re-create a complicated environment, use environment files (discussed below).
To help avoid installing into base
, you might consider "auto-activating" this environment:
# in your ~/.zshrc or ~/.bash_profile
conda activate all
An "Integrated Development Environment" (IDE) is essentially a text-editor designed for code. They come with a ton of conveniences like autocompletion, syntax highlighting, debuggers, and lots more. They can be a little intimidating at first, but if you plan to do a lot of programming, the investment is well worth it: they will become an indispensible tool.
People get a little religious debating the merits of their favorite code editing programs
For whatever IDE you choose, you should definitely learn how to activate a specific python environment (see links for each IDE below)
https://code.visualstudio.com/docs/python/environments
https://www.jetbrains.com/help/pycharm/conda-support-creating-conda-virtual-environment.html
https://docs.spyder-ide.org/current/faq.html#using-existing-environment
JupyterLab (not really an IDE, but much closer than Jupyter Notebook)
Generally, JupyterLab will be installed in the environment you want to use… but you can also use nb_conda_kernels
to access other environments.
Jupyter Notebook?
While Jupyter Notebook is certainly very useful for sharing code with others, and for exploratory analysis, I would discourage you from thinking about Jupyter notebooks as "the place" where one goes to run some Python code. Notebooks are much harder to version control (i.e. in a git repository – they are complicated JSON files, not simple python files), and they discourage code reuse and organization (Notebooks are something of a dead end: people very rarely import
from a notebook)
Definitely get comfortable writing python scripts, and using an interactive read-eval-print-loop (REPL) like IPython (or even the plain python
prompt). It will pay off to be comfortable using python without needing to start up Jupyter Notebook.
The following is a summary of some of the commands we've discussed here, and what they do:
command |
description |
---|---|
conda (or mamba ) |
|
conda create -n ENV_NAME python=3.10 |
create an environment named ENV_NAME (with Python 3.10 installed) |
conda env remove -n ENV_NAME |
remove environment named ENV_NAME |
conda info -e |
list all conda environments |
conda activate ENV_NAME |
activate env named ENV_NAME |
conda deactivate |
deactivate current environment |
conda install -c conda-forge numpy |
install numpy using conda (from the conda-forge channel) into the current environment |
conda install numpy==1.23.4 |
install specific version of numpy using conda (using whatever channels are in your configuration) |
conda remove numpy |
uninstall numpy from the current environment |
conda list |
list all packages installed in the current environment |
conda config --add channels conda-forge |
add the conda forge channel to your config |
pip |
|
pip install numpy |
install numpy (from PyPI) into the current environment |
pip install numpy -U |
install/update numpy to latest version (from PyPI) |
pip install numpy==1.23.4 |
install specific version of numpy (from PyPI) |
pip install numpy |
uninstall numpy from the current environment |
pip list |
List installed packages (add -v to show package locations and installer) |
Good attributes of the sys
module to be aware of:
attribute |
description |
---|---|
sys.path |
A list of strings that specifies the search path for modules. |
sys.executable |
A string giving the absolute path of the executable binary for the Python interpreter. |
sys.prefix |
A string giving the site-specific directory prefix where the platform independent Python files are installed |