Python package management
venv and pip
pip is a terminal command used to install and upgrade Python packages.
PyPI is the main Python package repository. It’s ‘official’, but that doesn’t mean a lot - like most of these open source package repositories, a poor quality or even malicious package can easily be uploaded there, so do your diligence when picking them.
A Python virtual environment (or venv
, for short) is a directory you can install a particular Python executable and Python packages into, away from your machine’s default ones. Typically each project/repo you work on should have a different venv, and then you never have to deal with conflicting requirements between projects. When you ‘activate’ a particular venv, then run python
or pip
, those commands will work with that venv’s Python executable and Python packages.
Basic usage
NOTE: You may need to delete the .bash_aliases
file (rm .bash_aliases
) from your home directory for pip to work properly within a virtual environment.
Create a venv for your project, called ‘venv’ (make sure you run this in the Terminal):
cd myproject
python3 -m venv venv
(Add ‘venv’ to your .gitignore file, because this shouldn’t be added to your git repo.)
When you work with your project’s packages in a terminal, you’ll want to ‘activate’ your venv:
. venv/bin/activate
You’ll notice the prompt changes to show that the venv is activated: (venv) jovyan@jupyter-lab-davidread-ju-6966d9b9b4-7zvsk:~/myproject$
With the venv activated you can install some packages using pip:
(venv) $ pip install --user pandas
The packages will get installed to your venv, in venv/lib/python3.7/site-packages/
.
You can see what packages are installed using ‘pip freeze’:
(venv) $ pip freeze
numpy==1.18.4
pandas==1.0.4
python-dateutil==2.8.1
pytz==2020.1
six==1.15.0
With the venv activated, if you run a Python script from the terminal, the package will be available to it. For example:
(venv) $ python3 -c 'import pandas; print(pandas); print("It worked")'
<module 'pandas' from '/home/jovyan/myproject/venv/lib/python3.7/site-packages/pandas/init.py'>
It worked
In JupyterLab, to be able to use the venv’s packages (instead of the system packages), see Using a venv in Jupyter
When you commit your code, to ensure reproducibility, you should also commit an up-to-date record of what packages you’ve installed. The simplest way is to do:
(venv) $ pip freeze >requirements.txt
(venv) $ git add requirements.txt
You should also add to your README file the instructions for using requirements.txt - see the following section.
Using a project that has a requirements.txt
If a project has a ‘requirements.txt’ then you should install that into a venv.
A project’s README file is the traditional place to communicate usage of a requirements.txt. Because of that, this section is provided in markdown format so it can be copied into your project’s README, and tailored as necessary:
## Setup
Before you can run this project, you need to install some Python packages using the terminal:
# create a virtual environment
cd myproject
python3 -m venv venv
# install the python packages required
. venv/bin/activate
pip install -r requirements.txt
Library conflicts & warnings
If you come across any conflicts or warnings when installing your libraries using pip we advise you use poetry to resolve them.