20 Jul 2024
There is no universally agreed-upon way of managing dependencies in a Python project. There are multiple ways, each with its own advantages and limitations. Up until recently I used to do it by generating a requirements.txt
from the installed packages using pip feeze
. It served me well for most of my projects. But it does have some limitations when a project grows bigger.
Then I started using pip-tools
and found it to be a better way of managing dependencies.
To demonstrate the two approaches, let us take an example where we just need two packages, django
and pandas
for our project.
This is how we will do it without pip-tools
.
Create a virtual environment.
python3 -m venv venv
Activate the virtual environment.
source venv/bin/activate
venv\Scripts\activate
Install packages and generate requirements.txt
pip install pandas
pip install numpy
pip freeze > requirements.txt
requirements.txt
will look like this.
asgiref==3.8.1
Django==5.0.7
numpy==2.0.0
pandas==2.2.2
python-dateutil==2.9.0.post0
pytz==2024.1
six==1.16.0
sqlparse==0.5.1
tzdata==2024.1
Note that we only installed two packages but it contains several other packages as well. All the other packages are automatically installed as dependencies of django
and pandas
.
Now whenever we want to install this project’s dependencies again, instead of manually installing each package one by one, we can simply run the following command to install everything in one go.
pip install -r requirements.txt
But one major limitation of this approach is that we do not know which dependencies got installed because of django
and which got installed because of pandas
. This becomes an issue when we want to uninstall a package which is not required anymore. Assume we do not need pandas
in this project anymore. We will uninstall it and will re-generate requirements.txt
.
pip uninstall pandas
pip freeze > requirements.txt
This is how the regenerated requirements.txt
will look like.
asgiref==3.8.1
Django==5.0.7
numpy==2.0.0
python-dateutil==2.9.0.post0
pytz==2024.1
six==1.16.0
sqlparse==0.5.1
tzdata==2024.1
pandas
is gone but all the packages that were installed because of pandas
are still there.
This is where pip-tools
comes into the picture.
Lets do this exercise again with pip-tools
.
Create a new project. Then create and activate a virtual environment just like we did earlier. Then install pip-tools
.
pip install pip-tools
Create a requirements.in
file and list down the packages we want to install.
django
pandas
You can also specify the version for a package. eg: django==4.2.1
Now run the following command.
pip-compile
It will generate a new file requirements.txt
which will look like this. Note that this command does not install the packages.
#
# This file is autogenerated by pip-compile with Python 3.12
# by the following command:
#
# pip-compile
#
asgiref==3.8.1
# via django
django==5.0.7
# via -r requirements.in
numpy==2.0.0
# via pandas
pandas==2.2.2
# via -r requirements.in
python-dateutil==2.9.0.post0
# via pandas
pytz==2024.1
# via pandas
six==1.16.0
# via python-dateutil
sqlparse==0.5.1
# via django
tzdata==2024.1
# via pandas
This file clearly states the packages which will be installed as dependencies for other packages.
Now run the following command to install all the packages.
pip-sync
Now let’s see how to uninstall pandas
.First remove pandas
from requirements.in
, this file should now look like this.
django
Compile the dependencies.
pip-compile
The regenerated requirements.txt
will look like this.
#
# This file is autogenerated by pip-compile with Python 3.12
# by the following command:
#
# pip-compile
#
asgiref==3.8.1
# via django
django==5.0.7
# via -r requirements.in
sqlparse==0.5.1
# via django
Now it does not contain pandas
and the packages that were installed because of pandas
.
Finally run the following to complete the uninstallation of pandas
and its dependencies.
pip-sync