Installing Python and packages#
A bit of background#
Here’s a an overview of a few relevant terms:
Interpreter: pragmatically defined, an interpreter is a computer program that executes instructions interactively.
The
python.exe
you find in your Python installation is an interpreter. It will run commands one by one.Interpreted programming languages are often contrasted against compiled programming languages.
Python is an interpreted language, as is R, for example.
Compilation: the act of translating a program from source code (human-readable text form) to machine code.
Machine code is the set of low level instructions that a processor understands.
A compiled program doesn’t run line by line, but rather builds to entire program first, and then be called to execute. We’re greatly simplifying, but the general idea is: building the entire program at once allows for optimization. This means that compiled languages are generally (much) faster than interpreted languages.
C, C++, and Fortran are examples of compiled languages.
The most popular Python interpreter, CPython, is a compiled program written in C!
Depending on operating system (Linux, MacOS, Windows) and processor architecture(e.g. x86, ARM; 32 bit versus 64 bit), different machine code has to be generated.
And a few notes specific to Python:
A Python installation includes the Python interpreter, and the Python standard library: modules to deal with dates, calendars, mathematics, file handling, etc.
Package: a collection of files that provide a certain set of functionality. For example: the
numpy
package for working with large, multi-dimension arrays and matrices.A Python package generally consists of several modules. These are generally the invididual
.py
files.In terms of files: packages can exist out of single module (
.py
file), or a set of directories “marked” with a__init__.py
file.
Generally, we would like to write the entire program in one language. Due to
slowness of interpreted languages, this has not been feasible within Python;
especially for technical computing since we typically want to crunch large
amounts of numbers. The solution people have come up with is to write the
performance critical parts in a compiled language like C or Fortran, and then
intermittently call these programs from Python. This solves the performance
problem, but at a price: while the package previously consisted of simply a
set of Python modules (.py
text files), it now includes compiled (binary)
files, that are specific to your operating system and processor architecture.
Not surprisingly, these are more complicated to distribute.
Dependency conflicts#
Multiple packages often share dependencies, but they might depend on
different, incompatible, versions. For example, package B
and C
might
both rely on package A
; but B
relies on version A.1
, and C
relies on version A.2
.
Typically, you might succeed in installing a version of B
and C
that
use the same version of A
, but you’ll find that suddenly package D
(which also depends on a version of A
) will no longer work. This state of
affairs is colloquially called “dependency hell”.
Installing Python packages with mamba – without the agonizing pain#
We highly recommend installing packages using mamba
or conda
. Conda is
a package and environment manager that installs packages from a remote
repository (which is a remote storage location of software packages); Mamba is
a (much) faster version of conda. Pip (acronym for “Pip install packages”) can
also be used for installing Python packages, but was designed mainly to install
pure Python packages, without binary dependencies; trying to pip install
packages with complex depencies is therefore a recipe for frustration and
disaster.
Conda does several things. First and foremost, it solves the dependency problem when installing a package. Secondly, it also installs binary dependencies. Thirdly, it provides isolated Python installations (termed environments). You might create a new environment if you have unsatisfiable version requirements, like two versions of Python (e.g. 2.7 and 3.9).
Mamba is a reimplementation of the conda package manager in C++. As the imod
package has a large number of dependencies, using mamba
instead of
conda
can strongly reduce installation times (from e.g. 30 minutes to 3
minutes!). As mamba
has become sufficiently stable, we now strongly
recommend it over conda
for installing packages and creating environments.
Some packages cannot be installed by conda/mamba because they are not available
on the conda channels. In that case, you can fall back on pip
to install
the package (pip install {package name}
).
Find the articles: Understanding conda and pip and Conda:Myths and Misconceptions for additional information. See also the Mamba homepage.
Anaconda, Miniconda, Miniforge, Mambaforge#
Ananaconda and Miniconda both provide a Python installation and conda as the package manager. The difference between them is that Anaconda comes with a large number of packages pre-installed in the base environment (which is why the installation is over a gigabyte). Miniconda, on the other hand, comes bare bones. Since we recommend working from environments to install packages into (see below), we do not consider the full Anaconda installer attractive.
Note
Since April 2020, Anaconda has changed their Terms of Service, limiting use of the anaconda repository to commercial users. If you worry you fall in the category of commercial users, we recommend installing Miniforge or Mambaforge instead. The change has negligible consequences for imod users, as we’ve long fully relied on the community led conda-forge channel.
You can find the Miniforge homepage and the Miniforge and Mambaforge installers here:
You can find installers for Miniconda or Anaconda here:
During installation, tick the box “Add Anaconda/Miniforge to PATH”, even though it colors a suggestive red.
Setting up an environment#
At some point you will run into a dependency issue. Sometimes the dependency requirements of two packages are straight out unsatisfiable. In other cases, you’d like to use the latest version, but this would break other packages. The solution conda offers is easy switching between different Python installations. A conda environment is simply a complete Python installation with all necessary dependencies. Creating a new environment will result in a new Python installation, without sharing of dependencies with other environments. (This is hardly the most efficient use of your hard disk space from a theoretical perspective, but it greatly simplifies matters in the practical sense.)
Below is the specification for an environment that should provide you with all the dependencies and requirements you need to build groundwater models with iMOD Python, and then some (for testing/development).
name: imod
channels:
- conda-forge
- defaults
dependencies:
- affine
- black
- bottleneck
- cftime>=1
- contextily>=1
- cytoolz
- dask
- geopandas
- graphviz
- python-graphviz
- flopy
- hypothesis
- jinja2
- matplotlib
- netcdf4
- numba>=0.50
- numpy
- pandas<2.0 # TODO: unpin when newer xarray on conda-forge
- pooch
- pip
- pip:
- pandamesh
- pydata-sphinx-theme
- pyproj
- pytest
- pytest-benchmark
- pytest-cov
- python>=3.7,<3.11
- pyvista
- rasterio>=1.0
- rioxarray
- scipy
- setuptools_scm
- scooby
- shapely>=2.0
- sphinx=5
- sphinx-gallery
- toolz
- tomli>=1.1.0
- tomli-w
- tqdm
- twine
- vtk>=9.0
- xarray>=0.15
- xugrid>=0.3.0
- zarr
Save this text into a file called imod-environment.yml
, location doesn’t
really matter. In your command prompt, cd
to this location and run:
mamba env create -f imod-environment.yml
This will create a conda environment named imod
as it is specified in the
file.
Environments can be “activated” by running:
conda activate {name of environment}
Active the just installed environment by running:
conda activate imod
This essentially temporarily updates your PATH variable, which is the set
of directories where executable programs are located. After deactivating the
conda environment, either via conda deactivate
or by closing the command
prompt, these directories are removed from PATH again so that the Python
installation is properly isolated.
Read more at the full conda docs.
Installing a newer or old version#
Since we’re currently in the process of adding a lot of features, the version on PyPI or conda-forge doesn’t always install the carry the latest updates. To get the latest version, activate the environment, clone the reposistory to a repository of choice, and do a “development install”:
activate imod
git clone https://gitlab.com/deltares/imod/imod-python.git
cd imod-python
pip install -e .
To get the latest developments at a later point in time, execute within the imod-python directory:
git pull
Installing specific (older) versions is possible by specifying a version number:
pip install imod==0.11.0
Past versions can also be found on the iMOD Python releases page.