Tesseracts use by default an official Python docker image as the base image. Although this covers many useful cases, some system dependencies sometimes require a custom image and extra build steps.
Via tesseract_config.yaml
, it is possible to somewhat flexibly alter the build process to accomodate different needs. As a concrete example, here’s what we had to do internally in order to to build an arm64 Tesseract with PyVista installed as a dependency:
build_config:
base_image: "debian:trixie"
target_platform: "linux/arm64"
extra_packages:
- python3-vtk9
- python3-venv
custom_build_steps:
- |
USER root
# Symlink the VTK Python bindings into the Python site-packages directory
RUN python_site=$(python -c "import site; print(site.getsitepackages()[0])") && \
ln -s /usr/lib/python3/dist-packages/vtk* $python_site && \
ls -l $python_site/vtk* && \
python -c "import vtk"
# Must install pyvista with --no-deps to avoid installing vtk (which we copied from the system)
RUN pip install matplotlib numpy pillow pooch scooby && pip install --no-deps pyvista==0.44.1
USER tesseractor
Here is what the various configurations set above do to the build process:
- We started from a base image that had VTK 9.2 installed (
debian:trixie
), which was specified viabase_image
. - The
target_platform
is set tolinux/arm64
, which will build the resulting image for an ARM64 architecture. - We installed python3-vtk9 and python3-venv to get the VTK Python bindings and
venv
by specifying them inextra_packages
. You can think ofextra_packages
as a list of packages which areapt-get install
ed right on the base image just immediately after some other dependencies (likegit
,ssh
, andbuild-essential
) are installed. - We can then run arbitrary commands on the image which is being built via
custom_build_steps
. This list of commands need to be specified as if they were in a Dockerfile. In particular, we start here by temporarily setting the user toroot
, as the default user in the Tesseract build process istesseractor
– which does not have root privileges – and then switch back to thetesseractor
user at the very end. We then run commands directly on the shell viaRUN
commands. All these steps specified incustom_build_steps
are executed at the very end of the build process, followed only by a last execution oftesseract-runtime check
that checks that the runtime can be launched and the user-definedtesseract_api
module can be imported.