Custom build steps: PyVista on ARM64 example
Context
Tesseracts use by default an official Python docker image as the base image. Although this covers many useful cases, some system dependencies sometimes require a custom image and extra build steps.
Example yaml
Via tesseract_config.yaml, it is possible to somewhat flexibly alter the build process to accomodate different needs. As a concrete example, here’s what we had to do internally in order to to build an arm64 Tesseract with PyVista installed as a dependency:
build_config:
base_image: "debian:trixie"
target_platform: "linux/arm64"
extra_packages:
- python3-vtk9
- python3-venv
custom_build_steps:
- |
USER root
# Symlink the VTK Python bindings into the Python site-packages directory
RUN python_site=$(python -c "import site; print(site.getsitepackages()[0])") && \
ln -s /usr/lib/python3/dist-packages/vtk* $python_site && \
ls -l $python_site/vtk* && \
python -c "import vtk"
# Must install pyvista with --no-deps to avoid installing vtk (which we copied from the system)
RUN pip install matplotlib numpy pillow pooch scooby && pip install --no-deps pyvista==0.44.1
USER tesseractor
Here is what the various configurations set above do to the build process:
- We started from a base image that had VTK 9.2 installed (
debian:trixie), which was specified viabase_image. - The
target_platformis set tolinux/arm64, which will build the resulting image for an ARM64 architecture. - We installed python3-vtk9 and python3-venv to get the VTK Python bindings and
venvby specifying them inextra_packages. You can think ofextra_packagesas a list of packages which areapt-get installed right on the base image just immediately after some other dependencies (likegit,ssh, andbuild-essential) are installed. - We can then run arbitrary commands on the image which is being built via
custom_build_steps. This list of commands need to be specified as if they were in a Dockerfile. In particular, we start here by temporarily setting the user toroot, as the default user in the Tesseract build process istesseractor– which does not have root privileges – and then switch back to thetesseractoruser at the very end. We then run commands directly on the shell viaRUNcommands. All these steps specified incustom_build_stepsare executed at the very end of the build process, followed only by a last execution oftesseract-runtime checkthat checks that the runtime can be launched and the user-definedtesseract_apimodule can be imported.