Neural network training with Julia

Introduction

This example demonstrates how to integrate a Julia-based Multi-Layer Perceptron (MLP) training pipeline with the Tesseract framework.

The whole Tesseract can be downloaded here: julia_mlp.zip (335.5 KB)

Required Dependencies

The following additional dependencies are required to build and run this Tesseract:

  • NumPy (numpy==2.0.0)
  • JuliaCall (juliacall==0.9.20)

These can be stored in a tesseract_requirements.txt file.

tesseract_config.yaml

This is how this file is supposed to be structured:

name: "julia_flux_mlp"
version: "0.1.0"

build_config:
    target_platform: linux/amd64

    package_data:
    - [package, package]

    extra_packages:
    - curl

    custom_build_steps:
    # download julia
    - RUN curl -fsSL https://julialang-s3.julialang.org/bin/linux/x64/1.10/julia-1.10.0-linux-x86_64.tar.gz -o julia.tar.gz

    # unpack into folder called julia; delete the tar file
    - RUN mkdir julia && tar -xzf julia.tar.gz -C julia --strip-components=1 && rm julia.tar.gz

    # add julia to PATH
    - ENV PATH=${PATH}:"/tesseract/julia/bin"

    # Make sure the image can be used on any x86_64 machine by setting JULIA_CPU_TARGET
    # to the same value used by the generic julia binaries, see
    # https://github.com/JuliaCI/julia-buildkite/blob/4b6932992f7985af71fc3f73af77abf4d25bd146/utilities/build_envs.sh#L23-L31
    - ENV JULIA_CPU_TARGET="generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)"

    # check julia version
    - RUN julia --version

    # setup julia environment we copied over
    - RUN julia --project="package" -e "import Pkg; Pkg.instantiate(); Pkg.precompile()"

    # set juliacall environment location
    - ENV PYTHON_JULIAPKG_PROJECT="/tesseract"

    # import juliacall to setup environment
    - RUN python -c "import juliacall"

Code Overview

Setting Up the Julia Environment

The Julia environment is configured to include necessary packages and load the training script:

import base64
import juliacall
import numpy as np
from pydantic import BaseModel, Field

from tesseract_core.runtime import Array, Float64

# Initialize Julia environment
jl = juliacall.newmodule("julia_mlp")
jl.seval("using Pkg")
jl.seval('Pkg.activate("package")')
jl.seval("Pkg.instantiate()")
jl.include("package/train.jl")

Schema Definitions

We define schemas for structured input validation and output formatting:

class Metrics(BaseModel):
    train_mse: float = Field(description="Train mean squared error.")
    test_mse: float = Field(description="Test mean squared error.")

class InputSchema(BaseModel):
    layers: list[int] = Field(description="Layer structure", default=[100, 100])
    activation: str = Field(description="Activation function", default="sigmoid")
    n_epochs: int = Field(description="Number of epochs to train", default=1000)
    state: bytes | None = Field(description="The current architecture state", default=None)

class OutputSchema(BaseModel):
    parameters: list[Array[..., Float64]] = Field(description="Parameters of trained MLP.")
    metrics: Metrics = Field(description="Schema of accuracy metrics.")
    state: bytes = Field(description="The updated architecture state.")

Training Function

We define the function that applies the training process:

def apply(inputs: InputSchema) -> OutputSchema:
    layers = inputs.layers
    activation = inputs.activation
    n_epochs = inputs.n_epochs

    # Load dataset
    x_data, y_data = jl.load_data()
    data_train, data_test = jl.split_train_test(x_data, y_data)
    x_train, y_train = data_train
    x_test, y_test = data_test

    # Initialize and train
    activation_func = jl.eval(jl.Symbol(activation))
    model = jl.initialize_mlp(jl.Vector(layers), activation_func)
    
    if inputs.state is not None:
        with open("state_to_load.bson", "wb") as file:
            decoded = base64.b64decode(inputs.state)
            file.write(decoded)
        jl.load_state(model, "state_to_load.bson")
    
    jl.train_mlp_b(model, x_train, y_train, n_epochs=n_epochs)

    # Extract parameters
    params = jl.collect(jl.Flux.params(model))
    params_out = [np.array(params[i]) for i in range(len(params))]

    # Compute metrics
    metrics = {
        "train_mse": jl.mse(model, x_train, y_train),
        "test_mse": jl.mse(model, x_test, y_test),
    }

    # Save state
    jl.save_state(model, "flux_mlp.bson")
    with open("flux_mlp.bson", "rb") as file:
        file_content = file.read()
    
    encoded = base64.b64encode(file_content)
    return OutputSchema(parameters=params_out, metrics=metrics, state=encoded)

Abstract Evaluation

This function computes expected parameter shapes:

import itertools
from tesseract_core.runtime import ShapeDType

def abstract_eval(abstract_inputs):
    neuron_sizes = [1, *abstract_inputs.layers, 1]

    param_shapes = []
    for i, j in itertools.pairwise(neuron_sizes):
        param_shapes += [
            ShapeDType(shape=[j, i], dtype="float64"),
            ShapeDType(shape=[j], dtype="float64"),
        ]
    return {"parameters": param_shapes}

Building the tesseract

Go in the folder where the Tesseract has been coded and run tesseract build .

Testing the API

We verify that the API functions correctly:

from tesseract_core import Tesseract
from tesseract_api import InputSchema, apply

# Initialize Tesseract runtime
with Tesseract.from_image(image="julia_flux_mlp") as julia_flux_mlp:
    data = julia_flux_mlp.apply(inputs={"n_epochs": 10})
    print(f"state = {data['state']}, metrics = {data['metrics']}")

# Run API test cases
input_schema = InputSchema(n_epochs=100)
outputs = apply(input_schema)

new_inputs = InputSchema(n_epochs=100, state=outputs.state)
new_outputs = apply(new_inputs)