How to deploy and debug a model with DJL

Deep Java Library (DJL) is an open-source deep learning library that runs on Java framework developed by AWS. It supports multiple engines, such as Pytorch, TensorFlow, ONIX, etc. In fact, SageMaker, a popular machine-learning service manager provided by AWS uses DJL in the backend.

Today, let’s have a look at how to deploy and debug a Pytorch model with DJL.

Deploy ResNet18

First, let’s deploy a ResNet18 model with DJL so that a client can send an HTTP request with an image, and the model will respond with labels and probabilities. For reproducibility, we will deploy in a Docker container. Below shows the Dockerfile for deployment

# this is the docker image for serving (deployment)
FROM deepjavalibrary/djl-serving:0.27.0

# setup necessary packages
RUN apt update && apt install vim git python3 -y
RUN apt install python3-pip -y

WORKDIR /app

# install python engine so that we can run a pytorch model
RUN git clone https://github.com/deepjavalibrary/djl-serving -b v0.27.0
RUN cd djl-serving/engines/python/setup && pip install -U .

# download ResNet18 example
RUN git clone https://github.com/deepjavalibrary/djl-demo.git

Next, we will build the image, run the container, and deploy the ResNet18 model

# build the image
docker build -t djl .

# run the container with port 8080 and deploy the model
docker run --rm -p 8080:8080 djl djl-serving -m resnet::Python=file:/app/djl-demo/djl-serving/python-mode/resnet18

Here, the actual Python code that runs the inference is model.py. When we execute djl_serving command, it runs Java backend that calls handle function within model.py.

Now that the model is deployed, we can test our model endpoint. Run the following commands from the host machine

# download a test image
curl -O https://resources.djl.ai/images/kitten.jpg

# test the endpoint deployed inside the docker container
curl -X POST "http://127.0.0.1:8080/predictions/resnet" -T "kitten.jpg"

If everything works successful, you should see the following response from the model

[
  {
    "tabby":0.455234557390213,
    "tiger_cat":0.3483537435531616,
    "Egyptian_cat":0.15608163177967072,
    "lynx":0.02676212601363659,
    "Persian_cat":0.0022320037242025137
  }
]

Debug the model

Say you want to debug the Python code of the model, e.g., inference function within model.py. Breaking and stepping in/out directly on djl-serving is not trivial because we are running a Java program that calls Python. The easiest way to break and step in/out is to use DJL’s built-in djl_python.test_model module instead to simulate a request. That is, run the following within the Docker container where the model is deployed

# download the test iamge
curl -O https://resources.djl.ai/images/kitten.jpg

# simulate a request and run the model inference directly in python
python -m djl_python.test_model --model-dir=/app/djl-demo/djl-serving/python-mode/resnet18 --input=kitten.jpg

Now that we are running in Python directly, we can easily debug the Python code. VSCode users can install Dev Container extension to attach the running container and create launch.json file below to easily debug the inference code

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "debug model.py",
            "type": "debugpy",
            "request": "launch",
            "module": "djl_python.test_model",
            "args": [
                "--model-dir=/app/djl-demo/djl-serving/python-mode/resnet18",
                "--input=kitten.jpg"
            ]
        }
    ]
}

References

djl-demo/djl-serving/python-mode/README.md at master · deepjavalibrary/djl-demo

DJL - Python engine implementation - Deep Java Library