Serve PyTorch Models¶

This section will guide you through serving a PyTorch model, using the Kale serve API.

Overview

What You’ll Need
Procedure
Summary
What’s Next

What You’ll Need ¶

An Arrikto EKF or MiniKF deployment with the default Kale Docker image.
An understanding of how the Kale serve API works.

This guide comprises three sections: In the first section, you explore and process the dataset. Then, in the second section, you build, train, and package a PyTorch model using the Torch model archiver for TorchServe. Next, you serve the model using the Kale serve API and, finally, in the third section, you will invoke the model service to get predictions on a holdout test subset.

Explore Dataset ¶

In this guide, you will work with the CIFAR10 dataset. The CIFAR10 dataset consists of 60000 32x32 RBG images. The dataset creators have categorized the images in 10 different classes and the end goal is to correctly predict the object that each image depicts.

Create a new notebook server using the Kale GPU Docker image. The image will have the following naming scheme:

gcr.io/arrikto/jupyter-kale-gpu-py38:<IMAGE_TAG>

Note

The <IMAGE_TAG> varies based on the MiniKF or Arrikto EKF release.

Note

If you want to have access to a GPU device you must specifically request one or more from the Jupyter Web App UI. For this user guide, access to a GPU device is not required, but we recommend to add one so that you can get better results.
Create a new Jupyter notebook (that is, an IPYNB file):
Install the necessary dependencies in the first code cell. Copy and paste the following code, and run the cell:

- hide: code

!pip3 install torch>=1.3.1 torchvision==0.8.2 torch-model-archiver==0.6.0

This is how your notebook cell will look like:
Restart the notebook’s kernel using the corresponding button in the UI:
Copy and paste the import statements in the next code cell, and run it:

- hide: code

import json import numpy as np import matplotlib.pyplot as plt import torch import torchvision import torch.nn as nn import torch.optim as optim import torchvision.transforms as transforms from kale.serve import Endpoint

This is how your notebook cell will look like:
Load the dataset into train and test subsets. Copy and paste the following code into a new cell, and run it:

- hide: code

batch_size = 4 transform = transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # Download the images and transform them to Tensors trainset = torchvision.datasets.CIFAR10(root="./data", train=True, download=True, transform=transform) testset = torchvision.datasets.CIFAR10(root="./data", train=False, download=True, transform=transform) # Load the data trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2) testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2) classes = ("plane", "car", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck")

This is how your notebook cell will look like:
Visualize images from the dataset and print the category they belong to:

- hide: code

# get some random training images dataiter = iter(trainloader) images, labels = dataiter.next() _, ax = plt.subplots(1, 4, figsize=(10, 5)) for i in range(4): img = images[i] img = img / 2 + 0.5 npimg = img.numpy() label = labels[i] ax[i % 4].imshow(np.transpose(npimg, (1, 2, 0))) ax[i % 4].set_title(classes[label]) ax[i % 4].axis("off") plt.show()

This is how your notebook cell will look like:

Serve PyTorch Model ¶

In the same notebook server, open a terminal, and create two new Python files:

$ touch classifier.py $ touch cifar10_handler.py
- Inside the first module you will define the PyTorch model.
- Inside the second module you will define a handler component. TorchServe needs this module to process the data before passing it to the model.
Note

To learn more about TorchServe handlers, see the TorchServe handlers documentation. TorchServe provides several default handlers for common use cases, but you can also define your own.

Define a simple PyTorch Convolutional Neural Network (CNN). Copy and paste the following code inside classifier.py:

classifier.py

1# Copyright © 2022 Arrikto Inc.  All Rights Reserved.
2
3"""PyTorch Model Definition.
4
5This script defines a simple PyTorch CNN.
6"""
7
8import torch
9import torch.nn as nn
10import torch.nn.functional as f
11
12
13class Net(nn.Module):
14    """Define CNN model."""
15
16    def __init__(self):
17        super(Net, self).__init__()
18        self.conv1 = nn.Conv2d(3, 6, 5)
19        self.conv2 = nn.Conv2d(6, 16, 5)
20        self.fc1 = nn.Linear(16 * 5 * 5, 120)
21        self.fc2 = nn.Linear(120, 84)
22        self.fc3 = nn.Linear(84, 10)
23
24    def forward(self, x):
25        x = self.conv1(x)
26        x = f.relu(x)
27        x = f.max_pool2d(x, kernel_size=2)
28        x = self.conv2(x)
29        x = f.relu(x)
30        x = f.max_pool2d(x, kernel_size=2)
31
32        x = torch.flatten(x, 1)
33        x = self.fc1(x)
34        x = f.relu(x)
35        x = self.fc2(x)
36        x = f.relu(x)
37        x = self.fc3(x)
38
39        return x

Copy and paste the following code inside cifar10_handler.py:

cifar10_handler.py

1# Copyright © 2022 Arrikto Inc.  All Rights Reserved.
2
3"""CIFAR10 handler Definition.
4
5This script defines a simple TorchServe handler.
6"""
7
8from torchvision import transforms
9from torch.profiler import ProfilerActivity
10from ts.torch_handler.image_classifier import ImageClassifier
11
12
13class Cifar10Classifier(ImageClassifier):
14    """Cifar10Classifier Handler."""
15
16    class_names = ["plane", "car", "bird", "cat", "deer",
17                   "dog", "frog", "horse", "ship", "truck"]
18
19    image_processing = transforms.Compose([
20        transforms.ToTensor(),
21        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
22
23    def __init__(self):
24        super(Cifar10Classifier, self).__init__()
25        self.profiler_args = {
26            "activities": [ProfilerActivity.CPU],
27            "record_shapes": True,
28        }
29
30    def postprocess(self, data):
31        """Convert the predicted output response to a label.
32
33        Args:
34            data (list): The predicted output response.
35
36        Returns:
37            list : A list of dictionaries with processed predictions.
38        """
39        pred = data.argmax(1).tolist()
40        labels = [self.class_names[p] for p in pred]
41        return labels

This script defines a custom image classifier handler, which processes the inputs before passing them to the model, and postprocesses the predictions before returning them to the client.

Return to your Notebook file and run the following code to train your model for one epoch:

- hide: code

from classifier import Net net = Net() epochs = 1 # Change this number if you have a GPU device criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) for epoch in range(epochs): # loop over the dataset multiple times running_loss = 0.0 for i, data in enumerate(trainloader, 0): # get the inputs; data is a list of [inputs, labels] inputs, labels = data # zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # print statistics running_loss += loss.item() # print every 2000 mini-batches if i % 2000 == 1999: print("Epoch:", epoch + 1, "---- Mini batch:", i + 1, "---- Loss:", "%.3f" % (running_loss / 2000)) running_loss = 0.0

This is how your notebook cell will look like:

Note

If you have a GPU device, you can train the model for more epochs. To do so, change the value of the epochs variable to a higher number. This will improve the accuracy of the model, but producing an accurate model is not the focus of this user guide.
Serialize the model locally. Copy and paste the following code into a new cell, and run it:

- hide: code

from kale.marshal import save save(net, "model")

This is how your notebook cell will look like:

This command will create a model.pt directory, containing the weights of the model, and its definition. You will need these two files to package the model in a .mar file that can be deployed with KServe.
In the terminal, run the torch-model-archiver CLI to package the model and the handler:

jovyan@serve-pytorch-0:~$ torch-model-archiver \ > --model-name cifar10 \ > --version 1.0 \ > --model-file /home/jovyan/model.pt/network.py \ > --serialized-file /home/jovyan/model.pt/model.pt \ > --handler /home/jovyan/cifar10_handler

This command will create a cifar10.mar file. You will need this file in the next step.

Note

To learn more about .mar archives, how you can create one, what are the different options, and how to use them, see the TorchServe documentation.
Create and name a folder name cifar10. This is the folder you will point TorchServe to. Inside there should be two other directories:
- a config directory containing a config.properties file
- a model-store directory to hold the model archive
The following file tree depicts the final structure of the folder:

jovyan@serve-pytorch-0:~$ tree cifar10 cifar10 ├── config │ └── config.properties └── model-store └── cifar10.mar

Place the following contents in the config.properties file:

config.properties

1inference_address=http://0.0.0.0:8085
2management_address=http://0.0.0.0:8085
3metrics_address=http://0.0.0.0:8082
4grpc_inference_port=7070
5grpc_management_port=7071
6enable_metrics_api=true
7metrics_format=prometheus
8number_of_netty_threads=4
9job_queue_size=10
10enable_envvars_config=true
11install_py_dep_per_model=true
12model_store=/mnt/models/model-store
13model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"cifar10":{"1.0":{"defaultVersion":true,"marName":"cifar10.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":10,"responseTimeout":120}}}}

Note

To learn more about the configuration options, head to the configuration guide for TorchServe.

Copy the cifar10.mar file in the model-store folder:

$ cp cifar10.mar cifar10/model-store
Upload the cifar10 folder to S3. You can complete this step manually or by using the aws CLI:

$ aws s3 cp cifar10 s3://<bucket-name>/cifar10 --recursive

Note

You can use almost any object storage provider, such as AWS S3, Azure Blob Storage, or Google Cloud Storage. For a list of the KServe supported services and their configuration, see the KServe documentation.
Retrieve the S3 URI pointing to your cifar10 folder from the S3 UI. For example s3://<bucket-name>/.

Important

You should provide a URI pointing but not including the cifar10 folder. In this case, if your URI is s3://<bucket-name>/cifar10, you should provide s3://<bucket-name>/.
In your terminal, create a new file named s3-creds.yaml:

$ touch s3-creds.yaml
Copy and paste the following code into the s3-creds.yaml file:

apiVersion: v1 kind: Secret metadata: name: s3-creds annotations: serving.kserve.io/s3-endpoint: s3.amazonaws.com serving.kserve.io/s3-region: <REGION> serving.kserve.io/s3-useanoncredential: "false" serving.kserve.io/s3-usehttps: "1" type: Opaque data: AWS_ACCESS_KEY_ID: <AWS-ACCESS-KEY-ID> AWS_SECRET_ACCESS_KEY: <AWS-SECRET-ACCESS-KEY>

Replace the <REGION>, <AWS-ACCESS-KEY-ID, and <AWS-SECRET-ACCESS-KEY> placeholders with your credentials. KServe reads the secret annotations to inject the S3 environment variables on the storage initializer or model agent to download the models from S3 storage.
Create a new file for your ServiceAccount resource:

$ touch kserve-sa.yaml
Copy and paste the following code into the kserve-sa.yaml file:

apiVersion: v1 kind: ServiceAccount metadata: name: kserve-sa secrets: - name: s3-creds
Apply the Secret and the ServiceAccount resources:

$ kubectl apply -f s3-creds.yaml && kubectl apply -f kserve-sa.yaml
Note

If you are using a different object storage provider read the KServe documentation to configure your environment:
- Azure Blob Storage
- HTTP/HTTPS URIs
In the Notebook server you have running, instruct Kale to serve the model using the S3 URI you retrieved in a previous step:

from kale.serve import serve config = {"predictor": {"service_account_name": "kserve-sa", "storage_uri": "s3://kserve-examples/", "model_format": {"name": "pytorch"}}} isvc = serve(name="cifar-10", serve_config=config)

This is how your notebook cell will look like:

Get Predictions ¶

In this section, you will query the model endpoint to get predictions for the images in the test subset.

Navigate to the Models UI to retrieve the name of the InferenceService. In this example, it is cifar10.
In the existing notebook, in a different code cell, initialize a Kale Endpoint object using the name of the InferenceService you retrieved in the previous step. Then, run the cell:

- hide: code

endpoint = Endpoint(name="cifar10")

Note

When initializing an Endpoint, you can also pass the namespace of the InferenceService. For example, if your namespace is my-namespace:

- hide: code

endpoint = Endpoint(name="cifar10", namespace="my-namespace")

If you do not provide one, Kale assumes the namespace of the notebook server. In our case it is kubeflow-user.

This is how your notebook cell will look like:
Visualize a test sample:

- hide: code

dataiter = iter(testloader) images, labels = dataiter.next() img = images[1] label = labels[1] img = img / 2 + 0.5 npimg = img.numpy() plt.title(classes[label]) plt.imshow(np.transpose(npimg, (1, 2, 0))) plt.show()

This is how your notebook cell will look like:
Transform the example image the same way you did during training:

- hide: code

transform = transforms.Compose( [transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) transformed_data = transform(img)

This is how your notebook cell will look like:
Convert the test example into JSON format. Copy and paste the following code into a new code cell, and run it:

- hide: code

data = {"instances": [{"data": transformed_data.tolist()}]}

This is how your notebook cell will look like:
Invoke the server to get predictions. Copy and paste the following snippet in a different code cell, and run it:

- hide: code

res = endpoint.predict(json.dumps(data)) print(res)

This is how your notebook cell will look like:

Summary ¶

You have successfully served a PyTorch model stored on S3, using the Kale serve API.

What’s Next ¶

Check out how you to serve an XGBoost model.

Serve XGBoost Models

1		inference_address=http://0.0.0.0:8085
2		management_address=http://0.0.0.0:8085
3		metrics_address=http://0.0.0.0:8082
4		grpc_inference_port=7070
5		grpc_management_port=7071
6		enable_metrics_api=true
7		metrics_format=prometheus
8		number_of_netty_threads=4
9		job_queue_size=10
10		enable_envvars_config=true
11		install_py_dep_per_model=true
12		model_store=/mnt/models/model-store
13		model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"cifar10":{"1.0":{"defaultVersion":true,"marName":"cifar10.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":10,"responseTimeout":120}}}}

Serve PyTorch Models¶

What You’ll Need¶

Procedure¶

Explore Dataset¶

Serve PyTorch Model¶

Get Predictions¶

Summary¶

What’s Next¶