Model Deployment Overview

There are two stages for developing deep learning applications: training and inference. The training stage is used to design a neural network for a specific task (such as image classification) using a huge amount of training data. The inference stage involves the deployment of the previously designed neural network to handle new input data not seen during the training stage.

The Vitis™ AI toolchain provides an innovative workflow to deploy deep learning inference applications on the DPU with the following four steps, which are described in this chapter using ResNet-50.

  1. Quantize the neural network model.
  2. Compile the neural network model.
  3. Program with Vitis AI programming interface.
  4. Run and evaluate the deployed DPU application.