Xilinx is creating an environment where employees, customers, and partners feel welcome and included. To that end, we’re removing non-inclusive language from our products and related collateral. We’ve launched an internal initiative to remove language that could exclude people or reinforce historical biases, including terms embedded in our software and IPs. You may still find examples of non-inclusive language in our older products as we work to make these changes and align with evolving industry standards. Follow this link for more information.
The following table shows the revision history for this document.

<table>
<thead>
<tr>
<th>Section</th>
<th>Revision Summary</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>10/27/2021 Version 1.3</strong></td>
<td></td>
</tr>
<tr>
<td>General</td>
<td>Updated XRT commands.</td>
</tr>
<tr>
<td>Minimum System Requirements</td>
<td>Updated operating system requirements.</td>
</tr>
<tr>
<td>Appendix E: Miscellaneous Information and Settings</td>
<td>Added additional details on how to read the shell version.</td>
</tr>
<tr>
<td><strong>08/13/2021 Version 1.2.1</strong></td>
<td></td>
</tr>
<tr>
<td>General updates.</td>
<td>Editorial updates only. No technical content updates.</td>
</tr>
<tr>
<td><strong>06/17/2021 Version 1.2</strong></td>
<td></td>
</tr>
<tr>
<td>General</td>
<td>Updated output logs.</td>
</tr>
<tr>
<td>Minimum System Requirements</td>
<td>Updated operating system requirements.</td>
</tr>
<tr>
<td>Qualified Servers</td>
<td>Removed section.</td>
</tr>
<tr>
<td>Installing the SmartSSD CSD</td>
<td>Added clarification.</td>
</tr>
<tr>
<td>XRT and Deployment/Development Platform Installation Procedures on RedHat and CentOS</td>
<td>Updated installation steps.</td>
</tr>
<tr>
<td>XRT and Deployment/Development Platform Installation Procedures on Ubuntu</td>
<td></td>
</tr>
<tr>
<td>Known Issues/Limitations</td>
<td>Added a known issue related to shell image.</td>
</tr>
<tr>
<td>U.2 Platform</td>
<td>Updated platform related information.</td>
</tr>
<tr>
<td>Platform Naming Convention</td>
<td>Updated information.</td>
</tr>
<tr>
<td>Clocking</td>
<td>Added a note about FPGA clock throttling algorithm.</td>
</tr>
<tr>
<td>Available Resources After Platform Installation</td>
<td>Updated the total available value of resources.</td>
</tr>
<tr>
<td>Downgrading Packages</td>
<td>Added information about card and shell versions.</td>
</tr>
<tr>
<td>Appendix H: SmartSSD CSD Flat Shell</td>
<td></td>
</tr>
<tr>
<td>Appendix E: Miscellaneous Information and Settings</td>
<td>Added a note about the <code>sensors xilinx_u2_gen5x4_xdma_base_1_mgmt-pci-dd00</code> command.</td>
</tr>
<tr>
<td><strong>03/17/2021 Version 1.1</strong></td>
<td></td>
</tr>
<tr>
<td>General</td>
<td>• Updated output logs.</td>
</tr>
<tr>
<td></td>
<td>• Replaced the deprecated <code>xbmgmt list</code> command with <code>xbutil scan</code>.</td>
</tr>
<tr>
<td>Minimum System Requirements</td>
<td>Updated memory requirements.</td>
</tr>
<tr>
<td>Platform Naming Convention Prior to the 2020.2 Release</td>
<td>Removed section.</td>
</tr>
<tr>
<td>U.2 Platform</td>
<td>Added DFX-1RP and flat shell platform related information.</td>
</tr>
<tr>
<td>Appendix E: Miscellaneous Information and Settings</td>
<td>Updated the FPGA temperature and power threshold values.</td>
</tr>
<tr>
<td>Appendix H: SmartSSD CSD Flat Shell</td>
<td>Added appendix.</td>
</tr>
<tr>
<td>Section</td>
<td>Revision Summary</td>
</tr>
<tr>
<td>-----------------------</td>
<td>------------------</td>
</tr>
<tr>
<td>11/10/2020 Version 1.0</td>
<td></td>
</tr>
<tr>
<td>Initial release.</td>
<td>N/A</td>
</tr>
</tbody>
</table>
# Table of Contents

**Revision History** ................................................................................................................. 2

**Chapter 1: Introduction** ........................................................................................................ 6

**Chapter 2: Installation** .......................................................................................................... 7
  Overview .................................................................................................................................... 7
  Card Installation Procedures ..................................................................................................... 8
  Software Installation .................................................................................................................. 11
  Card Bring-Up and Validation .................................................................................................. 17
  Next Steps .................................................................................................................................. 28
  Troubleshooting ........................................................................................................................ 28

**Chapter 3: Platform Notes** .................................................................................................. 32
  Overview ..................................................................................................................................... 32
  Development Target Shell ........................................................................................................ 34
  Platform Naming and Life Cycle .............................................................................................. 34
  U.2 Platform ................................................................................................................................ 37

**Appendix A: Changing XRT and Target Platform Versions** .............................................. 43
  RedHat and CentOS .................................................................................................................. 43
  Ubuntu ........................................................................................................................................ 44

**Appendix B: Creating a Vault Repository for CentOS** ......................................................... 46

**Appendix C: Accessing the SmartSSD CSD through a Virtual Machine** ......................... 48

**Appendix D: Hot Plug Support on the SmartSSD CSD** ......................................................... 49

**Appendix E: Miscellaneous Information and Settings** ......................................................... 50

**Appendix F: FPGA-SSD DNA Pairing** ................................................................................. 53
Appendix G: Accessing FPGA DNA Primitive from Dynamic Region .......................................................... 55

Appendix H: SmartSSD CSD Flat Shell ................................................................. 58  
  Kernel Execution ........................................................................................................ 60  
  Flat Shell Kernel Migration ..................................................................................... 62

Appendix I: Additional Resources and Legal Notices ........................................... 65  
  Xilinx Resources ......................................................................................................... 65  
  Documentation Navigator and Design Hubs .......................................................... 65  
  References .................................................................................................................. 65  
  Please Read: Important Legal Notices ...................................................................... 66
Chapter 1

Introduction

The Samsung SmartSSD® Computational Storage Drive (CSD), powered by Xilinx® FPGAs, is a PCI Express® compliant storage accelerator module that integrates a Xilinx FPGA and Samsung NVMe SSD (controller with storage media) together. It is designed to accelerate storage-intensive applications such as data compression, decompression, encryption, decryption, and filtering with standard NVMe SSD functions by enabling direct PCIe® peer-to-peer (P2P) transfers between NVMe SSD and FPGA-DDR (global memory).

This guide provides installation and functional details for the SmartSSD CSD storage accelerator module.

Note: Module and card are used interchangeably throughout this document.
Chapter 2

Installation

Overview

This chapter provides hardware and software installation procedures for the SmartSSD® CSD storage accelerator module and applies to the Vitis™ unified software platform release 2021.1 and later.

Different system configurations are available for running, developing, and debugging applications on your SmartSSD CSD accelerator module:

- **Running Applications**: To run accelerated applications, install a SmartSSD CSD module into a system as described in Card Installation Procedures along with the required deployment software to support running applications as described in Software Installation.

- **Developing Applications**: To develop FPGA accelerated applications, it is necessary to install the development software. See the Software Installation section which describes the installation procedure for both, a development target platform and the Vitis environment. This configuration need not have a SmartSSD CSD module installed and can be used for development along with debugging in emulation modes.

- **Running, Developing, and Debugging Applications**: By installing the SmartSSD CSD along with both the deployment and development software on a single machine, you can configure a system for developing and running accelerated applications. With the module installed, developers can debug applications in both emulation modes and on the hardware.

Minimum System Requirements

The following table lists the minimum system requirements for running a SmartSSD CSD storage accelerator module.
### Table 1: Minimum System Requirements

<table>
<thead>
<tr>
<th>Component</th>
<th>Requirements</th>
</tr>
</thead>
<tbody>
<tr>
<td>PCI Express Subsystem</td>
<td>PCI Express 3.0 (or greater) compliant with NVMe U.2 bay available.</td>
</tr>
<tr>
<td></td>
<td>• System must support memory mapped I/O above 8 GB</td>
</tr>
<tr>
<td></td>
<td>This can be done by enabling Above 4G decoding in most standard BIOS.</td>
</tr>
<tr>
<td></td>
<td>Above 4G decoding allows the user to enable or disable memory mapped I/O for</td>
</tr>
<tr>
<td></td>
<td>a 64-bit PCIe device to 4G or greater address space.</td>
</tr>
<tr>
<td>BIOS Options</td>
<td>Above 4G decoding(^1)</td>
</tr>
<tr>
<td>Bay Power Supply</td>
<td>25W</td>
</tr>
<tr>
<td>Operating System</td>
<td>Linux, 64-bit:</td>
</tr>
<tr>
<td></td>
<td>• Ubuntu 18.04, 20.04</td>
</tr>
<tr>
<td></td>
<td>• CentOS 7.8, 7.9, 8.1, 8.2, 8.3</td>
</tr>
<tr>
<td></td>
<td>• RHEL 7.8, 7.9, 8.1, 8.2, 8.3, 8.4</td>
</tr>
<tr>
<td>OS Options</td>
<td>Reserve four PCIe bus numbers per U.2 slot for the SmartSSD CSD Surprise</td>
</tr>
<tr>
<td></td>
<td>Add functionality. Add the following arguments in the /etc/default/grub file</td>
</tr>
<tr>
<td></td>
<td>to boot the kernel to support Surprise Add(^1).</td>
</tr>
<tr>
<td></td>
<td>GRUB_CMDLINE_LINUX=&quot;pci=assign-busses,hpbussize=4&quot;</td>
</tr>
<tr>
<td></td>
<td>GRUB_CMDLINE_LINUX=&quot;realloc=on, hpmemsize=16G&quot;</td>
</tr>
<tr>
<td>Driver Installation</td>
<td>Linux inbox NVMe driver.</td>
</tr>
<tr>
<td>System Memory</td>
<td>For installations, a minimum of 8 GB plus application memory is required.</td>
</tr>
<tr>
<td>Internet Connection</td>
<td>Required for downloading drivers and utilities.</td>
</tr>
<tr>
<td>Hard Disk Space</td>
<td>Satisfy the minimum system requirements for your operating system.</td>
</tr>
<tr>
<td>Licensing</td>
<td>None required for application deployment. For the application development</td>
</tr>
<tr>
<td></td>
<td>environment, see Vitis Unified Software Platform Documentation.</td>
</tr>
</tbody>
</table>

**Notes:**

1. Reserving PCIe bus numbers to support the Hot Plug Surprise Add functionality is possible through either the BIOS or OS options.

## Card Interfaces and Details

The SmartSSD CSD module is passively cooled and is designed for installation into a server where controlled air flow provides direct cooling to the module. The module includes the following interfaces.

- U.2 compliant PCIe Gen3x4 connector

## Card Installation Procedures

To reduce the risk of fire, electric shock, or injury, always follow basic safety precautions.
CAUTION! You must always use an ESD strap or other antistatic device when handling hardware.

Safety Instructions

Safety Information

To ensure your personal safety and the safety of your equipment:

- Keep your work area and the computer/server clean and clear of debris.
- Before opening the computer/system cover, unplug the power cord.

Electrostatic Discharge Caution

Electrostatic discharge (ESD) can damage electronic components when they are improperly handled, and can result in total or intermittent failures. Always follow ESD-prevention procedures when removing and replacing components.

To prevent ESD damage:

- Use an ESD wrist or ankle strap and ensure that it makes skin contact. Connect the equipment end of the strap to an unpainted metal surface on the chassis.
- Avoid touching the module against your clothing. The wrist strap protects components from ESD on the body only.
- Handle the module by its bracket or edges only. Avoid touching the printed circuit board or the connectors.
- Put the module down only on an antistatic surface such as the bag supplied in your kit.
- If you are returning the module to Xilinx Product Support, place it back in its antistatic bag immediately.

Before You Begin

Note: SmartSSD CSD modules are delicate and sensitive electronic devices; equipment is to be installed by a qualified technician only. This equipment is intended for installation in a Restricted Access Location.

Check for module compatibility with the system. Also check for proper system requirements such as power, bus type, and physical dimensions to support the module.

Installing the SmartSSD CSD

The following procedure is a guide for the SmartSSD CSD module installation. Consult your computer documentation for additional information.
**Note**: The output results/logs provided throughout this document are for example purposes only. When executing any given command, your output may or may not exactly match the text.

If you encounter any issues during installation, see **Troubleshooting** and **Known Issues/Limitations**.

The steps to install the SmartSSD CSD module are as follows.

1. Shut down the host computer and unplug the power cord.
2. Plug the SmartSSD CSD module into the U.2 PCIe Gen3x4 capable slot provided for NVMe drives.
3. Connect the power cord and turn ON the computer.

**WARNING!** Do not power ON a passively cooled module without adequate forced airflow across the module, otherwise the module can be damaged. This module can heat up after use in the server. Use caution when handling.

4. To verify that the device has been installed correctly, enter the following Linux command in the terminal:

   ```bash
   $ lspci | grep -i xilinx
   ```

   If the module is successfully installed with respect to lspci and found by the operating system, a message similar to the following listing five PCIe devices will be displayed.

   ```plaintext
   5e:00.0 PCI bridge: Xilinx Corporation Device 9134
   5f:00.0 PCI bridge: Xilinx Corporation Device 9234
   5f:01.0 PCI bridge: Xilinx Corporation Device 9434
   61:00.0 Memory controller: Xilinx Corporation Device 6987
   61:00.1 Memory controller: Xilinx Corporation Device 6988
   ```

5. To verify that the controller is properly enumerated, enter the following Linux command in the terminal.

   ```bash
   $ lspci | grep -i samsung
   ```

   If the PCIe device for the SSD controller is properly enumerated, a message similar to the following will be displayed.

   ```plaintext
   69:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a824
   ```

6. After steps 4 and 5 are successful, enter the following command in the terminal. The output of this command must show the enumerated SSD name space. In this example, it is `nvme0n1`. Successful execution of this step indicates that the SmartSSD CSD module’s enumeration/detection is completed.

   ```bash
   $ lsblk
   ```
Software Installation

This chapter details the procedures for installing the deployment/development software on RedHat/CentOS and Ubuntu operating systems. All software installations use standard Linux RPM and Linux DEB packages and require root access.

The deployment/development software consists of the following packages.

- **Xilinx® runtime (XRT):** XRT provides the libraries and drivers for an application to run on the SmartSSD CSD module.

- **Deployment/Development platform:** The deployment platform provides the base firmware needed to run pre-compiled applications. It cannot be used to compile or create new applications. To create new applications, install the development software detailed in Next Steps. While you can also install the development software on a machine with an installed module, doing so is not necessary to run applications.

Both the XRT and deployment/development platform installation packages can be downloaded from the SmartSSD Computational Storage Device Download page.

If you encounter any issues during installation, see Troubleshooting and Known Issues/Limitations.

*Note:* Root access is required for all software and firmware installations.

XRT and Deployment/Development Platform Installation Procedures on RedHat and CentOS

Use the following steps to download and install the XRT and deployment/development platform using a .rpm installation package.

For details on upgrading or downgrading the XRT and deployment/development platform, see Appendix A: Changing XRT and Target Platform Versions.
Note: The installation packages referenced here are updated regularly and the file names frequently change. If you copy and paste any commands from this user guide, be sure to update the placeholders in those commands to match the downloaded packages.

1. XRT installation requires extra packages for enterprise Linux (EPEL) and a related repository. The initial setup depends on whether you are using RedHat or CentOS.
   For Redhat:
   a. Open a terminal window and enter the following command:
      
      ```shell
      $ sudo yum-config-manager --enable rhel-7-server-optional-rpms
      ```
      
      This enables an additional repository on your system.
   b. Enter the following command to install EPEL:
      
      ```shell
      ```

   For CentOS:
   a. Enter the following command in a terminal window:
      
      ```shell
      $ sudo yum install epel-release
      ```
      
      This installs and enables the repository for Extra Packages for Enterprise Linux (EPEL).

2. Run the following two commands to install kernel headers and kernel development packages. Ensure that `uname` is surrounded by backticks (` `) and not single quotes (` `):

   ```shell
   $ sudo yum install kernel-headers-`uname -r`
   $ sudo yum install kernel-devel-`uname -r`
   ``

   Note: If these yum commands fail because they cannot find packages matching your kernel version, set up a Vault repository. For more information, see Appendix B: Creating a Vault Repository for CentOS.

3. After the previous command completes, reboot your machine.

4. Download the XRT installation package corresponding to your OS and version from SmartSSD Computational Storage Device Download.

5. Install the XRT installation package by running the following command from within the directory where the XRT installation packages reside.

   ```shell
   $ sudo yum install ./xrt*.rpm
   ```

   This will install the XRT and its necessary dependencies. Follow the instructions when prompted throughout the installation.

6. Download and extract the deployment/development installation file based on your OS and version from SmartSSD Computational Storage Device Download.

   Extract the file into a single directory. The location of the directory is not important, however, the directory must not contain any other files.
7. Install the deployment/development packages. From within the directory where the installation packages were extracted, run the following commands. This will install all the deployment/development packages.

Part of development package:

```bash
sudo yum install xilinx-u2-gen3x4-xdma-gc-2-202110-1-dev*.rpm
```

Part of both deployment & development package:

```bash
sudo yum install xilinx-u2-gen3x4-xdma-gc-validate-2*.rpm
sudo yum install xilinx-u2-gen3x4-xdma-gc-base-2*.rpm
```

The installation of the deployment/development partition and firmware are located in the `/opt/xilinx/firmware` directory and contains the named partition and firmware sub-directories. After installing the deployment/development packages, flash the firmware to the module using the following command.

```bash
sudo /opt/xilinx/xrt/bin/xbmgmt program --base --device <xclmgmt BDF> --image <shell_name>
```

8. If multiple modules are installed on the server, then `xbmgmt program` command should be run separately for each module.

**Note:** When upgrading from `xilinx_samsung_u2x4_202010_1` platform to `xilinx_u2_gen3x4-xdma-gc_2_202110_1` platform, use the following command.

```bash
sudo /opt/xilinx/xrt/bin/xbmgmt program --base --image /opt/xilinx/firmware/u2/gen3x4-xdma-gc/base/partition.xsabin /opt/xilinx/firmware/u2/gen3x4-xdma-gc/base/partition.xsabin --flash-type spi --device <xclmgmt BDF>
```

Here the `--device` option takes the `xclmgmt` BDF (Bus device function which can be determined by executing the following command).

```bash
lspci | grep -i xilinx
```

If you have multiple modules installed on the server, you must run the `xbmgmt program` command separately for each module because the command needs the card BDF to be explicitly specified.

If the module/card has been upgraded, you will see a message similar to the following and no additional installation steps are necessary.

**Status:** Device(s) up-to-date and do not need to be flashed.

9. You will be asked to confirm the update. Type **y** and press the **Enter** key.

**Status:** shell needs updating
Current shell: <current_platform_name>
Shell to be flashed: <platform_to_be_flashed>
Are you sure you wish to proceed? [y/n]:

**Note:** Do not press Ctrl + c in the terminal while the firmware is flashing because this can cause the module to become inoperable.
Successful flashing of a new module results in the following message. If the command returns Card Not Found, perform a cold reboot, and retry. Otherwise, see Troubleshooting.

```
[0000:68:00.0] : Updating base (e.g., shell) flash image...
Bitstream guard installed on both flashes @0x801000
Extracting bitstream from MCS data:
.........
Extracted 9245382 bytes from bitstream @0x801000
Writing bitstream to flash 0:
.........
Successfully programmed flash address into icap controller ip
Extracting bitstream from MCS data:
.........
Extracted 9245382 bytes from bitstream @0x801000
Writing bitstream to flash 1:
.........
Bitstream guard removed from both flashes
INFO     : Base flash image has been programmed successfully.
----------------------------------------------------
Report
[0000:68:00.0] : Satellite Controller (SC) is either up-to-date, fixed, or not installed. No actions taken.
[0000:68:00.0] : Successfully flashed the base (e.g., shell) image
Device flashed successfully.
```

10. Cold boot the machine to load the new firmware image on the FPGA.

**Note:** Be sure to perform a cold boot to fully power OFF the machine and then power it ON again. The image will not boot from flash if the machine is only rebooted (that is, warm reboot).

The installation for deployment/development package is now complete. You can go directly to Card Bring-Up and Validation to validate the installation.

**Note:** If your system has older version of XRT installed, it can be removed using, $ yum remove xrt.

**XRT and Deployment/Development Platform Installation Procedures on Ubuntu**

**Note:** When installing XRT on Ubuntu, if the 2015 version of *pyopencl* is installed on your system, you must uninstall it. The XRT installation will install the 2019 version of *pyopencl* and will return an error if the 2015 version is installed. For more information, see Xilinx Answer Record 73055.

Use the following steps to download and install the XRT and deployment/development platform using a .deb installation package.

For details on upgrading or downgrading the XRT and deployment/development platform, see Appendix A: Changing XRT and Target Platform Versions.

**Note:** The installation packages referenced here are updated regularly and the file names frequently change. If you copy and paste any commands from this user guide, be sure to update the placeholders in those commands to match the downloaded packages.
1. Download the XRT installation package corresponding to your OS and version from SmartSSD Computational Storage Device Download.

2. Install the XRT installation package by running the following command from within the directory where the XRT installation packages reside.

   ```
   $ sudo apt install ./xrt*.deb
   ```

   This will install the XRT along with any necessary dependencies. Follow the instructions when prompted throughout the installation.

3. Download and extract the deployment/development installation file based on your OS and version from SmartSSD Computational Storage Device Download.

   Extract the file into a single directory. The location of the directory is not important, however, the directory must not contain any other files.

4. Install the deployment/development packages. From within the directory where the installation packages were extracted, run the following commands. This will install all deployment/development packages.

   ```
   Part of development package:
   sudo apt install xilinx-u2-gen3x4-xdma-gc-2-202110-1-dev*.deb
   sudo apt install xilinx-u2-gen3x4-xdma-gc-validate_2*.deb
   Part of both deployment & development package:
   sudo apt install xilinx-u2-gen3x4-xdma-gc-base_2*.deb
   ```

   The installation of the deployment/development partition and firmware are located in the `/opt/xilinx/firmware` directory and contain the named partition and firmware subdirectories. After installing the deployment/development packages, flash the firmware to the SmartSSD CSD using the below command:

   ```
   Partition package installed successfully.
   Please flash card manually by running below command:
   sudo /opt/xilinx/xrt/bin/xbmgmt program --base --device <xclmgmt BDF> --image <shell_name>
   ```

   **Note:** The `xbutil` is being deprecated and will not be supported in future releases. It is recommended to use the `xbmgmt` command for the SmartSSD CSD module.

5. If multiple modules are installed on the server, then `xbmgmt flash` can be used to upgrade all of them to the same shell version, at the same time.

   **Note:** When upgrading from `xilinx_samsung_u2x4_202010_1` platform to `xilinx_u2_gen3x4-xdma-gc_2_202110_1` platform, use the following command.

   ```
   sudo /opt/xilinx/xrt/bin/xbmgmt program --base --image /opt/xilinx/firmware/u2/gen3x4-xdma-gc/base/partition.xsabin /opt/xilinx/firmware/u2/gen3x4-xdma-gc/base/partition.xsabin --flash-type spi --device <xclmgmt BDF>
   ```
Here the `--device` option takes the `xclmgmt` BDF (Bus device function) which can be determined by executing the following command.

```bash
lspci | grep -i xilinx
```

If you have multiple modules installed on the server, you must run the `xbmgmt` program command separately for each module because the command needs the card BDF to be explicitly specified.

If the module/card has been upgraded, you will see a message similar to the following and no additional installation steps are necessary.

```
Status: Device(s) up-to-date and do not need to be flashed.
```

6. You will be asked to confirm the update. Type `y` and press the Enter key.

```
Status: shell needs updating
Current shell: <current_platform_name>
Shell to be flashed: <platform_to_be_flashed>
Are you sure you wish to proceed? [y/n]:
```

Flashing will take up to 10 minutes.

**Note:** Do not enter `Ctrl + c` in the terminal while the firmware is flashing because this can cause the module to become inoperable.

The following message is the result of successfully flashing a new module. If the command returns `Card Not Found`, perform a cold reboot, and retry. Otherwise, see Troubleshooting.

```
[0000:68:00.0] : Updating base (e.g., shell) flash image...
Bitstream guard installed on both flashes @0x801000
Extracting bitstream from MCS data:
.............
Extracted 9245382 bytes from bitstream @0x801000
Writing bitstream to flash 0:
.............
Successfully programmed flash address into icap controller ip
Extracting bitstream from MCS data:
.............
Extracted 9245382 bytes from bitstream @0x801000
Writing bitstream to flash 1:
.............
Bitstream guard removed from both flashes
INFO : Base flash image has been programmed successfully.
----------------------------------------------------
Report
[0000:68:00.0] : Satellite Controller (SC) is either up-to-date, fixed, or not installed. No actions taken.
[0000:68:00.0] : Successfully flashed the base (e.g., shell) image

Device flashed successfully.
```

7. Cold boot the machine to load the new firmware image on the FPGA.

**Note:** Be sure to perform a cold boot to fully power off the machine and then power it on again. The image will not boot from flash if the machine is only rebooted.
The installation for deployment/development is now complete. You can go directly to Card Bring-Up and Validation to validate the installation.

**Note:** If your system has older version of XRT installed, it can be removed using, `$ sudo apt remove xrt`.

---

**Card Bring-Up and Validation**

After installing the XRT and deployment (or development) platform, the module installation can be verified using the following commands, which are explained in more detail in the following sections.

- `lspci`
- `lsblk`
- `FIO`
- `xbutil examine`
- `xbmgmt examine --report platform`
- `xbutil validate --device <xocl B.D.F>`
- `byte copy test kernel execution`

Both the `lspci` and `lsblk` Linux commands are used to validate the module as seen by the OS, as was done when installing the module.

The `FIO` command can be run after installing Flexible I/O software on your server machine and is used to check the access to NVMe SSD.

The `xbmgmt` and `xbutil` utilities are included during the XRT package installation. These utilities include multiple commands to validate and identify the installed module(s) and report additional module details including DDR memory, PCIe®, platform name, and system information. See *Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)* for a detailed list of commands.

Set the environment to use the utilities by running the following command. Note that the command is dependent on the command shell you are using.

**Use the following command in csh shell.**

```
$ source /opt/xilinx/xrt/setup.csh
```

**Use the following command in bash shell.**

```
$ source /opt/xilinx/xrt/setup.sh
```
Running lspci and lsblk

1. Enter the following command to see the Xilinx PCIe devices.

```
$ sudo lspci -vd 10ee:
```

2. A message similar to the following is displayed if the module is successfully installed and
found by the operating system.

```
5e:00.0 PCI bridge: Xilinx Corporation Device 9134 (prog-if 00 [Normal decode])
  Flags: bus master, fast devsel, latency 0
  Bus: primary=5e, secondary=5f, subordinate=61, sec-latency=0
  I/O behind bridge: b8a00000-b8cfffff
  I/O behind bridge: b8a00000-b8cfffff
  Memory behind bridge: 0000383e00000000-0000383f040fffff
  Memory behind bridge: 0000383e00000000-0000383f040fffff
  Prefetchable memory behind bridge: 0000383e00000000-0000383f040fffff
  Prefetchable memory behind bridge: 0000383e00000000-0000383f040fffff
  Capabilities: [40] Power Management version 3
  Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
  Capabilities: [70] Express Upstream Port, MSI 00
  Capabilities: [100] Advanced Error Reporting
  Capabilities: [1c0] #19
  Kernel driver in use: pcieport
  Kernel modules: shpchp

5f:00.0 PCI bridge: Xilinx Corporation Device 9234 (prog-if 00 [Normal decode])
  Flags: bus master, fast devsel, latency 0
  Bus: primary=5f, secondary=60, subordinate=60, sec-latency=0
  I/O behind bridge: b8a00000-b8cfffff
  I/O behind bridge: b8a00000-b8cfffff
  Memory behind bridge: 0000383e00000000-0000383f040fffff
  Memory behind bridge: 0000383e00000000-0000383f040fffff
  Capabilities: [40] Power Management version 3
  Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
  Capabilities: [70] Express Downstream Port (Slot+), MSI 00
  Capabilities: [100] Access Control Services
  Capabilities: [1c0] #19
  Kernel driver in use: pcieport
  Kernel modules: shpchp

5f:01.0 PCI bridge: Xilinx Corporation Device 9434 (prog-if 00 [Normal decode])
  Flags: bus master, fast devsel, latency 0
  Bus: primary=5f, secondary=61, subordinate=61, sec-latency=0
  I/O behind bridge: 00000000-00000000
  Prefetchable memory behind bridge: 0000383e00000000-0000383f040fffff
  Capabilities: [40] Power Management version 3
  Capabilities: [70] Express Downstream Port (Slot-), MSI 00
  Capabilities: [100] Access Control Services
  Capabilities: [1c0] #19
  Kernel driver in use: pcieport
  Kernel modules: shpchp

61:00.0 Memory controller: Xilinx Corporation Device 6987
  Subsystem: Xilinx Corporation Device 1351
  Flags: bus master, fast devsel, latency 0
  Memory at c?2000000000 [64-bit, prefetchable] [size=2MB]
  Memory at c?2004000000 [64-bit, prefetchable] [size=2MB]
  Capabilities: [40] Power Management version 3
  Capabilities: [60] MSI-X: Enable- Count=33 Masked-
  Capabilities: [70] Express Endpoint, MSI 00
  Capabilities: [100] Advanced Error Reporting
```
3. Enter the following command to see the Samsung NVMe SSD PCIe device.

```
$ sudo lspci -vs <nvme_pcie_dev_id>:
Example: sudo lspci -vs 60:00.0
```

60:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
Device a824 (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd Device a801
Flags: bus master, fast devsel, latency 0, IRQ 33
Memory at b8a00000 (64-bit, non-prefetchable) [size=32K]
Expansion ROM at <ignored> [disabled]
Capabilities: [40] Power Management version 3
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable= Count=64 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [168] Alternative Routing-ID Interpretation (ARI)
Capabilities: [178] #19
Capabilities: [198] #26
Capabilities: [1c0] #27
Capabilities: [1e8] Single Root I/O Virtualization (SR-IOV)
Capabilities: [3ad] #25
Kernel driver in use: nvme
Kernel modules: nvme

4. Enter the following command to see the properties of the Samsung NVMe SSD controller.

```
$ lsblk
```

```
NAME   MAJ:MIN  RM  SIZE RO TYPE MOUNTPOINT
sdb    8:16    0  238.5G  0 disk
sdb2   8:18    0  1K    0 part
sdb5   8:21    0  976M  0 part [SWAP]
sdb1   8:17    0  237.5G  0 part /
sda    8:0     0  931.5G  0 disk
sda2   8:2     0  1K    0 part
sda5   8:5     0  63.6G  0 part
sda1   8:1     0  868G  0 part
nvme0n1 259:0  0   3.5T  0 disk
```
If the `lspci` and/or `lsblk` outputs do not match, see Troubleshooting.

**Running FIO**

Flexible I/O (FIO) is a disk I/O tool used to baseline SSD performance. After installing the software on the server machine (example command to install FIO: `sudo yum install fio` or `sudo apt-get install fio`), FIO commands can be run as shown in the following examples. These example commands run each of the FIO command for one minute (runtime = 60). `fio --help` can be used to understand or modify any of the FIO parameters as necessary.

- **Random-Write command:**

```bash
fio --name=rand-write --ioengine=libaio --iodepth=256 --rw=randwrite --bs=4k --direct=1 --size=100% --numjobs=12 --runtime=60 --filename=/dev/nvme0n1 --group_reporting=1
```

```
rand-write: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
fio-2.2.10
Starting 12 processes
Jobs: 12 (f=12): [w(12)] [100.0% done] [0KB/2511MB/0KB /s] [0/643K/0 iops] [eta 00m:00s]
rand-write: (groupid=0, jobs=12): err=0: pid=3894: Fri Mar 20 17:31:14 2020
write: io=149439MB, bw=2490.5MB/s, iops=637551, runt=60005msec
  slat (usec): min=0, max=2326, avg=3.86, stdev=2.45
  clat (usec): min=615, max=12844, avg=4809.97, stdev=425.89
  lat (usec): min=671, max=12850, avg=4813.91, stdev=425.88
  percentiles (usec):
    | 1.00th=[ 4384], 5.00th=[ 4704], 10.00th=[ 4768], 20.00th=[ 4768],
    | 30.00th=[ 4768], 40.00th=[ 4768], 50.00th=[ 4768], 60.00th=[ 4768],
    | 70.00th=[ 4768], 80.00th=[ 4768], 90.00th=[ 4832], 95.00th=[ 4832],
    | 99.00th=[ 8032], 99.50th=[ 8640], 99.90th=[ 9024], 99.95th=[ 9280],
    | 99.99th=[ 9536]
  bw (KB /s): min=112856, max=315080, per=8.33%, avg=212543.43,
  stdev=8157.99
  lat (usec) : 750=0.01%, 1000=0.01%
  lat (msec) : 2=0.03%, 4=0.06%, 10=99.89%, 20=0.01%
  cpu          : usr=11.96%, sys=25.69%, ctx=5205881, majf=0, minf=118287
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
               : >=64=100.0%
  submit       : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
  complete     : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
  issued       : total=r=0/w=38256283/d=0, short=r=0/w=0/d=0,
  drop=r=0/w=0/d=0
  latency : target=0, window=0, percentile=100.00%, depth=256
```

Run status group 0 (all jobs):
  WRITE: io=149439MB, aggrb=2490.5MB/s, minb=2490.5MB/s, maxb=2490.5MB/s,
  min=60005msec, max=60005msec

```
Disk stats (read/write):
  nvme0n1: ios=44/38193768, merge=0/0, ticks=0/182768396.
  in_queue=201898004, util=100.00%
```
**Random-Read command:**

```
fio --name=rand-read --ioengine=libaio --iodepth=256 --rw=randread --bs=4k --direct=1 --size=100% --numjobs=12 --runtime=60 --filename=/dev/nvme0n1 --group_reporting=1
```

```
rand-read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=256
...
fio-2.2.10
Starting 12 processes
Jobs: 12 (f=12): [r(12)] [100.0% done] [1798MB/0KB/0KB /s] [460K/0/0 iops] [eta 00m:00s]
rand-read: (groupid=0, jobs=12): err= 0: pid=3969: Fri Mar 20 17:33:55 2020
read : io=174809MB, bw=2913.2MB/s, iops=745764, runt= 60007msec
    slat (usec): min=0, max=2070, avg= 3.05, stdev= 1.56
    clat (usec): min=37, max=88007, avg=4112.07, stdev=2197.35
    lat (usec): min=41, max=88010, avg=4115.20, stdev=2197.41
    clat percentiles (usec):
    |  1.00th=[  628],  5.00th=[ 1176], 10.00th=[ 1608], 20.00th=[ 2288],
    | 30.00th=[ 2864], 40.00th=[ 3440], 50.00th=[ 3984], 60.00th=[ 4512],
    | 70.00th=[ 5088], 80.00th=[ 5600], 90.00th=[ 6496], 95.00th=[ 7008],
    | 90.00th=[12992], 99.50th=[15424], 99.90th=[17024], 99.95th=[18048],
    | 99.99th=[29312]
    bw (KB /s): min=106488, max=466448, per=8.33%, avg=248592.66, stdev=76171.88
    lat (usec) : 50=0.01%, 100=0.01%, 250=0.05%, 500=0.40%, 750=1.25%
    lat (usec) : 1000=0.14%, 2000=0.28%, 4000=0.56%, 8000=1.12%, 16000=2.25%
    lat (usec) : 32000=4.51%, 64000=9.03%, 128000=18.06%, 256000=36.12%
    lat (usec) : 512000=72.24%, 1024000=100.00%
    lat (msec) : 2=12.14%, 4=34.70%, 10=48.17%, 20=1.51%, 50=0.03%
    lat (msec) : 100=0.01%
    cpu          : usr=13.11%, sys=27.77%, ctx=21583274, majf=0, minf=15827
    IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
    >=64=100.0%
    submit      : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%
    complete    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%
    >=64=0.1%
    issued      : total=r=44751076/w=0/d=0, short=r=0/w=0/d=0,
    drop=r=0/w=0/d=0
    latency : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
    READ: io=174809MB, aggrbw=2913.2MB/s, minb=2913.2MB/s, maxb=2913.2MB/s,
    mint=60007msec, maxt=60007msec

Disk stats (read/write):
    nvme0n1: ios=44699555/0, merge=0/0, ticks=18346424/0,
    in_queue=206110532, util=100.00%
```

**Seq-Write command:**

```
fio --name=seq-write --ioengine=libaio --iodepth=64 --rw=write --bs=1024k --direct=1 --size=100% --numjobs=12 --runtime=60 --filename=/dev/nvme0n1 --group_reporting=1
```

```
seq-write: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=64
...
fio-2.2.10
Starting 12 processes
Jobs: 12 (f=12): [W(12)] [100.0% done] [0KB/3193MB/0KB /s] [0/3192/0
```
seq-write: (groupid=0, jobs=12): err= 0: pid=4050: Fri Mar 20 17:35:20 2020
write: io=191090MB, bw=3171.2MB/s, iops=3171, runt= 60243msec
slat (usec): min=59, max=1312, avg=258.71, stdev=67.88
clat (msec): min=8, max=498, avg=241.77, stdev=15.25
lat (msec): min=8, max=499, avg=242.03, stdev=15.24
clat percentiles (msec):
  | 1.00th=[ 229], 5.00th=[ 233], 10.00th=[ 235], 20.00th=[ 237],
  | 30.00th=[ 239], 40.00th=[ 241], 50.00th=[ 241], 60.00th=[ 243],
  | 70.00th=[ 245], 80.00th=[ 247], 90.00th=[ 251], 95.00th=[ 255],
  | 99.00th=[ 262], 99.50th=[ 277], 99.90th=[ 420], 99.95th=[ 449],
  | 99.99th=[ 482]
bw (KB /s): min=248365, max=410826, per=8.34%, avg=270936.65,
lat (msec) : 10=0.01%, 20=0.03%, 50=0.11%, 100=0.11%, 250=86.87%
cpu : usr=3.53%, sys=3.87%, ctx=185996, majf=0, minf=218
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.2%,
>=64=99.6%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
issued : total=r=0/w=191090/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/
d=0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: io=191090MB, aggrb=3171.2MB/s, minb=3171.2MB/s, maxb=3171.2MB/s,
  mint=60243msec, maxt=60243msec

Disk stats (read/write):
  nvme0n1: ios=44/1526654, merge=0/0, ticks=4/36093220, 
in_queue=362539680, util=100.00%

- Seq-Read command:

```
fio --name=seq-read --ioengine=libaio --iodepth=64 --rw=read --bs=1024k --
direct=1 --size=100% --numjobs=12 --runtime=60 --filename=/dev/nvme0n1 --
group_reporting=1
```

```bash
seq-read: (g=0): rw=read, bs=1M-1M/1M-1M, ioengine=libaio, iodepth=64
```

```
fio -2.2.10
Starting 12 processes
Jobs: 12 (f=12) [R(12)] [100.0% done] [1771MB/0KB/0KB /s] [1771/0/0
iops] [eta 00m:00s]
seq-read: (groupid=0, jobs=12): err= 0: pid=4123: Fri Mar 20 17:36:55 2020
read : io=196587MB, bw=3253.1MB/s, iops=3253, runt= 60416msec
slat (usec): min=42, max=873, avg=146.05, stdev=64.68
clat (msec): min=29, max=835, avg=235.76, stdev=37.41
lat (msec): min=29, max=836, avg=235.91, stdev=37.40
clat percentiles (msec):
  | 1.00th=[ 221], 5.00th=[ 225], 10.00th=[ 227], 20.00th=[ 229],
  | 30.00th=[ 231], 40.00th=[ 233], 50.00th=[ 231], 60.00th=[ 233],
  | 70.00th=[ 235], 80.00th=[ 237], 90.00th=[ 239], 95.00th=[ 243],
  | 99.00th=[ 247], 99.50th=[ 249], 99.90th=[ 725], 99.95th=[ 775],
  | 99.99th=[ 824]
bw (KB /s): min=44122, max=420207, per=8.32%, avg=277209.59,
stdev=30837.04
```
lat (msec) : 50=0.15%, 100=0.10%, 250=96.99%, 500=2.45%, 750=0.23%
lat (msec) : 1000=0.08%
cpu          : usr=0.21%, sys=4.23%, ctx=188129, majf=0, minf=196789
IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.2%,
    >=64=99.6%
submit      : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
    >=64=0.0%
complete    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
    >=64=0.0%
issued      : total=r=196587/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/
    latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: io=196587MB, aggrb=3253.1MB/s, minb=3253.1MB/s, maxb=3253.1MB/s,
    mint=60416msec, maxt=60416msec

Disk stats (read/write):
    nvme0n1: ios=1569192/0, merge=0/0, ticks=359879568/0,
in_queue=362147904, util=100.00%

**Note:** You can use the standard NVMe commands to check NVMe SSD device temperature when running FIO or any acceleration application. For example, `$ nvme smart-log /dev/nvme0n1` (Note that this command needs to be run as root and NVME CLI may need to be installed).

```text
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning                    : 0
temperature                         : 35 C
available_spare                     : 100%
available_spare_threshold           : 10%
percentage_used                    : 0%
data_units_read                     : 7,35,30,201
data_units_written                  : 44,37,21,492
host_read_commands                  : 1,68,55,846
host_write_commands                 : 21,16,45,54,518
controller_busy_time                : 1,403
power_cycles                        : 2,402
power_on_hours                      : 3,556
unsafe_shutdowns                    : 2,329
media_errors                        : 0
num_err_log_entries                 : 0
Warning Temperature Time            : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1                : 35 C
Temperature Sensor 2                : 34 C
Temperature Sensor 3                : 32 C
Temperature Sensor 4                : 0 C
Temperature Sensor 5                : 0 C
Temperature Sensor 6                : 0 C
Temperature Sensor 7                : 0 C
Temperature Sensor 8                : 0 C
```

If `fio` does not run and fail with an error, see **Troubleshooting**.
Running xbutil Examine

xbutil examine confirms that the SmartSSD CSD module is available for use by the XRT for accelerating any application kernel. The following example shows the devices ready for use under Devices present field.

```bash
[root@localhost bin]# xbutil examine
System Configuration:
  OS Name              : Linux
  Release              : 3.10.0-1127.el7.x86_64
  Version              : #1 SMP Tue Mar 31 23:36:51 UTC 2020
  Machine              : x86_64
  CPU Cores            : 24
  Memory               : 63575 MB
  Distribution         : CentOS Linux 7 (Core)
  GLIBC                : 2.17
  Model                : Super Server

XRT:
  Version              : 2.12.393
  Branch               : 2021.2
  Hash                 : 17897d64afb1868416c51f669e581e311cc0f14b
  Hash Date            : 2021-09-22 02:24:23
  XOCL                 : 2.12.393, 17897d64afb1868416c51f669e581e311cc0f14b
  XCLMGMT              : 2.12.393, 17897d64afb1868416c51f669e581e311cc0f14b

Devices present:
  [0000:68:00.1] : xilinx_u2_gen3x4_xdma_gc_base_2 user(inst=128)
```

If xbutil examine output does not match, see Troubleshooting.

Running xbmgmt Examine --report Platform

Use xbmgmt examine --report platform command to view and validate the module's current firmware version, as well as to display the details of the installed module, including module BDF, platform name, and platform UUID (checksum).

1. Enter the following command:

   ```bash
   $ sudo /opt/xilinx/xrt/bin/xbmgmt examine --report platform
   ```

   For each module in the server, an output similar to the following example is displayed:

   ```bash
   [root@localhost bin]# xbmgmt examine --report platform
   1/1 [0000:68:00.0] : xilinx_u2_gen3x4_xdma_gc_base_2
   --------------------
   Flash properties
   Type          : spi
   Serial Number : N/A

   Device properties
   Type          : N/A
   Name          : N/A
   Config Mode   : 0
   ```
Max Power            : N/A

Flashable partitions running on FPGA
  Platform             : xilinx_u2_gen3x4_xdma_gc_base_2
  SC Version           : 0.0.0
  Platform UUID        : 8E0FAC72-0D9F-B708-058B-D2291BDEDF3A
  Interface UUID       : EA4824D1-F858-8723-EACA-0BA4DDF76339

Flashable partitions installed in system
  Platform             : xilinx_u2_gen3x4_xdma_gc_base_2
  SC Version           : N/A
  Platform UUID        : 8E0FAC72-0D9F-B708-058B-D2291BDEDF3A

In this example, the BDF is 0000:68:00.0.

Note: SC Version : N/A can be ignored.

The name of the platform and associated ID running on the FPGA are found under Flashable partition running on FPGA while the ones installed in the system are found under Flashable partitions installed in system.

In the previous output example, the platform on the FPGA and system are identical; the deployment (or development) platform is named xilinx_u2_gen3x4_xdma_base_2 and the ID/timestamp is 0x8E0FAC720D9FB708.

2. Verify that the deployment (or development) platform version installed on the FPGA is identical to that installed on the system. You can do this by making sure the Platform UUID under Flashable partition running on FPGA and Flashable partitions installed in system is identical.

If these versions do not match, see Troubleshooting.

Running xbutil validate

The xbutil validate command validates the correct installation by performing the following set of tests:

1. Validates the device found.
2. Checks PCIe link status.
3. Runs a verify kernel on the module.
4. Performs the following data bandwidth tests:
   a. DMA test - Data transfer between host and FPGA DDR memory through PCIe.
   b. DDR test - Data transfer between kernels and FPGA DDR memory (device memory bandwidth test in the following xbutil command output log).

The validate command has the following format. Specify the card_bdf(xocl Physical Function B.D.F) value which needs to be validated as an argument to the --device option.

xbutil validate --device <card_bdf>
Run the following validate command.

```
$ /opt/xilinx/xrt/bin/xbutil validate --device <card_bdf>
```

If the module was installed correctly, you will see a high-level summary of the tests performed similar to the following output. If the output displayed is not similar to the following, see Troubleshooting.

```
Starting validation for 1 devices

Validate Device : [0000:8d:00.1]
  Platform : xilinx_u2-gen3x4_xdma_gc_base_2
  SC Version : 0.0.0
  Platform ID : 625B99FA-75B5-6D83-53FF-2A7A999C8BBB

---
Test 1 [0000:8d:00.1] : PCIE link
  Test Status : [PASSED]
---
Test 2 [0000:8d:00.1] : SC version
  Test Status : [PASSED]
---
Test 3 [0000:8d:00.1] : Verify kernel
  Test Status : [PASSED]
---
Test 4 [0000:8d:00.1] : DMA
  Details : Host -> PCIe -> FPGA write bandwidth = 3423.9 MB/s
            Host <- PCIe <- FPGA read bandwidth = 3314.4 MB/s
  Test Status : [PASSED]
---
Test 5 [0000:8d:00.1] : iops
  Details : IOPs: 150790 (hello)
  Test Status : [PASSED]
---
Test 6 [0000:8d:00.1] : Bandwidth kernel
  Details : Maximum throughput: 15419 MB/s
  Test Status : [PASSED]
---
[PASSED] : < 6s >
Test 7 [0000:8d:00.1] : Peer to peer bar
  Details : bank0 validated
  Test Status : [PASSED]
---
Test 8 [0000:8d:00.1] : vcu
  Validation completed. Please run the command '--verbose' option for more details.
```
Running Byte Copy Test Kernel

Byte copy test kernel is provided in the reference design files to validate peer to peer (P2P) transfers using the SmartSSD CSD module. This kernel exercises all the possible data paths through PCIe switch including P2P transfers by NVMe SSD.

This test exercises two sub-sequences:

1. P2P Read
2. P2P Write

During the P2P read sequence, the data flows from the SSD -> FPGA DDR -> Byte Copy Read (from FPGA DDR) -> Byte Copy Write (into FPGA DDR) -> Host DDR.

Note: The flow of data from SSD to FPGA DDR is called P2P read. This is iteration 0 in the following log.

During the P2P write sequence, the data flows from Host DDR -> FPGA DDR -> Byte Copy Read from FPGA DDR -> Byte Copy Write into FPGA DDR -> SSD.

Note: The flow of data from FPGA DDR to SSD is called P2P write. This is iteration 1 in the following log.

Run the following validate command.

```
./run_async_bytecopy.sh
iteration 0
INFO: Successfully opened NVME SSD /dev/nvme1n1
Detected 1 devices, using the 0th one
INFO: Importing ./bytecopy.xclbin
INFO: Loaded file
INFO: Created Binary
INFO: Built Program
INFO: Preparing 131072KB test data in 32 pipelines
INFO: Kick off test
SSD -> FPGA(p2p BO) -> FPGA(host BO) -> HOST
overall        72708us   100.00%    1760.47MB/s
   p2p         39827us   54.78%    3213.90MB/s
INFO: Evaluating test result
INFO: Test passed
iteration 1
INFO: Successfully opened NVME SSD /dev/nvme1n1
Detected 1 devices, using the 0th one
INFO: Importing ./bytecopy.xclbin
INFO: Loaded file
INFO: Created Binary
INFO: Built Program
INFO: Preparing 131072KB test data in 32 pipelines
INFO: Kick off test
HOST -> FPGA(host BO) -> FPGA(p2p BO) -> SSD
overall        97844us   100.00%    1308.20MB/s
   p2p         53183us   54.35%    2406.78MB/s
INFO: Evaluating test result
INFO: Test passed
iteration 2
INFO: Successfully opened NVME SSD /dev/nvme1n1
Detected 1 devices, using the 0th one
INFO: Importing ./bytecopy.xclbin
INFO: Loaded file
```
INFO: Created Binary
INFO: Built Program
INFO: Preparing 131072KB test data in 32 pipelines
INFO: Kick off test
SSD -> FPGA(p2p BO) -> FPGA(host BO) -> HOST
   overall     75113us  100.00%  1704.10MB/s
   p2p         40916us  54.47%  3128.36MB/s
INFO: Evaluating test result
INFO: Test passed

Note: Byte copy application test is primarily a functional test and has not been optimized for performance. You may see variations in the performance numbers from iteration to iteration. Byte copy test sources are available at: Xilinx GitHub.

Next Steps

If you are an application developer who wants to develop and deliver accelerated applications, install the Vitis™ software platform. It allows you to develop, debug, and optimize accelerated applications for the SmartSSD CSD module.

For more information about getting started with the Vitis software platform, installation instructions, and complete details on the development flow, see Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

Troubleshooting

The following table lists potential issues, causes, and fixes related to module installation.

Table 2: Module Troubleshooting

<table>
<thead>
<tr>
<th>Issue</th>
<th>Potential Cause</th>
<th>Fix</th>
</tr>
</thead>
<tbody>
<tr>
<td>Device not present/ 0 card(s) found in xbuil examine.</td>
<td>Module not correctly installed.</td>
<td>Reinstall the module following the installation instructions. Follow module bring up steps to check if module shows up.</td>
</tr>
<tr>
<td></td>
<td>Module not compatible with server.</td>
<td>Check if the server is a qualified server.</td>
</tr>
<tr>
<td></td>
<td>Kernel version is incompatible.</td>
<td>Run <code>uname -r</code> to check the kernel version. Ensure that the kernel version matches the version listed for your OS in Software Installation.</td>
</tr>
</tbody>
</table>
### Table 2: Module Troubleshooting (cont’d)

<table>
<thead>
<tr>
<th>Issue</th>
<th>Potential Cause</th>
<th>Fix</th>
</tr>
</thead>
</table>
| Device present but `xbutil examine` shows 0 usable device. | XRT Driver has not loaded successfully, or the module is not flashed successfully. | Reload XRT drivers using the following commands:  
`sudo rmmod xclmgmt`  
`sudo rmmod xocl`  
`sudo modprobe xclmgmt`  
`sudo modprobe xocl`  
If `xbutil examine` continues to have the same issue, then perform a cold-reboot. |
| Sudden unexpected reboot or loss of PCIe devices (lspci) or NVMe devices (lspci or lsblk). | Module may be overheating.  
**Note:** The `xbutil query --legacy` or `xbutil examine -r all --device <xocl PF BDF>` command can be used to track the FPGA operating temperature and `nvme -smart log` command can be used to track the NVMe SSD operating temperature. | Ensure that operating ambient conditions do not exceed specifications and there is adequate cooling for module to function properly. |
| Read issued at some DDR memory location results in sudden reboot or some kind of PCIe or NVMe SSD I/O error.  
Running application suddenly crashes with some kind of PCIe error / machine reboot or some kind of NVMe error. | Uninitialized DDR location read can cause this behavior. | DDR controller’s ECC function requires write to the memory location prior to performing read. Therefore, ensure that user application does not read the DDR location that is never written, that is, P2P or non-P2P buffers in DDR must be written or initialized prior to read. |
| `xbmgmt --program` returns the error:  
Specified DSA/XSA is not applicable | Correct type of deployment platform package not installed. | Install the correct type of deployment platform package. |
| Flashing the module does not complete after 20 minutes. | The flash operation has failed. | Perform cold-reboot and then re-flash the module. |
| XRT installation incomplete or unsuccessful. | Missing dependent packages. | Contact your Linux administrator. |
| Deployment platform installation incomplete or unsuccessful. | Missing dependent packages. | Contact your Linux administrator. |
| Unable to install packages on RedHat and CentOS. | Incorrect permissions for download directory, for example, `/home/` directory. | Download the packages to a directory where root has read access (for example, `/tmp`). Use the full path to the RPM package when installing. `yum` will fail with a relative path to RPM package. |
| Run time fails with following message:  
**Error:** Failed to find Xilinx platform | Failed to source the `setup.sh` script. | Source `/opt/xilinx/xrt/setup.sh` |
### Table 2: Module Troubleshooting (cont’d)

<table>
<thead>
<tr>
<th>Issue</th>
<th>Potential Cause</th>
<th>Fix</th>
</tr>
</thead>
</table>
| XRT package fails to install on CentOS7.8, CentOS7.9, CentOS8.0, CentOS8.1, CentOS8.2 | Kernel development headers are missing. The XRT package is missing a dependency on `kernel-devel` and `kernel-headers` | Manually install `kernel-devel` and `kernel-headers` with `yum install`:
  $ sudo yum install kernel-devel
  $ sudo yum install kernel-headers
  $ `uname -r` |
| **Note:** Do not run `sudo yum upgrade`. This will update the kernel-headers to an incompatible version. |
| When installing the XRT, you see the following message:              | This is caused by running `sudo apt install` as root.                            | The XRT will install correctly, despite the error. You can find more information about this error on Ask Ubuntu. |
| N: Can't drop privileges for downloading as file '/root/xrt.201802.2.1.179_16.04.deb' couldn't be accessed by user '/_apt'. - pkgAcquire::Run  (13: Permission denied) |                                                                                  |

For help with additional issues, contact Xilinx customer support.

### Known Issues/Limitations

The following table lists known issues. See Xilinx Answer Record 75177 for additional known issues.

### Table 3: Known Issues

<table>
<thead>
<tr>
<th>Area</th>
<th>Description</th>
<th>Comments/Recommendations</th>
</tr>
</thead>
<tbody>
<tr>
<td>General</td>
<td>The module is not present when running <code>xbutil</code> or <code>lspci</code>. The module may not have been ready when the server enumerated PCI Express.</td>
<td>Potential Fix: Warm Reboot the server, disable Fast Boot.</td>
</tr>
<tr>
<td>General</td>
<td>The SmartSSD CSD module has not trained to the full expected PCI Express link width or link speed.</td>
<td>Ensure that the SmartSSD CSD module is plugged into a Gen 3x4 or higher capable slot. Then cold reboot and see if the module enumerates to the correct settings.</td>
</tr>
<tr>
<td>Reset to Factory (Golden) Image</td>
<td>Rev 1.2 PoC version of the SmartSSD CSD modules does not come with write protected factory (Golden) image. This means that <code>xbmgmt flash --factory_reset --legacy</code> or <code>xbmgmt program --revert-to-golden</code> is currently not supported.</td>
<td>Support for this feature has been added in the latest revision of the card (Rev FS and future revs). Refer to Xilinx Answer Record 75177 for details on the latest revision of the card and corresponding shell that needs to be flashed on this card.</td>
</tr>
</tbody>
</table>
### Table 3: Known Issues (cont'd)

<table>
<thead>
<tr>
<th>Area</th>
<th>Description</th>
<th>Comments/Recommendations</th>
</tr>
</thead>
<tbody>
<tr>
<td>General</td>
<td>The SmartSSD CSD U.2 platform does not support PLRAM memory for</td>
<td>The PLRAM is not supported for performing a DMA operation to/from the host. So, generating a kernel with PLRAM (either by using the v++ options like sptag set to &quot;plram&quot; or sptag is set to &quot;all&quot;) and using it for regular acceleration is not supported.</td>
</tr>
<tr>
<td></td>
<td>accelerator (Vitis accelerated kernel) uses.</td>
<td></td>
</tr>
<tr>
<td></td>
<td>Changing the predefined target flash address offset of the shell image (mcs)</td>
<td>For Rev FS and future revs, the shell image offset location in flash must not be modified in the generated shell or kernel images (.mcs files) before programming them to the flash.</td>
</tr>
<tr>
<td></td>
<td>can leave the card in unknown state.</td>
<td></td>
</tr>
</tbody>
</table>
Chapter 3

Platform Notes

Overview

The Vitis™ core development kit provides verified platforms defining all the required hardware and software interfaces (shown in gray in the following figure), allowing you to design custom acceleration applications (shown in white) that are easily integrated into the Vitis programming model.

Figure 1: Platform Overview

The SmartSSD® CSD target platform (shell) consists of a static region and dynamic (user programmable) region. The static region of the platform provides the basic infrastructure for the module to communicate with the host and hardware support for the kernel. It includes the following main features.
• **Three port PCIe Switch (PCIe):** PCIe® switch up stream port connects to the host PCIe link and PCIe switch down stream port connects to Samsung NVMe SSD Controller PCIe link (embedded on-board) to enable NVMe SSD access to the host in a transparent manner with minimal add-on latency without affecting the SSD I/O performance. PCIe switch embedded endpoint port connects to the XDMA IP internally and enables the acceleration function through the FPGA.

• **Base Clocking and Reset (BCR):** Basic clocking and reset for module bring-up and operation.

• **Isolation logic structure (ILS):** Reset and partial reconfiguration isolation structure which are required for isolating static region logic from dynamic region logic during the partial bitstream download.

• **Card Management Controller (CMC):** Responsible for module power and FPGA temperature monitoring as well as power and temperature driven kernel clock throttling.

• **Embedded Run Time Scheduler (ERT):** Schedules and monitors compute units during kernel execution.

*Figure 2: Target Platform Shell Dynamic and Static Regions*

Acceleration kernels go into the dynamic region. The features and resources available for accelerated kernels are described in **U.2 Platform**.

The partitioning between the static and dynamic regions can lock significant resources. To ensure maximum availability of resources for the kernels, a new flat shell platform is available. Refer to **Appendix H: SmartSSD CSD Flat Shell** for additional details.
Development Target Shell

Xilinx provides a high-performance development target shell which can be used to create custom acceleration applications for the SmartSSD CSD. The U.2 platform (shell) provides:

- Host access to Samsung NVMe SSD in a transparent manner with minimal add-on latency and without affecting SSD I/O throughput and performance
- Direct PCIe peer-to-peer (P2P) transfers from NVMe SSD to FPGA-DDR used in storage application acceleration
  - P2P reduces the data transfer latencies (by avoiding hop to the Host-DDR) and improves the application performance
- Memory-mapped DMA transfers
- Kernel support for memory-mapped AXI4

The following table provides PCIe and DDR memory link widths and expected (theoretical) max performances.

Table 4: Platform (Shell) Link Widths and Throughputs

<table>
<thead>
<tr>
<th>Feature</th>
<th>Properties</th>
<th>Note</th>
</tr>
</thead>
<tbody>
<tr>
<td>Host Interface</td>
<td>PCIe Gen3 x4 with 128-bit data path, and 250 MHz clock</td>
<td>On the EEP side, this is brought out at the XDMA M_AXI interface (128-bit 250 MHz clock). Over this interface, the DMA transactions occur between the host and FPGA DDR memory.</td>
</tr>
<tr>
<td>SSD Interface</td>
<td>PCIe Gen3 x4 with 128-bit data path and 250 MHz clock</td>
<td>On the EEP side, this is brought out at the XDMA M_AXI_BYPASS interface (128-bit 250 MHz clock) which is used to make P2P transactions</td>
</tr>
<tr>
<td>PCIe Switch Internal Data Path</td>
<td>128-bit 250 MHz clocks</td>
<td>USP and DSP PCIe clocks are treated as asynchronously by PCIe Switch design</td>
</tr>
<tr>
<td>DDR Memory</td>
<td>Single channel 64-bit 1200 MHz DDR memory, Size = 4 GB</td>
<td>19.2 GB/s theoretical max throughput</td>
</tr>
<tr>
<td>DDR Memory Controller Interface</td>
<td>512-bit 300 MHz clock</td>
<td>19.2 GB/s theoretical max throughput</td>
</tr>
</tbody>
</table>

Platform Naming and Life Cycle

Platform Naming Convention

Starting with the 2020.2 release, the SmartSSD CSD platform is delivered through two types of Linux installation packages outlined in the following table.
Table 5: Platform Installation Package Types

<table>
<thead>
<tr>
<th>Package</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Partition</td>
<td>Contains a device bitstream that implements part of the deployment platform in the SmartSSD CSD module.</td>
</tr>
<tr>
<td>Validate</td>
<td>Contains code to validate a platform installation and SmartSSD CSD module setup.</td>
</tr>
</tbody>
</table>

Notes:
1. CMC firmware is embedded in the U.2 platform bitfile and is loaded at power ON.
2. For flat shell, only the shell package is applicable. The base and validate packages are not applicable.

Partition and Validate Package Naming

This section describes the package naming convention for partition and validate types. The partition and validate installation package names are generated by concatenating the following elements. For information about the flat shell, see Appendix H: SmartSSD CSD Flat Shell.

`<name>_<version>_<release>_<architecture>[-<OS version>].<extension>`

Each element consists of one or more sub-elements and are further described in the following table.

Table 6: Partition and Validate Package Naming Elements

<table>
<thead>
<tr>
<th>Element</th>
<th>Sub-Element</th>
<th>Description</th>
<th>Examples</th>
</tr>
</thead>
<tbody>
<tr>
<td>Name</td>
<td>Company</td>
<td>Vendor name</td>
<td>Xilinx</td>
</tr>
<tr>
<td></td>
<td>Card</td>
<td>Card name</td>
<td>u2</td>
</tr>
<tr>
<td></td>
<td>Chassis</td>
<td>Connectivity to the server</td>
<td>gen3x4-xdma-gc (PCIe Switch Up Stream Port) gen3x4-xdma-flat-gc</td>
</tr>
<tr>
<td></td>
<td>Partition</td>
<td>Partition name distinguishes the partition type and can be one of base, shell, or validate</td>
<td>base shell validate</td>
</tr>
<tr>
<td></td>
<td>Version</td>
<td>Iteration (s)</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td>Release</td>
<td>Release</td>
<td>Integer release number</td>
</tr>
<tr>
<td></td>
<td>Architecture</td>
<td>Architecture</td>
<td>noarch</td>
</tr>
<tr>
<td></td>
<td>Extension</td>
<td>Package file extension</td>
<td>RPM</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>DEB</td>
</tr>
</tbody>
</table>
An example of a deployment installation package is as follows.

```text
xilinx-u2-gen3x4-xdma-gc-base-2-*_.noarch.rpm
<xilinx-u2-gen3x4-xdma-gc-base_2*_all.deb>
```

After a deployment partition package is installed, you can use XRT commands to display the partition installed on the card (see xbmgmt utility in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416). Because the version number indicates compatibility with other partitions, the release number is not displayed. The displayed partition name for the example package is as follows.

```text
xilinx-u2-gen3x4-xdma-gc-base-2
```

**Life-cycle of the U.2 Platform**

Platforms have at least one year of backward compatibility with XRT, but not more than two. If IP used in the dynamic part of the platform is auto-upgraded for the same time frame, then generally:

- A platform generated from a release that has major revision of tools/run time such as 2019.1 is backward compatible until the last release of 2020 (2020.2).
- A platform generated from a release that has minor revision of tools/run time such as 2019.2 is also backward compatible until the last release of 2020 (2020.2).

**Note:** Xilinx reserves the right to make a backward incompatible change once a year with a major revision of XRT, a platform, or the Vitis core development kit. Major revision changes are usually done in the first release of a calendar year.
U.2 Platform

U.2 has two platforms, Dynamic function eXchange (DFX)-1RP and flat shell. The platform information for both the platforms is as follows.

DFX technology allows the card to change functionality on the fly without power-cycling the server, which enables some platforms to reconfigure DMA links. In the DFX-1RP platform, the PCIe core and DMA engine are combined and reside in the static region of the platform. These are also known as one stage platforms. For more information, see Dynamic Function Exchange in the XRT Documentation.

DFX-1RP Platform Information

- **Platform Name:** xilinx_u2_gen3x4_xdma_gc_2_202110_1
  
  *Note:* The xilinx_u2_gen3x4_xdma_2_202110_1 platform is compatible to latest revision of cards (RevFS or future versions), and supports Golden base image. For various details about card revisions as well as general do's and dont's, see Xilinx Answer Record 75177.

  *Note:* Rev 1.2 PoC version of the card MUST NOT be flashed with xilinx_u2_gen3x4_xdma_gc_2_202110_1 to avoid bricking the card.

- **Supported By:**
  - Vitis tools 2021.2.
    
    *Note:* Vitis tools 2021.2 is the recommended version for generating accelerator/kernel xclbins.

- **Platform (Shell) Version:** v2.44
  
  *Note:* Platform (shell) version can be identified using the version register as mentioned in Appendix E: Miscellaneous Information and Settings.

- **Timestamp:** 0x625b99fa75b56d83

- **Release Date:** Oct 2021

- **Created With:** 2021.1 tools

- **Supported XRT Versions:** 2021.2

- **Host Link Speed:** PCIe Gen3 x4

- **NVMe SSD Link Speed:** PCIe Gen3 x4

- **Target Module:** SmartSSD CSD module

- **Release Notes:** Release notes for the SmartSSD CSD module are available in Xilinx Answer Record 75176.
• **Known Issues:** For known issues related to the SmartSSD CSD module, see Xilinx Answer Record 75177.

The platform implements the device floorplan shown in the following figure and uses resources from the static region Pblock of the device. The dynamic region Pblock instantiates the DDR memory controller that is connected to the host through the XDMA integrated PCIe switch in the static region.

**Figure 3: Floorplan**

Flat Shell Platform Information

• **Platform Name:** xilinx_u2_gen3x4_xdma_flat_gc_2_202110_1

  *Note:* The xilinx_u2_gen3x4_xdma_flat_2_202110_1 platform is compatible to latest revision of cards (RevFS or future versions), and supports Golden base image. For various details about card revisions as well as general do's and dont's, see Xilinx Answer Record 75177.

  *Note:* Rev 1.2 PoC version of the card MUST NOT be flashed with xilinx_u2_gen3x4_xdma_flat_gc_2_202110_1 to avoid bricking the card.

• **Supported By:**
  - Vitis tools 2021.2.
    *Note:* Vitis tools 2021.2 is the recommended version for generating accelerator/kernel xclbins.

• **Platform (Shell) Version:** v2.44

  *Note:* Platform (shell) version can be identified using the version register as mentioned in Appendix E: Miscellaneous Information and Settings.

• **Timestamp:** N/A

• **Release Date:** Oct 2021

• **Created With:** 2021.1 tools

• **Supported XRT Versions:** 2021.2
• **Host Link Speed**: PCIe Gen3 x4
• **NVMe SSD Link Speed**: PCIe Gen3 x4
• **Target Module**: SmartSSD CSD module
• **Release Notes**: Release notes for the SmartSSD CSD module are available in Xilinx Answer Record 75176.
• **Known Issues**: For known issues related to the SmartSSD CSD module, see Xilinx Answer Record 75177.

As shown in the following figure, the flat shell platform does not have a physical partition between the static and dynamic regions. However, it has two logical blocks in the IP integrator block design. The dynamic block design instantiates the DDR memory controller that is connected to the host through the XDMA integrated PCIe switch in the static region.

**Figure 4: Flat Shell Floorplan**

---

**Card Thermal and Electrical Protections**

The SmartSSD CSD module provides FPGA clock throttling and SSD performance throttling to ensure production modules operate within electrical and thermal limits while running acceleration kernels.

FPGA clock throttling protection reduces the kernel clock frequencies when module power consumption or FPGA temperature reaches or exceeds their respective clock throttling threshold as listed in the following table. It is a dynamic process that lowers the clock frequencies while power exceeds associated threshold. By lowering the clock frequencies, clock throttling reduces the required power and subsequently generated heat. Only when both module power and FPGA temperature fall below their respective clock throttling threshold values will the application clocks be restored to full performance.
SSD performance throttling increases SSD I/O delay to reduce the SSD performance when module power consumption or SSD temperature reaches or exceeds their respective throttling threshold as listed in the following table. It is a dynamic process that lowers SSD performance while power exceeds associated threshold. By lowering the SSD performance, overall power consumption is reduced and subsequently SSD temperature is reduced. Only when both module power and SSD temperature fall below their respective threshold values will the SSD performance be restored to full performance.

**Table 7: Thermal and Electrical Protection Thresholds**

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Threshold</th>
</tr>
</thead>
<tbody>
<tr>
<td>FPGA Temperature</td>
<td>93°C</td>
</tr>
<tr>
<td>Card Power Consumption</td>
<td>25W</td>
</tr>
<tr>
<td>SSD Temperature</td>
<td>74°C</td>
</tr>
</tbody>
</table>

**Notes:**
1. FPGA temperature and power thresholds used by the clock throttling protection feature are hard limits and cannot be increased.
2. The power throttling algorithm operates seamlessly at two levels to ensure the card meets the PCIe compliance of maximum 25W limit and to ensure power protection of the drive. At SSD level, the algorithm operates as a master with the default maximum power limit of 23.3W. At FPGA level, the algorithm operates with the default maximum power limit of 25W. Note that the SSD algorithm starts throttling at 23.3W to ensure the RMS power is always within the maximum limit of 25W. These power throttling algorithms kick in if the board power reaches beyond the respective power throttling limits to meet the overall SmartSSD CSD power requirement. The power limits can be reconfigured (lower than default limits) through the FPGA configuration settings. For more information about SSD specifications and composite temperature limits, see SmartSSD Computational Storage Device Data Sheet (DS997). To get access to the data sheet, contact local Xilinx sales representative.

**IMPORTANT! If needed, you can lower temperature or power thresholds for the FPGA clock throttling function (after every cold or warm reboot) as per the example provided in Changing (Lowering) FPGA Temperature and Power Thresholds for Clock Throttling.**

**Card and Clock Shutdown Protection**

The SmartSSD CSD platform implements card shutdown and accelerator (Vitis kernel) clock shutdown features (along with the performance throttling feature described previously) to ensure that production card operates within the specified electrical and temperature limits.

**Card Shutdown Protection**

Card shutdown feature is implemented using the SYSMON primitives thermal management option. This feature is provided to protect the card from permanent damage in certain failure scenarios, such as, blocking or disruption of airflow over the card due to fan failure which can cause excessive heating of the card. When card shutdown temperature threshold is reached, FPGA is disabled which results in loss of the host PCIe connection. Card shutdown feature is controlled by SYSMON hard primitive with default threshold value as 125°C. The platform (shell) design sets the FPGA shutdown temperature threshold as 100°C.

**Note:** A cold reboot of the server is required to recover after the card shuts down.
**Clock Shutdown Protection**

Clock shutdown protection shuts down the accelerator (Vitis kernel) clocks when FPGA temperature exceeds clock shutdown threshold (97°C). This will cause an AXI firewall trip that can crash the application on the host. Because the card ends up in an unknown state the XRT driver will issue a command to reset the card. It typically takes a couple minutes until the card is usable again.

*Note:* Review the Linux dmesg command output to determine if a protection was activated. A message containing following text may appear in the log indicating clock shutdown event:

```
...Critical temperature or power event. Kernel clocks have been stopped...
```

**Clocking**

The platform is designed to provide a 300 MHz default clock to run the accelerator. When the Vitis accelerator generation flow is run, platform (shell) dynamic region gets re-implemented. Based on the final placement/routing actual frequency is known after implementation.

*Note:* The FPGA clock throttling algorithm controls only the 300 MHz default kernel clock provided in the design to run the accelerator.

**Available Resources After Platform Installation**

The following table lists the available resources in the dynamic region partition block. These resources are available for the Vitis accelerated kernel and associated interconnect needed to connect to the FPGA DDR memory controller (that is already present in the dynamic region).

*Table 8: SmartSSD CSD Platform (DFX 1RP) Resource Availability*

<table>
<thead>
<tr>
<th>Resource</th>
<th>Total Available Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>CLB LUT</td>
<td>331K (21K)</td>
</tr>
<tr>
<td>CLB Register</td>
<td>680K (25K)</td>
</tr>
<tr>
<td>Block RAM Tile</td>
<td>646 (26)</td>
</tr>
<tr>
<td>UltraRAM</td>
<td>104 (0)</td>
</tr>
<tr>
<td>DSP</td>
<td>1341 (3)</td>
</tr>
</tbody>
</table>

*Notes:*  
1. Values provided in parenthesis are FPGA DDR memory controller resources consumed in the dynamic region.

**Deployment Platform Installation**

To run applications with this platform, download the deployment installation packages corresponding to your OS. Then, use the installation procedures described in *Card Installation Procedures.*
Accelerated applications have software dependencies. Work with your accelerated application provider to determine which XRT version to install.

Development Platform Installation

For developing applications for use with the SmartSSD CSD module you must install and use the Vitis software platform. To set up an accelerator module for use in the development environment, follow the installation steps in:

- Installation Requirements in the Vitis Unified Software Platform Documentation
- Installing Xilinx Runtime in the Vitis Unified Software Platform Documentation

To generate your own kernel binaries and run applications with this platform, download the development installation packages corresponding to your OS. Then, use the installation procedures described in Card Installation Procedures.
Changing XRT and Target Platform Versions

The SmartSSD® CSD target platform revisions can change significantly between releases. To ensure a successful upgrade (or downgrade) of the SmartSSD CSD module XRT and platform, carefully follow the instructions in the following sections. Failure to adhere to these procedures can result in an unstable installation or other issues.

RedHat and CentOS

During upgrading, downgrading, or uninstalling, it can be useful to list the currently installed SmartSSD CSD packages. To list the currently installed deployment platform package, run the following command in a Linux terminal:

```bash
$ yum list installed | grep -i xilinx
```

To list the currently installed XRT package, run the following command:

```bash
$ yum list installed | grep -i xrt
```

Upgrading Packages

You can upgrade the XRT and deployment platform on your SmartSSD CSD module by following these steps. Currently, both packages must be upgraded concurrently.

Download the desired XRT and deployment platform packages and follow installation steps 5 through 10 in XRT and Deployment/Development Platform Installation Procedures on RedHat and CentOS.

Downgrading Packages

While beta packages are available for the SmartSSD CSD, it is not recommended to downgrade to a beta version.
xilinx_samsung_u2x4_201920_3, xilinx_samsung_u2x4_202010_1, and xilinx_u2.gen3x4.xdma_1.202020_1 are the available production packages for Rev 1.2 cards. It is recommended to use the latest production version (for Rev1.2 cards) i.e., xilinx_u2.gen3x4.xdma_1.202020_1. For Rev FS and future versions of the SmartSSD CSD module, there are two productions shell versions supported with this release—xilinx_u2.gen3x4.xdma_gc_1.202020_1 and xilinx_u2.gen3x4.xdma_gc_2.202110_1. However, it is recommended to use the latest production shell, xilinx_u2.gen3x4.xdma_gc_2.202110_1.

**Uninstalling Packages**

To completely uninstall the SmartSSD CSD XRT and deployment platform packages, run the following command in a Linux terminal. Uninstalling XRT also uninstalls the deployment platform.

```
$ sudo yum remove xrt
```

*Note:* Make sure that all of the platform packages are displayed in the output terminal after running the command. If not, manually list the packages using the `list` command as explained in the RedHat and CentOS section, then delete the remaining packages using the `remove` command.

**Ubuntu**

During upgrading, downgrading, or uninstalling, it can be useful to list the currently installed SmartSSD CSD packages. To list the currently installed deployment platform package, run the following command in a Linux terminal:

```
$ apt list --installed | grep -i xilinx
```

To list the currently installed XRT package, run the following command:

```
$ apt list --installed | grep -i xrt
```

**Upgrading Packages**

You can upgrade the XRT and deployment platform on your SmartSSD CSD module by following these steps. Currently, both packages must be upgraded concurrently.

Download the desired XRT and deployment platform packages. Follow installation steps 5 through 10 in XRT and Deployment/Development Platform Installation Procedures on Ubuntu.
Downgrading Packages

While beta packages are available for the SmartSSD CSD, it is not recommended to downgrade to a beta version.

xilinx_samsung_u2x4_201920_3, xilinx_samsung_u2x4_202010_1, and xilinx_u2_gen3x4_xdma_1_202020_1 are the available production packages. It is recommended to use the latest production version (for Rev1.2 cards) i.e., xilinx_u2_gen3x4_xdma_1_202020_1. For Rev FS and future versions of the SmartSSD CSD module, there are two productions shell versions supported with this release—xilinx_u2_gen3x4_xdma_gc_1_202020_1 and xilinx_u2_gen3x4_xdma_gc_2_202110_1. However, it is recommended to use the latest production shell, xilinx_u2_gen3x4_xdma_gc_2_202110_1.

Uninstalling Packages

To completely uninstall the SmartSSD CSD XRT and deployment platform packages, run the following command in a Linux terminal. Uninstalling XRT also uninstalls the deployment platform.

```
$ sudo apt remove xrt
```

**Note:** Make sure that all of the target platform packages are displayed in the output terminal after running the command. If not, manually list the packages using the `list` command as explained in the Ubuntu section, then delete the remaining packages using the `remove` command.

<table>
<thead>
<tr>
<th>IMPORTANT! When upgrading from xilinx_samsung_u2x4_202010_1 platform to xilinx_u2_gen3x4_xdma_2_202110_1 platform, use the following command:</th>
</tr>
</thead>
<tbody>
<tr>
<td>sudo /opt/xilinx/xrt/bin/xbmgmt program --base --image /opt/xilinx/firmware/u2/g3x4-xdma-gc/base/partition.xsabin /opt/xilinx/firmware/u2/g3x4-xdma-gc/base/partition.xsabin --flash-type spi --device &lt;xclmgmt BDF&gt;</td>
</tr>
</tbody>
</table>

*Here the --device option takes the xclmgmt BDF (Bus device function which can be found out by executing `lspci | grep -i xilinx`).*

*If you have multiple modules installed on the server, you must run the xbmgt flash command separately for each module.*
Creating a Vault Repository for CentOS

On CentOS, `yum install kernel-headers` always installs the latest version of the headers, but might not match your kernel version. This causes the installation of XRT to skip compilation of the driver modules and will silently fail. To correctly install XRT, you must create a vault repository file that points to versions matching the kernel.

The following is an example repository for CentOS 7.4 created in the following file:

```
/etc/yum.repos.d/centos74.repo
```

```ini
# CentOS-Base-7.4.repo
#
# This repo is locked to 7.4.1708 version
#
# C7.4.1708
[C7.4.1708-base]
name=CentOS-7.4.1708 - Base
baseurl=http://vault.centos.org/7.4.1708/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
enabled=1
[C7.4.1708-updates]
name=CentOS-7.4.1708 - Updates
baseurl=http://vault.centos.org/7.4.1708/updates/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
enabled=1
[C7.4.1708-extras]
name=CentOS-7.4.1708 - Extras
baseurl=http://vault.centos.org/7.4.1708/extras/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
enabled=1
[C7.4.1708-centosplus]
name=CentOS-7.4.1708 - CentOSPlus
baseurl=http://vault.centos.org/7.4.1708/centosplus/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
enabled=1
[C7.4.1708-fasttrack]
name=CentOS-7.4.1708 - CentOSPlus
baseurl=http://vault.centos.org/7.4.1708/fasttrack/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
enabled=1
```

UG1382 (v1.3) October 27, 2021
SmartSSD CSD
Note: For CentOS 7.5, create the repo file `/etc/yum.repos.d/centos75.repo` and add the content from the previous code example, replacing `7.4.1708` with `7.5.1804`. Similarly, for CentOS 7.6, create the repo file `/etc/yum.repos.d/centos76.repo` and add the content from the previous code example, replacing `7.4.1708` with `7.6.1810`. 
Accessing the SmartSSD CSD through a Virtual Machine

The SmartSSD® CSD module has the following physical functions (PFs) that can be accessed in a virtualized setup.

- Xilinx® Mgmt PF (PCIe ID \(0x6987\))
- Xilinx User PF (PCIe ID \(0x6988\))
- Samsung NVMe Controller PF (PCIe ID \(0xa824\))

These three physical functions can be accessed by the virtual machine (VM) using the KVM-Linux packages part of the Linux kernel. The following VM configurations are supported on the SmartSSD CSD module.

- **Passthrough Xilinx User PF and Samsung NVMe Controller PF:** In this setup, Xilinx Mgmt PF will remain in the host and cannot be accessed by the VM.

- **Passthrough of All Three PFs:** In this setup, all three PFs can be accessed by the VM.

**Note:** Before attempting VM (passthrough) access, ensure that \(xbutil\) scan displays one usable device. If not, check the Troubleshooting section.

**Achieving P2P Baremetal Performance in the VM**

In a virtualized environment, for any transaction between the NVMe and Xilinx User PF, the request has to be routed all the way to the Input/Output Memory Management Unit (IOMMU) at the host for address translation and then re-routed to the Xilinx PF. This leads to degradation of the peer-to-peer (P2P) throughput in the VM.

To avoid this degradation, SmartSSD CSD platform has implemented a Passthrough Virtualization Using Range Translation Feature. This feature is available to you at no extra cost and is enabled by default. With this feature, P2P performance in the virtualized setup is expected to be the same as baremetal setup.
Hot Plug Support on the SmartSSD CSD

For more information on hot plug support on the SmartSSD CSD, see Master Answer Record: 75177.
Appendix E

Miscellaneous Information and Settings

Platform (Shell) Version Identification

The SmartSSD® CSD platform (shell) has a version identification register at 0x330000 offset. This register can be read from the Xilinx® PCIe® mgmt. PF (Mgmt. BAR).

For example:

- [11:08] - Major Version
- [07:00] - Minor Version

0x00000090 indicates -- Version 0.90
0x00000244 indicates -- Version 2.44

To find the platform (shell) version, use the ./rwmem utility by downloading the reference design files for this user guide from the Xilinx website.

<table>
<thead>
<tr>
<th>Bit field</th>
<th>Access</th>
<th>Default value</th>
</tr>
</thead>
<tbody>
<tr>
<td>[31:12]</td>
<td>Read-only</td>
<td>N/A</td>
</tr>
<tr>
<td>[11:0]</td>
<td>Read-only</td>
<td>Version number</td>
</tr>
</tbody>
</table>

Following example provides details on how to identify Xilinx PCIe mgmt PF and AXI4-Lite PCIe BAR on to which the version control register is mapped on. The same approach can be followed to read any other register as long as it is mapped to the AXI4-Lite BAR of the Xilinx PCIe mgmt PF. PCIe writer application is used to read these registers in this example. Enter the following command after connecting the SmartSSD CSD to the host.

```
lspci -d 10ee:
```

The output is shown in the following figure. The PCIe PF (physical function) with the ID 6987 is the Xilinx PCIe mgmt PF (xclmgmt PF) and the PCIe PF with the ID 6988 is the Xilinx PCIe user PF (xocl PF).
Now to check the PCIe BAR's assigned for the xclmgmt PF, enter the following command. The BAR with 32MB of space is the AXI4-Lite BAR over which the version control register is mapped.

```
lspci -v -s dc:00.0
```

**Figure 6: management_pf_lspci_output**

### Changing (Lowering) FPGA Temperature and Power Thresholds for Clock Throttling

Use the following command for Xilinx mgmt. PF identification.

```
sensors | grep -i xilinx
xilinx_u2_gen3x4_xdma_base_1_user-pci-dd01
xilinx_u2_gen3x4_xdma_base_1_mgmt-pci-dd00
```

Use the following command to know the default FPGA temperature and power thresholds used for clock throttling (FPGA temperature = 93.0°C; Power threshold = 25W). Note that default values are hard upper limits and cannot be increased.

```
sensors xilinx_u2_gen3x4_xdma_base_1_mgmt-pci-dd00 | grep -i CS_TARGET
CS_TARGET_TEMP: +93.0°C (high = +0.0°C)
CS_TARGET_POWER: 25.00 W (max = 0.00 W)
```
FPGA temperature and power thresholds can be lowered if needed. In this example, FPGA temperature threshold is lowered to 97.0°C and power threshold is lowered to 22W.

```
chip "xilinx_u2_gen3x4_xdma_base_1_mgmt-pci-dd00"
label temp15 CS_TARGET_TEMP
set temp15_input 97/1000

chip "xilinx_u2_gen3x4_xdma_base_1_mgmt-pci-dd00"
label power2 CS_TARGET_POWER
set power2_input 22
```

Save the settings using, `sensors -s`.

Confirm the change by running the following command again.

```
sensors xilinx_u2_gen3x4_xdma_base_1_mgmt-pci-dd00 | grep -i CS_TARGET
CS_TARGET_TEMP:    +97.0°C  (high =  +0.0°C)
CS_TARGET_POWER:   22.00 W  (max =   0.00 W)
```

**Note:** The command `sensors xilinx_u2_gen3x4_xdma_base_1_mgmt-pci-dd00` can be used to identify FPGA temperature and power consumption of the SmartSSD CSD module.
FPGA-SSD DNA Pairing

The SmartSSD® CSD module is implemented with the FPGA and NVMe SSD physically co-located on the same module/PCB. In a multiple card setup, it becomes essential for the application running on the host machines to be aware about this FPGA-SSD co-location for offloading of acceleration jobs to the right FPGA. This is important for using the efficacy of Peer-to-Peer (P2P) transfers. In the absence of co-location awareness, the application can end up choosing a random FPGA for acceleration and see inefficient data movements (higher latencies and higher power consumption).

Each Xilinx manufactured FPGA contains a unique physical identifier. This is known as the FPGA DNA which is a single unique (96-bit) non-volatile device identifier. For additional information about FPGA DNA, refer to the UltraScale Architecture Configuration User Guide (UG570).

Similarly, Samsung NVMe SSD provides a unique non-volatile physical identifier for each manufactured SSD. This is called SSD Serial Number (SN).

An FPGA-SSD pair can be identified using the FPGA DNA and SSD Serial Number as follows:

1. NVMe SSD's Identify controller command reports both the SSD Serial Number and 96-bit FPGA DNA value (represented in bold in the following log).

```bash
sudo nvme id-ctrl /dev/nvme0n1 -v
```

```
NVME Identify Controller:
vid : 0x144d
ssvid : 0x144d
sn : 160401P00RC01
mn : MZWLJ3T2HBL5-000D7-101
fr : EPK9SB5E
rab : 8
ieee : 002538
cmic : 0x3
mdts : 5
cntlid : 41
ver : 10300
rtd3r : e4e1c0
rtd3e : 989680
oaes : 0x300
oacs : 0xd0
acl : 127
aerl : 15
frmw : 0x17
lpa : 0xe
elpe : 255
npss : 0
avsc0 : 0x1
```
2. Confirm the 96-bit FPGA DNA value from SmartSSD CSD User PF BAR through the register read.

PCIe User PF BAR + 0x330020 = 0x34b0c205
PCIe User PF BAR + 0x330024 = 0x013a1b29
PCIe User PF BAR + 0x330028 = 0x40020000

Both these steps can be used to identify the NVMe SSD and FPGA that are located on the same SmartSSD CSD.
Accessing FPGA DNA Primitive from Dynamic Region

The SmartSSD® CSD platform instantiates the DNA primitive in the static region and uses it to obtain the 96-bit FPGA DNA value at power up. After latching the FPGA DNA value, SmartSSD CSD platform design releases the DNA primitive and makes it accessible for any Vitis™ kernel/user logic from dynamic region. The following figure highlights the connections that need to be made during v++ kernel generation flow using IP integrator connect commands.

**Figure 7: FPGA DNA Access for User Kernels**

![Diagram showing connections between Static Region and Dynamic Region]

**Table 9: Interface Port Names**

<table>
<thead>
<tr>
<th>Interface Port Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>clkwiz_sysclks_clk_out2</td>
<td>DNA primitive clock port. This is 50 MHz clock.</td>
</tr>
<tr>
<td>ulp_m_data_dout dna_00</td>
<td>Connects to the DOUT port of DNA primitive.</td>
</tr>
<tr>
<td>ulp_s_data dna_from ulp_00[0]</td>
<td>Connects to the DIN port of DNA primitive.</td>
</tr>
<tr>
<td>ulp_s_data dna_from ulp_00[1]</td>
<td>Connects to the READ port of DNA primitive.</td>
</tr>
<tr>
<td>ulp_s_data dna_from ulp_00[2]</td>
<td>Connects to the SHIFT port of DNA primitive.</td>
</tr>
</tbody>
</table>
User logic can connect to the DNA ports in the dynamic region using following IPI commands. As shown in the previous figure, user logic must use respective clock (clkwiz_sysclks_clk_out2 – 50 MHz) to drive DNA primitive ports and latch DOUT.

```
connect_bd_net [get_bd_pins ii_level0_wire/ulp_m_data_dout_dna_00] [get_bd_pins user_logic/DOUT]
connect_bd_net [get_bd_pins user_logic/DIN] [get_bd_pins ii_level0_wire/ulp_s_data_dna_from_ulp_00[0]]
connect_bd_net [get_bd_pins user_logic/READ] [get_bd_pins ii_level0_wire/ulp_s_data_dna_from_ulp_00[1]]
connect_bd_net [get_bd_pins user_logic/SHIFT] [get_bd_pins ii_level0_wire/ulp_s_data_dna_from_ulp_00[2]]
```

**Note:** There is one cycle of latency in inputs (DIN, READ, SHIFT) and one cycle of latency in output (DOUT). User logic must consider the isolation interface flop latency when latching the DOUT value.

**Example Module for Testing DNA Connectivity**

SmartSSD CSD platform design has an example module (fpga_dna_module_0) present in the dynamic region which has been used to test the dynamic region DNA port connectivity. This example module runs at clkwiz_sysclks_clk_out2 (50 MHz clock) and is accessible through AXI4-Lite Mgmt. PF + 0x0110_0000 offset.

**Figure 8: Example FPGA DNA Module**

This module is connected to the DNA ports using following commands:

```
connect_bd_net [get_bd_pins ii_level0_wire/ulp_m_data_dout_dna_00] [get_bd_pins fpga_dna_module_0/dna_dyn_data_dout]
connect_bd_net [get_bd_pins fpga_dna_module_0/dna_dyn_data_ports] [get_bd_pins ii_level0_wire/ulp_s_data_dna_from_ulp_00]
```

After loading the implementation bitfile, this test module can be triggered by writing following register (through PCIe writer application):

```
./rwmem AXI Lite Mgmt. PF + 0x01100010 0x1
```

Captured FPGA DNA are reported in the following registers:

```
./rwmem AXI Lite Mgmt. PF + 0x01100000
0x2460e205
./rwmem AXI Lite Mgmt. PF + 0x01100004
0x013a1b40
./rwmem AXI Lite Mgmt. PF + 0x01100008
0x40020000
```

Appendix G: Accessing FPGA DNA Primitive from Dynamic Region
Example module also drives the DNA value captured in the above registers on fpga_dna_data [95:0] output port.

Note that the example fpga_dna_module considers isolation interface flop latency and latches DOUT coming from DNA primitive after two cycles of READ assertion. Also, this module drives the latched DOUT value back to the DIN port towards DNA PRIMITIVE.

For more details about DNA PRIMITIVE, refer to the DNA_PORTE2 section in the UltraScale Architecture Configuration User Guide (UG570).
This section describes the details of SmartSSD® CSD flat shell. In case of the SmartSSD CSD development platform, static and dynamic regions are two partitions and defined with separate Pblocks. These Pblocks are defined based on the resources locations in the device such as PCIe® and DDR memory. The advantage of this partitioning is that it helps in reloading the kernels in the dynamic region without affecting the static regions. However, the unused resources in the static region are not available for the kernel use. To overcome this limitation, the flat shell addresses the effective usage of the device resources for the kernels. Flat shell does not have any reconfigurable partitions, that is, the partitioning between static and dynamic region is removed. This means that the complete device is available for the shell logic and kernels together without localizing the routing to the static/dynamic region Pblock. The resources that were locked due to the partitioning can now be released to the kernel.

**Post Implementation View of the Kernel**

The following block design represents the overview of the V++ generated project, where the hello world kernel is hooked to the flat shell.

The following image displays the implemented design view of the hello_world kernel hooked to a flat shell. Note that there are no partitions in the implemented design.
Note: For Rev FS and future versions of the SmartSSD CSD module, there are two production shell versions supported with this release—xilinx_u2_gen3x4_xdma_flat_gc_1_202020_1 and xilinx_u2_gen3x4_xdma_gc_2_202110_1. However, it is recommended to use the latest production shell, xilinx_u2_gen3x4_xdma_flat_gc_2_202110_1.

Table 10: Resource Availability for Kernels in Flat Shell

<table>
<thead>
<tr>
<th>Resources</th>
<th>Total Number of Resources in the Device</th>
<th>Resources Available for Kernels</th>
</tr>
</thead>
<tbody>
<tr>
<td>CLB LUT</td>
<td>522K</td>
<td>396K</td>
</tr>
<tr>
<td>CLB Register</td>
<td>1045K</td>
<td>850K</td>
</tr>
<tr>
<td>Block RAM Tile</td>
<td>984</td>
<td>618</td>
</tr>
<tr>
<td>UltraRAM</td>
<td>128</td>
<td>120</td>
</tr>
<tr>
<td>DSP</td>
<td>1968</td>
<td>1959</td>
</tr>
</tbody>
</table>

After removing the physical partitioning between static and dynamic regions, approximately 70K LUTs, 200K registers, 58 Block RAMs, 16 UltraRAMs, and 602 DSPs are released for kernel use in the flat shell.
**V++ Flow to Generate the Kernel**

Kernel generation flow on the flat shell is similar to generating kernels for the SmartSSD CSD development based platform. The V++ engine will identify if the targeted shell is flat and if yes, then V++ will implement the shell and kernel together, generating the complete bit file. The kernel that is generated during this phase cannot be directly used to run on the target SmartSSD CSD. Refer to Kernel Execution for generating the kernel that can be deployed on the SmartSSD CSD.

---

**Kernel Execution**

This section describes the steps to generate the deployment package and the details of how to execute the kernel. Ensure that sudo permissions are enabled before running the following commands on the host.

1. Login to the target host where the SmartSSD CSD card is connected.
2. Install the 2021.2 XRT using the following command, ran as root.
   
   ```
   yum install <xrt path> (.rpm)
   
   For example:
   ```
   
   ```
   yum install <2021.2 installed location>/xbb/xrt/packages/xrt_<version>.rpm
   ```
   
   To install the deb package, use the following command.
   ```
   apt install <xrt path>
   ```
   
   ```
   source /opt/xilinx/xrt/setup.sh
   ```
   
3. Install the flat shell package (.rpm/.deb) using the following command.
   ```
   yum install <rpm_path>
   ```
   
   To install the deb package, use the following command.
   ```
   apt install <deb path>
   ```
   
   After the installation of this package is complete, refer to the following path in the host for the deploy_flat script. This script is compatible with Python 2.7 version
   ```
   /opt/xilinx/platforms/xilinx_u2_gen3x4_xdma_flat_gc_2_202110_1/scripts/deploy_flat
   ```
   
4. Now, execute the following command to generate the deployment package.
   ```
   /opt/xilinx/platforms/xilinx_u2_gen3x4_xdma_flat_gc_2_202110_1/scripts/deploy_flat --xclbin <path_to_xclbin>/<application_name>.xclbin --vpp_exec_path <path_to_where_kernel_generated>
5. The command output creates a directory named deployment. It contains the deployment kernel (.xclbin) which is ready to be used on the target SmartSSD CSD and the deployment shell package, as well as the shell+kernel rpm/deb which has to be installed before running/executing the kernel. Note that the deploy_flat script looks for uuid.fragment.csv and .xclbin file under the kernel run directory.

6. Install the generated deployment rpm/deb (the rpm file will be under the deployment directory). This step generates the partition.xsabin under the following path.

   ```
   yum install ./xilinx-<application>-<version>-1.noarch.rpm
   ```

7. After installing deployment rpm, the next step is to flash the shell. When flashing the flat shell for the first time, use the following command.

   ```
   sudo /opt/xilinx/xrt/bin/xbmgmt program --base --image <xsabin path_1> <xsabin_path_1> --flash-type spi --device <xclmgmt PF BDF>
   ```

When trying to flash verify kernel, then the example command is as follows.

   ```
   sudo /opt/xilinx/xrt/bin/xbmgmt program --base --image /opt/xilinx/firmware/u2/gen3x4-xdma-gc/base/partition.xsabin /opt/xilinx/firmware/u2/gen3x4-xdma-gc/base/partition.xsabin --flash-type spi --device <xclmgmt BDF>
   ```

If the SmartSSD CSD is already flashed with a flat shell, then the installation step will point to the flash command. An example command is as follows.

   ```
   sudo /opt/xilinx/xrt/bin/xbmgmt program --base --device <xclmgmt PF BDF> --image <shell_name>
   ```

8. Cold reboot the machine when any kernel generated on the flat shell is loaded for the first time. After the flat shell based kernel is loaded, migration to another flat shell kernel can be done using the host cold reboot after programming a new kernel, or using the WBSTAR flow described in Flat Shell Kernel Migration.

9. Source XRT.

10. Check for the available cards recognized by XRT using the following command.

    ```
    sudo /opt/xilinx/xrt/bin/xbutil examine
    ```

    This command output displayed should show the platform in under the “Devices Present” field.

11. Execute the kernels.

    **Note:** The `xbutil validate` command is not applicable for the flat shell.
**RECOMMENDED:** It is recommended to avoid using underscores and upper case letters in the kernel name (`xclbin name`). Violating this can result in the following error message during deployment package generation. In such a case, follow the recommendations provided in the error message, for either changing the name of the kernel, or using the `--application_name` option to `deploy_flat` script during the deployment package generation.

The following example shows the error message for a kernel name that includes an under score.

```
##################################################################
ERROR: The choosen application name (krnl_vadd_2clk_rtl) is invalid, must contain lower case letters, numbers, and hyphens only. No upper case or underscores allowed. Either rename the xclbin or set the --application_name option.
##################################################################
```

---

**Flat Shell Kernel Migration**

This section explains two flows that can be followed to migrate from one shell to another, without cold rebooting the host. A python script is provided for this kernel migration and this script is compatible with python 2.7 version. In the first case, the SmartSSD CSD module is connected to the host through a PLX switch and in the second case, it is directly connected to the server without any PLX switch.

**Note:** The python script provided for the kernel migration removes XRT drivers before executing steps for loading new bit stream (kernel) on the targeted SmartSSD CSD card, when multiple SmartSSD CSD cards are connected to a server, then the traffic on all the SmartSSD CSD cards needs to be stopped before executing this script.

**When SmartSSD CSD is Connected to the Host without a PLX Switch**

1. The upstream and link/root ports are arguments for the script. The command to execute the script is as follows. `linkport` refers to the port that is connected to the upstream port of the SmartSSD CSD module.

   ```
   python WbstarFlow.py --linkport <b:d.f> --upstreamport <b:d.f>
   ```

2. The following example can help in identifying the appropriate link and upstream port values that must be passed as arguments to the script. In the following example, the upstream and link port values are 5f:01.0 and 5d:00.0, respectively. The example command to execute the script is as follows.

   ```
   python WbstarFlow.py --linkport 5d:00.0 --upstreamport 5f:01.0
   ```
When SmartSSD CSD is Connected to the Host through a PLX Switch

1. Before running the script, the host grub file needs to be updated with the following setting. Host must support this grub setting requirement, or kernel migration will fail.

```
GRUB_CMDLINE_LINUX="pci=assign-busses,hpbussize=4,hpmemsize=16G,realloc=on"
```

2. Remove and rescan at the PLX switch root port (where PLX switch is connected to the host), as shown in the following example. The image shown in the example, is the output of the `lspci -tv` command. 5d:02.0 is the BDF of the PLX switch root port, at which the remove and rescan operation has to be performed. Avoiding this step can cause BAR assignment issues for the PCIe devices.

```
echo 1 > /sys/bus/pci/devices/0000:5d:02.0/remove
echo 1 > /sys/bus/pci/rescan
```

**Note:** Note that PCIe device removal and rescan has to be performed at every host cold reboot before proceeding with the script.

3. The upstream and link/root ports are arguments for the script as described in the previous section. The command to execute the script is as follows.

```
python WbstarFlow.py --linkport <b:d.f> --upstreamport <b:d.f>
```

4. The following example can help in identifying the appropriate link and upstream port values to be passed as arguments to the script. In the following example, the upstream and link port values are 60:00.0 and 5d:02.0, respectively. The example command to execute the script is as follows.

```
python WbstarFlow.py --linkport 5d:02.0 --upstreamport 60:00.0
```

5. Upstream port of the SmartSSD CSD module in the previous code snippet is 60:00.0 and the link port value is 5d:02.0.

6. The FPGA will now generate the new bit file.
In both cases, after the script execution is successful, the following message appears.

```
WBStar flow completed successfully please run kernel
```

**Note:**

The kernel migration steps provided in this section may result in an error in the following cases.

1. If the python script `WbstarFlow.py` is executed prior to programming a new bit stream into the flash using XRT, then it can result in error.

2. If the kernel migration steps are executed between the same kernel. For example, if kernel1 is loaded on the SmartSSD CSD module, and migration steps are performed to load kernel1 into the flash again, then the flow will not work reliably.

3. When performing PCIe remove and rescanning at the PLX switch root port as mentioned in the previous section, you should ensure that there is no host boot drive is connected in the same hierarchy that affects this step. Violating this may lead to system crash.
Appendix I

Additional Resources and Legal Notices

Xilinx Resources

For support resources such as Answers, Documentation, Downloads, and Forums, see Xilinx Support.

Documentation Navigator and Design Hubs

Xilinx® Documentation Navigator (DocNav) provides access to Xilinx documents, videos, and support resources, which you can filter and search to find information. To open DocNav:

- From the Vivado® IDE, select Help → Documentation and Tutorials.
- On Windows, select Start → All Programs → Xilinx Design Tools → DocNav.
- At the Linux command prompt, enter docnav.

Xilinx Design Hubs provide links to documentation organized by design tasks and other topics, which you can use to learn key concepts and address frequently asked questions. To access the Design Hubs:

- In DocNav, click the Design Hubs View tab.
- On the Xilinx website, see the Design Hubs page.

Note: For more information on DocNav, see the Documentation Navigator page on the Xilinx website.

References

These documents provide supplemental material useful with this guide:
Vitis Documents
3. Vitis Application Acceleration Development Flow Tutorials (GitHub)

Additional Xilinx Resources
1. Xilinx licensing website: https://www.xilinx.com/getproduct
3. Xilinx Community Forums: https://forums.xilinx.com
4. Xilinx Third-Party End User License Agreement
5. End-User License Agreement

Please Read: Important Legal Notices

The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display the Materials without prior written consent. Certain products are subject to the terms and conditions of Xilinx's limited warranty, please refer to Xilinx's Terms of Sale which can be viewed at https://www.xilinx.com/legal.html#tos; IP cores may be subject to warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products in such critical applications, please refer to Xilinx's Terms of Sale which can be viewed at https://www.xilinx.com/legal.html#tos.
AUTOMOTIVE APPLICATIONS DISCLAIMER

AUTOMOTIVE PRODUCTS (IDENTIFIED AS "XA" IN THE PART NUMBER) ARE NOT WARRANTED FOR USE IN THE DEPLOYMENT OF AIRBAGS OR FOR USE IN APPLICATIONS THAT AFFECT CONTROL OF A VEHICLE ("SAFETY APPLICATION") UNLESS THERE IS A SAFETY CONCEPT OR REDUNDANCY FEATURE CONSISTENT WITH THE ISO 26262 AUTOMOTIVE SAFETY STANDARD ("SAFETY DESIGN"). CUSTOMER SHALL, PRIOR TO USING OR DISTRIBUTING ANY SYSTEMS THAT INCORPORATE PRODUCTS, THOROUGHLY TEST SUCH SYSTEMS FOR SAFETY PURPOSES. USE OF PRODUCTS IN A SAFETY APPLICATION WITHOUT A SAFETY DESIGN IS FULLY AT THE RISK OF CUSTOMER, SUBJECT ONLY TO APPLICABLE LAWS AND REGULATIONS GOVERNING LIMITATIONS ON PRODUCT LIABILITY.

Copyright

© Copyright 2020-2021 Xilinx, Inc. Xilinx, the Xilinx logo, Alveo, Artix, Kintex, Spartan, Versal, Virtex, Vivado, Zynq, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. PCI, PCIe, and PCI Express are trademarks of PCI-SIG and used under license. All other trademarks are the property of their respective owners.