OpenCL Kernels

The following OpenCL kernel discussion is based on the information provided in the C/C++ Kernels topic. The same programming techniques for accelerating the performance of a kernel apply to both C/C++ and OpenCL kernels. However, the OpenCL kernel uses the __attribute syntax in place of pragmas. For details of the available attributes, refer to OpenCL Attributes.

The following code examples show some of the elements of an OpenCL kernel for the Vitis application acceleration development flow. This is not intended to be a primer on OpenCL or kernel development, but to merely highlight some of the key difference between OpenCL and C/C++ kernels.

Kernel Signature

In C/C++ kernels, the kernel is identified on the Vitis compiler command line using the v++ --kernel option. However, in OpenCL code, the __kernel keyword identifies a kernel in the code. You can have multiple kernels defined in a single .cl file, and the Vitis compiler will compile all of the kernels, unless you specify the --kernel option to identify which kernel to compile.

__kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void apply_watermark(__global const TYPE * __restrict input, 
   __global TYPE * __restrict output, int width, int height) {
TIP: The complete code for the kernel function above, apply_watermark, can be found in the Global Memory Two Banks (CL) example in theVitis Accel Examples GitHub repository.

In the example above, you can see the watermark kernel has two pointer type arguments: input and output, and has two scalar type int arguments: width and height.

In C/C++ kernels, these arguments would need to be identified with the HLS INTERFACE pragmas. However, in the OpenCL kernel, the Vitis compiler, and Vitis HLS recognize the kernel arguments, and compile them as needed: pointer arguments into m_axi interfaces, and scalar arguments into s_axilite interfaces.

Kernel Optimizations

Because the kernel is running in programmable logic on the target platform, optimizing your task to the environment is an important element of application design. Most of the optimization techniques discussed in C/C++ Kernels can be applied to OpenCL kernels. Instead of applying the HLS pragmas used for C/C++ kernels, you will use the __attribute__ keyword described in OpenCL Attributes. Following is an example:

// Process the whole image 
image_traverse: for (uint idx = 0, x = 0 , y = 0  ; idx < size ; ++idx, x+= DATA_SIZE)

The example above specifies that the for loop, image_traverse, should be pipelined to improve the performance of the kernel. The target II in this case is 1. For more information, refer to xcl_pipeline_loop.

In the following code example, the watermark function uses the opencl_unroll_hint attribute to let the Vitis compiler unroll the loop to reduce latency and improve performance. However, in this case the __attribute__ is only a suggestion that the compiler can ignore if needed. For details, refer to opencl_unroll_hint.

//Unrolling below loop to process all 16 pixels concurrently
watermark: for ( int i = 0 ; i < DATA_SIZE ; i++)

For more information, review the OpenCL Attributes topics to see what specific optimizations are supported for OpenCL kernels, and review the C/C++ Kernels content to see how these optimizations can be applied in your kernel design.