Troubleshooting Performance Issues

The SDSoC environment provides some basic performance monitoring capabilities in the form of the sds_clock_counter() function described earlier. Use this to determine how much time different code sections, such as the accelerated code, and the non-accelerated code take to execute.

Estimate the actual hardware acceleration time by looking at the latency numbers in the Vivado HLS report files (_sds/vhls/…/*.rpt). In the SDSoC IDE Project Platform Details tab, you can determine the CPU clock frequency, and in the Project Overview you can determine the clock frequency for a hardware function. A latency of X accelerator clock cycles is equal to X * (processor_clock_freq/accelerator_clock_freq) processor clock cycles. Compare this with the time spent on the actual function call to determine the data transfer overhead.

For best performance improvement, the time required for executing the accelerated function must be much smaller than the time required for executing the original software function. If this is not true, try to run the accelerator at a higher frequency by selecting a different clkid on the sdscc/sds++ command line. If that does not work, try to determine whether the data transfer overhead is a significant part of the accelerated function execution time, and reduce the data transfer overhead. Note that the default clkid is 100 MHz for all platforms. More details about the clkid values for the given platform can be obtained by running sdscc –sds-pf-info <platform name>.

If the data transfer overhead is large, the following changes might help:
  • Move more code into the accelerated function so that the computation time increases, and the ratio of computation to data transfer time is improved.
  • Reduce the amount of data to be transferred by modifying the code or using pragmas to transfer only the required data.