We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

AR# 66421

Example Design - Accelerated adaptive filter design and verification using HLS and System Generator for DSP


Attached to this Answer Record is an Example Design for using Vivado HLS to develop adaptive FIR filters and Hardware Co-simulation in System Generator for DSP.


The purpose if this design is to show how you can use Xilinx high-level design tools (specifically, HLS and System Generator for DSP) to rapidly develop, test, and verify advanced signal processing blocks.

Traditionally, the design of a given DSP function on an FPGA would involve roughly the following design steps:

The flow proposed here using Vivado HLS and System Generator for DSP greatly reduces the development time at several of these steps. The first obvious benefit when using HLS is that step 4 and its associated iterations are completely eliminated, because HLS can convert the fixed point model (implemented in C/C++) directly to RTL. 

As a result, significant design time is saved not only in writing the actual RTL for the implementation, but also writing test benches to verify the behavior. A side benefit is that an experienced FPGA design engineer is no longer required to write the RTL; the algorithm engineer can do it with the push of a button. 

The next clear benefit that HLS provides is the ability to optimize a given implementation for area or performance with relatively little effort. By simply tweaking pragmas on critical parts of the code, HLS might generate a totally different architecture. This flexibility can be exploited to minimize the impact of step 5 and its iterations. Designing FPGA circuits the old fashioned way would require re-writes of significant portions of the module to optimize for area vs. performance.

In addition to these obvious benefits over traditional RTL DSP design, several other steps in this flow will benefit from more implicit factors. Verification of step 4 traditionally requires the RTL designer to also write test benches to apply the defined test signals to the Unit Under Test. Writing functionally correct test benches is eliminated with the HLS flow because it offers C-Simulation of the source code and automated C-RTL co-simulation. 

RTL simulation is also significantly slower than C-based simulation. So verification iterations can be done using C-simulation very quickly and, once the final implementation is decided, C-RTL co-simulation can be used to do the final verification of the C model against the auto-generated RTL code.

Perhaps the most significant implicit benefit that the HLS flow provides is in cases where a given algorithm is decided on (step 2), but it turns out that it is impossible to meet performance/area requirements at the desired target device cost once the implemented circuit has been optimized as much as possible (step 5). When this happens, you must go back from step 5 to at least step 2! You will need to develop a new algorithm to solve the problem. In a traditional RTL flow, all of the work for steps 4 and 5 are scrapped and the time spent developing the algorithm can be entirely wasted. Because steps 4 and 5 are automated in the HLS flow, a tweak or even total re-write of an algorithm has minimal impact on the development timeline. This allows you to more liberally explore different options and converge on the optimal solution.

Finally, this design showcases the utility of Burst-mode Hardware Co-Simulation in System Generator for DSP. When a given block needs to be implemented into the rest of the DSP pipeline along with other blocks, System Generator for DSP is the tool of choice. Simulation of such systems with large data vectors can take a very long time. That time can be dramatically reduced by using the Hardware Co-Simulation flow, as can be seen in the attached design. This flow allows you to push the number crunching onto an attached Xilinx development board (KC705 in this case) and different test vectors can be run through the system in a fraction of the design. In this example design, the pure-simulink simulation took more than an hour to run. Once the design was compiled, running Hardware Co-Simulation completes in just a couple minutes.

This design was tested in the following environment:

  • Vivado 2015.3
  • Matlab R2015a
  • RHEL 6.5, 64-bit
  • KC705


Associated Attachments

Name File Size File Type
hls_adaptive_filter_v1.0.zip 6 MB ZIP
AR# 66421
Date 01/28/2016
Status Active
Type General Article
  • System Generator for DSP
  • Vivado Design Suite
Page Bookmarked