Xcell Journal Online
  Xcell Journal Archives
   
  Writing for Xcell
  Advertising in Xcell
  FREE Subscription
   
  Partner Yellow Pages
  Reference Pages
  Contact Us

    

Home : Documentation : Xcell Journal Online : Article
New XtremeDSP Slices Deliver As Much As 10X More GMACs Per Dollar



by Narinder Lall, Senior Manager – DSP Product and Solutions Marketing, Xilinx, Inc..
narinder.lall@xilinx.com (10/15/04)


Virtex-4 FPGAs offer breakthrough DSP performance at the lowest cost.
article link to PDF
Article PDF 295 KB


Xilinx® Virtex™ Series FPGAs have been the preferred choice for high-performance signal processing and DSP co-processing in many digital communications and video/imaging applications. With algorithmic complexity on the rise, the pervasive need for flexibility, and increasing downward pressures on price/power per channel, designers often face tough tradeoffs and difficult choices to employ either FPGA- or ASIC-centric systems for challenging signal processing applications.

New XtremeDSP™ slices on the Virtex-4™ device family extend FPGA signal processing capability beyond 256 GMACs, which represents DSP performance two times greater than previous generation Virtex-II Pro™ FPGAs.

Even more revolutionary, Virtex-4 system designers need not employ the largest family member to achieve this performance, as has previously been the case. Virtex-4 FPGAs now deliver this signal processing capability in a medium-density device, providing you with as much as a staggering 10X increase in the available GMACs per dollar (see Figure 1). This dramatically extends production volumes where it makes economic sense to use an FPGA for performance-centric signal processing applications. The new DSP slices also dramatically reduce power consumption, allowing you to drive down both cost and power per channel.

DSP to Logic Resources
At the heart of the Virtex-4 FPGA’s signal processing resources are new highly integrated XtremeDSP slices, sometimes referred to as DSP48s Figure 2). Depending on the family member, you can utilize as many as 512 XtremeDSP slices, each capable of providing 500 MHz throughput.

Each slice contains a dedicated 2’s complement signed, 18 x 18 bit multiplier, and a three-input adder/subtracter with feedback for accumulation modes. The addition of a seven-bit op mode multiplexer allows you to dynamically configure the XtremeDSP slice for one of more than 40 operating modes, such as addition, multiplication, accumulation, MACC functions, MACC cascading, wide (48-bit) addition, and wide multiplexing. Configuration wizards in Xilinx ISE or System Generator for DSP allow you to simply apply the desired function.

The addition of new XtremeDSP slices allows you to implement many such functions within the slice and without the need for external logic slices, which can thus be allocated to other tasks. XtremeDSP slices can also be cascaded directly without accessing logic fabric or any loss in speed.

The new Xilinx ASMBL architecture enables us to alter the mix of XtremeDSP slices and logic slices. The SX platform with in the Virtex-4 family offers the highest ratio of XtremeDSP slices to logic slices at one XtremeDSP slice for every 108 logic slices. The SX platform is ideally suited for multiplier or MAC-intensive tasks such as software radios. The LX platform offers the highest ratio of logic to other features, and is suited for many traditional FPFA applications that may also require some DSP capability.

Reduced Power Consumption
In today’s infrastructure applications, driving down cost per channel is not the only goal diligently pursued. Wireless infrastructure manufacturers are under increasing pressure to stay within power limits imposed by governing telecom standards bodies. Power consumption is also becoming a key concern for some military applications, such as Joint Tactical Radio Systems radios.

The integrated XtremeDSP slices on Virtex-4 FPGAs eliminate the need to use logic slices for many signal processing and arithmetic tasks, reducing the need for power-consuming routing resources. Initial power estimates show that XtremeDSP slices consume only 57 µW/MMAC, representing one-seventh the power for an equivalent function implemented using Virtex-II Pro FPGAs. Although not the ultimate goal, this reduction in power goes some way in addressing the power concerns of infrastructure equipment providers.

Another way to reduce system power consumption in such applications is to use the embedded processor capabilities available on the FX platform. You have the option to trade gates for processor cycles for sequential control tasks using FX platform devices. Examples of such implementations include software communication architectures or real-time operating systems.

High Compute Density Using SRL16s
Shift Register Logic (SRL16) is a unique feature in Xilinx FPGAs. A popular feature for increasing compute density in multi-channel implementations, SRL16s are included in all Virtex-4 platforms.

To demonstrate SRL16 usage, let’s take a look at a simple Reed-Solomon encoder example. Implementing a single-channel Reed-Solomon encoder in a Virtex-4 device consumes 56 logic slices. For a 16-channel implementation, one approach would be to replicate this 16 times, resulting in a consumption of 16 x 56 slices. Figure 3 shows another implementation of the 16-channel solution using SRL16s. This consumes only 86 logic slices, representing only 10% of the 16X replicated version. SRL16s can substantially pack more signal processing into a smaller area, allowing you to potentially target a much smaller device than is possible with other FPGA architectures.

Serial/Parallel Connectivity
In addition to embedded processors, the FX platform also includes 3.125 Gbps multi-gigabit transceivers that are particularly suited for interfacing to other DSP processors. One such example is highspeed serial connectivity using the serial RapidIO™ interface, which is gaining momentum with DSP vendors. With 1 Mbps LVDS interfaces for interfacing to high-speed A/D converters and a host of DRAM and SRAM memory interfaces for hooking up to frame buffers, the Virtex-4 family is an ideal platform for interfacing to other DSP devices that will form part of the system data flow.

Virtex-4 DSP Design Solutions
The Virtex-4 family includes a beefed-up

  • System Generator for DSP allows you to model your design in The MathWorks Simulink® and, through powerful capabilities like hardware-in-the-loop, verify and debug that design from the same environment. System Generator also includes a new block that allows you to instantiate an XtremeDSP slice and configure it for one of its many operating modes.
  • Hardware-in-the-loop is supported for any Virtex-4 development environment with a JTAG header. Other new capabilities introduced in System Generator 6.3 include the ability to generate VHDL or Verilog™ netlists.
  • The Xilinx DSP library now supports Virtex-4 FPGAs, allowing you to develop designs faster.
  • A range of services are now available as you implement your DSP design onto Virtex-4 FPGAs. These include DSP design services, education classes, and platinum/technical support.
Conclusion
FPGA-based DSP has always been associated with high performance when hundreds of GMACs/s rates are needed. Virtex-4 FPGAs bring a new revolutionary era in the XtremeDSP initiative that provides you with economic incentives to use FPGAs and get your design to market faster than ever before.

To understand how to use the new XtremeDSP slices in your next design, attend the Virtex-4 session in the DSP track at Programmable World 2004, or watch the demo-on-demand that will follow the event at www.xilinx.com/dsp/.

Printable PDF version of this article with graphics. PDF logo (10/15/04) 295 KB

 
Jobs Events Webcasts News Investors Feedback Legal Privacy Trademarks Sitemap
© 1994-2008 Xilinx, Inc. All Rights Reserved.