Xcell Journal Online
  Xcell Journal Article
  Partner Yellow Pages
   
  Xcell Archives
  Order Free Xcell Journal
  Comments & Suggestions
  Write Articles for Xcell

    

Home : Documentation : Xcell Journal Online : Article
Enabling Low-Cost DSP Co-Processing with Spartan-3 FPGAs



by Suhel Dhanani, Senior Solutions Marketing Manager, Xilinx, Inc.
suhel.dhanani@xilinx.com

and

Steve Zack, Signal Processing Engineer, Xilinx, Inc.
steve.zack@xilinx.com (4/15/04)

Embedding high-performance DSP functions within FPGA fabric is now a genuine low-cost option.

article link to PDF
Article PDF 303 KB


FPGAs have been used in DSP applications for years as logic aggregators, bus bridges, and peripherals. More recently, FPGAs have gained considerable traction in highperformance DSP applications and have also emerged as ideal co-processors for standard DSP devices.

In these latter roles, FPGAs provide tremendous computational throughput by using highly parallel architectures. Because the hardware is re-configurable, you can develop customized architectures for ideal implementation of your algorithms.

The new generation of Spartan-3™ low-cost FPGAs, developed using 90 nm process technology, not only creates an effective way to implement high-performance DSP functions but provides an even more economical solution. Their low cost means that you can use them to implement high-performance DSP co-processing functions in conjunction with a conventional DSP device – typically integrating pre- and post-processing functions in a cost-effective manner.

Key Advantages
FPGA architectures are well suited for highly parallel implementations of DSP functions, allowing for very high performance. And user programmability allows you to trade off device area versus performance by selecting the appropriate level of parallelism to implement your functions.

FPGAs are essentially arrays of uncommitted logic and signal processing resources. These signal processing resources allow you to implement DSP functions using highly scalable, parallel processing techniques.

For example, whereas a traditional DSP solution would implement multiple multiply accumulate (MAC) functions in a serial manner, an FPGA allows you to implement these in parallel using dedicated multipliers and registers that are now available in the Spartan-3 family.

As another example, consider a 256-tap finite impulse response (FIR) filter. By using resources available in the FPGA fabric, you can design a highly parallel implementation and achieve higher performance (Figure 1). Because FPGAs are completely hardware- configurable, you have the flexibility to only use the necessary resources that the algorithm demands.

Figure 2 shows the different ways of implementing four MAC functions. By using four embedded multipliers within the FPGA fabric, you can complete these implementations at maximum speed. Alternatively, you can opt to conserve area and implement the same function at a lower performance by using only one multiplier, one accumulator, and a register, or use the semi-parallel approach.

Although FPGAs bring significant benefits to DSP, it is important to analyze the effective cost of implementing DSP functions within the FPGA fabric. For the purpose of this analysis, the new Spartan-3 FPGA family is considered because of its low cost and system features for DSP.

Spartan-3 Devices: Optimized for DSP
Spartan-3 FPGAs use 90 nm manufacturing technology to achieve low silicon die costs. These devices are also the only lowcost FPGAs that have all of the features required for efficiently implementing DSP functions – features that were once the exclusive domain of high-end FPGAs (Table 1).

Table 1 – These Spartan-3 features enable DSP functions in an area-efficient manner.
Spartan-3 Silicon Features Customer Benefits
Embedded 18 x 18 Multipliers Area-efficient implementation of multiply function
Distributed RAM Local storage for DSP coefficients, small FIFOs
Shift Register Logic 16-bit shift register ideal for capturing high-speed or burst mode data and to store data in DSP applications
Up to 104 18 Kb Block RAM Video line buffers, cache tag memory, scratch-pad memory, packet buffers, large FIFOs

With the Spartan-3 family, you can implement high-performance, complex DSP functions in a small portion of the total device, leaving the rest of the device free to implement system logic or interfacing functions – providing both lower costs and higher system integration.

Table 2 demonstrates how the combination of advanced features and low cost work together to provide DSP capability at a low cost. The table shows a sampling of available Spartan-3 parts, the number of million multiply accumulate per second (MMAC/s), and the cost for MMAC/s in each device.

Table 2 –Calculating the cost per MMAC/s
DeviceEmbedded Mults
(18 x 18)
MMAC/second
(Number of Mults
x 150 MHz)
Cost for MMAC/s
XC3S50 4 600 $0.0055
XC3S200 12 1,800 $0.0024
XC3S400 16 2,400 $0.0030
XC3S1000 24 3,600 $0.0037
XC3S1500 32 4,800 $0.0044

We calculated the MMAC/s column by multiplying the number of multipliers with their operating frequency, which for Spartan-3 FPGAs is 150 MHz in the slowest speed grade.

Then, looking at the published 50,000-unit price for the slowest speed grade of the appropriate device, we calculated the cost for MMAC/s. This is one of the quoted industry benchmarks, with the cost per MMAC/s reaching a quarter of a cent.

How to Achieve the Lowest DSP Function Cost
No standard currently exists to estimate the actual cost of implementing DSP functions onto FPGAs. For the purposes of this analysis, however, let’s theorize that the effective cost is the cost based on percentage of silicon area utilized, multiplied by the unit device cost. This is a fair calculation, since the remainder of the FPGA is available for other system functions.

To calculate the effective cost of a DSP function when implemented in an FPGA, we considered the Spartan-3 XC3S1000 device, which is a mid-range member of the Spartan-3 family. In many cases, a given DSP function uses not only the FPGA logic but also embedded multipliers and block RAMs. In that case, we included the estimated amount of die space taken by these embedded functions and added that to the die area used by the logic.

Table 3 shows some of these functions and the cost of implementing these within the Spartan-3 silicon. (We have not included the cost for programming the PROM, because in many cases you can use the existing EPROM on-board to program the FPGA.)

Table 3 – Effective costs of various DSP functions in a Spartan-3 device
Functions% of the
XC3S1000
Device Utilized
Effective Cost
(50K Units)
Key Specification Other Specifications
1024-point complex FFT24.1% $3.23 20 µs transform 20 µs transform, burst I/O, 16-bit input and phase factor
Single channel 64-tap FIR filter3.0% $0.41 8.1 MSPS 16-bit data and co-efficient, MAC implementation, 8.1 MSPS
Digital down converter per channel18.6% $2.49 Sample rate 100 MSPS 
Digital up converter per channel18.6% $2.49 Sample rate 100 MSPS 
Viterbi decoder 37.8% $5.06 1.9 MSPS per channel Parallel mode, trace-back 42,
constraint length = 7,
32-channel, 1.9 MSPS per channel
Reed Solomon G.709 encoder1.3% $0.17 120 MHz 
Reed Solomon G.709 decoder6.9% $0.92 60 MHz 

Some of the most common functions used in DSP applications are fast Fourier transforms (FFTs) and FIR filters. A single channel 64-tap MAC FIR filter running at 8.1 mega samples per second (MSPS) can be implemented for an effective cost of $0.41. Note that this filter uses 200 logic slices and four embedded multipliers – approximately 3% of the die area.

You can also implement simple forward error correction DSP cores such as Viterbi and Reed Solomon functions at a low cost within the Spartan-3 device. A 32-channel, parallel mode Viterbi decoder running at 1.9 MSPS per channel has an effective cost of $5.06, or $0.16 per channel. A Reed Solomon G.709 decoder function running at 60 MHz takes only 6.9% of the same device (with an effective cost of $0.92).

Complex functions such as a digital down converter (DDC) or a digital up converter (DUC) – commonly used in wireless base stations – take less than 20% of the Spartan-3 XC3S1000 device (with an effective cost of $2.49).

Development Tool Flow
With Xilinx, you can use industry standard development tools for your DSP designs. Using MATLAB™ and Simulink™ from The MathWorks, coupled with Xilinx System Generator for DSP, you can now model, simulate, and verify your signal processing algorithms on your target hardware platform without leaving the Simulink environment.

The design flow typically involves the following steps:

  1. A DSP designer develops and verifies the hardware model using industry-standard tools from The MathWorks in conjunction with Xilinx System Generator for DSP.
  2. With a push of a button, Xilinx System Generator generates an HDL circuit representation that is bit- and cycle-true, meaning that the behavior is guaranteed to match the functionality seen in the Simulink/System Generator model.
  3. The ISE design tools synthesize the design and produce a bitstream that can be used to program the FPGA.
The error-prone and time-consuming step of having an FPGA designer translate the system engineer’s design into HDL is thus eliminated. Figure 3 shows a typical design flow using the Xilinx System Generator. With recent advances in this product, DSP designers can now generate an FPGA bitstream directly using Simulink/System Generator.

Conclusion
With its combination of low unit cost and architecture optimized for DSP functions, Spartan-3 FPGAs have the industry’s lowest price points for highperformance DSP functions. Xilinx further enables embedded DSP functions by providing design tools that fit within your tool flow and enhance your productivity by automating the FPGA implementation process.

With the availability of Spartan-3 devices, associated design tools, and the increasing number of off-the-shelf DSP functions optimized for this fabric, you must evaluate embedding DSP functions within Spartan-3 FPGAs as a viable option.

For more information, visit www.xilinx.com/spartan3/, www.xilinx.com/dsp/, and www.xilinx.com/ipcenter/.

Printable PDF version of this article with graphics. PDF logo (4/15/04) 303 KB

 
Jobs Events Webcasts News Investors Feedback Legal Privacy Trademarks Sitemap
© 1994-2008 Xilinx, Inc. All Rights Reserved.