Xcell Journal Online
  Xcell Journal Archives
   
  Writing for Xcell
  Advertising in Xcell
  FREE Subscription
   
  Partner Yellow Pages
  Reference Pages
  Contact Us

    

Home : Documentation : Xcell Journal Online : Article
Using Spartan-3/3E Features to Area-Optimize Your Design



by Suhel Dhanani, Senior Marketing Manager, Spartan Products, Xilinx, Inc.
suhel.dhanani@xilinx.com
and
Ken Chapman, Staff Engineer, General Products Division, Xilinx, Inc.
ken.chapman@xilinx.com (4/18/05)


Spartan-3/3E devices include features that were typically the domain of only high-end FPGAs – distributed memory, embedded multipliers, and others that can yield significant optimization.
article link to PDF
Article PDF 285 KB


The Xilinx® Spartan™-3 series is the first low-cost FPGA that does not compromise on features – it offers fast embedded multipliers, look-up tables that can be optimally configured to form shift registers, distributed memory, and large amounts of embedded block RAMs.

All of these features let you design for the lowest possible area, lowering the size and thereby the cost of the FPGA in your design.

With more than 100 million units shipped, Spartan devices have quickly become the world’s most accepted lowcost FPGA architecture, familiar to thousands of engineers. With every generation of the Spartan architecture, Xilinx has rolled out even more logic and I/O at a lower price. With the release of the Spartan-3 series we did the same, but also added embedded features that let you integrate even more functionality in your favorite Spartan device. Table 1 shows the advantages of these features.

Features That Optimize Area
The Spartan-3/3E architecture is very memory-intensive. Not only does this architecture have embedded 18 Kb block RAMs, but half the look-up tables (LUT) can be configured as memory (Figure 1).

Table 1 – Features included in the Spartan-3/3E family of devices
Spartan-3/3E Features To Reduce Area Utilization
Embedded MultipliersArea-efficient implementation of various DSP functions
Up to Eight Digital Clock Managers (DCM) Clock management functions such as clock multiplication/division, phase alignment can be integrated within the FPGA
Distributed RAM Small FIFOs, LIFO buffers, scratch-pad memory, register banks
18 Kb Block RAM Large FIFOs, buffers, cache memory, program storage for processors
LUT Configured as 16-bit Shift Register (SRL16E) FIFOs, delay lines, state machines, logic-register, tradeoff in regular intensive designs
Differential Signaling Support – LVDS, RSDS Lower number of pins, reduced power consumption, reduced EMI, high noise immunity

The advantage of having distributed memory is that you can very efficiently implement any design that requires a large number of smaller size memory structures without running out of registers or block RAMs, offering the potential for unrivaled data bandwidth. A good example of this is the 8-bit PicoBlaze™ microcontroller, which takes only 192 logic cells (LC) within the Spartan-3/3E architecture. As shown in Figure 2, this is possible only because the register stack, scratchpad memory, program counter stack, and program ROM are constructed using the FPGA memory resources.

A 4-input LUT in a slice-M can also be configured in the SRL16E mode. This provides a 16-bit shift register with clock enable in addition to the dedicated flip-flop, as shown in Figure 3.

When writing to this shift register, the new data is always placed in location “0” and all other data moves along by one location. However, data can be read from any location using the address inputs A[3:0].

The SRL16E provides a very cost-efficient way to implement a delay line. For example, Figure 4 shows how one LUT can be used to generate a 5-cycle delay for the data. Simply apply the data into the SRL16E data input, hard-code the address pins at “0100,” and read the data out from that address. Synthesis tools can implement this technique automatically for anything up to a 16-cycle delay in each LUT.

The FIFO is a common component of many system designs. The normal way to construct a FIFO is by using a memory (Figure 5). Because of the need to write at the same time as reading, dual-port memory is required. Two address counters are also needed to point at the write address and the read address. Additional comparator logic is then required to determine the state of the FIFO: full, empty, or half-full.

In Figure 6, a FIFO can be implemented using one SRL16E for each bit of the data path width (Din). The SRL16E gives a dual-port operation but in half the space of the real dual-port memory. We also only need one up/down counter instead of two up counters, which again is half the space. As a final bonus, the counter status tells you precisely how many words are stored in the FIFO, with the most significant bit naturally providing a most useful “half-full flag.”

The embedded multipliers available in the Spartan-3 fabric are useful in a variety of DSP applications. Each multiplier can replace as many as 450 logic cells that would normally be required to implement a multiply function.

Many implementations of finite impulse response (FIR) filters – base stations, digital video systems, wireless LANs, xDSL, and cable modems – use multiply-accumulate functions. You can implement high-performance FIR filters that use multiple MACs with minimal area penalty within the Spartan-3 fabric. Such MAC-intensive functions would ordinarily take significant logic resources in competing FPGAs that lack embedded multipliers.

Conclusion
Spartan-3/3E features were designed to save costs by allowing the design to be implemented using the smallest (and thereby the lowest cost) silicon. Spartan devices utilize the world’s most advanced manufacturing process and incorporate features that allow you to integrate even more functionality in the smallest FPGA.

Printable PDF version of this article with graphics. PDF logo (4/18/05) 285 KB

 
Jobs Events Webcasts News Investors Feedback Legal Sitemap
© 1994-2008 Xilinx, Inc. All Rights Reserved.