|
Being the first to deliver a product in a
given market greatly increases its chance of
success. Effective use of chip resources
allows for less expensive parts and a less
expensive overall solution.
Synplify Pro software has many timing
closure features that increase your chances of
using slower speed-grade parts without compromising
on quality. The Synplify Pro tool
offers a perfect fit for the high-volume, low-cost
Spartan™ series of Xilinx® FPGAs.
Identifying Critical Areas
Usually, only a few sections or modules in
a design fail timing. It is often a good idea
to recognize these structures while coding
your design. During coding, the designer
usually knows what device the design is targeted
for but not the speed grade. Bumping
up to the next speed grade can add unbudgeted
costs to the project. In this article,
we’ll outline how to design with performance
and area in mind, along with a few
tricks-of-the-trade for reducing area.
You should know roughly how many
levels of logic the requirements will tolerate
before failing timing. Keep this in
mind while coding counters, state
machines, decoding, and data path logic.
When the logic is nearing the first draft,
synthesize and place and route; this will
give you a glimpse at the problems ahead.
If you meet timing, then read further to
the area-saving section.
If you don’t meet timing, here is a
checklist you can go through:
- First and foremost, use the Synplify
or Synplify Pro products. Some designers
are often unaware of the impact of
a quality synthesis tool. If you don’t
have Synplify or Synplify Pro and are
not meeting timing, or you want to
reduce area, you can download a fullfeatured
evaluation copy from the
Synplicity website for free.
- Is Synplify software reporting positive
slack in the log file (.srr)? If the Synplify
product’s estimates are incorrect (usually
because of excessive routing delays
caused by congestion), the logic optimizations
will be affected. The Synplify
solution has to work on the same critical
areas as reported by place and route. If
there are excessive routing delays, apply
the -route constraint to either the clock
or the path (see the online documentation
for more information).
- Do not use the global frequency box
on the front panel of the Synplify tool
unless you need to constrain combinatorial
paths. Specify all the clocks in
the constraints editor (SCOPE).
Ensure that unrelated clocks are put
into different clock groups.
- If you have clocks that are related and
in the same clock group, make sure the
periods have a common multiple. If
the clocks are slightly different, this
can cause very small clock-to-clock
setup delays, making timing impossible.
This can again have an effect on
logic optimizations. For example, if
you have two clocks, Clk1 20 Mhz and
Clk2 23 Mhz, change to Clk1 20 Mhz
and Clk2 25 Mhz.
- Are the I/O constraints correct?
Excessive input or output delay can
wreak havoc with synthesis. If the critical
path includes an I/O and one level of
logic, there is not much that synthesis
tools can do about it. Turn on I/O register
packing with the syn_useioff switch.
- Ensure that pipelining is enabled. Try a
run with re-timing on. We have found
that this gives a ~ 5% performance
improvement for Spartan-3 devices. It is
also a very good idea to surgically apply
the retiming attribute to registers on the
critical path with the global switch off.
This can increase performance while
keeping the area increase to a minimum.
- Add all timing exceptions: false paths
and multi-cycle paths. Just adding these
constraints can make a huge difference
in performance.
- Does the critical path start or end inside
a black box or Coregen .edn/.ngc/.ngo?
Is it possible to re-code in generic code?
If not, add the .edn file to the Synplify
project. The Synplify tool will optimize
the logic around these cores. If you
have .ngc files, use ngc2edf.exe to convert
the core to .edn and add to the
Synplify project.
- Provide I/O types. Remember to
specify the speed and type for I/Os.
Synplify software needs this timing
information to optimize the
driving/driven logic.
- Are there gated clocks in the design?
Turn on the gated-clock converter in the
Synplify Pro tool. This engine will push
the clock from the register input pin
onto the clock line while maintaining the
same functionality. This will dramatically
increase the performance of this path.
- Turn off resource sharing.
Are you now meeting timing? If not, you
may want to try Amplify FPGA physical
synthesis, which generates detailed placement
and performs physical optimizations
for additional performance and timing predictability.
How to Reduce Area
Here are few ways to reduce area for
Spartan-3 FPGAs. These techniques will
have an impact on the timing.
- Use as many of the dedicated resources
as possible. Spartan-3 devices have
plenty of resources that you can tap
into to reduce LUT usage. Here are
some suggestions:
- Try to keep black boxes/Coregen to a
minimum. Wherever possible replace
these components with generic code.
Synplify software is capable of optimizing
around black boxes, but not the boxes
themselves. Significant effort has gone
into dedicated resource management for
each Xilinx architecture. With generic
code, the Synplify product can remove
redundant logic, merge identical logic,
and pack into the dedicated resources.
- Synplify software may pack logic into
registers for performance reasons. If this
is undesirable, either loosen the timing
constraint or force the logic into
RAMs/ROMs/multipliers. Please see
the Synplify online documentation for
more information.
- There are times when the Synplify tool
will not pack logic into the dedicated
resources. Common occurrences are:
- RAMs; either the address bus or the
data output must be registers
- Reset or preset is synchronous/asynchronous, causing failure
to map
- Enable is synchronous/asynchronous,
causing failure to map (our
online help has examples)
- Timing constraints are too tight;
logic is mapped to registers for timing
reasons
- ROMs are less than half-populated;
use the syn_romstyle attribute
to force
- Clock tree management. Clock management
is extremely design-dependent,
so it is difficult to offer generic
advice. The idea is to pack each clock
quadrant of the chip with as much
logic as possible.
- The Synplify tool will build feedback
logic for registers without resets. This
extra MUX can add to the LUT
count. Reset every register in the
design if possible.
- If certain paths/modules are clocked by
the critical clock but are not critical
themselves, either supply a multicycle
path or give them another clock line off
the DCM. The perfect solution is to
have as little as possible high-frequency/
critical logic on the chip.
- Experiment with different state
machine implementations. Try a run
with the Synplify Pro product’s FSM
Explorer. This is a timing-driven state
machine engine that will increase runtime
but can often find a better FSM
solution for your particular design.
- Turn on resource sharing. Resource
sharing tells the tool to share logic such
as adders and multipliers whenever possible.
This may have an impact on timing,
but usually results in fewer
resources used.
Conclusion
Spartan-3 devices are increasingly being
used in higher volume applications. But
along with higher volumes comes the
requirement that part cost is as low as possible.
We hope these guidelines and hints
will help you quickly achieve the performance
required for your FPGAs while minimizing
the logic resources – and therefore
cost – of your next design project.
For more information, please contact your
local Synplicity sales office, which you can
find on our website, www.synplicity.com.
Printable PDF version of this article with graphics. (4/18/05) 150 KB |