|
As system complexities advance, programmable
logic follows suit. High-density
FPGAs now contain millions of gates and
operate at speeds in excess of 200 MHz. At
this level, schedules, budgets, and FPGA
design tools all begin to feel the burden.
You can incrementally increase performance
or reduce the area of Xilinx devices
using the Synplicity® Synplify Pro® tool in
several ways. In this article, we’ll describe
four preferred ways to set up your design and
four ways to fine-tune synthesis, all of which
can be used together or independently.
Design Setup to Improve Timing or Area
Setting up your design correctly can
result in huge performance increases or
reductions in chip area. The following
checklist describes the best practices you
can use when setting up your design.
Include CORE Generator EDIFs
or Timing Models for Black Boxes
If Xilinx CORE Generator™ EDIF files
(*.edn) or black box timing models are
provided during synthesis, the Synplify
Pro tool knows the path timing and can
alter the logic surrounding the boxes based
on the timing constraints. If the design’s
critical path starts or ends in a black box,
adding the EDN file usually results in better
performance.
To demonstrate this point, we took an
open-source design that included a blackbox
FIFO, with the critical path ending
inside the FIFO. Without adding the
CORE Generator EDN file to the Synplify
Pro tool, the post PAR (place and route)
results yielded an Fmax of 153 MHz.
However, when we added the CORE
Generator EDN file to the synthesis
process, the clock frequency jumped to 171
MHz because of additional path optimization
performed by Synplify Pro synthesis.
Provide Accurate Clock Constraints
Under- or over-constraining your design
results in reduced performance. Do not
over-constrain the clocks by more than
15%. For maximum performance, make
sure that there is 10% negative slack on the
critical clock. This ensures that critical
paths are squeezed (see the Route
Constraint section for more information).
The Fmax field on the front panel of the
Synplify Pro software is fine for a quick
run, but do not use it if you need maximum
performance. Instead, you can put
unrelated clocks in separate clock groups in
the Synplify Pro synthesis design constraints
file (*.sdc). If your clocks are in the
same group, the Synplify Pro tool works
out the worst-case setup time for the clockto-
clock paths.
Figure 1 shows a timing diagram for two
clocks that are in the same clock group.
Synplify Pro rolls the clocks forward until
they match up again. The tool then calculates
the minimum setup time between the
clocks – in this case 10 ns.
If the clocks are very unrelated, several
hundred clock periods may be required
before the clocks match up again. This will
probably result in the worst-case setup time
being very small, such as 100 ps. You can
check the setup time in the clock relationships
table in the log file. If the setup time
is too short, it is best to re-constrain the
clocks so that they are more related.
Specify Timing Exceptions
You should provide all timing exceptions,
such as false and multicycle paths, to the
Synplify Pro tool. With this information,
the tool can ignore these paths and concentrate
on the actual critical paths.
As an example, in the Synplify Pro 7.3.3
tool we have enabled timing-driven, 3-state
to MUX conversion. If a 3-state path is
critical, the Synplify Pro tool automatically
converts the logic to multiplexers, thus
speeding up the path. Data on buses is usually
not critical and can survive a few clock
cycles because the bus master can wait for
the data to become valid. In these situations,
applying a multicycle constraint to
the 3-state path causes the Synplify Pro
tool to keep the TBUFs, thus saving area.
Constrain I/Os
If your design has I/O timing constraints,
it is likely that the critical path is through the I/O buffer. The Synplify Pro tool recognizes
these paths as the most critical
and tries to optimize them. However, I/O
paths cannot usually be physically optimized
further. Therefore, the Synplify Pro
tool prematurely stops optimizing the rest
of the design.
A new switch has been added to the
Synplify Pro 7.3 release called “Use clock
period for unconstrained I/O.” When
enabled, the tool does not include any
unconstrained I/O paths in timing optimizations,
therefore allowing the optimization
process to continue.
Fine-Tuning Designs to Improve Timing or Area
After setting up a design using the methods
previously described, you can use additional
options after synthesis to improve
design performance or area utilization.
Following these guidelines will usually save
a device size or a speed grade, and in many
cases, both.
The following optimization techniques
are design-dependent. Not all designs benefit
from enabling these features. The best
method is to analyze the implementation
of your design and see if the following optimizations
improve performance.
Standard Optimization Techniques
Retiming and Pipelining
Enabling the retiming and pipelining
options can improve your design performance
by as much as 50%. Retiming attributes
such as syn_allow_retiming let you
refine your constraints by applying retiming
to a single register.
Resource Sharing
With this option enabled, the software
shares hardware resources, thus decreasing
the area. If you disable this option,
hardware resources are not shared, which
will probably increase the area but yield
higher performance.
FSM Compiler
This option extracts and optimizes finite
state machines (FSMs) based on the number
of states. As a rule, we find the following
guidelines improve performance:
| Number of States | Suggested Encoding Scheme |
| 2-4 | Sequential |
| 5-40 | Onehot |
| Over 40 | Gray |
FSM Explorer
If the previous methods of encoding do not
produce the desired result, you can use timing-
driven state encoding. The Synplify Pro
tool automatically selects the best encoding
for the specified timing constraints.
Resource Allocation
The use of dedicated macro blocks in
Xilinx devices usually provides the best synthesis
solution, but this is not always the
case. A well-pipelined multiplier in logic
can often provide a faster (but larger) solution.
You can configure macro blocks within
the Synplify Pro tool based on the design
requirements. You can also force the tool to
use a specific resource implementation by
adding any of the following attributes:
|
Macro Block | Attribute (Options) |
| Multiplier | syn_multstyle {logic | block_mult} |
| RAM | syn_ramstyle {registers | select_ram | block_ram | no_rw_check} |
| ROM | syn_romstyle {logic | select_rom | block_rom} |
| Shift Register | syn_srlstyle {registers | select_srl | noextractff_srl} |
These attributes are extremely design-dependent.
Optimization Controls
The Synplify Pro tool provides user constraints
to let you shape and control logic
according to your design requirements.
The following attributes and directives are
the most commonly used.
- syn_keep (in the source code).
Preserves an RTL net throughout
synthesis and prevents LUT packing
and replication. It is also useful for
timing exceptions because you can
apply a -thru constraint to it.
- syn_preserve. Disables sequential optimizations
on registers, preventing
removal, merging, inverter pushthrough,
and FSM extraction.
- syn_replicate (in the constraint file).
Prevents replication of registers.
- syn_maxfan (in the constraint file).
Controls the maximum fanout limit,
triggering register replication and
buffering. This control is a hard limit
on modules and instances but a soft
limit when set globally.
- syn_direct_enable (in the constraint
file). Forces a connection to the enable
pin of the register; additional logic is
moved to the D input path.
Route Constraint
The -route constraint is probably the most
important but least known timing constraint.
It can provide a 10% performance
improvement with minimal effort, as well
as drastically reduce area.
The -route constraint adds a specified
delay to Synplify Pro’s routing estimates. A
positive value adds to the routing delay
estimate and increases criticality. A negative
value reduces the routing delay estimate
and decreases criticality.
If the Synplify Pro timing estimate is
different from the PAR value, the difference
will prevent the Synplify Pro tool
from optimizing the actual critical paths.
The -route switch allows you to align synthesis
estimates with the PAR delays.
Aligning the routing delays almost always
creates significantly better results.
Use the -route constraint to perform two
functions:
- To make synthesis see the same critical
path as PAR
- To make synthesis estimate the same
slack as PAR.
If many clocks fail PAR timing, apply
-route to the clock. If there are only a few
paths failing PAR timing, apply -route to
just those paths.
Conclusion
By setting up your design correctly and
using features and constraints described in
this article, you can meet and often surpass
performance. We found that on 50% of the
designs, using the following settings
increased Fmax by more than 25%:
- Add the CORE Generator EDIF
files to synthesis
- Apply the -route constraint to:
paths or clocks
- Turn resource sharing off
- Turn pipelining/retiming on
- Turn use clock period for
unconstrained I/O off.
For additional information about the
Synplify Pro synthesis tool, contact
Synplicity at (408) 215-6000, or visit www.synplicity.com.
Printable PDF version of this article with graphics. (4/15/04) 335 KB |