Xcell Journal Online
  Xcell Journal Article
  Partner Yellow Pages
   
  Xcell Archives
  Order Free Xcell Journal
  Comments & Suggestions
  Write Articles for Xcell

    

Home : Documentation : Xcell Journal Online : Article
Tips for Improving Synplify Pro Performance for FPGA Designs



by Steve Pereira, Technical Marketing Manager, Synplicity, Inc.
stevep@synplicity.com (4/15/04)

Using simple setup and optimization techniques, the Synplify Pro synthesis tool helps you increase design speed and reduce chip area.

article link to PDF
Article PDF 335 KB


As system complexities advance, programmable logic follows suit. High-density FPGAs now contain millions of gates and operate at speeds in excess of 200 MHz. At this level, schedules, budgets, and FPGA design tools all begin to feel the burden.

You can incrementally increase performance or reduce the area of Xilinx devices using the Synplicity® Synplify Pro® tool in several ways. In this article, we’ll describe four preferred ways to set up your design and four ways to fine-tune synthesis, all of which can be used together or independently.

Design Setup to Improve Timing or Area
Setting up your design correctly can result in huge performance increases or reductions in chip area. The following checklist describes the best practices you can use when setting up your design.

Include CORE Generator EDIFs or Timing Models for Black Boxes
If Xilinx CORE Generator™ EDIF files (*.edn) or black box timing models are provided during synthesis, the Synplify Pro tool knows the path timing and can alter the logic surrounding the boxes based on the timing constraints. If the design’s critical path starts or ends in a black box, adding the EDN file usually results in better performance.

To demonstrate this point, we took an open-source design that included a blackbox FIFO, with the critical path ending inside the FIFO. Without adding the CORE Generator EDN file to the Synplify Pro tool, the post PAR (place and route) results yielded an Fmax of 153 MHz. However, when we added the CORE Generator EDN file to the synthesis process, the clock frequency jumped to 171 MHz because of additional path optimization performed by Synplify Pro synthesis.

Provide Accurate Clock Constraints

Under- or over-constraining your design results in reduced performance. Do not over-constrain the clocks by more than 15%. For maximum performance, make sure that there is 10% negative slack on the critical clock. This ensures that critical paths are squeezed (see the Route Constraint section for more information).

The Fmax field on the front panel of the Synplify Pro software is fine for a quick run, but do not use it if you need maximum performance. Instead, you can put unrelated clocks in separate clock groups in the Synplify Pro synthesis design constraints file (*.sdc). If your clocks are in the same group, the Synplify Pro tool works out the worst-case setup time for the clockto- clock paths.

Figure 1 shows a timing diagram for two clocks that are in the same clock group. Synplify Pro rolls the clocks forward until they match up again. The tool then calculates the minimum setup time between the clocks – in this case 10 ns.

If the clocks are very unrelated, several hundred clock periods may be required before the clocks match up again. This will probably result in the worst-case setup time being very small, such as 100 ps. You can check the setup time in the clock relationships table in the log file. If the setup time is too short, it is best to re-constrain the clocks so that they are more related.

Specify Timing Exceptions
You should provide all timing exceptions, such as false and multicycle paths, to the Synplify Pro tool. With this information, the tool can ignore these paths and concentrate on the actual critical paths.

As an example, in the Synplify Pro 7.3.3 tool we have enabled timing-driven, 3-state to MUX conversion. If a 3-state path is critical, the Synplify Pro tool automatically converts the logic to multiplexers, thus speeding up the path. Data on buses is usually not critical and can survive a few clock cycles because the bus master can wait for the data to become valid. In these situations, applying a multicycle constraint to the 3-state path causes the Synplify Pro tool to keep the TBUFs, thus saving area.

Constrain I/Os
If your design has I/O timing constraints, it is likely that the critical path is through the I/O buffer. The Synplify Pro tool recognizes these paths as the most critical and tries to optimize them. However, I/O paths cannot usually be physically optimized further. Therefore, the Synplify Pro tool prematurely stops optimizing the rest of the design.

A new switch has been added to the Synplify Pro 7.3 release called “Use clock period for unconstrained I/O.” When enabled, the tool does not include any unconstrained I/O paths in timing optimizations, therefore allowing the optimization process to continue.

Fine-Tuning Designs to Improve Timing or Area
After setting up a design using the methods previously described, you can use additional options after synthesis to improve design performance or area utilization. Following these guidelines will usually save a device size or a speed grade, and in many cases, both.

The following optimization techniques are design-dependent. Not all designs benefit from enabling these features. The best method is to analyze the implementation of your design and see if the following optimizations improve performance.

Standard Optimization Techniques
Retiming and Pipelining
Enabling the retiming and pipelining options can improve your design performance by as much as 50%. Retiming attributes such as syn_allow_retiming let you refine your constraints by applying retiming to a single register.

Resource Sharing
With this option enabled, the software shares hardware resources, thus decreasing the area. If you disable this option, hardware resources are not shared, which will probably increase the area but yield higher performance.

FSM Compiler
This option extracts and optimizes finite state machines (FSMs) based on the number of states. As a rule, we find the following guidelines improve performance:

Number of States Suggested Encoding Scheme
2-4 Sequential
5-40 Onehot
Over 40 Gray

FSM Explorer
If the previous methods of encoding do not produce the desired result, you can use timing- driven state encoding. The Synplify Pro tool automatically selects the best encoding for the specified timing constraints.

Resource Allocation
The use of dedicated macro blocks in Xilinx devices usually provides the best synthesis solution, but this is not always the case. A well-pipelined multiplier in logic can often provide a faster (but larger) solution. You can configure macro blocks within the Synplify Pro tool based on the design requirements. You can also force the tool to use a specific resource implementation by adding any of the following attributes:

Macro Block Attribute (Options)
Multiplier syn_multstyle {logic | block_mult}
RAM syn_ramstyle {registers | select_ram | block_ram | no_rw_check}
ROM syn_romstyle {logic | select_rom | block_rom}
Shift Register syn_srlstyle {registers | select_srl | noextractff_srl}

These attributes are extremely design-dependent.

Optimization Controls
The Synplify Pro tool provides user constraints to let you shape and control logic according to your design requirements. The following attributes and directives are the most commonly used.

  • syn_keep (in the source code). Preserves an RTL net throughout synthesis and prevents LUT packing and replication. It is also useful for timing exceptions because you can apply a -thru constraint to it.
  • syn_preserve. Disables sequential optimizations on registers, preventing removal, merging, inverter pushthrough, and FSM extraction.
  • syn_replicate (in the constraint file). Prevents replication of registers.
  • syn_maxfan (in the constraint file). Controls the maximum fanout limit, triggering register replication and buffering. This control is a hard limit on modules and instances but a soft limit when set globally.
  • syn_direct_enable (in the constraint file). Forces a connection to the enable pin of the register; additional logic is moved to the D input path.
Route Constraint
The -route constraint is probably the most important but least known timing constraint. It can provide a 10% performance improvement with minimal effort, as well as drastically reduce area.

The -route constraint adds a specified delay to Synplify Pro’s routing estimates. A positive value adds to the routing delay estimate and increases criticality. A negative value reduces the routing delay estimate and decreases criticality.

If the Synplify Pro timing estimate is different from the PAR value, the difference will prevent the Synplify Pro tool from optimizing the actual critical paths. The -route switch allows you to align synthesis estimates with the PAR delays. Aligning the routing delays almost always creates significantly better results.

Use the -route constraint to perform two functions:

  • To make synthesis see the same critical path as PAR
  • To make synthesis estimate the same slack as PAR. If many clocks fail PAR timing, apply -route to the clock. If there are only a few paths failing PAR timing, apply -route to just those paths.
Conclusion
By setting up your design correctly and using features and constraints described in this article, you can meet and often surpass performance. We found that on 50% of the designs, using the following settings increased Fmax by more than 25%:
  • Add the CORE Generator EDIF files to synthesis
  • Apply the -route constraint to: paths or clocks
  • Turn resource sharing off
  • Turn pipelining/retiming on
  • Turn use clock period for unconstrained I/O off.
For additional information about the Synplify Pro synthesis tool, contact Synplicity at (408) 215-6000, or visit www.synplicity.com.

Printable PDF version of this article with graphics. PDF logo (4/15/04) 335 KB

 
Jobs Events Webcasts News Investors Feedback Legal Privacy Trademarks Sitemap
© 1994-2008 Xilinx, Inc. All Rights Reserved.