UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

AR# 22921

LogiCORE PCI - Virtex-4 66 MHz designs sometimes fail to meet timing. What can be done to correct this problem?

Description

Virtex-4 66 MHz designs sometimes fail to meet timing.

These failures are primarily noticed on the OFFSET IN constraints and manifest as either setup failures or hold time failures.

What can be done to correct this problem?

Solution

Customers may notice that Virtex-4 66 MHz designs might not always meet timing on the first attempt at implementation.

Usually, the failures will occur on OFFSET IN constraints and most notably on the IRDY# and TRDY# signals.

The failures will either be a violation of the 3 ns input setup or the 0 ns hold time requirement. 

 

For past devices such as Virtex-II Pro, Spartan-IIE etc., guide files were provided that would guarantee timing on the critical input paths.

Xilinx elected not to do this for Virtex-4 designs because it limits the flexibility customers have in using different pin-outs and the UCF Generator.

This solution attempts to provide you with the necessary information in order to meet timing closure with your designs. 

 

If you are using the PCI UCF Generator from CORE Generator to produce UCF files for the Virtex-4, please read (Xilinx Answer 22671).

It is important to note that not all UCFs provided will meet timing as outlined in this solution. 

Also, some UCF files will be close to meeting timing and for these you can use some of the techniques described below to try to reach timing closure. 

 

Please see (Xilinx Answer 21399) regarding Virtex-4 stepping information and the PCI Core. 

 

The critical path that most often fails is the path from the IOB pin that cannot be registered in the IOB because of requirements of PCI. 

This input must go into the FPGA fabric and through combinational logic before it is registered as shown in the figure below.

Most notably, these problems will show up on IRDY# and TRDY# pins, but could be seen on other control pins like FRAME#, DEVSEL#, etc., as well.  

 

Critical PCI Timing Path

 

 

From the figure above, you can see that the register cannot be packed into the input side of the IOB.

Without a guide file to control the routing of this signal, it is up to PAR to route it in such a way that timing is met.

Depending on the PAR options used or the density of the design, PAR might have trouble meeting this requirement on the first try. 

 

Below is a list of steps to use when attempting to meet timing closure: 

 

Step 1 

For Virtex-4 66 MHz designs, you must be using the regional clock input, and not the global clock input for the PCI clock.

To do this, you must use the right combination of wrapper files and UCF files. 

Please see the Virtex-4 examples listed in Table 3-1 of the PCI Getting Started User Guide.

This table lists the correct wrapper file and UCF file to be used for select Virtex-4 devices. 

 

If you have used the UCF generator to create a UCF file, ensure that you created one that uses a regional clock input and then use the wrapper file that contains the regional clock buffers as shown in Table 3-1.

This wrapper file will be found in the <hdl>/src/wrap directory and will be named pcim_lc_66_r.v or pcim_lc_66_r.vhd for 66 MHz Virtex-4 designs. 

 

Step 2 

Ensure that the input delay buffer settings are correct as shown in Table 3-3 of the PCI Getting Started User Guide

 

Step 3 

Attempt to adjust the IDELAY controller value. The Virtex-4 PCI designs make use of the IDELAY component in the Virtex-4.

This component adds delay to the input signal and is adjustable.

The goal is to add enough delay to meet the 0 ns hold time requirement from PCI, but not so much that it fails the 3 ns input setup requirement. 

You can adjust these values in the UCF file. Look for lines like these: 

 

INST "PCI_CORE/XPCI_TRDYD" IOBDELAY_VALUE = 0 ; 

INST "PCI_CORE/XPCI_IRDYD" IOBDELAY_VALUE = 1 ; 

INST "PCI_CORE/XPCI_STOPD" IOBDELAY_VALUE = 4 ; 

INST "PCI_CORE/XPCI_DEVSELD" IOBDELAY_VALUE = 4 ;

 

Increasing this value improves the ability to meet the 0 ns hold requirement and decreasing it improves the ability to meet the setup requirement. 

The idea is to find a value that does both. The range is 0 to 63.

Please see the Virtex-4 User Guide for more information about the IDELAY component. 

 

Step 4 

MAP Options 

The default implementation script found in the <hdl>/example/xilinx directory might have to be changed.  

One option is to use the -timing switch during map.

This will cause map to work harder in placing the logic in order to meet timing.

However, using this switch will increase the time map takes to complete.

An example of using the -timing option is shown below: 

 

map -pr b -timing pcim_top.ngd -o pcim_top.ncd pcim_top.pcf 

 

Step 5 

PAR Options 

If using the map -timing switch is not enough, you can try different cost tables for PAR.

There are 100 different cost tables or seeds available for PAR to use in its place and route algorithm.

By default, it uses cost table 1. You can change this by using the -t option.

Following is an example of using cost table 25: 

par -ol high -t 25 -w pcim_top.ncd pcim_top_routed pcim_top.pcf 

Instead of guessing on cost tables, you can use MPPR (Multi Pass Place and Route) to run multiple cost tables. 

Please see the Software Manuals located at:  

http://www.xilinx.com/support/software_manuals.htm.
 

In particular, you will want to read about PAR in the Development System Reference Guide. 

 

Following is an example of running 10 cost tables: 

par -n 10 -t 25 pcim_top.ncd routed.dir pcim_top.pcf 

This runs 10 PAR iterations starting at cost table 25 and stores all results in the directory "routed.dir". 

 

An enhancement to using MPPR is to use the Turns Engine to farm out each job on multiple computing nodes.

For more information, please refer to the Development System Reference Guide. 

 

Another PAR option to try is to set the extra effort level switch. 

This makes PAR try harder to meet timing, and in turn might cause extremely long run times.

For more information, see the Development System Reference Guide. 

 

Step 6 

In many cases, if you tighten the OFFSET IN constraints you will be able to meet the 3 ns requirement. 

For example, PAR might not meet timing if the constraint is 3 ns. 

However, if you change it to something like 2.8 or 2.9 ns, sometimes PAR will get closer to this value, and in turn meet the 3 ns value. 

In other words, if you set it to 2.8 ns, it could fail with 2.9 ns but this is still under the 3 ns requirement.

To do this, change the UCF file OFFSET IN constraint to look like the following: 

TIMEGRP "PCI_PADS_C" OFFSET = IN 2.800 VALID 2.800 BEFORE "PCLK" TIMEGRP "ALL_FFS" ; 

 

Step 7 

Hopefully, you can meet timing using one of the steps given above.

Once you have met timing, you have the option of creating directed routing (DIRT) constraints on signals that are giving problems.

In this way, if you do not change your pin-out or logic placement, you can instruct PAR to route the offending signals on the same routes each time.

For more information on using directed routing, see the FPGA Editor Help manual in the Xilinx Software Manuals. 

 

Following is an example of using directed routing.  

 

Suppose a Virtex-4 66 MHz design on a 4vlx25ff668 fails to meet timing on the initial run using the default scripts from the core example directory.

It fails to meet the input setup requirement on TRDY#. 

The timing report shows the following: 

 

Slack: -0.008ns (requirement - (data path - clock path - clock arrival + uncertainty)) 

Source: TRDY_N (PAD) 

Destination: PCI_CORE/PCI_LC/PCI-AD64/IO11/OFD (FF) 

Destination Clock: CLK rising at 0.000ns 

Requirement: 3.000ns 

Data Path Delay: 4.704ns (Levels of Logic = 3) 

Clock Path Delay: 1.696ns (Levels of Logic = 3) 

Clock Uncertainty: 0.000ns 

 

In this example, the TRDY# input failed timing by 0.008 ns.

However, by following some of the suggestions above, timing was eventually met. In particular, for this example, PAR was run again with an initial cost table of 5.  

 

The goal is to create a directed routing constraint that fully constrains this path in the design that meets timing. 

Using the timing report, identify the nets in the path that must be constrained with directed routing constraints in the working design. 

For this example, two nets required directed routing constraints; they are TRDY_I and SOFT_CE.  

 

Open the working design in FGPA Editor and select Tools -> Directed Routing Constraints.

Use the Filter to find the nets and select them.

You can append these constraints to your current UCF file.

You will have to do this for each net and FPGA Editor will append the constraints each time to your UCF file. 

 

Once completed, using these DIRT constraints will cause PAR to place the nets on the same routes each time, which should allow timing to be met if the design is changed. 

 

For this example, PAR was run again with the default options that originally failed, and the PAR report showed that the DIRT constraints were matched. 

In this example, the PAR report said: 

 

Starting Router 
# of EXACT MODE DIRECTED ROUTING found:2, SUCCESS:2, FAILED:0 

 

This tells the user that PAR recognized the DIRT constraints in the UCF and successfully routed them.

By doing so, timing was met for the design. 

 

Essentially, you could lock down your entire design using this method if you choose to.

One important note to consider is that for the DIRT constraints to work, the source and destination of the signal must also be placed through the UCF file.

For the most part, these placement LOC constraints are already in the PCI core UCF files for the signals that might give you problems.

However, if you find that you need to place DIRT constraints on other nets, please ensure you also apply placement constraints for the source and destination of these nets.

AR# 22921
Date Created 09/04/2007
Last Updated 03/26/2015
Status Active
Type General Article
Devices
  • Virtex-4 FX
  • Virtex-4 LX
  • Virtex-4 SX