Xcell Journal Online
  Xcell Journal Article
  Partner Yellow Pages
   
  Xcell Archives
  Order Free Xcell Journal
  Comments & Suggestions
  Write Articles for Xcell

    

Home : Documentation : Xcell Journal Online : Article
For Synchronous Signals, Timing is Everything



by Bill Hargin, Product Manager, HyperLynx, Mentor Graphics Corp.
Bill_Hargin@mentor.com (3/25/04)

Mentor Graphics highlights a proven methodology for implementing pre-layout Tco correction and flight time simulation with Virtex-II and Virtex-II Pro FPGAs.

article link to PDF
Article PDF 300 KB


We've all heard the phrase "timing is everything," and this is certainly the case for the majority of digital outputs on modern FPGAs. Timing-calculation errors of 10 or 20 percent were fine at 20 MHz, but at 200 MHz and above, they're absolutely unacceptable.

As Xilinx Senior Field Applications Engineer Jerry Chuang points out, "The toughest case usually is a memory or processor bus interface. Most designers know that they have to account for Tco (clock-to-output) as it relates to flight time, but don't really know how."

Another signal integrity engineering manager who preferred to remain anonymous explains, "We've got lots of things that hang on the hairy edge of working. That's one of the reasons why they give you so many knobs to turn on newer memory interfaces."

To complicate matters, manufacturer datasheets and application notes use multiple, often-conflicting definitions of many of the variables and procedures involved, requiring you to investigate the conventions used by manufacturer A versus manufacturer B. Most of the recently published signal integrity books either gloss over the subject or avoid it altogether. We hope that this article will serve to blow away some of the fog and reinforce some standard definitions.

System Timing for Synchronous Signals
An FPGA team will typically place and route an FPGA according to their specific timing requirements, leaving system-level timing issues to be negotiated later with the system-design team. With the sub-nanosecond timing margins associated with many signals, it's common for the system side to be faced with PCB floor-planning changes, part rotation, and sometimes the need to negotiate pin swaps with the FPGA team to accommodate timing goals. Proactive, prelayout timing analysis and some careful accounting can keep both the FPGA and system teams from spending a month or more chasing timing problems.

Two classes of signals pose problems for FPGA designers and their downstream counterparts at the system level: timing-sensitive synchronous signals and asynchronous, multi-gigabit serial I/Os. We'll concentrate on parallel, synchronous designs in this article.

Margins
The system-timing spreadsheet for synchronous designs is based on two "classic" timing equations:

Tco_test(Max) + Jitter + TFlight(Max) + TSetup < TCycle

Tco_test(Min) + TFlight(Min) > THold

Or, once Tco_test is corrected, becoming Tco_sys, as outlined in this article:

Tco_sys(Max) + Jitter + Tpcb_delay(Max) + TSetup < TCycle

Tco_sys(Min) + Tpcb_delay(Min) > THold

Each net's timing is initially set up with a small, positive timing margin. This margin is allocated to the TFlight(Max) and TFlight(Min) values (or Tpcb_delay[Max] and Tpcb_delay[Min], respectively) in the preceding equations; these are timing contributions of the PCB interconnect between each net's driver and receivers.

If there is insufficient margin left to design the interconnects, either the silicon numbers need to be retargeted and redesigned, or the system speed must be slowed. Figure 1 shows how timing margins shrink relative to frequency. There are two ways to come up with the interconnect values for the timing spreadsheet. Some signal integrity tools automatically make calculations that produce a single "flight-time" value. However, especially for designers just learning about the timing challenges of high-speed systems, a two-step approach is more instructive. First, you learn how to correct a datasheet's driver Tco value to match the behavior in your real system; second, you add the additional delay between the driver and each of its receivers.

Data Book Values
Initially, timing spreadsheets are populated with values from the silicon vendor's data book. You'll need first-order estimates from silicon designers on the values of Tco and setup and hold times for each system component. You can usually obtain this data from the component datasheet.

Test and Simulation Reference Loads
To arrive at the datasheet value for your drivers' Tco, standard simulation test loads (or reference loads) provide an artificial interface between the silicon designer and the system designer.

You'd prefer, of course, to have Tco specified into the actual transmission-line impedance you're driving on your PCB, but the silicon provider has no way of knowing what that will be. Knowing what loading the vendor assumed when publishing Tco is critical so that you can adjust for the difference between that load and your real one.

The Recipe for a Problem
As shown in Figure 2, if the reference load is significantly different from the actual load that the output buffer will see in your design, the sum of the datasheet and PCBinterconnect timing values will not represent actual system timing. Actual or total delay may be represented as:

Total Delay = Tco_sys + Tpcb_delay /= Tco_test + Tpcb_delay

where Tpcb_delay is the extra interconnect delay between the time at which the driver switches high or low until a given receiver switches.

Note that this "PCB delay" is not just the time it takes for a signal to travel along the trace (sometimes called "copper delay" or "propagation delay"). Here, Tpcb_delay accounts for effects such as ringing at the receiver, as shown in Figure 3. Its value could (on a poorly terminated net) easily be longer than the simple copper delay.

Calculating accurate timing involves more than finding Tpcb_delay. If the difference between Tco_sys and Tco_test is significant — even in the neighborhood of 100 ps — your board may not function properly if you don't account for the difference. But because Tco_test is a value created with an assumed test load, it almost never matches Tco_sys, the clock-to-output delay you'll see in your actual system.

For example, Lee Ritchey, author of "Get it Right the First Time" and founder of the consulting firm Speeding Edge, was hired to resolve a timing problem on a 200 MHz memory system. After digging into the design, he found that unadjusted datasheet values were used, based on Tco values that were measured on a 50 pF load rather than something resembling the design's 50 Ohm transmission-line load. As a result, this improper accounting "threw timing off by just over one nanosecond," he says. "That's 20 percent of the total timing budget, a major error."

In the following sections, we'll see how you can correct Tco_test to become Tco_sys, avoiding this type of error altogether.

The Process
Measuring Tco_test
To measure Tco_test, you need to set up a simulation with just the driver model and the datasheet test load. Though they're an optional sub-parameter in the IBIS specification, most IBIS models (including Xilinx IBIS models) contain a record of the test load (Cref, Rref, Vref ) and the measurement voltage (Vmeas) to use with these values. Figure 4 shows these values for the LVTTL8F buffer in the Virtex-II Pro IBIS model, as well as a generic reference load diagram taken from the IBIS specification.

Once you've gathered these load values from the IBIS model, you simulate rising and falling edges, and for each, measure the time from the beginning of switching until the driver pin crosses the Vmeas threshold. These are the Tco_test values.

Obtaining "Tcomp," the Timing-Correction Value
Now you need to calculate a compensation value, Tcomp, that will convert the datasheet Tco value into the actual Tco you'll see in your system. Tcomp is the delay between the time the driving signal, probed at the output, crosses Vmeas into the silicon manufacturer's standard reference load, and the time it crosses Vmeas for your actual system load. Tcomp is then used as a modification to the Tco value from the vendor datasheet, as shown in Figure 5.

The revised computation of actual delay from the previous equation is then:

Total Delay = Tco_sys + Tpcb_delay = (Tco_test + Tcomp) + Tpcb_delay

Note that Tcomp may be negative or positive, depending on whether the actual load in your system is smaller or larger than the standard test load. Traditionally, silicon vendors used capacitive test loads (like 35 pF) to measure Tco; almost all real PCB transmission lines do not present as heavy a load, so Tcomp is usually negative in this situation.

Xilinx, for its current generation of FPGAs, uses a 0 pF test load for output driver wave shape accuracy. Real transmission lines will represent a different load — some mixture of inductance, capacitance, and resistance. Because the transmissionline load is heavier than a 0 pF "open load," Tcomp will be positive. Simulation is the only way to accurately predict the exact value of Tcomp.

Simulating Tpcb_delay
At this point in the process, you've completed the first step in finding accurate delays for your timing spreadsheet, and you've compensated the datasheet Tco to match your real system load. Next, you need to determine Tpcb_delay, the additional delay caused by the interconnect from driver to receiver.

A signal integrity simulator is the only way to accurately do this, because only a simulator can account for subtle effects like reflections, receiver input capacitance, line loss, and so forth.

From here, we'll explore some detailed examples based on Xilinx-provided IBIS models ¨C the process of calculating Tcomp and then using the HyperLynx simulator to determine an interconnect's Tpcb_delay through pre-layout topology analysis. You could enter the values that we come up with directly into your system-timing spreadsheet.

The process using Mentor Graphics' HyperLynx product is straightforward. You look up the manufacturer's test load in the IBIS model (see Figure 4), enter it in the LineSim schematic, set up your actual interconnect topology just below the reference load, and begin a simulation, probing at both drivers so that you can measure Tcomp and Tpcb_delay, as shown in Figure 6.

Running the Numbers on a Real Problem An important design for an electronic equipment manufacturer had a Xilinx FPGA talking to a bank of SRAMs at 125 MHz, meaning the cycle time (Tcycle) was 8 ns. The Xilinx datasheet specified Tco as 4 ns (i.e., Tco_test). The SRAM's setup time was 2 ns.

Some of the traces connecting the FPGA to an SRAM were six inches long; a signal integrity simulation showed a worst-case maximum PCB delay (to the receiver's "far" threshold) of 2.5 ns. This yielded in the design's timing spreadsheet a total time of 4 + 2.5 + 2 = 8.5 ns (Tco_test + Tpcb_delay + Tsetup), violating the 8 ns cycle time.

However, the Tco value, when corrected for the actual design load, was 4-1.2 = 2.8 ns (Tco_sys = Tco_test + Tcomp), meaning that the actual total delay value was 2.8 + 2.5 + 2 = 7.3 ns (Tco_sys + Tpcb_delay + Tsetup), leaving an acceptable timing margin of 700 ps.

Note that in this calculation, we measured to the time at which the receiver signal crossed the farthest-away threshold to get the worst-case, longest possible Tpcb_delay. For a rising edge, we measured to the last crossing of Vih; for a falling edge, to the last crossing of Vil.

Conclusion
For seamless interaction between the FPGA designer and the system designer, it's prudent to do as much pre-layout, "what-if " analysis as possible. And, though not covered explicitly in this article, you can also verify that your laid-out printed circuit boards meet your timing requirements using a post-layout simulator with batch analysis capabilities.

Some Mentor products that perform this type of analysis are HyperLynx, ICX, and XTK. Running these simulations, you're revising simulated representations of interconnect circuits in minutes as compared to the weeks required to spin actual PCB prototypes.

The new HyperLynx Tco simulator is available on Mentor Graphics' website, www.mentor.com/hyperlynx/tco/. Included with the Tco simulator are the Virtex-II Pro, Virtex-II, and Spartan IBIS models; boilerplate schematics that will help you make adjustments to data book Tco values; and a detailed tutorial on Tco and flight-time correction that parallels this article.

What is "Flight Time"?
In this article, we've shown conceptually how Tco values specified into a silicon vendor's test load can be corrected on a per-net basis to give the actual clock-to-output (Tco) timing you'll see on your PCB, and then added to the additional trace delays between drivers and receivers to give accurate timing values. However, signal integrity (SI) tools actually deal with corrected timing values in a different (but equal) way.

The most convenient output from an SI tool is a single number — called "flight time" — shown in Figure 5 as (Total Delay – Tco_test) or (Tpcb_delay – Tcomp). You can add this value to the standard data book Tco values in your timing spreadsheet to give the same effect as the twostep process described in this article.

When an SI tool calculates timing values, it 1) simulates each driver model into the vendor's test load, measures the time for the output to cross the Vmeas threshold, and stores the value (Tco_test); 2) simulates the actual nets in the design and measures the time at which each receiver switches (Total Delay); and 3) for each receiver, subtracts the driver-switching-into-test-load time from the receiver time (Total Delay – Tco_test). The resulting flight time is a single number that can be added to each net's row in a timing spreadsheet, and that both compensates Tco_test for actual system loading and accounts for the interconnect delay between driver and receiver.

The term "flight time" is somewhat unfortunate, although it's become the industry standard. The name suggests the total propagation delay between driver and receiver, but the value calculated is actually the delay derated to compensate for the reference load. For old-style capacitive reference loads (e.g., 50 pF), flight time can even be negative.

Printable PDF version of this article with graphics. PDF logo (3/25/04) 300 KB

 
Jobs Events Webcasts News Investors Feedback Legal Privacy Trademarks Sitemap
© 1994-2008 Xilinx, Inc. All Rights Reserved.