|
Xilinx® FPGAs provide connectivity in
very high speed source-synchronous bus
interfaces. Transmission rates of 1 Gbps
and higher are not uncommon for these
types of interfaces.
In source-synchronous interfaces, the
transmitter forwards a dedicated clock
along with the data. As data rates skyrocket
to 1 Gbps and beyond, you may find
that your timing budgets are eaten away by
skew and jitter.
Skew is defined as the difference in
arrival time between signals sent at the
same time. It is caused by variations in
board trace lengths, connectors, package
flight-time delays, and secondary parasitic
effects. Figure 1 illustrates how the
improper routing of board traces and the
use of connectors contributes towards
skew at the receiver.
Another challenge is jitter, the deviation
from ideal timing caused mostly by slow
transition times, ground bounce, intersymbol
interference, and electromagnetic
interference. Figure 2 illustrates the combined
effects of skew and jitter on a system
designer’s timing budget.
In a real system, many bits of data (16,
for example) are received in parallel and
must be clocked into the receiver by the
common clock sent together with the data.
Ideally, the clock edge arrives in the middle
of the bit time, thus offering a maximum
timing margin.
But in reality, the individual data bits
arrive at slightly different times, and each
suffers from timing jitter on its rising and
falling edges, and therefore the clock signal
also suffers from timing jitter. All of these
effects combine to limit the data-valid window,
and thus might lead to unreliable data
transmission.
Virtex-4™ data and clock inputs offer
ChipSync™ technology, facilitating
dynamic phase alignment (DPA). DPA
can greatly reduce the skew between different
data lines, as well as between the
data lines and their associated clock input.
Using a system-generated training pattern,
the receiving FPGA can adjust the
input delay of each data and clock input,
using individual precision delay lines on
every input buffer. Gross errors exceeding
one bit time pass through the bit-serial
interface, but can be corrected after
serial-to-parallel conversion using the
Bitslip module.
A Generic Networking Interface Example
The generic interface is defined by a 16-channel bus and a forwarded clock. The signaling
standard is low-voltage differential
signaling (LVDS). The interface protocol
specifies a de-skewing method called “training.”
During the initialization phase,
the transmitter sends a repetitive 20-bit training pattern. The receiver uses
it to de-skew the interface by delaying
each data bit such that it is optimally
centered over the received
clock edge. The interface specification
calls for the receiver to correct
data skew as much as +/- 1 bit time
of channel-to-channel skew.
This fine-grained delay adjustment
uses a 64-tap delay line with
a counter-controlled tap multiplexer
available on each input. All of the delay lines in a region are continuously
being calibrated by a servo system
using a dedicated delay line, a 200 MHz
user-provided clock, and a phase-comparator-driven PLL circuit that adjusts
the delay line(s) such that the 64-stage
delay equals one period of the clock
(5 ns / 64 = 78 ps per tap).
All delay lines in one region share a
common adjustment, and thus have the
same tap delay, as accurately as delay tracking
in a small silicon area allows. The reference
frequency is specified, tested, and
supported by software at 200 MHz. Minor
variations can be tolerated, and jitter is filtered
out by the control structure. This
programmable precision delay will find its
way into many innovative applications.
Here it is described only as a method to
achieve dynamic phase alignment.
The ChipSync technology built into
every I/O contains a dedicated serial-to-parallel
converter that converts the highspeed
serial stream to a sequence of parallel
words that can be processed at a much slower
rate within the FPGA. This feature decouples
the high-speed serial data transfer from
the clock rate supported by the FPGA fabric.
The converter supports both single data
rate (SDR) and double data rate (DDR)
modes. In SDR mode, the serial-to-parallel
converter is fully programmable to generate
anywhere from 2- to 8-bit parallel
words. In DDR mode, the converter can
be programmed to de-serialize by a factor
of 4, 6, 8, or 10, as specified by the HDL
attributes of the ChipSync technology.
The maximum width in a single ChipSync
module is six. For larger bit widths, you
can connect two adjacent ChipSync modules
in master-slave mode.
Word alignment can correct for data
skew greater than one bit period by comparing
the parallel version of the incoming
pattern to the pre-specified training pattern.
The Bitslip module enables you to
match an incoming data stream to a predetermined
data pattern by shifting the
output of the dedicated serial-to-parallel
converter. An example of this feature in
operation is given in Figure 3.
The IDELAY, SERDES, and Bitslip
features are encapsulated in a module called
ISERDES, available as part of the
ChipSync technology in every single I/O.
The Virtex-4 DPA Solution
Let’s use the Virtex-4 ChipSync technology
features previously described to create a DPA
solution that meets interface requirements.
There are three basic steps in the solution:
- Bit alignment – completed during the
initialization procedure, its purpose is
to correct for skew less than one bit
time and position the clock edge at the
center of the data eye
- Word alignment – completed during
the initialization procedure, its purpose
is to align the incoming data stream to
the pre-determined training pattern
- Real-time window monitoring – continuously
monitors the data eye so that
the clock edge is always centered to the
data eye
Figure 4 illustrates the implementation
of DPA in a Virtex-4 device.
The goal of the bit-alignment procedure
is to position the captured clock edge in the
center of the data eye to provide maximum
margin. The bit-alignment procedure takes
advantage of the dedicated 64-tap delay
line feature of the ISERDES.
The word alignment procedure aligns the
output pattern from the ISERDES to a specific
training pattern. This procedure effectively
removes word skew and aligns all
channels to a specific word boundary. The
word alignment unit primarily uses the
Bitslip module of the ISERDES. Each channel
monitors the pattern streaming in. If the
training pattern is not found, activate Bitslip
until the pattern is found. Once found, the
channel is – by definition – de-skewed.
After the initialization stage using the
training procedure, the channels are
assumed to remain trained throughout
normal operation. However, the data valid
window might shift because of operating
conditions. The window monitoring unit
can continuously monitor the data valid
window during normal operation and can
adjust the sampling point as necessary to
provide maximum margin.
Conclusion
Dynamic phase alignment is a critical
function in many bus interfaces as data
rates explode into the gigabit range. As
FPGAs are increasingly being used directly
in the data path of these very high speed
interfaces, dynamic phase alignment in the
FPGA is a must.
Virtex-4 ChipSync technology built
into every I/O enables you to quickly and
easily develop a DPA solution that meets
your application.
An application note describing the
implementation of DPA is available at
www.xilinx.com/bvdocs/appnotes/xapp700.pdf. The application note, “Dynamic Phase
Alignment for Networking Applications,” is
published as XAPP 700. The reference
design enables you to quickly understand
how to implement a DPA solution that fits
your particular application.
Printable PDF version of this article with graphics. (1/15/05) 257 KB |