|
Digital designs require good clock signals
with a short delay and minimal skew, so that
they arrive almost simultaneously at their
many on-chip destinations. Clocks must
maintain their duty cycle, which is especially
important in double-data-rate designs where
data is clocked on the rising as well as on the
falling clock edge. Those delays and edge
rates must therefore always be closely
matched, independent of their loading.
Although single-clock operation is
desirable, many systems require multiple
clocks. Often, input and output signals are
clocked very fast and require even better
timing precision than the general logic
implemented on the chip.
Xilinx® Virtex-4™ FPGAs provide significant
advances in all of these areas. Global
clocks can reach all flip-flops on the chip,
and high-speed I/O clocks provide exceptional
performance, especially for sourcesynchronous
interfaces. Additional regional
clocks serve specific areas on the chip.
Clock Regions
For clocking purposes, each Virtex-4 device
is divided into regions. The number of
regions varies with device size, from 8
regions in the smallest device to 24 regions
in the largest one.
Global Clocks
Independent of array size, each Virtex-4
FPGA has 32 low-skew global clock distribution
networks that can each clock all
sequential resources on the whole chip
(CLBs, block RAMs, DCMs, and I/Os) and
also drive logic signals. You can use any 8 of
these 32 global clock lines in any region.
All global clock inputs have dedicated
fast routing to the corresponding global
clock buffer, which can also be used as a
clock-enable circuit or a glitch-free multiplexer.
It can select between two clock
sources and can also switch away from a
failed clock source – a new feature in the
Virtex-4 architecture.
A global clock buffer is often driven by a digital clock manager (DCM) to eliminate
the clock distribution delay, or to adjust its
delay relative to another clock. There are
more global clocks than DCMs, and a DCM
often drives more than one global clock.
Virtex-4 clock trees are designed for low
skew and low power. Any unused branch is
automatically disconnected. All global clock
lines and buffers are implemented differentially.
This minimizes duty-cycle distortion
and improves common-mode noise rejection.
The whole global clock network is designed
for 500 MHz operation and beyond.
I/O Clocks and Regional Clocks
Virtex-4 devices have two additional clock
types: I/O clocks and regional clock networks,
two of each per region, used primarily
for clocks forwarded into the Virtex-4
FPGA. I/O and regional clock networks are
independent from the global clock networks,
thus offering a maximum of 12 independent
clock domains in any clock region.
Each clock region has two pairs of
clock-capable inputs, optimized for
incoming high-frequency clocks. Clock-capable
I/O pairs, like global clock inputs,
are regular I/O pairs where the LVDS output
drivers have been removed to reduce
the input capacitance.
Each of these input pins or input pin
pairs can connect to a BUFIO that drives
a high-speed differential I/O clock
network, which is dedicated to the I/O
circuits and is ideally suited for source-synchronous
data capture using the built-in
serializer/deserializer (SerDes).
Each BUFIO can drive all I/O logic in
its region as well as in the two adjacent
regions (Figure 1). This means that one
receive clock can control up to 47 differential
or 95 single-ended receive data lines,
ideal for many networking and memory
interface applications.
Regional clocks form a third type of
clock networks, each being able to span as
many as three adjacent clock regions.
Regional clocks drive single-ended nets and
are intended for the parallel clock domain
of the SerDes.
You can program the regional clock
buffer to divide the incoming clock rate
by any integer number from one to eight.
This feature, in conjunction with the programmable
SerDes in the I/O block,
allows source-synchronous systems to
cross clock domains without using additional
logic resources.
Conclusion
Virtex-4 clocking resources have been
optimized for high clock rates and multiple
clock domains. Thirty-two global
clock networks provide high-performance
clocking across the whole chip, with short
delay, low skew, and stable duty cycles.
Many localized clock networks serve the
I/O for high-speed source-synchronous
applications. These clock networks are used
in conjunction with the built-in SerDes and
reduce the burden on global clock resources.
Last but not least, all of these resources
are easy to use. They are automatically handled
by the Xilinx ISE 6.3i software.
For more information, visit www.xilinx.com/products/virtex4/capabilities/xesium.htm.
Printable PDF version of this article with graphics. (1/15/05) 195 KB |