|
The debate over which high-performance
90 nm FPGA has the lowest power is heating
up. The industry has crossed a critical
inflection point at the 90 nm process,
where performance competes with power
and thermal budgets. Customers want as
much performance as possible, but increasingly
the decision about which FPGA to
use is based on which device consumes the
least amount of power.
Excessive power is expensive in many
ways. It creates the need for special design
and operational considerations . everything
from heat sinks to fans to sophisticated
heat exchangers. Even the cost of
larger power supplies must be considered.
Perhaps the most critical issue is the
effect excessive power can have on reliability.
As junction temperatures rise, transistors
consume more power, further
increasing the device temperature. Left
unchecked, this phenomenon leads to thermal
runaway. Continuously operating systems
with junction temperatures from
85°C to over 100°C threaten the reliability
of the device.
Fortunately, Xilinx encountered the
first evidence of this 90 nm inflection point
more than three years ago, in the early
development stages of SpartanTM-3 FPGAs
(the first Xilinx FPGA family with the 90
nm process). Xilinx began immediately
developing new ways to cope with the
inherent power issues posed by the 90 nm
process. Consequently, when the higher
performance VirtexTM-4 family was introduced
in September 2004, Xilinx was confident
that the new family would
simultaneously deliver the highest performance
and lowest power consumption
in a 90 nm FPGA.
Reducing Power in FPGAs
There are two major components to power
consumption: static power and dynamic
power. Each component poses a unique
challenge. For the 90 nm FPGA, the most
challenging component is static power.
Static Power
Static power is the standby power that is
wasted even if the design is not performing
any function. It occurs as a result of
leakage current in the transistors within
the FPGA. Leakage current increases as
transistors get smaller with each new
process. This principle is one of the major
reasons the 90 nm process crosses a major
inflection point (Figure 1).
For the first time, static power is
threatening to eclipse dynamic power as
the component responsible for the greatest
amount of total power consumption in
an FPGA. As processes get smaller, core
voltage decreases and parasitic capacitance
decreases; consequently, the rate of
increase in dynamic power drops, despite
the increase in frequency that accompanies
a new process. In contrast, below
0.25 µms static power has grown exponentially
with each new process.
This is where the inflection point really
becomes a critical factor for the FPGAs
and where Xilinx has established a substantial
lead. Smaller transistors are faster,
but they leak more. Thicker gate oxides
reduce leakage, but they also reduce performance.
However, unlike ASICs, ASSPs,
and microprocessors, Xilinx FPGAs do
not need all of their transistors to switch at
maximum speed. A substantial number of
transistors make up the configuration
memory cells used for programmable
logic, while pass transistors are used to
implement the programmable interconnect
routing. Configuration memory cells
do not need to be fast, and programmable
interconnect transistors only need to be
fast from source to drain and not under
gate control. These factors have allowed
Xilinx to selectively increase gate-oxide
thickness to reduce leakage current without
compromising performance.
Virtex-4 FPGAs incorporate a new
process approach called triple-oxide technology
to solve the static power problem.
Although this third gate-oxide layer is still
very thin, these transistors exhibit substantially
lower leakage than the standard
thin-oxide transistors used in Virtex-II
Pro FPGAs and in various other parts of
Virtex-4 FPGAs.
In addition, Xilinx optimized a number
of other transistor parameters (including
VT) to balance performance and leakage
across I/O, configuration memory, interconnect
pass transistors, and logic and
interconnect buffers. Figure 2 shows that
Virtex-4 FPGAs consume 50% less static
power than their predecessor, 130 nm
Virtex-II Pro FPGAs. We believe that this
is the first time in FPGA history that static
power decreased when moving to a
new, smaller process node.
Dynamic Power
The three contributing elements to
dynamic power in an FPGA are core voltage
(V), frequency (f ), and parasitic
capacitance (C). In addition, dynamic
power is proportional to the data toggle
rate (k). Fortunately, core voltage and
capacitance decrease with each new
process node, which lowers dynamic
power. Conversely, increasing the operating
frequency of a design increases
dynamic power. The well-known formula
for dynamic power that applies here is:
P = k * CV2f
One major area of opportunity to
reduce dynamic power consumption in
FPGAs involves the way in which a design
uses embedded functions. Embedded
functions consume less static and dynamic
power when implemented as hardwired
functions instead of configurable
logic blocks and programmable interconnects.
Hard fixed logic uses far fewer transistors
than programmable logic.
Additionally, the lack of programmable
interconnect transistors in hard-wired
embedded functions further reduces
dynamic power consumption.
These hard IP cores occupy far less
real estate, deliver much higher performance,
and consume 80-95% less power
than soft IP versions of the same functions.
And by making these hard IP cores
programmable and parameterizable, you
can maintain the flexibility inherent
to FPGAs.
Functions that Xilinx provides as hard
IP cores in Virtex-4 FPGAs include:
- 450 MHz PowerPCTM processors for
all microcontroller and embedded
processing applications with an APU
(auxiliary processing unit) interface
for hardware acceleration
- 500 MHz XtremeDSPTM slice capable
of simple math and filter functions
to complex high-performance
DSP functions
- 500 MHz digital clock managers
(DCM) and phase-matched clock
dividers (PMCD) that support clock
synthesis, clock management, and
phase matching
- A ChipSyncTM block in every I/O
with built-in SERDES and a dataalignment
function to simplify sourcesynchronous
interfaces in memory,
networking, and telecom applications
- RocketIOTM transceivers (622 Mbps-
10.3125 Gbps) with built-in physical
coding sublayer (PCS) and physical
media attachment (PMA)
- Tri-mode Ethernet MACs
(10/100/1000 Mbps) that can interface
directly with RocketIO transceivers
- Smart RAM memory with distributed
RAM and 18 Kb block RAM . each
block RAM has built-in FIFO logic to
convert RAM into FIFO and comes
with built-in error correction code
(ECC) circuits
Besides the obvious advantages associated
with moving these commonly used
blocks into hard IP, you must not overlook
the inherent contribution that Xilinx
advanced silicon modular block (ASMBL)
architecture makes to the Virtex-4 power
advantage. Because each of the three
Virtex-4 platforms—the LX, FX, and SX—satisfies distinct requirements for a particular
application domain (logic,
embedded processing, and signal processing),
their standard ratio of logic
cells, memory, I/O, DSP, and processors
has been optimized for that domain.
Consequently, the Virtex-4 device is the
first FPGA to offer domain-optimized
power consumption.
End-Market Power Requirements
Having achieved substantial power savings
both in static power (as a result of
triple-oxide technology) and dynamic
power (using embedded hard IP), you
might wonder what it all means for your
designs. The simplest example often provides
the best perspective. Using an
equivalent amount of generic logic and
memory in Virtex-4 devices and the nearest
competitor's devices of equivalent
density, with no consideration of other
embedded IP, the Virtex-4 FPGA saved
1-5W in power (see Figure 3). But how
does this translate to measurable benefits
in real-world applications?
Power Budgets
Every product has a power budget driven
by standards, cost goals, and reliability
requirements. As power consumption
and temperature are interrelated, it is
important to meet operating temperature
goals as well. System architects have specific
power budgets at the system level —
for each board as well as for devices used
on the board. Markets where high-performance
FPGAs are used, such as wired
and wireless networks, storage/servers,
automotive, and aerospace/defense, also
have aggressive power budgets. Let's discuss
a few applications where tight power
budgets are critical.
Wired Networks: Metro Aggregation
Metro aggregation refers to the aggregation
of access connections at central offices
(COs) within a metropolitan area network
(MAN). The equipment within each CO
must operate continuously, placing a heavy
burden on operational costs and the effective
capacity of power supplies and air conditioning
systems. Any means by which
equipment vendors can help reduce total
system power consumption equates to real
benefits for service providers.
The power budgets for the cards in a
rack of metro aggregation equipment usually
average 20-30W. The FPGAs used in
these boards consume 4-5W each, and
many designs use multiple FPGAs.
For example, power budgets for a
multi-service provisioning platform line
cards and FPGAs include:
- 12-port DS3 card: 30W; FPGA = 4-5W
- 4-port OC-12 card: 28W; FPGA = 4-5W
- 12-port 10/100 Base-T card: 50W; FPGA = 4-5W
- 32-port T1/E1 card: 9 W; FPGA = 2-3W
Using Virtex-4 FPGAs in these applications
would dramatically benefit service
providers' operational costs. Each Virtex-4
FPGA can save 1-5W when compared to
competitive 90 nm FPGAs.
Wired Networks: Metro Access
Unlike the metro aggregation equipment
deployed in COs, metro access equipment
exists at the edge of the network. It is
deployed outdoors, where air flow is limited
and air conditioning is virtually non-existent.
Example systems include passive optical
networks (PONs), digital loop carriers
(DLCs), and cable modem termination systems
(CMTS). These systems operate continuously
at temperatures often well above
85°C, taking junction temperatures as high
as 100°C. Transistor leakage current . and
hence static power . increase with temperature.
As a result, equipment vendors in this
space are constrained by stringent power
budgets (10 to 12W per card, 4 to 8W per
FPGA) to ensure reliability.
As power-sensitive as these applications
are, saving as little as 0.5W can make a
design workable. Virtex-4 devices eliminate
as much as 1-5W per FPGA, dramatically
benefiting both equipment vendors
and service providers.
Wireless Base Stations
Because of its quick deployment and low
establishment costs, the growth of the cell
phone market has overtaken the growth of
fixed-telephony networks. Once again,
service providers can measure the value of
reduced power consumption in Virtex-4
FPGAs both in terms of the mitigation of
reliability issues (arising from the outdoor
environment in which the base stations
are deployed) as well as the reduction of
operational expenditures.
Service providers running a typical
wireless base station network of 35,000
units can save more than $1M per year
just in electricity charges. Consider the
following power budgets:
- 16 line cards/base station;
1 FPGA/line card
- Power budget/line card = 20W
- FPGA power budget = 6W
Based on an extremely conservative
estimation of a 2W power reduction
using Virtex-4 FPGAs, service providers
would see a 32W power savings per base
station, amounting to a savings of 1.12
MW for the entire network. Using
10¢/KWh, this saves about $1M per year
for 35,000 base stations in the network.
Cutting 32 watts per base station also
impacts service providers' bottom lines in
the form of capital expenditure reductions
for cooling equipment costs, battery backup
costs, and power supply and power
management costs.
Conclusion
The battle to deliver maximum performance
at the lowest cost has taken center
stage in the evolution of FPGAs. Today,
customers are demanding minimum
power expenditure as well. Power conservation
impacts every budget, whether
technological or financial. Product acceptability,
reliability, and profitability depend
as much or more on power efficiency as
they do on performance. Besides offering a
robust feature set, Virtex-4 FPGAs exhibit
a real power consumption advantage.
Nonetheless, competition in the FPGA
market does not end with 90 nm devices.
Interesting new dynamics arise when
moving into the 65 nm node and below.
Fortunately for Xilinx, one inherent value
of using triple-oxide technology is that it
scales nicely with each new process.
As for the value of embedding hard IP
where appropriate, it is practically an
industry axiom. Xilinx has incorporated
the right amount of programmable
embedded IP with programmable logic to
make the whole solution more flexible,
with higher performance and lower
power. In the long term, customers will
only use platform FPGAs that provide the
best performance and power.
For more information about power
budgets, seminars/tutorials, white papers,
and power analysis/optimization tools,
visit www.xilinx.com/virtex4/lowpower.
Printable PDF version of this article with graphics. (7/11/05) 310 KB
|