|
Each generation of FPGAs gets increasingly
faster, denser, and larger. What can you
do to ensure that power doesn’t increase in
conjunction? A number of design decisions
can impact the power consumption of your
system, ranging from the obvious choice of
device selection to the more minute details
of choosing state machine values based on
frequency of use.
To understand why the design techniques
we’ll discuss in this article conserve
power, let’s give a brief primer on
power consumption.
Power comprises two factors: dynamic
and static power. Dynamic power is the
power required to charge and discharge the
capacitive loads within the device. It is
highly dependent on frequency, voltage,
and loading. Each of these three variables is
under your control in one form or another.
Dynamic Power = Capacitance x Voltage2
x Frequency
Static power is the sum of power
caused by leakage (source-to-drain and
gate leakage, often lumped as quiescent
current) for all of the transistors in the
device, as well as any other constant
power requirements. Leakage current is
highly dependent on junction temperature
and transistor size. For more details,
see Xilinx® White Paper 221, “Static
Power and the Importance of Realistic
Junction Temperature Analysis,” at
www.xilinx.com/bvdocs/whitepapers/wp221.pdf.
Constant power requirements would
include current leakage due to termination,
such as a pull-up resistor. Not much
can be done to affect leakage, but constant
power may be controlled.
Think About Power Early
The decisions you make about power have
the greatest impact in the early stages of
your design. Deciding on a part can have
huge implications on power, whereas
inserting a BUFGMUX on a clock will
have much less impact. It is never too
early to start thinking about power for
your next design.
The Right Part for the Job
Not all parts have the same quiescent power.
As a general rule, the smaller the device
process technology, the higher the leakage
power. But not all process technology is created
equal. For example, the dramatic differences
in quiescent power for 90 nm
technology between Virtex™-4 devices and
other 90 nm FPGA technology can be seen in
Xilinx White Paper 223, “Power vs.
Performance: The 90 nm Inflection Point,” at
www.xilinx.com/bvdocs/whitepapers/wp223.pdf.
However, as quiescent power rises as
process technology shrinks, dynamic power
decreases because smaller processes come
with lower voltage and capacitance.
Consider what will be more relevant to your
design – standby (quiescent) power or
dynamic power.
All Xilinx devices have dedicated logic
in addition to the general-purpose slice
logic cells. These take the form of block
RAM, 18 x 18 multipliers, DSP48 blocks,
and SRL16s, among others. You should
always use dedicated logic rather than its
slice-based equivalent. Not only does dedicated
logic have higher performance, but it
requires less density and therefore consumes
less power for the same given operation.
Consider the types and quantity of
dedicated logic when evaluating your
device options.
Selecting an appropriate I/O standard
can save power as well. These are simple
decisions, such as choosing the lowest
drive strength or lower voltage standards.
When a high-power I/O standard is
required for system speeds, plan on a
default state to lower power. Some I/O
standards (such as GTL/+) require a pullup
to function properly. So if the default
state of the I/O were high instead of low,
the DC power through the termination
resistor would be saved. For GTL+,
setting the proper default state for the
50 ohm termination to 1.5V can result in
30 mA power savings per I/O.
Data Enable
Chip select or clock-enable logic is often
used to enable registers when the data on
the bus is relevant to them. Take this a step
further and “data enable” the logic as early
as possible to prevent unnecessary transitions
between the data bus and combinatorial
logic to the clock-enabled registers, as
shown in Figure 1. The waveforms in red
indicate the original design; the ones in
green indicate the modified design.
Another option is to perform this “data
enable” on the board instead of on the chip.
Xilinx Application Note 347, “Decrease
Power Consumption of a Processor using a
CoolRunner™ CPLD,” at www.xilinx.com/bvdocs/appnotes/xapp347.pdf, discusses this
concept to minimize processor clock cycles.
The concept here is to use a CPLD to
offload simple tasks from the processor,
allowing it to stay in a standby mode longer.
Applying this same idea to FPGAs is
certainly feasible as well. Although
FPGAs do not necessarily have a standby
mode, using a CPLD to intercept bus
data and selectively feed data to the
FPGA can save unnecessary input transitions.
CoolRunner-II CPLDs contain a
feature called “data gate,” which disables
logic transitions on the pin from reaching
the internal logic of the CPLD. The data
gate enable may be controlled either by
logic on-chip or by a pin.
State Machine Design
Enumerate state machines based on the
anticipated next state condition and
choose state values that have few switching
bits between common states. By
doing so, you can minimize the amount
of transitions (frequency) for state
machine nets. Identifying common state
transitions and selecting values appropriately is a simple way to reduce power with
little impact on the design. Simpler
encoding styles (one-hot or grey-code)
also utilize less decode logic.
Consider a state machine where the frequent
state transitions are between states 7
and 8. If you select binary encoding for this
state machine, this means that for every state
transition between 7 and 8, four bits would
need to change state, as shown in Table 1.
If the state machine were designed
using a grey-code instead of binary, this
would limit the amount of logic transitions
required to move between these two
states to only one bit. Alternatively, if
states 7 and 8 were encoded as 0010 and
0011, respectively, this would also serve
the same purpose.
Clock Management
Of all the signals in a design that can
draw power, clocks are the largest offenders.
Although a clock may run at 100
MHz, the signals derived from this clock
often run at a small fraction of the main
clock frequency (commonly 12% to
15%). In addition, the fanout for clocks
is naturally high – so these two factors
show that clocks should be studied for
purposes of power reduction.
If a section of a design can be in an
inactive state, consider using a BUFGMUX
to disable the clock tree from toggling,
instead of using clock enables.
Clock enables will prevent registers from
toggling unnecessarily; however the clock
tree will still toggle, consuming power.
But clock enables are better than nothing.
Isolate clocks to use the fewest amount
of quadrants possible. Unused clock tree
quadrants will not toggle, thereby lowering
the load on the clock net. Careful
floorplanning may achieve this goal without
affecting the actual design.
Power Estimation Tools
Xilinx provides power estimation tools in
two forms: a pre-implementation tool
called Web Power Tools and a post-implementation
tool called XPower. The Web
Power Tools at www.xilinx.com/power provide
power estimation based on ballpark
estimates of logic usage. With this, you can
get a power assessment with only a design
utilization estimate – no actual design files.
XPower is a post-implementation tool
that analyzes the actual device usage and, in
conjunction with actual post-fit simulation
data (in the form of a VCD file), delivers
accurate power data. With XPower, you can
analyze design changes for impact on overall
power without touching a piece of silicon.
Web-Based Power Tools
Web-based power estimation is the quickest
and easiest way to get an idea of device
power consumption early in the design
flow. A new version of these tools is
released every quarter, so information is
current, and no installation or downloading
is required – just an Internet connection
and a web browser. You can specify
design parameters and save and load design
settings, eliminating the need to re-enter
design parameters with iterative use. Just an
estimate of design behavior and a target
device will get you started.
XPower – Integrated,
Design-Specific Power Analysis
XPower, a free part of all Xilinx ISE™
design tool configurations, allows you to
get a much more detailed estimate of your
design-based power requirements.
XPower estimates device power based on
a mapped or placed and routed design.
XPower calculates an estimate of power
with an average design suite error of less
than 10% for mature in-production
FPGA and CPLDs. It considers device
data along with your design files and
reports estimated device power consumption
at a high level of accuracy, customized
to your specific design
information.
XPower is integrated directly into ISE
software and gives hierarchical and detailed
net power displays, detailed
summary reports, and a power
wizard that makes it easy to run
for new users. XPower can
accept simulated design activity
data and runs in both GUI
and batch mode (Figure 2).
XPower considers each net
and logic element in the
design. The ISE design files
provide exact resource use;
XPower cross-references
routing information with
characterized capacitance
data. Physical resources are
then characterized for capacitance.
Design characterization is continuous
and ongoing for newer devices to provide
the most accurate results. XPower
uses net toggle rates as well as output
loading. XPower then computes power
and junction temperature, and can display
individual net power data as well.
Conclusion
Increasing demands for cheaper and simpler
thermal management – as well as
power supplies coupled with the increasing
power requirements of cutting-edge
FPGAs – have elevated the concept of
designing for low power to greater
heights. The latest device offering from
Xilinx, Virtex-4 FPGAs, offers the high
performance of 90 nm without the
assumed dramatic increase in static
power. When used with Xilinx power
estimation tools and considerations for
low-power design, meeting your power
goals is easier than ever.
Printable PDF version of this article with graphics. (7/11/05) 260 KB
|