|
Customers in today’s demanding communications
and consumer applications need
to attain unprecedented levels of capacity
and performance while reducing power
consumption and overall cost. With the
introduction of high-end devices into the
marketplace, more of these applications are
being addressed by FPGA solutions.
As professional programmable logic
designers, you are always searching for better
ways to create value and differentiate
your products. To do so effectively, you
need to adopt comprehensive, high-productivity
design flows instead of point tools
to crack new design challenges and take
advantage of the benefits of the latest programmable
silicon platforms.
Multiple Platforms, Unprecedented Opportunity
With the release of Xilinx® Virtex-4™
devices, you can enjoy twice the density,
twice the performance, and half the power
consumption of previous Xilinx FPGA
families. If you seek sheer DSP performance,
you might prefer Virtex-4 SX FPGAs,
which offer 256 GigaMAC/s performance
for 18-bit operations. The LX family of
FPGAs offers higher performance logic;
with FX devices, you can explore embedded
processing and high-speed serial connectivity
applications. These three
platforms, comprising a complete selection
of 17 devices, collectively offer a compelling
alternative to ASICs and ASSPs.
To fully exploit this immense potential,
design teams must consider moving away
from serial, iterative, point-tool approaches
that involve designing or re-designing
from scratch. To manage non-recurring
engineering time and costs and create efficient,
reliable flows, you must clearly identify
which of the various “building blocks”
you need to focus on when using a platform
approach to successfully implement a
high-end design.
Typical building blocks may include:
- Intellectual property such as internal
company, Xilinx, or third-party IP
- Lower-level blocks used in the context
of a bottom-up design flow
- Algorithms via C or C++ or
MATLAB™
- RTL blocks
- Embedded processors
- I/O interfaces
By using a comprehensive, methodical
design flow, you can effectively optimize
these blocks in a multimillion-gate device.
As high-end FPGAs approach ASIClevel
performance, designers are adapting
many advanced ASIC techniques for
FPGA design. The complex FPGA design
flow shares some commonality with ASIC
design; for instance, RTL simulation
remains basically unchanged. But certain
subtle differences exist under the hood, and
many steps are fundamentally different.
The pre-built nature of FPGAs implies a
“use or lose” approach to features or capabilities,
so you must match functional
requirements with the device architecture.
Thus, common steps such as synthesis or
place and route all differ subtly in the
FPGA domain.
You can use C++ synthesis techniques
borrowed from ASIC flows to target
FPGAs. C++ specifications are much less
tied to any specific hardware than the corresponding
RTL code.
Another technique, physical synthesis,
illustrates the subtleties involved when the
same general approach is used for both
ASICs and FPGAs. Physical synthesis
requires a detailed understanding of the
FPGA’s hardware structure. At the very
least, physical synthesis tools must be more
specifically targeted to FPGA architectures.
A typical high-end FPGA design flow
should encompass such tasks as:
- Early design rule checking
- Higher level design abstraction
- Functional and system-level simulation
and verification
- Advanced physical synthesis techniques
Let’s describe each of these in more detail.
Integrated Approach to Design Creation
In terms of design entry, the need to create
faster, larger, and complex designs packed
into the latest FPGA devices within the
shortest possible time presents significant
challenges. The high availability of configurable
logic in platform FPGAs that
include hard ASIC macros – such as
embedded processor blocks and complex
I/O standards – has truly enabled programmable
SoC, where a serialized design
approach would not work. Only a systemlevel
RTL design concept, used in parallel
with multiple aspects of managing and
optimizing the high-level design creation
process, will ensure success.
Large design projects mandate the collaboration
of several engineers or engineering
teams, often belonging to separate
companies and typically distributed in different
geographic locations worldwide.
This team-based approach raises the
importance of a consistent design coding
style for teams to share code effectively.
Teams invariably comprise experienced
project leaders and designers alongside less
experienced junior engineers working on
the various building blocks of a design. The
resulting skill diversity makes the need for
consistency critical. It is imperative that
companies carefully scrutinize the planning
and creation process to identify poor
design styles, incorrect design rules, and
syntax/semantic errors at the earliest possible
stage before even attempting to tie the
building blocks together or simulate/synthesize
the design.
In bigger designs, it is not unusual for
multidisciplinary design teams to focus on
and optimize only a portion of the device.
As the system is defined in RTL by combining
both vendor and internal IP (and for those applications utilizing DSP functionality,
RTL generated algorithmically), you will
need an integrated system design approach
to help synchronize the development of each
specific part of a large, high-capacity FPGA.
From the configuration of the embedded
processor to logic development and high-speed
I/O assignment, the ideal synchronization
of these teams and processes is
required to deliver an optimized field-programmable
SoC. The merging and management
of these multiple disciplines to generate
the system-level RTL and associated
design files is a huge task best handled by a
comprehensive and flexible environment.
To reduce development cost and time to
market, 80-90% of projects may now include
both re-work of an existing design as well as
reuse of previously designed components or
IP, whether internal or purchased. Because
this trend is expected to increase, you need to
ensure that your components/subsystems are
designed to be reusable and conform to established
design reuse rules.
Through cooperative efforts in the design
community and internal corporate standardization,
the industry has developed a number
of reuse methodology guidelines that can be
checked using automated tools. Tools such as
Mentor Graphics® HDL Designer Series™
(HDS) can help design teams successfully
integrate both hard and soft IP (such as
PowerPC™ and MicroBlaze™ processors).
Larger designs at higher speeds have prolonged
traditional simulation cycles.
Similarly, synthesis can become a protracted,
iterative process in order to achieve desired
performance goals. You need to maximize the
productivity of potentially long EDA tool
runs by ensuring that as many code errors as
possible are found and fixed before the start
of simulation and synthesis (Figure 1).
Equally important are integrated connections
to advanced tools such as
DesignAnalyst™ and Precision® Synthesis
from Mentor Graphics to ensure against
errors and reduce iterations, as well as integration
with any third-party EDA tools
through a flexible integration mechanism.
Through static design checking or “linting”
products, you can perform many different
forms of checking during the design
creation process.
Interactive HDL visualization and creation
tools provide automatic documentation
features and reporting as well as
intelligent debug and analysis to effectively
manage FPGA designs. Moreover, tight bidirectional
communications with PCB tools
from within the design creation process
shorten design cycles by integrating and synchronizing
HDL design with PCB design,
eliminating time-consuming manual steps.
Higher Abstraction Levels
Speed Hardware Design
For the first time, professional design engineers
are literally struggling to keep pace
with Moore’s Law, which makes it difficult
to fully utilize the capacity of 90 nm ASICs
or efficiently target the complex structures
found in domain-specific FPGAs. Algorithmic C synthesis (Figure 2) promises
to raise the abstraction of hardware
design by providing a new, more abstract
entry point, benefiting both ASIC and
FPGA hardware designers. But to understand
the need for higher abstraction languages,
you must first analyze the problems
with existing RTL methodologies.
The design complexity of new DSP
applications has outpaced traditional RTL
capabilities. To create hardware implementations
for blocks of computationally
intensive algorithms using RTL, design
teams must iterate through several steps,
including micro-architecture definition,
handwritten RTL, and area/speed optimization
through RTL synthesis. This
manual process is slow and error-prone. In
the final result, both the micro-architecture
and technology characteristics become
hard-coded into the RTL description. This
hard coding renders the whole notion of
RTL reuse or retargeting impractical in real
applications.
An optimized C-to-RTL synthesis flow
not only promotes a higher level of
abstraction, it also gives the design team
the flexibility to transition from one implementation
technology to another. You can
tune the hardware for high-performance
parallel implementations or smaller, more
serial implementations.
Using this approach to describe functional
intent (offered in the Mentor
Graphics Catapult™ C Synthesis tool),
you can move up to a far more productive
abstraction level for designing hardware. As
hardware designers, you can reduce implementation
efforts by as much as 20X while
creating a more repeatable and reliable
design flow.
The ability to select fundamentally
superior micro-architectural alternatives
allows you to create designs of better quality
than traditional RTL methods. Finally,
this approach closes the conceptual gap
between algorithm designers modeling in
C/C++ and hardware designers working at
the RTL abstraction level.
Simulation and Verification Challenges
Using standard RTL verification methods
in high-capacity FPGAs quickly diminishes
the benefits of faster hardware creation.
The current execution speeds of software
validation platforms and RTL verification
environments are insufficient to quickly
test design functionality. Design verification
takes significantly longer than design
development because of the limited speed
of RTL simulators and the time needed to
manually create an RTL test bench.
Additionally, C/C++ simulation
(although upwards of 10,000X faster than
RTL) may be inadequate to validate the
original algorithm given the data-intensive
nature of DSP designs. These challenges
are in fact opportunities for both algorithm
development and system validation
through the use of accelerated simulation.
High-level design verification flows are
now turning to address rapid algorithm
validation and verification, using hardware
acceleration by leveraging the benefits of a
SystemC verification environment. These
flows begin with the algorithm designer
validating designs in C++ and end with
the hardware designer verifying the algorithm
in RTL.
This method of using high-level
C/C++ synthesis in combination with a
SystemC verification environment provides
an automated path from algorithm
development to synthesized RTL running
in an FPGA prototyping environment.
Executing the algorithm directly in hardware
gives algorithm designers the ability
to validate algorithms and hardware
designers the ability to validate the entire
system at or near real-time speeds.
The use of SystemC as a verification
environment permits both algorithm and
hardware designers to use the same test
bench and test vectors, eliminating the
need for manual test bench creation. The
combined approach of hardware acceleration
of C/C++ algorithms in a SystemC
verification environment provides a push-button
solution for accelerated algorithm
development and system validation.
Balancing the Cost/Timing Closure Equation
An essential step in realizing a high-capacity
FPGA design is to optimize that
design for both timing and cost. Timing
closure challenges are well known. Using
stand-alone logic synthesis with place and
route can be non-deterministic by nature,
especially for large devices.
Designers tend to write and rewrite
RTL code and constraints to try and coax
the place and route tool to do their bidding.
Once you go down this path, you
then must iterate through place and route
– the most time-consuming step in FPGA
design – before gaining any visibility as to
whether your changes were a step in the
right direction or if they only served to
further exacerbate the problem.
Similar to optimization for timing, the
process of achieving true “cost closure”
involves a reduction in area to reduce
FPGA part cost, or a reduction in the total
cost of the design by increasing levels of
abstraction and design reuse. The irony is
that once you attain a successful implementation,
any change – no matter how
small – in the design or architecture threatens
to obsolete that success. This unpredictability
negates the reduced cost and
time-to-market benefits of using programmable
logic in the first place.
Increasing die sizes place additional burdens
on the extant methodologies. A large
die poses a significant challenge in obtaining
repeatable, high-quality placements out of
current placement algorithms. The larger die
size is now widening the distribution curve
of net delays grouped by fanout, the basis
behind industry-accepted wire delay models.
This widened distribution has a degrading
effect on the accuracy of fanout-based
wire delay models. In larger devices, interconnect
delay dominates performance for
FPGA platforms. Because fanout-based
delay estimates in FPGAs struggle to model
even a simplified version of physical reality
today, you can see why optimization decisions
based on a wire-load estimate are often
ineffective. Worse, physical proximity cannot
always relate directly to delay, so traditional
floorplanning falls painfully short.
Advanced physical synthesis techniques can
solve these issues in several ways.
First, to improve accuracy and reduce
design iterations, you must consider real
interconnect delay and physical effects up
front (Figure 3); combining logic and physical
synthesis is critical for the design of larger,
high-performance FPGAs. Some physical
synthesis alternatives available today are
based solely on technology borrowed from
the ASIC implementation space.
In reality, forcing an ASIC methodology
– and mentality – on the FPGA world cannot
work. Such approaches essentially try
to outsmart the vendor placement and may
show promise in certain situations, but
most cannot match the performance of a
tool that leverages the FPGA vendor’s postlayout
information to provide accurate
physically aware synthesis.
Second, FPGA-oriented physical synthesis
solutions need to take into account
successful implementation experience that
you have previously developed. For
instance, when you complete a modular
design and have optimized performance for
a portion of it using physical synthesis, a
good tool must ensure that you can take
full advantage of these optimizations and
reuse them on subsequent designs.
Physical synthesis in FPGAs is growing
beyond the ASIC model to be a valuable
part of cost minimization and component
reuse strategies. When investing in a synthesis
tool with a highly deterministic
process for improved results, look for technologies
and algorithms that not only optimize
designs for cost and timing, but also
enable you to translate your professional
experience and previous design implementations
at the physical level into faster time
to market in subsequent designs.
Any tool used in professional FPGA
design (including the Precision Synthesis
tool from Mentor Graphics) should consider
FPGA vendor placement results as
soon as possible, and only then begin to
manipulate the design using physical synthesis
– integrated with logic synthesis in a
unified data model – to converge on timing
at a lower cost.
From Point Tools to ESL Design Flows
Every designer stands poised to benefit
from the new standard set by Virtex-4 high-performance
FPGAs. The next-generation
challenge faced by mainstream FPGA EDA
tool vendors is to leverage point-tool
expertise and thus meld apparently contradictory
trends – higher levels of abstraction
on the one hand and greater dependence on
specific physical characteristics on the other
– into a coherent design methodology and
highly productive flow.
In keeping with these advances, EDA
tool companies will continue to extend and
improve their comprehensive, integrated
design flows spanning all levels of abstraction.
Mentor Graphics continues to be a
technology leader in this space. Designers
must take advantage of EDA tools that
now address both physical and electronic
system-level (ESL) challenges of high-end
FPGAs, and thus realize the unprecedented
potential of these devices as ASIC replacements
in new SoC designs.
To access the latest product news, application
notes, and case studies, evaluate new
design flows, or schedule a product demonstration,
visit www.mentor.com/fpga/.
Printable PDF version of this article with graphics. (1/15/05) 195 KB |