|
The Mentor Graphics Seamless co-verification tool is as
applicable to large FPGA designs as it is to large ASICs.
As gate counts increase with each new generation
of FPGAs, it becomes easier to
implement large-scale systems using platform
FPGAs rather than ASICs. FPGA
design starts have grown along with the
technology’s capabilities, exceeding those
for ASICs by as much as 10 to 1 in 2002.
To address the challenges these large
FPGA-centric systems present, we must reexamine
system-on-chip (SoC) methodologies,
such as hardware/software
co-verification, initially developed for
ASIC-scale designs.
In this article, we’ll demonstrate the
Mentor Graphics® Seamless® hardware/
software co-verification technology
on a design targeting the Xilinx Virtex-II
Pro™ FPGA, with its embedded IBM™
PowerPC™ 405 CPU. This design boots up
the Nucleus® Plus real-time operating system
(RTOS), also from Mentor Graphics. As
we’ll see, significant performance improvements
are possible over conventional HDL
simulation when using the Seamless tool.
Co-Verification Offers Speed, High Performance
Co-verification is proven in the development
of embedded system ASICs.
Hardware/software co-verification tools
allow you to run software against a hardware
design to verify hardware/software
interfaces before building a physical prototype.
This gives you concurrent access
to your hardware design while it is in
development and reduces overall project
cycle time.
Increased visibility into your hardware
design, especially when encountering problems,
enables you to debug designs while
they’re exercised by actual verification software
running on the CPU.
From a performance perspective, the
behavioral model of the co-verification
processor speeds up simulation execution
versus a register transfer level (RTL) model
of the CPU.
Massive speed gains by as much as 1,000
times are possible by applying Seamless
optimizations that remove overhead, such
as explicit instruction and data fetch cycles
in hardware simulation. This performance
improvement enables you to verify large
FPGA-based systems with as many as 2 million gates – gates that could not otherwise
be verified within a practical time frame
using conventional HDL simulation.
With hardware/software co-verification,
you can also spot bugs that would otherwise
go unnoticed, simply because it would be
impractical to test for them. This becomes
more important as systems become larger,
and the portion of nodes that can be physically
probed from the perimeter of the
FPGA becomes smaller.
The majority of nodes are internal to the
design and observable only through simulation.
Even those that can be probed are
dependent on cycle time for the design and
production of test boards.
Seamless Co-Verification of an FPGA Design
In the following example, you can observe
software – whether high-level C or low-level
assembler – as well as hardware using
the Seamless co-verification tool.
System Specifications
Our example system is built to the scale of
typical embedded microcontrollers, such as
the IBM PowerPC 405GP, a.k.a. Galaxy. In
a Virtex-II Pro design for the Galaxy SoC,
we integrated standard IP components
around the IBM PowerPC 405 CPU,
either from the Xilinx Embedded Design
Kit (EDK) library or from third-party
providers such as the Mentor Graphics
Intellectual Property Division.
At this scale of design, we have:
- Three standard IBM CoreConnect
buses: processor local bus (PLB), onchip
peripheral bus (OPB), and device
control registers (DCR) bus
- Multiple peripherals: Two GPIO, IIC,
Ethernet MAC, direct memory access
(DMA) controller, PCI bridge, and
memory controllers for Flash, SRAM,
and SDRAM
- Support functions such as an interrupt
controller and on-chip memory using
Xilinx block RAM.
As with typical reference boards, such as
the Walnut from IBM for the 405GP, a reference
board for the Virtex-II Pro FPGA
provides the external memory to support
the system.
The hardware design was assembled
using Xilinx embedded system tools, as
shown in Figure 1.
Once the FPGA hardware design is created,
the conversion for co-simulation is
quite straightforward. The IBM PowerPC
405 block in the hardware design is
replaced by the bus interface model (BIM)
component of the IBM PowerPC 405
model from the Seamless processor support
package (PSP). Software for the Nucleus
Plus RTOS is compiled and loaded to run
from Flash memory, which is external to
the FPGA as part of the test bench.
Hardware/software co-verification can now
be performed on this FPGA design, just as
it is for ASIC designs.
How Seamless Measures Up
Table 1 outlines the relative performance
results from executing the boot up of the
Nucleus Plus RTOS of our example system.
The baseline figure is a measure of
Seamless hardware/software co-verification
executing without any optimizations.
The hardware design was created in
VHDL to utilize the Xilinx EDK behavioral model libraries for those components;
otherwise Verilog® HDL can be used.
Other components not available from the
Xilinx libraries were found in the Mentor
Graphics Intellectual Property Division’s
Inventra™ catalog.
The Seamless processor model BIM for
the IBM PowerPC 405d5 is in VHDL.
The bus is also available in Verilog HDL.
Because modern HDL simulators such as
the Mentor Graphics ModelSim® application
can perform mixed-language simulation,
that capability extends to Seamless
co-verification for hardware simulation.
The Nucleus Plus RTOS is bundled with
the Seamless application, as the software
runs separately in the embedded system. For
simplicity, it is run from Flash memory at a
high address. When using Seamless co-verification
on a design, you can assign “software
memory” to the Flash region. This
feature allows you to start software development
while the hardware is evolving.
In this case, development began prior
to the final setting of the memory controller
for the SRAM, ROM, and Flash
memory selected for the system.
Likewise, for SDRAM at low addresses in
the memory map, you can assign software
memory to that region prior to the final
settings of the SDRAM controller for the
chosen devices.
Our simulation time trials are run in
four Seamless modes, all using software
memory for Flash (instructions and data)
and a mix of scenarios for SDRAM (data)
as follows:
A) Verilog SDRAM hardware model
B) SDRAM memory region optimized
with Denali models
C) SDRAM memory region as software
memory
D) SDRAM memory region as software
memory, all memory-optimized for
time.
Optimization in Seamless transfers certain
bus operations from hardware to the
Seamless Coherent Memory Server software.
The software executes all of the
same instructions and data virtually, from
a copy of the memory contents. As a
result, bus cycle activity is not needed in
the hardware simulation.
Optimization can be activated for
instruction fetches only; for set instruction
regions (as in case B); or in time
(where the software execution on the
instruction set simulator [ISS] is asynchronous
to the hardware simulation).
The preliminary performance findings for
our Xilinx platform FPGA design, executing
a 1.6 µs segment of the RTOS boot,
are shown in Table 1. In further studies,
we will port more of the RTOS software
to this design. These results will be presented
in future articles.
Table 1 – Preliminary time trial results for Seamless co-verification
of a Xilinx Virtex-II Pro reference design.
Mode
| Elapsed Wall Clock Time
|
| Verilog SDRAM hardware model | 35 minutes |
| SDRAM memory region optimized, Denali models | 30 minutes |
| SDRAM memory region as software memory | 30 minutes |
| SDRAM region as software memory, optimized for time | 15 seconds |
For case A, we had to establish the proper
settings of the SDRAM controller for a
particular memory device, as well as
include sufficient time to initialize both.
Were it not for a wait loop (to ensure
enough time for SDRAM initialization)
and a relatively low level of memory access
to the Verilog behavioral model, the
elapsed time would have been greater.
Before establishing the controller settings
and initialization timing, Seamless
allows you to start software development
by either optimizing the address space of,
for example, a Denali model for the hardware
design memory, or by setting that
region as software memory.
The results of cases B and C are the same
because the solution operates the same from
either path. If the software is executing in a
CPU-bound manner, then time optimization
can yield further performance gains through
faster code development, as in D. We can
decouple the ISS from the hardware simulation
clock when no bus activity is scheduled.
Faster software execution and debug can be
carried out between hardware activities.
Optimization can be turned on and off during
the co-simulation session, so breakpoints
can be set in the software before executing
any hardware-dependent code. This allows
system debug of simultaneous hardware and
software execution, the key benefit for using
Seamless in platform FPGA design.
Conclusion
We have demonstrated that hardware/software
co-verification, originally developed
for ASIC design, also addresses the challenges
of today’s largest FPGA designs.
Seamless co-verification from Mentor
Graphics meets these challenges by providing
significant performance improvements
over conventional HDL simulation. The
Seamless solution continues to be enhanced
to provide even greater versatility.
In addition to the performance optimization
features discussed here, technological
enhancements include:
- Platform-based design tools that speed
system design and verification
- A C-based design that raises the level of
verification abstraction
- An embedded system optimization
tool that ensures performance goals
and maximizes design efficiency.
These advances are made even more useful
to the FPGA engineer by the growing
library of Seamless PSPs, developed in
cooperation with leading FPGA vendors
such as Xilinx.
To learn more about the Seamless
co-verification solution, including information
on Mentor Graphics’ extensive library
of PSPs, please visit www.mentor.com/soc/verification/.
Printable PDF version of this article with graphics. (1/25/04) 240 KB |