Support|documentation
 
 
Home : Publications : Xcell Journal Online : Article

Software-Compiled System Design Optimizes Xilinx Programmable Systems
   
     
   
   
   
 
  Xcell Journal Home
  Xcell Archives
   
  Subscription
  Comments & Suggestions
  Write Articles for Xcell
   
   
   
   
 
by Chris Sullivan, Director of Strategic Alliances, Celoxica, Ltd.
chris.sullivan@celoxica.com

Milan Saini, Technical Marketing Manager, Xilinx, Inc.
milan.saini@xilinx.com (05/02/03)

We cannot solve our problems with the same thinking we used when we created them.
– Albert Einstein

Celoxica and Xilinx have partnered to fine-tune software-compiled system design for the benefit of designers using Virtex-II Pro and MicroBlaze programmable systems.

Okay, so none of us will claim to be Einstein, but programmable system design might be the next development challenge we face.

With the advent of the Xilinx Virtex-II Pro™ Platform FPGA and the MicroBlaze™ soft processor, we now have access to unrivalled levels of product flexibility and performance. This technology – for the first time – enables us to partition and re-partition systems between hardware and software at any time during the development cycle. We can even repartition the hardware and software mix after the product has gone to market. This complete reprogrammability means the system can be optimized over and over again.

But for design teams looking to use this powerful technology, system design can be a challenge. Exploring whether a design is even feasible can cost many months of time, effort, and expense, let alone finding the optimal design. And if system verification does not begin right from the start, projects can miss their specifications or take too long to get to market – reducing your ability to complete, your business credibility, and your all-important bottom line. It’s a design headache you don’t need.

Meeting the Design Challenge

A remedy for this headache is the synergy of software-compiled system design and programmable systems technology. Software-compiled system design is a methodology that provides a seamless bridge between hardware and software (Figure 1). It enables the system partition, verification, and hardware/software co-design to be driven directly from the system specification. Software-compiled system design was specifically developed for programmable systems and provides an efficient, cost-effective, and quality driven co-design solution.

Software-Compiled System Design

At the heart of software-compiled system design is the principle of providing a direct link between the programmable platform and the originally defined system functionality. This provides the necessary system verification flow and offers confidence to the designer who wants to fully explore the design space – pushing the envelope of design innovation and product differentiation.

The methodology also deploys novel technology that permits more informed and flexible hardware/software partitioning. This enables tradeoff analysis, partition and re-partition at any stage in the design, and by using higher level languages (HLLs) for both hardware and software design, the methodology allows the system specification to be written in a form that both teams can immediately use, without costly and time-consuming rewrites.

So Prove It To demonstrate the capabilities of software-compiled system design, Celoxica and Xilinx partnered to undertake the co-design of a JPEG 2000 codec (compressor/decompressor) for implementation in a Virtex-II Pro FPGA. In particular, we wanted to address the design challenge of system partitioning, co-verification, and the easy integration of hardware and software.

JPEG 2000 is a standards-based image coding system that uses state-of-the-art compression techniques based on wavelet technology (Figure 2). Its architecture lends itself to a range of uses from consumer electronics, such as digital cameras, to medical imaging, remote sensing, surveillance systems, and scanners.

Project Specifications

  • Maximize overall system performance
  • Innovate and differentiate from the competition
  • Demonstrate an efficient and effective co-design environment
  • Use software specification as a starting point for system design
  • Improve communication between hardware and software design teams
  • Simplify partitioning and migration of code between software and hardware for better overall Quality of Design (QoD)
  • Demonstrate a complete system verification flow
  • Deliver competitive Quality of Results (QoR)
  • Exceed time-to-market expectations
  • Support design re-use strategies
  • Maximize current EDA and IP investments.
Project Plan
  • Phase 1: Profile and verify
  • Phase 2: Partition and verify
  • Phase 3: Design and verify
  • Phase 4: Implement and verify
Tools
  • Celoxica Nexus PDK
  • Celoxica DK Design Suite
  • Wind River Xilinx Edition toolset
  • Xilinx ISE 5.1i
Phase 1: Profile and Verify

A multitude of applications can benefit from hardware acceleration and product flexibility. To demonstrate this, we selected JPEG 2000; our starting point for the design was the C specification code.

To drive our system verification flow, we ran the specification code through an appropriate target – in this instance the IBM PowerPC™ 405 GP. From this we simulated and verified the functionality of our system and established a test bench that remained constant and consistent throughout the design.

We then began code profiling to establish where our program spent its time and determine which functions called other functions during execution. We found profiling was useful, as it quickly identified the functions in a program that were processor hungry or compute intensive. That made them possible candidates for offload into the FPGA fabric. However, the profiling does not analyze dataflow between hardware and software, nor burst length or frequency, so designer intervention is mandatory to understand the dynamics of our hardware/ software interaction.

Using Wind River’s WindView™ visualization and diagnostic tool, we determined that the DWT (discrete wavelet transform) and Tier 1 encoder were the processor-intensive functions, consuming 87% of processing time (Figure 3). We selected them for further scrutiny and tradeoff analysis.

Phase 2: Partition and Verify

Validating the system partition against the requirements of the design specification is cardinal in programmable system design. Typically the system designer maps a system-level architecture into specific hardware and software components, making direct reference to the project specification and factors such as component availability, cost, and technical feasibility.

The consequences of the system partition cascade through the design flow to physical implementation and final system performance is greatly dependent upon partitioning decisions. It makes little sense to invest time, money, and effort optimizing and refining an incorrect partition – it is inherently sub-optimal.

Uniquely, software-compiled system design provides the designer with a flexible partitioning capability that permits partition and repartition at any stage in the design process. Moreover, it is linked to a verification flow that enables the designer to confidently explore and innovate in the design space, analyzing hardware/software tradeoffs, and identifying the optimal system partition for the best QoD.

Facilitating this is the data streaming manager (DSM), a portable co-design API (Figure 4) supplied with Celoxica’s DK Design Suite. Developed specifically for programmable system partitioning and hardware/software integration, the DSM allows the designer to iteratively explore, test, and verify multiple partition alternatives. The designer can quickly create and easily move ports that are used to send data between the software and hardware by using the API standard (Figures 5 and 6). As each option is explored, the designer can verify the partitioning with the software used as a test bench throughout the project.

In our project, the DSM validated the profiling information determined in Phase 1 of our design flow. It helped analyze the data flow and the burst length and frequency between our hardware and software, and fine-tuned the partition to meet the project’s criteria. Moreover, the DSM’s inherent portability meant the design could be repartitioned at any stage in the design flow, redefining the system architecture and easily accommodating late specification changes.

Phase 3: Design and Verify

With the optimal partition determined and verified, we began the design optimization phase of our project. Software-compiled system design makes use of HLLs for both hardware and software design. HLLs allow the system specification to be written in a form that both the hardware and software teams can immediately use – without costly and time-consuming rewrites. Additionally, HLLs simplify the migration of code between hardware and software. Because there is a common language base and common level of abstraction between the hardware and software, there is improved communication and shared understanding between the development teams.

As we did in our partitioning phase, we used the DSM throughout the design optimization phase. The DSM provided a functionally accurate simulation environment that allowed our hardware and software to interact – keeping them connected throughout design optimization (Figure 7).

The software was run as a native executable on the PPC 405 GP, and the hardware was run using the simulation and debugging capability of Celoxica’s DK Design Suite. We used a utility program to monitor the data passing between the applications to assist with debugging. Because all of the API functions were provided, this allowed complete system development to begin – without the development platform being available. Once we got it working, the application was easily transferred to the target platform for final testing. Co-simulation between hardware and soft-ware was made possible by connecting DK with the Tornado™ environment from Wind River (Figure 8).

As our system specification was described in ANSI-C, we progressed our design in ANSI-C and used hardware language extensions defined in Handel-C to describe our hardware. These hardware extensions enable, for example, efficient control over area, timing, clocks, RAMs, ROMs, and interfaces.

Combining multiple DSM calls, we made optimizations to the software. And we applied hardware optimization techniques, such as increasing parallelism, replacing for() loops with while() loops, pipelining, and syntax duplication.

Specification Change
At this stage in the design, a specification change was introduced. A novel lifting algorithm was developed that performs a two-dimensional DWT and thus provides faster processing time. The algorithm was readily available as a HDL IP block, and we decided, with respect to design time and maximizing IP investment, to integrate the IP into the design as a black box. The integration was simplified by using the interface declaration available in Handel-C for connecting third-party IP into a software-compiled system design flow (Figure 9).

Phase 4: Implement and Verify

Implementation to the target platform was simplified by using the platform abstraction layer (PAL). The PAL shields designers from low-level hardware interfaces, easing the integration of FPGAs with physical resources. This is done by developing a library of low-level interfaces to specific platform resources, such as I/O or memory. This library, called the platform support library (PSL), is then accessed by the hardware application on the FPGA using a simple and consistent application programming interface – the PAL API (Figure 10).

The target platform was a Wind River SBC405GP (single board computer reference design) with a Proteus FPGA daughter card, effectively a Virtex-II Pro prototyping platform (Figure 11). This development environment supported timing simulation, emulation, and block optimization, and it was used prior to final implementation in a Virtex-II Pro ML300 Evaluation Platform (Figure 12).

Object code was compiled into the PPC405GP under Wind River’s VxWorks™ RTOS, with the hardware implementation using the direct EDIF output generated by Celoxica’s DK Design Suite. This EDIF netlist was optimized for the Virtex-II Pro Platform FPGA (Figure 13), ensuring maximum efficiency for best QoR. Optionally, the DK Design Suite can also output RTL-level VHDL or Verilog, pre-optimized for traditional synthesis tool flows (Figure 14).

Results

The results (Tables 1 and 2) from the Celoxica DK Design Suite were compared to the handcrafted VHDL authored by a JPEG 2000 domain expert. Handel-C’s systematic and ANSI-C-like approach to the problem led to substantial savings in design time. An expert Handel-C engineer with no prior knowledge of the JPEG 2000 standard was able to get the algorithm to a working hardware implementation in less than half the time it took to code the VHDL.

Table 1

Table 2

The other key success we had was that the design was easily able to meet the system timing constraints. The results provide clear validation that design abstraction leads to increased designer productivity without necessarily compromising performance or area.

Conclusion

Software-compiled system design is a proven methodology for programmable system co-design. It provides a solution for system partitioning, co-verification, and hardware/software integration across a spectrum of design styles and applications. For use by all members of your design team, from the system architect to the verification engineer, software-compiled system design enables code sharing between hardware, firmware, and software designers from system specification through to implementation.

The Celoxica DK Design Suite is a fully featured development toolset that enables a complete implementation of a software-compiled system design. It is interoperable with popular third-party tools and languages and provides fast co-simulation between C/C++, Handel-C, HDLs, instruction set simulators (ISSs), and modeling languages such as Open SystemC™ and MATLAB™.

Software-compiled system design methodology offers you compelling benefits, and it is an efficient and effective design strategy for Xilinx programmable systems.

Printable PDF version of this article. PDF logo (05/02/03) 400 KB

 
/csi/footer.htm