Xcell Journal Online
  Xcell Journal Archives
   
  Writing for Xcell
  Advertising in Xcell
  FREE Subscription
  Contact Us

    

Home : Xcell Journal Online : Article
The scc-32/scc-16 Microsequencers and AHBDBG System Debugger



by Aki Niimura, IP Supplier, Ponderosa Design
info@ponderosa-design.com (7/15/05)


Ponderosa Design‘s new microsequencers and supporting tools make developing scalable microsequencer-based designs more accessible.
article link to PDF
Article PDF 320 KB


As larger Xilinx FPGAs become affordable (thanks to advanced process technologies), FPGA designers are now asked to create systems with more complex functionalities. By implementing these functionalities in software, FPGA designers can achieve their goals quickly and make their designs more maintainable and reusable.

However, the available resources in FPGAs are finite. Thus, the demand for easy-to-use, resource-efficient compact processors is always strong. Ponderosa Design microsequencers were developed with such demands in mind. In this article, we’ll present our new microsequencer products and new system debugging tools, which enable microsequencers to be used in a wider range of applications.

The scc-32
When we developed our first-generation microsequencers, the main available resources in FPGAs were 512-byte block memories with a maximum 16-bit wide data interface (and no multipliers). Now, Xilinx® Spartan™-3 and Virtex™-4 devices provide quite a different landscape for FPGA designers. Two kilobyte block memories with a maximum 36-bit-wide data interface and 18 x 18 bit multipliers have become common resources that you can expect even for cost-sensitive FPGA projects.

We developed our newest microsequencer, the scc-32, to fully utilize such resources, as well as to provide 32-bit data handling capability. As the name implies, the scc-32 is a 32-bit controller – but how does it compare to the MicroBlaze™ softcore processor? The scc-32 is not designed as a generic microcontroller like MicroBlaze or Power PC™ processors. Our microsequencers are designed to take different roles and work with generic microcontrollers instead.

Our microsequencers employ a stack architecture, while today’s generic microcontrollers employ a register-based architecture. Stack architecture is suitable for custom processors for FPGAs because the core is compact and resource-efficient (an effective use of block RAM). The program size is smaller because the instruction is one byte long. Stack architecture doesn’t use deep pipelining, resulting in a predictable interrupt latency.

Contrary to common perceptions, supporting 32-bit data types is not difficult, nor does it consume a lot of resources in the FPGA. Our microsequencers use stack architecture, where the data size is irrelevant to each stack operation. When it comes to FPGA resources, all 32-bit data is stored in block RAM rather than registers. However, the arithmetic logic unit (ALU) must be a 32-bit ALU.

The scc-32 uses unified memory architecture (UMA). It needs three logically independent memories (data stack, program stack, and register file) in addition to a program memory. With UMA, three logically independent memories are unified into a single memory. One block RAM (38-bit wide, 512-word deep) can hold a 32-level-deep data stack, a 16-level-deep program stack, 144 global registers, and 24 auto registers per function call.

The scc-32 has a 16-bit program space, 64 KB, while our previous generation has an 11- or 13-bit program space. A larger program space means that a large amount of data (such as coefficient tables or message strings) can be included in a program. As our microsequencers have instructions to read from/write to program memory, program memory space can be used as an extra storage or data space to share with another process.

One important architectural change we observed from older Xilinx FPGAs to newer Xilinx FPGA families is the absence of internal tri-state buffer (TBUF) resources. Our older microsequencers utilize TBUFs to construct multiplexers with a large number of inputs. As they are no longer available in newer Xilinx FPGA families, the scc-32 is designed without internal TBUFs and optimized to minimize the complexity of such data multiplexers.

The scc-16
Our first-generation microsequencers were 16-bit controllers. They had a much tighter programming model; for example, the scc- IIs, one of our first microsequencers, had a 2 KB program size limit, five-level function call, eight global registers, and eight auto registers per function call. Not all applications require 32-bit data types, but you may want to use a newer programming model like the scc-32.

The scc-16 microsequencer uses an identical architecture and instruction set to the scc-32 (some instructions are dropped because there is no need to support 32-bit data types). The scc-16 provides a smaller footprint, which is close to the one that the first-generation microsequencer delivered (see Table 1). For example, even with the smallest Spartan-3 FPGA (the XC3S50), the scc-16 “hello” project only consumes 53% of the device, leaving ample space for other logic to be implemented.

Supporting a 32-Bit System Bus
A common practice is to use a system bus to create a larger system using bus-compliant modules. We originally designed our microsequencers for stand-alone use, but in some situations, you may want to connect a microsequencer to a system bus. For example, a microcontroller can download a program to a microsequencer on the fly so that the block can be used as a reconfigurable functional block. A microsequencer can also share a resource with other controllers.

The Advanced Microcontroller Bus Architecture (AMBA) high-performance bus (AHB bus) from ARM Ltd. is a system bus similar to the CoreConnect used with MicroBlaze and PowerPC processors. We use 32-bit AHB bus, which results in a 32-bit address space (4 GB). To interface to the AHB bus, we developed two wrapper modules, ahb32wrap (AHB master) and scc32ahb/scc16ahb (AHB slave). For the AHB master, macros are provided to access the 32-bit address space, as the scc-32 native instructions cannot access 32-bit address space directly.

The following code excerpts demonstrate how accessing the AHB bus is coded in the “SC” program. The SC language is a proprietary high-level language specifically for the SCC-II microsequencer family.

ahbwrite(DMAC_CTRL, 0x00000101);
ahb_status = ahbread(DMAC_STAT);
With the wrappers, the microsequencer’s internal signals as well as the entire program memory are exposed to the AHB bus. To facilitate the development of a system with the AHB bus, we developed a new debugging tool in addition to our original stand-alone JTAG debugger.

The AHBDBG
The AHBDBG is a debugging tool for systems with the AHB bus (Figure 1). It provides a wide range of features to debug an AHB bus-based system, while our original JTAG debugger only provides those features necessary to try a program with an FPGA on the board.

The AHBDBG communicates with a small AHB master – the jtag2ahb module – through a JTAG interface to generate AHB bus accesses. In addition to obvious features such as bus read/write/dump, two important features are worth mentioning. The first feature is the AHB-based logic analyzer (Figure 2). You may think this is yet another ChipScope™ analyzer, but the logic analyzer available with the AHBDBG is quite different. First, it must be explicitly instantiated. Because of this, you can provide your trigger signal if you need a complex trigger condition. Second, it only saves signals when they change (recording events). This is really necessary to capture bus activities, which may last for many cycles. The tool also allows you to compress much further by sacrificing timing relationship accuracy.

With the optional compression mode turned on, if the timing period between event A and event B exceeds the 16-bit cycle counter, it is truncated to the maximum of the cycle counter. Otherwise, a null event (an event containing no value change information but a time stamp), is generated to guarantee the timing accuracy.

There are four types of AHB-based logic analyzers: 16 bits wide, 32 bits wide, 64 bits wide, and 128 bits wide. You can configure the depth of the logic analyzer trace memory. A 32-bit-wide logic analyzer is sufficient for microsequencer debug, whereas a 64-bit-wide logic analyzer is probably sufficient for AHB bus monitoring (Table 2). The captured events are uploaded to the host side; the AHBDBG produces a VCD dump file so that you can use a simulation waveform viewer to see the waveform. The signals can be bundled to a set of wires and buses with meaningful signal names using a helper tool, vcdwizard.

The second unique feature is the remote access feature. The AHBDBG can be a gateway to your hardware system. When the remote access feature is turned on, it will listen to the network (specified TCP port) and translate a network message to an AHB bus access. This feature allows you to exercise the FPGA hardware using programs such as Python, Perl, or C++.

We provided this feature because our users asked us to include specific features for their projects, and the remote access feature provides a generic way to provide project-specific features. We later found that the remote access feature could be used in other situations beyond its original intent. For example, you could do system prototyping using a remote program before developing a real embedded program.

The AHBDBG has a layered software design; the bottom layer is a layer to talk to a physical device. Currently, it supports Parallel Cable III through a printer port or Ethernet Pod (proprietary to Ponderosa Design). However, it can also support any device that can mimic a printer port interface.

In addition to the devices mentioned previously, a virtual device “sim” is also supported. As the name implies, the “sim” device is a virtual device for Verilog simulation, which the AHBDBG can “control.” Any commands given to the AHBDBG are ultimately converted to AHB accesses in a simulation. We found this scheme very helpful, as it provides an intuitive way to run Verilog simulation. The current scheme uses Unix IPC (Inter-Process Communications) and Verilog PLI, so it is not a universal solution for everyone. However, many simulators provide a special hook so that controlling Verilog simulation from the AHBDBG is possible even where Unix IPC is not available.

The AHBWIZARD
When constructing a larger system with the AHB bus, we realized that creating a top-level file that holds all AHB submodules and arranging bus multiplexers is tedious, time-consuming, and errorprone. Moreover, you need to constantly maintain the file as new AHB modules are added or subtracted from the system.

We developed the AHBWIZARD to automate the process such that you can rearrange your AHB system without rewriting the top-level file. For example, you may want to have a logic analyzer module to debug the system, but you don’t want to have such modules in the production FPGA design.

The AHBWIZARD separates the AHB library (definition files) and the tool itself (GUI) such that you can add a new AHB module definition file or modify how codes are generated (such as signal naming conventions). At start-up, the AHBWIZARD scans the library directory, enumerates the modules available, and displays them in the window. You can just drag-and-drop the modules you want to add (Figure 3).

A custom property sheet pops up to specify property information about the module, such as address decoding or AHB master priority, which is in turn used in the module’s code generation. Glue modules such as AHB slave to master multiplexer are automatically generated accordingly.

If the generated top-level file is not complete, you can use an optional “touch-up” module to add or modify the generated top-level file. A helper tool (touchupwizard) generates the touch-up module (Figure 4).

We provide several commonly used AHB modules for use with the AHBWIZARD, in addition to our AHB-ready microsequencer modules.

Multiprocessors in FPGAs
As the footprints of our microsequencers are small, you could use more than one microsequencer in a FPGA. Although you can assign a different program to each microsequencer, you can also assign the same program to multiple microsequencers. We think that such a configuration – which we call SIMD (single-instruction-stream, multiple-datastream) configuration – could be very beneficial for certain types of applications.

When controlling a robot, for example, the left and right sides probably require the same control flow. With two microsequencers sharing one program memory, the program only needs to deal with one side, resulting in a simpler program. Additionally, you can use a portion of the program memory for processor communication, as our microsequencers can write to a program memory.

Of course, the required number of block RAMs is half the required number of block RAMs for two separate microsequencers, such that a larger program can be crammed into an FPGA. You can achieve such a scheme with minor core modifications because the block RAM in an FPGA is dual-port memory, allowing two microsequencers to access the program memory without disturbing another (Figure 5).

Multiprocessors in FPGAs is an interesting developing field. As the AHBDBG accesses the microsequencer through the AHB bus, any number of microsequencers can be supported.

Conclusion
The ideas presented here are used in ASIC design processes, so we are glad to bring these advanced design methodologies to FPGAs. However, this cannot be done without the newest Xilinx FPGA families.

For more information, visit www.ponderosa-design.com, or e-mail info@ponderosa-design.com.

Printable PDF version of this article with graphics. PDF logo (7/15/05) 320 KB

 
Jobs Events Webcasts News Investors Feedback Legal Sitemap
© 1994-2008 Xilinx, Inc. All Rights Reserved.