Support|documentation

  Xcell Journal Online
  Xcell Journal Archives
   
  Writing for Xcell
  Advertising in Xcell
  FREE Subscription
   
  Partner Yellow Pages
  Reference Pages
  Contact Us

    

Home : Documentation : Xcell Journal Online : Article
Implementing a Cable Modem Termination System with Virtex-4 FPGAs



by Delfin Rodillas, Strategic Solutions Manager, Xilinx, Inc.
delfin.rodillas@xilinx.com (1/15/05)


Integrated features make the Virtex-4 device an ideal choice.
article link to PDF
Article PDF 200 KB


With the continued proliferation of cable and satellite television and the rapid growth of the Internet, video transmission bandwidth has experienced phenomenal growth. With video streaming now being introduced into mobile handsets, this growth rate is not showing any signs of slowing down.

The technology advances of Xilinx® FPGAs have kept pace with the increasing transmission requirements and have solved many of the critical design issues in these systems. The Virtex-4™ product family incorporates additional enhancements – high-speed DSP, ultra low power, flexible integrated memory, and high-speed serial I/O – that enable these devices to meet the high bandwidth requirements of video applications.

With these features, you can use Virtex-4 devices in a variety of products, such as cable modem termination systems, digital video broadcast systems, flat-panel displays, master control switches, MPEG encoders, non-linear video editors, broadcast routers, image statistical multiplexers, and video servers.

Cable Modem Termination System
One common application where you can use Virtex-4 devices is in a cable modem termination system (CMTS), shown in Figure 1. The CMTS is used in cable headends, a switching system that works in conjunction with Internet service providers to route data between cable modems and the Internet.

In a CMTS, the transmitted data is multiplexed onto a cable channel along with broadcast video transmissions. Bandwidth is shared by all active subscribers (typically 500 to 2,000) in the cable network segment. Downstream transmission rates run at 40 Mbps using quadrature amplitude modulation (QAM), while upstream rates can be as high as 10 Mbps using QAM or quadrature phase shift keying (QPSK). The speed of the upstream link depends on the service level agreement (SLA) that the subscriber has signed with their cable company.

CMTS Design Challenges
Cable operators can offer a variety of different services by using quality of service (QoS) provisioning to support different subscriber packages, helping to maximize their revenue stream. For QoS in the CMTS, the design needs to support packet classification, packet prioritization, flow control, congestion control, queuing, scheduling, and QoS statistical measurements. All of these functions need to be supported without a reduction in user bandwidth. Given this, QoS processing is generally done in hardware, for software implementations lack the processing power to make real-time routing decisions and can result in delays and excessive queuing.

Maintaining efficient bandwidth utilization while supporting SLAs and multiple traffic types makes traffic management very challenging. Throw in varying protocols, memory management, different sized payloads, and a variety of different system interfaces, and it is easy to see how these designs require highperformance, cost-effective flexibility that ASSPs and ASICs cannot offer. These challenges open up opportunities for Virtex-4 devices that can provide flexible traffic management capability at the required performance levels.

CMTS Queuing and Scheduling Requirements
QoS provisioning is basically a queuing and scheduling problem. Proper queuing and scheduling entails recognizing service classes along with managing buffer memories and port bandwidth. The design goal is to reduce the amount of congestion in order to offer the maximum amount of bandwidth and packet throughput by optimizing end-to-end delay and minimizing packet loss. In addition, the implementation needs to support fair bandwidth distribution for each service class; furnish protection between the different class levels; provide fast, flexible access to bandwidth without impacting forwarding performance; and allow other service classes to use underutilized bandwidth.

To surmount these challenges, efficient queuing and scheduling techniques are required to optimize queue memory management, which controls the number of packets in a queue. This function controls service-class access to the packet memory buffer and determines which packets to drop because of congestion.

Multiple queue memory management techniques are in use today, including random early detection (RED), weighted random early detection (WRED) and leaky bucket. Per-flow queuing is commonly performed using one or a combination of the scheduling algorithms shown in Table 1.

Table 1 – Common queuing and scheduling algorithms
Queuing and Scheduling Algorithms
First-In, First-Out
Round Robin
Weighted Round Robin
Fair Queuing
Weighted Fair Queuing
Priority Queuing
Shortest Remaining Time

Table 1 shows that there are many different queuing and scheduling algorithms. Given the dearth of standards activity in this area, many different algorithms will continue to exist for the foreseeable future. In addition, these algorithms need to handle variable sized packets, which are more complicated than fixed cells.

Virtex-4 devices offer a high-performance solution for these queuing and scheduling requirements, for the devices offer an extremely fast and flexible fabric for implementing designs without impacting forwarding performance. Scheduling decisions are typically performed every clock cycle and require heavily pipelined designs.

Virtex-4 devices also offer a register-rich architecture with ample routing, enabling efficient implementation of these decisions. The high-speed designs also require very wide internal buses, which are easily implemented in the Virtex-4 architecture by using the integrated DLLs and DCMs to help manage multiple clock domains.

Many of the queuing and scheduling buffer management schemes are math-intensive; these schemes must quickly calculate multi-variable equations such as packet transmit scheduling and customer service normalization schemes. For instance, the bandwidth calculation shown in Figure 2 is a multi-variable equation used to calculate the bandwidth (B1, B2) for each user for a given level of total bandwidth. These types of functions can take advantage of the integrated 500 MHz performance, low power, 18 x 18 multipliers, and 48-bit adder/subtractor integrated in the XtremeDSP™ slice.

CMTS Memory Requirements
Most networking applications are built around a load-store type of architecture, with packets being stored in linked lists in external memories. Because of the increasing queuing and scheduling performance requirements of the CMTS, high-speed DDR or QDR SRAM memories prevent memory access from becoming a bottleneck.

To properly interface to these memory devices, all Virtex-4 devices have the ChipSync™ feature in every device I/O. ChipSync lets designers easily align the DQS control signal with memory data in very small increments; this alignment can be easily monitored and altered as temperature and voltage changes alter the very delicate timing.

Converting the high-speed 300 MHz+ memory data to wider, slower, more manageable data is easily accomplished with the built in ISERDES and OSERDES available in every I/O. Additionally, the Virtex-4 memory-rich architecture, capable of running at 500 MHz, provides much needed on-chip cache capability.

Virtex-4 devices support high-speed memory interfaces and, along with an embedded hierarchy of memory structures comprising distributed and block RAM, can easily facilitate implementation of high-performance queuing and scheduling algorithms. The Virtex-4 devices’ high memory-to-logic ratio helps reduce memory access latency by caching data on-chip, buffering data between two disparate clock domains, and using scratch-pad memory for storing coefficients.

The integrated distributed RAM is good for implementing small FIFOs, DSP coefficients, shallow/wide memories, and CAMs. The block RAM is good for larger FIFOs, packet buffers, video line buffers, cache tag memory, deep/wide memories, and CAMs. Xilinx also has many proven embedded-memory CAM and FIFO reference designs available to help implement these high-speed memory designs.

CMTS Video Transmission Standards The ITU-T (International Telecommunications Union – Telecommunication Standardization Sector) has created a standard for the transmission of audio, video, and data services over cable networks. The specification for this standard is ITU-T J.83 Digital Multi-Program Systems for Television, Sound, and Data Services for Cable Distribution.

This standard is supported in Virtex-4 devices using the Xilinx J.83 Cable Modulator LogiCORE™ IP to provide either single- or quad-channel support. (See the related article from the Winter 2004 issue of the Xcell Journal, “Using System Generator for DSP to Create the J.83 Cable Modulator.”)

Conclusion
Given the high bandwidth requirements of a CMTS along with the associated queuing and scheduling complexities to provide the appropriate QoS requirements, Virtex-4 devices offer an optimal solution for these designs. The embedded hierarchy of memory structures, along with integrated high-speed serial interfaces and programmable flexibility, make Virtex-4 devices a better choice over implementations using ASICs or ASSPs.

To learn more about Xilinx key markets and end applications, visit www.xilinx.com/esp/. For more details on Virtex-4 FPGAs, visit www.xilinx.com/virtex4/.

Printable PDF version of this article with graphics. PDF logo (1/15/05) 200 KB

 
/csi/footer.htm