|
With the continued proliferation of cable
and satellite television and the rapid
growth of the Internet, video transmission
bandwidth has experienced phenomenal
growth. With video streaming now being
introduced into mobile handsets, this
growth rate is not showing any signs of
slowing down.
The technology advances of Xilinx®
FPGAs have kept pace with the increasing
transmission requirements and have solved
many of the critical design issues in these
systems. The Virtex-4™ product family
incorporates additional enhancements –
high-speed DSP, ultra low power, flexible
integrated memory, and high-speed serial
I/O – that enable these devices to meet the
high bandwidth requirements of video
applications.
With these features, you can use Virtex-4 devices in a variety of products, such as
cable modem termination systems, digital
video broadcast systems, flat-panel displays,
master control switches, MPEG
encoders, non-linear video editors, broadcast
routers, image statistical multiplexers,
and video servers.
Cable Modem Termination System
One common application where you can
use Virtex-4 devices is in a cable modem
termination system (CMTS), shown in
Figure 1. The CMTS is used in cable
headends, a switching system that works
in conjunction with Internet service
providers to route data between cable
modems and the Internet.
In a CMTS, the transmitted data is multiplexed
onto a cable channel along with
broadcast video transmissions. Bandwidth
is shared by all active subscribers (typically
500 to 2,000) in the cable network segment.
Downstream transmission rates run
at 40 Mbps using quadrature amplitude
modulation (QAM), while upstream rates
can be as high as 10 Mbps using QAM or
quadrature phase shift keying (QPSK). The
speed of the upstream link depends on the
service level agreement (SLA) that the subscriber
has signed with their cable company.
CMTS Design Challenges
Cable operators can offer a variety of different
services by using quality of service
(QoS) provisioning to support different
subscriber packages, helping to maximize
their revenue stream. For QoS in the
CMTS, the design needs to support packet
classification, packet prioritization, flow
control, congestion control, queuing,
scheduling, and QoS statistical measurements. All of these functions need to be
supported without a reduction in user
bandwidth. Given this, QoS processing is
generally done in hardware, for software
implementations lack the
processing power to make
real-time routing decisions
and can result in delays
and excessive queuing.
Maintaining efficient
bandwidth utilization
while supporting SLAs and
multiple traffic types
makes traffic management
very challenging. Throw in
varying protocols, memory
management, different
sized payloads, and a variety
of different system
interfaces, and it is easy to
see how these designs require highperformance,
cost-effective flexibility that
ASSPs and ASICs cannot offer. These challenges
open up opportunities for Virtex-4
devices that can provide flexible traffic
management capability at the required performance
levels.
CMTS Queuing and Scheduling Requirements
QoS provisioning is basically a queuing and
scheduling problem. Proper queuing and
scheduling entails recognizing service classes
along with managing buffer memories and port
bandwidth. The design goal is to
reduce the amount of congestion in order to
offer the maximum amount of bandwidth
and packet throughput by optimizing end-to-end delay and minimizing packet loss.
In addition, the implementation needs
to support fair bandwidth distribution for
each service class; furnish protection
between the different class levels; provide
fast, flexible access to bandwidth without
impacting forwarding performance; and
allow other service classes to use underutilized
bandwidth.
To surmount these challenges, efficient
queuing and scheduling techniques are
required to optimize queue memory management,
which controls the number of
packets in a queue. This function controls
service-class access to the packet memory
buffer and determines which packets to
drop because of congestion.
Multiple queue memory management
techniques are in use today, including random
early detection (RED), weighted random
early detection (WRED)
and leaky bucket. Per-flow
queuing is commonly performed
using one or a combination
of the scheduling
algorithms shown in Table 1.
Table 1 – Common queuing and scheduling algorithms
| Queuing and
Scheduling Algorithms |
| First-In, First-Out |
| Round Robin |
| Weighted Round Robin |
| Fair Queuing |
| Weighted Fair Queuing |
| Priority Queuing |
| Shortest Remaining Time |
Table 1 shows that there
are many different queuing
and scheduling algorithms.
Given the dearth of standards
activity in this area, many different
algorithms will continue
to exist for the foreseeable
future. In addition, these
algorithms need to handle
variable sized packets, which
are more complicated than fixed cells.
Virtex-4 devices offer a high-performance
solution for these queuing and scheduling
requirements, for the devices offer an
extremely fast and flexible fabric for implementing
designs without impacting forwarding
performance. Scheduling decisions
are typically performed every clock cycle and
require heavily pipelined designs.
Virtex-4 devices also offer a register-rich
architecture with ample routing, enabling
efficient implementation of these decisions.
The high-speed designs also require very
wide internal buses, which are easily implemented
in the Virtex-4 architecture by
using the integrated DLLs and DCMs to
help manage multiple clock domains.
Many of the queuing and scheduling
buffer management schemes are math-intensive;
these schemes must quickly calculate
multi-variable equations such as
packet transmit scheduling and customer
service normalization schemes. For
instance, the bandwidth calculation shown
in Figure 2 is a multi-variable equation
used to calculate the bandwidth (B1, B2)
for each user for a given level of total bandwidth.
These types of functions can take
advantage of the integrated 500 MHz performance,
low power, 18 x 18 multipliers,
and 48-bit adder/subtractor integrated in
the XtremeDSP™ slice.
CMTS Memory Requirements
Most networking applications are built
around a load-store type of architecture,
with packets being stored in linked lists in
external memories. Because of the increasing
queuing and scheduling performance
requirements of the CMTS, high-speed
DDR or QDR SRAM memories prevent
memory access from becoming a bottleneck.
To properly interface to these memory
devices, all Virtex-4 devices have the
ChipSync™ feature in every device I/O.
ChipSync lets designers easily align the
DQS control signal with memory data in
very small increments; this alignment can
be easily monitored and altered as temperature
and voltage changes alter the very
delicate timing.
Converting the high-speed 300 MHz+
memory data to wider, slower, more manageable
data is easily accomplished with the
built in ISERDES and OSERDES available
in every I/O. Additionally, the Virtex-4
memory-rich architecture, capable of running
at 500 MHz, provides much needed
on-chip cache capability.
Virtex-4 devices support high-speed
memory interfaces and, along with an
embedded hierarchy of memory structures
comprising distributed and block RAM,
can easily facilitate implementation of
high-performance queuing and scheduling
algorithms. The Virtex-4 devices’ high
memory-to-logic ratio helps reduce memory
access latency by caching data on-chip,
buffering data between two disparate clock
domains, and using scratch-pad memory
for storing coefficients.
The integrated distributed RAM is
good for implementing small FIFOs, DSP
coefficients, shallow/wide memories, and
CAMs. The block RAM is good for larger
FIFOs, packet buffers, video line buffers,
cache tag memory, deep/wide memories,
and CAMs. Xilinx also has many proven
embedded-memory CAM and FIFO reference
designs available to help implement
these high-speed memory designs.
CMTS Video Transmission Standards
The ITU-T (International Telecommunications
Union – Telecommunication
Standardization Sector) has created a standard
for the transmission of audio, video,
and data services over cable networks. The
specification for this standard is ITU-T
J.83 Digital Multi-Program Systems for
Television, Sound, and Data Services for
Cable Distribution.
This standard is supported in Virtex-4
devices using the Xilinx J.83 Cable
Modulator LogiCORE™ IP to provide
either single- or quad-channel support.
(See the related article from the Winter
2004 issue of the Xcell Journal, “Using
System Generator for DSP to Create the
J.83 Cable Modulator.”)
Conclusion
Given the high bandwidth requirements
of a CMTS along with the associated
queuing and scheduling complexities to
provide the appropriate QoS requirements,
Virtex-4 devices offer an optimal
solution for these designs. The embedded
hierarchy of memory structures, along
with integrated high-speed serial interfaces
and programmable flexibility, make
Virtex-4 devices a better choice over
implementations using ASICs or ASSPs.
To learn more about Xilinx key markets
and end applications, visit www.xilinx.com/esp/. For more details on Virtex-4 FPGAs, visit www.xilinx.com/virtex4/.
Printable PDF version of this article with graphics. (1/15/05) 200 KB |