Cloud data centers are changing. Today’s CPUs have not been able keep up with today’s compute-intensive applications like machine learning, data analytics, and video processing. Coupled with increasing bottlenecks in networking and storage, cloud service providers have turned to accelerators to increase the overall throughput and efficiency of their cloud data centers.
Major cloud service providers like Amazon, Baidu, and Microsoft have announced deployment of FPGA technology in their Hyperscale data centers to drive their services business in an extremely competitive market. FPGAs are the perfect complement to highly agile cloud computing environments because they are programmable and can be hardware-optimized for any new application or algorithm.
The inherent ability of an FPGA to reconfigure and be reprogrammed over time is perhaps its greatest advantage in a fast-moving field. Using dynamic reconfiguration, FPGAs can quickly change – in less than a second -- to a different design that is hardware-optimized for its next workload. As a result, Xilinx FPGAs can deliver the flexibility, application breadth and feature velocity that complex and constantly changing hyperscale applications need – something that CPUs and custom ASICs cannot achieve.
Customers - Three of the top seven hyperscale cloud companies have deployed Xilinx FPGAs, including Baidu, which in October announced it had designed Xilinx UltraScale™ FPGA in pools to accelerate machine learning inference.
Partnerships - Both Qualcomm and IBM announced strategic collaborations with Xilinx for data center acceleration. The IBM engagement already has already resulted in a storage and networking acceleration framework, CAPI SNAP, making it easier for developers to accelerate applications such as NoSQL using Xilinx FGPAs.
Standards Leadership - Xilinx has been leading an industry initiative toward the development of an intelligent, cache coherent interconnect called CCIX. Formed in May 2016 by Xilinx along with AMD, ARM, Huawei, IBM, Mellanox, and Qualcomm, the initiative’s membership has since tripled in five months.
Software-Defined Tools and Products for the Data Center - The SDAccel™ Development Environment for FPGA acceleration was released in 2014. In November 2016 Xilinx unveiled details for new 16nm Virtex UltraScale+™ FPGAs with High Bandwidth Memory (HBM) and CCIX technology.
The new Xilinx Reconfigurable Acceleration Stack enables the world’s largest cloud service providers to develop and deploy acceleration platforms at cloud scale and delivers ultimate flexibility for complex cloud computing applications like machine learning, data analytics, and video transcoding. Designed for cloud-native applications, this FPGA-powered acceleration stack includes libraries, framework integration, developer board and resources, and OpenStack support while providing up to 40x better compute efficiency of a CPU and up to six times the compute efficiency of any other FPGA on the market today.
Hear the latest on FPGA acceleration in hyperscale data centers from the Xilinx R&D team.
Review the Xilinx technical paper on “Deep Learning with INT8 Optimization on Xilinx Devices”.
Learn about FPGA acceleration in the Amazon Cloud.
Get started now with a cloud-based test drive.
Sign up to be notified of acceleration news and updates from Xilinx.
DNN -- Deep Neural Network (DNN) library from Xilinx is a highly optimized library for building deep learning inference applications. It is designed for maximum compute efficiency at 16-bit and 8-bit integer data types.
GEMM -- General Matrix Multiply (GEMM) library, based on the level-3 Basic Linear Algebra Subprograms (BLAS), from Xilinx delivers optimized performance at 16-bit and 8-bit integer data types and supports any matrices of any size.
HEVC Decoder & Encoder -- HEVC/H.265 is the latest video compression standard coming out of the MPEG and ITU standards bodies. It is the successor to H.264 and offers up to 50% bandwidth reduction. Xilinx provides two encoders – a high quality, real-time and flexible encoder to address the majority of video data center workloads and an alternate for non-camera generated content. The decoder supports all the applications for both encoders.
Data Mover (SQL) – The SQL data mover library makes it easy to accelerate data analytics workloads with a Xilinx FPGA. The data mover library orchestrates standard connections to SQL databases by sending blocks of data from the database tables to the on-chip memory of the FPGA accelerator card over PCIe. The library has been optimized to maximally utilize PCIe bandwidth between the host CPU and the accelerator functions on the FPGA device.
Compute Kernel (SQL) – A library that accelerates numerous core SQL functions on the FPGA hardware such as decimal type, date type, scan, compare, filter and many others. The compute functions are optimized for exploiting the massive hardware parallelization of FPGAs.
|Virtex UltraScale+ FPGA VCU1525 Acceleration Development Kit||Developer Evaluation||Xilinx|
|Kintex UltraScale FPGA Acceleration Development Kit||Developer Evaluation||Xilinx|
|Bittware PCIe Boards
|Alpha Data ADM-PCIE-KU3||Production||Alpha Data|
|Alpha Data ADM-PCIE-7V3||Production||Alpha Data|
|Semptian NSA-120 Accelerator Card||Production||Semptian
|Storage Acceleration Cards (NVMeoF)||Production||Fidus|
Faced with exponential growth in computing requirements and the inability for CPU technology to keep pace, cloud and data center architectures are moving toward accelerated computing. Accelerators compliment CPU-based architectures and deliver both performance and power efficiency.
FPGAs can deliver 10x acceleration across a broad set of applications and are reconfigurable to provide an ideal fit for the changing workloads of the modern data center.
With acceleration capabilities a full generation ahead of any other FPGA, Xilinx UltraScale™ and UltraScale+ FPGAs are empowering hardware and application developers in many of the world’s largest and most innovative cloud computing services.
The SDAccel™ development environment for OpenCL™, C, and C++, enables up to 25X better performance/watt for data center application acceleration leveraging FPGAs. SDAccel, member of the SDx™ family, combines the industry's first architecturally optimizing compiler supporting any combination of OpenCL, C, and C++ kernels, along with libraries, development boards and the first complete CPU/GPU like development and run-time experience for FPGAs. To learn more vist the SDAccel Zone.
|FPGA Startup Gathers Funding Force for Merged Hyperscale Inference||This article discusses FPGA-based architecture that targets efficient, scalable machine learning inference from startup DeePhi Tech.|
|ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA||FPGA2017 Best Paper winner for breakthrough results with a highly efficient FPGA-accelerated speech recognition engine achieving 43x the performance and 40x the performance per watt compared to a CPU; 3x the performance and 11x the performance per watt compared to a GPU.|
|Power-Efficient Machine Learning on POWER Systems using FPGA Acceleration||This session provides an overview of how FPGA acceleration can enhance POWER systems for machine learning workloads such as image recognition.|
|Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster||This paper presents a deeply pipelined multi-FPGA architecture that expands the design space for optimal performance and energy efficiency.|
|From Model to FPGA: Software-Hardware Co-Design for Efficient Neural Network Acceleration||This presentation discusses the use of FPGAs and trends in neural network acceleration.|
|Baidu Takes FPGA Approach to Accelerating SQL at Scale||This article discusses Baidu’s approach to big data challenges using FPGAs.|
|SDA: Software-Defined Accelerator for general-purpose big data analysis system||This presentation discusses Baidu’s Software-Defined Accelerator for a general-purpose big data analysis system.|
|SDA: Software-Defined Accelerator for Large-Scale DNN Systems||This article consists of a collection of slides from the author's conference presentation on the special features, system design and architectures, processing capabilities, and targeted markets for Baidu's family of software defined accelerator products (SDA) for large scale deep neural network (DNN) systems.|
||A community for discussing topics related to the SDAccel™ Development Environment for OpenCL™, C, and C++|