Xilinx launches new FPGA cards that can match GPU performance
Xilinx says its new FPGA card, the Alveo U50, can match the performance of a GPU in areas of artificial intelligence (AI) and machine learning.
Courtesy/News Source: networkworld.com
Xilinx has launched a new FPGA card, the Alveo U50, that it claims can match the performance of a GPU in areas of artificial intelligence (AI) and machine learning.
The company claims the card is the industry’s first low-profile adaptable accelerator with PCIe Gen 4 support, which offers double the throughput over PCIe Gen3. It was finalized in 2017, but cards and motherboards to support it have been slow to come to market.
The Alveo U50 provides customers with a programmable low-profile and low-power accelerator platform built for scale-out architectures and domain-specific acceleration of any server deployment, on premises, in the cloud, and at the edge.
Xilinx claims the Alveo U50 delivers 10 to 20 times improvements in throughput and latency as compared to a CPU. One thing's for sure, it beats the competition on power draw. It has a 75 watt power envelope, which is comparable to a desktop CPU and vastly better than a Xeon or GPU.
For accelerated networking and storage workloads, the U50 card helps developers identify and eliminate latency and data movement bottlenecks by moving compute closer to the data.
Xilinx Xilinx Alveo U50
The Alveo U50 card is the first in the Alveo portfolio to be packaged in a half-height, half-length form factor. It runs the Xilinx UltraScale+ FPGA architecture, features high-bandwidth memory (HBM2), 100 gigabits per second (100 Gbps) networking connectivity, and support for the PCIe Gen 4 and CCIX interconnects. Thanks to the 8GB of HBM2 memory, data transfer speeds can reach 400Gbps. It also supports NVMe-over-Fabric for high-speed SSD transfers.
That’s a lot of performance packed into a small card.
What the Xilinx Alveo U50 can do
Xilinx is making some big boasts about Alveo U50's capabilities:
Deep learning inference acceleration (speech translation): delivers up to 25x lower latency, 10x higher throughput, and significantly improved power efficiency per node compared to GPU-only for speech translation performance.
Data analytics acceleration (database query): running the TPC-H Query benchmark, Alveo U50 delivers 4x higher throughput per hour and reduced operational costs by 3x compared to in-memory CPU.
Computational storage acceleration (compression): delivers 20x more compression/decompression throughput, faster Hadoop and big data analytics, and over 30% lower cost per node compared to CPU-only nodes.
Network acceleration (electronic trading): delivers 20x lower latency and sub-500ns trading time compared to CPU-only latency of 10us.
Financial modeling (grid computing): running the Monte Carlo simulation, Alveo U50 delivers 7x greater power efficiency compared to GPU-only performance for a faster time to insight, deterministic latency and reduced operational costs.
The Alveo U50 is sampling now with OEM system qualifications in process. General availability is slated for fall 2019.