In today's world, a speedy transfer of data is critical to accessing information efficiently. Datacenter services, such as search, storage, database, financial or high transaction rate applications, are latency sensitive and bandwidth hungry. In addition, mobile applications, gaming, artificial intelligence (AI), machine-learning (ML) workloads, and over-the-top (OTT) video all use low latency to improve user experience.
Modern datacenters are struggling to meet the demand for high bandwidth with extremely low latency. RoCEv2, the latest development on Remote Direct Memory Access (RDMA), is the ideal technology to address requirements of high performance, low latency, and a low-cost data transfer network. It increases efficiency in existing network infrastructure, improves overall CPU utilization for running applications and host memory usage. This results in reduced power, cooling, and rack space requirements with lower cost of ownership and higher return on investment for organizations.
Brief history of RoCEv2
RDMA is an innovative technology that offers an efficient and fast way to move data between networked computers without involving their Operating System and CPU resources, thus improving performance of the hosts by reducing CPU load and the network performance with lower latency and higher bandwidth.
RDMA was invented in 1993 as a concept and initially applied to create low-cost supercomputer using distributed computing. InfiniBand was among the first to develop this concept into awhich later evolved into the favored networking technology for high-performance computing (HPC) cluster design. This technology was productized by Mellanox Technologies (now ) in 2001.
At that point, RDMA was supported only over Infiniband which worked extremely well but its adoption was limited. RDMA started getting noticed in 2010 when(IBTA) brought the RDMA technology over popular Ethernet networks. RDMA over Converged Ethernet (RoCE) brings all the advantages of RDMA in existing Ethernet networks.
This move made RDMA popular as it saved massive amount of capital expenditures for replacing Ethernet with Infiniband. RoCE is limited to Layer 2 domain and finally upgraded with routing capability in 2014 when IBTA introduced RDMA over Converged Ethernet version 2 (RoCEv2). RoCEv2 enables routing functionality by changing the packet encapsulation to include IP and UDP headers. With RoCEv2, RDMA technology now can be used across both L2 and L3 networks with multiple subnets. This allows efficient clustering for elastic and scale out
Why invest in testing RoCEv2 networks?
To help improve network efficiency, many organizations started deploying RoCEv2 in their datacenters. In comparison to TCP/IP, RoCEv2 not only improves performance of the host machines by freeing up CPU resources, but also increases the bandwidth availability and reduce latency of the network.
Switch fabric performance is key in datacenters for achieving high bandwidth and extremely low latency. Incorrect or non-optimized network settings in a high scale datacenter can result in poor application or storage performance. To maximize the benefits of RoCEv2 deployments in datacenter network, the underlying interconnect must be optimized by running realistic traffic and measuring the relevant network KPIs (like throughput, end-to-end latency, frame loss, network stability) with varying switch/network settings (buffer size, QoS settings).
Thus far, the need to test RoCEv2 network performance has been underrated and testing was limited to functional validation of servers and Host Channel Adapters (HCA). Now the rapid influx of data into the datacenters, is propelling RoCEv2 deployments in large scale and the need for network performance validation and optimization has finally started receiving due attention.
Testing the RoCEv2 switch fabric
Up until recently, RoCEv2 testing has been carried out in homegrown test beds with physical servers or open-source solutions to validate functionality. While this worked well in small scale, it falls short of addressing the scale and efficiency requirements of a modern datacenter. Building a test bed with racks of server is expensive and lacks the scalability to keep up with the growing performance demands of a network. Hundreds of physical servers in a setup are also difficult to manage effectively and the test results are limited and not repeatable.
The right solution for testing RoCEv2 switch fabric should be able to emulate a real network at high scale, replacing the need for racks of servers in the test bed and reducing costs significantly. To achieve maximum performance, the switch or network should be stressed to its limits. To do that, the test solution should be able to generate RoCEv2 traffic at line rate consistently in a realistic manner, with a mix of bursty and continuous traffic flow, to validate per queue pair (QP) congestion control and verify lossless packet delivery with Priority Flow Control (PFC). It is also important to provide relevant measurements, repeatability of tests. Most importantly, the solution should easily scale up to keep up with the incremental demand for high performance.
A highly scalable RoCEv2 storage fabric test solution
There is growing demand to accurately measure performance of RoCEv2 switch fabric in datacenters. To address this, Spirent introduced an innovative solution withand high-density multi-speed FX3/MX3 test modules. These test modules are already proven in the industry for performance and have a large install base.
RoCEv2 testing capability is offered for the FX3/MX3 high-density test modules via a firmware upgrade. RoCEv2 Traffic generation and congestion control is built in hardware to ensure reliable traffic rate and low latency. The solution is highly scalable, a fully loaded chassis can generate up to 3.6 terabits of RoCEv2 traffic and offers the flexibility to run popular performance benchmarking methodologies (e.g., RFC 2544) in the same test setup.
Learn more about the industry’s highest-density test solution for .