RDMA over Converged Ethernet (RoCE)
I didn’t
know anything about RoCE till weeks ago when a sales engineer told me
about this technology. It’s amazing. Actually, I’m studying these
days how to configure RoCE and I will end up installing and
deploying this technology. However, I’ve realised RoCE uses the
Data
Center Bridging (DCB) standard, which has features such as
Priority-based Flow Control (PFC), Enhanced Transmission
Selection (ETS), Data Center Bridging Capabilities Exchange
Protocol (DCBX) and Congestion Notification. All of them
useful for RoCE.
If
we want to understand RoCE, firstly, we should know about InfiniBand.
The
first time I heard about InfiniBand was two or three years ago when
Ariadnex
worked for CenitS
in a project of supercomputing. They have 14
Infiniband Mellanox SX6036
switches with 36 56Gbps FDR ports and 3
InfiniBand Mellanox IS5030 switches
with 36 QDR ports 40Gbps for computing
network. Therefore, we will see most
InfiniBand networks in High-Performance Computing (HPC) systems
because HPC systems require very high
throughput and very low latency.
CenitS Lusitania II |
RoCE
stands for RDMA over Converged Ethernet
and RDMA stands for Remote Direct Memory Access.
This last technology,
RDMA, was only known in the InfiniBand community but,
lately, it’s increasingly known because we can also
enable RDMA over Ethernet
networks which is
a great advantage because we
can achieve high throughput and low latency. Thanks
to RDMA over Converged Ethernet (RoCE), servers can send data from
the source application to the
destination application
directly, which increases considerably
the network performance.
RDMA over Converged Ethernet (RoCE) |
Clustering,
Hyper-Convergence Infrastructure (HCI)
and Storage solutions can
benefit from performance improvements provided by RoCE. For instance,
Hyper-V deployments are able to use SMB 3.0 with the SMB Direct
feature, which can be combined with RoCE adapters for fast and
efficient storage access,
minimal CPU utilization for I/O processing, and high throughput with
low latency. What’s more,
iSCSI extensions for RDMA, such as iSER, and NFS over RDMA are able
to increase I/O operations per second (IOPS), lower latency and
reduced client and server CPU consumption.
RDMA support in vSphere |
In
addition to RoCE and InfiniBand, the
Internet Wide Area RDMA Protocol (iWARP) is another option for
high throughput and low latency. However,
this protocol is less used than RoCE and InfiniBand. In fact, iWARP
is no longer supported in new Intel NICs and the latest Ethernet
speeds of 25, 50 and 100 Gbps are not available for
iWARP. This
protocol uses TCP/IP to deliver reliable services, while
RoCE uses UDP/IP and DCB for congestion and flow control.
Furthermore, I think it's
important to highlight that these technologies are not compatible
with each other. I mean, iWARP adapters can only communicate with
iWARP adapters, RoCE adapters can only communicate with RoCE adapters
and InfiniBand adapters can only communicate with InfiniBand adapters.
Thus, if there is an
interoperability conflict, applications will revert to TCP without
the benefits of RDMA.
RoCE and iWARP Comparison |
To
sum up, RDMA was only used for High-Performance Computing (HPC)
systems with
InfiniBand networks but thanks to converged Ethernet networks, and
protocols such as RoCE and iWARP, today, we can also
install clusters,
Hyper-Convergence Infrastructures
(HCI) and storage solutions with high throughput and low latency in
the traditional Ethernet network.
Keep
reading and keep studying!!
Commentaires
Enregistrer un commentaire