RoCE (RDMA over Converged Ethernet) raises many questions when practical deployment issues and limitations are encountered. The answers to the questions arising on RoCE are almost always cause of concern to potential users. The standard for RDMA over Ethernet is iWARP (Internet Wide Area RDMA Protocol), which uses the familiar TCP/IP stack as foundation. High performance iWARP implementations are available and compete directly with InfiniBand in real application benchmarks. iWARP:
- allows use of existing hardware.
- lives alongside existing applications.
- uses existing management and monitoring tools.
- is supported in the same OpenFabrics Enterprise Distribution as IB for Linux, and is similarly available on Windows and BSD systems for a drop-in Ethernet replacement of IB.
The truth about the capabilities and limitations of RoCE hasn’t exactly been forthcoming, leaving customers interested in iWARP with many unanswered questions. A list of answers to those frequently asked questions and comparison of RoCE Vs. iWARP are compiled and explained here:
Is RoCE the standard RDMA over Ethernet protocol?
NO – The IETF standard for RDMA is iWARP. It provides the same host interface as InfiniBand and is available in the same OpenFabrics Enterprise Distribution (OFED).
Does RoCE lower CPU utilization?
YES – However, applications get lower CPU utilization because of RDMA, not RoCE.
Using iWARP provides the same benefits.
Does RoCE reduce memory copies?
YES – Again, zero copy is a benefit of RDMA, also provided by iWARP.
Does RoCE allow user-space I/O?
YES – Similarly to zero copy, user-space I/O is a benefit of RDMA, also provided by iWARP.
Is RoCE an alternative to InfiniBand?
NO – Although RoCE gives good micro-benchmarks results, it lacks critical pieces of the IB stack and is neither scalable nor competitive as an Ethernet solution.
Is RoCE more efficient than iWARP?
NO – Both protocols have similar header sizes and hardware TCP/IP implementations provide similar performance to InfiniBand, without all the limitations.
Is RoCE easy to deploy and use?
NO – Unlike iWARP, RoCE requires a complicated layer-2 configuration for lossless operation, and has been found to be very difficult to deploy, even by experienced IT staff.
Does RoCE take advantage of Ethernet economies of scale?
NO – Unlike iWARP, RoCE does not operate with standard switches, and requires the more expensive, DCB capable types.
Does RoCE inter-operate with switches from different vendors?
NO – RoCE does not work with non-DCB switches, and depends on configuring Priority Flow Control consistently throughout the network, which adds many inter-operability challenges.
Can RoCE share a channel with other traffic?
NO – RoCE is very sensitive to packet drop and requires a dedicated priority channel for its traffic.
Does making QoS configuration changes to a switch affect RoCE operation?
YES – If a switch is configured to treat QoS classes differently than expected, it may easily result in the collapse of RoCE performance.
Does RoCE restrict QoS traffic marking configuration?
YES – Any switch configured to re-mark traffic priority can break down the uniformity of RoCE frame treatment in the network, and result in dismal performance.
Does RoCE scale?
NO – A RoCE network must have PAUSE enabled in all switches and end-stations, which limits the deployment scale of RoCE to single hop at best.
Does RoCE real application performance match micro-benchmarks?
NO – Although RoCE may perform well in simple single hop scenarios, real application performance can fall short of expectations, particularly when network hotspots are involved.
Does RoCE operate over long distance links?
NO – PFC limits RoCE operation to a few hundred meters.
Does RoCE operate over WAN links or cross subnet boundaries?
NO – PFC does not operate beyond a subnet.
Is RoCE routable?
NO – Although RoCE may use IPv6-like addresses, RoCE does not use a standard IP header and cannot be routed by standard IP routers.
Can I use standard traffic management and monitoring tools for RoCE ?
NO – Most traffic management and monitoring tools have been developed for IP applications. RoCE does not use IP and therefore is unrecognized by existing tools.
Can I configure RoCE congestion management to suit my environment?
NO – The congestion management layer for RoCE is non-existent, RoCE being completely dependent on PAUSE for operation.