A great deal of attention has historically been paid to Ethernet speed transitions in the market. Networking vendors, consumers, and industry analysts closely follow these transitions, because they can trigger new technology buying cycles and periods of rapid change in the Ethernet performance-cost curve.
The rise of cloud computing and scale-out data centers has driven the latest Ethernet speed transitions, evidenced by the explosive growth in server-facing 10-Gbit/s ports this decade, and more recently the breakout in 40-Gbit/s Ethernet deployment — particularly in the leaf-to-spine layer of the data center — to an expected 2.5 million-plus ports in 2014. As big data becomes bigger, virtual machines grow in number, and cloud workloads become more demanding, it is expected that the largest cloud operators will soon shift to 100-Gbit/s Ethernet fabrics for the spine layer of their networks.
But what happens to the server- and storage-facing Ethernet downlinks when leaf-to-spine optical links migrate to 100-Gbit/s Ethernet and CPU/storage endpoints demand greater than 10-Gbit/s network connections? These downlinks represent the largest number of cables deployed in mega-scale data centers (MSDCs), where cabling costs dominate.
The IEEE 802.3 standard defines 40-Gbit/s Ethernet as the next higher link speed after 10-Gbit/s Ethernet, but the current standard uses four physical lanes running at 10 Gbit/s to enable communication between link partners. That’s four times the number of physical channels on a server’s network interface controller (NIC), four times the amount of copper wiring in the cables to the top-of-rack (ToR) switch, and four times the number of serializer/deserializer (serdes) lanes consumed on the switch. The MSDC operator gets four times the performance of 10-Gbit/s Ethernet, but is paying for it with four times the number of physical connectivity elements. Moreover, if data center workloads are just exceeding the 10-Gbit/s threshold per endpoint, 40 Gbit/s may be overkill given the amount of interconnect tax it requires.
The Case for 25G and 50G Ethernet
With chip technologies evolving to support 100-Gbit/s Ethernet, a more optimal solution exists for in-rack connectivity using fractional components of the 100-Gbit/s link itself. Similar to 40-Gbit/s Ethernet, 100-Gbit/s Ethernet standards (CAUI-4, 100GBASE-CR4, 100GBASE-SR4) use four physical lanes to enable communication between link partners, but those four lanes are now running much faster, at 25 Gbit/s each. Using the same per-lane breakout principles at 100 Gbit/s, one or more 25-Gbit/s Ethernet ports can be used to connect server/storage endpoints with the ToR switch.
A new 25-Gbit/s Ethernet standard is exactly what cloud networking companies including Broadcom, Arista Networks, Google, Microsoft, and Mellanox Technologies are advancing as part of a new industry organization whose mission is to improve Ethernet performance and cable plant efficiency in large-scale data centers.
The 25 Gigabit Ethernet Consortium proposes a new Ethernet specification at 25 Gbit/s and 50 Gbit/s that leverages the advent of 25-Gbit/s silicon technology as well as existing IEEE standards at 10, 40, and 100 Gbit/s to define new high-speed downlink ports between the ToR and NIC. The 25/50G specification enables the most cost-efficient scale-up of server and storage bandwidth beyond 10 Gbit/s, while mating optimally with 100-Gbit/s uplinks to the rest of the data center network. The Consortium — founded earlier this month and already in the process of doubling its number of member companies — has made its 25-Gbit/s and 50-Gbit/s Ethernet specification open to all data center ecosystem vendors or consumers who join the Consortium, providing the ability for anyone to create and deploy compliant, interoperable implementations of the specification, royalty-free.
This group of influential cloud technology players is seeking not to redefine Ethernet, but rather to extrapolate from existing Ethernet standards to address a key performance/cost optimization point for rack-level interconnect – and in the process offer the industry a more compelling roadmap for network speeds and feeds in the data center.
In terms of numbers, the 25-Gbit/s and 50-Gbit/s Ethernet links defined by the 25 Gigabit Ethernet Consortium’s specification provide 2.5 times faster performance per serdes lane and twinax copper wire than existing 10-Gbit/s or 40-Gbit/s Ethernet connections. This can equate to more than a 50 percent savings in rack interconnect cost per unit of bandwidth, and translates to a significant impact on an MSDC operator’s bottom line. It also increases network scale (sometimes called radix) and accommodates higher server density within the rack than what is currently achievable with 40-Gbit/s Ethernet server-to-ToR links.
Faster Performance, Higher Radix, Same Inexpensive Copper Cables
The specification prescribes a 25-Gbit/s Ethernet link using a single switch/NIC serdes lane to provide 2.5 times the bandwidth of a 10-Gbit/s link over the same number (two) of twinax copper pairs used today in SFP+ direct-attach copper (DAC) cables. A 50-Gbit/s Ethernet link uses two switch/NIC serdes lanes running at 25 Gbit/s to deliver 25 percent more bandwidth than a 40-Gbit/s Ethernet link while consuming half the number (four) of twinax copper pairs compared to current QSFP+DAC cabling. The switch and NIC silicon runs faster and consumes slightly more power at 25 Gbit/s per lane; but when normalized for bandwidth, the cost and power advantage is expected to be substantial.
At the rack endpoint, next-generation NICs based on 25-Gbit/s capable silicon would present single or dual 25- or 50-Gbit/s Ethernet ports to the server or storage node from the ToR switch. On the ToR front panel, compact SFP28 style connectors and 3m DAC cables can be used for 25-Gbit/s downlinks, along with QSFP28 ports for uplinks. Alternatively, QSFP28 4x25G interfaces can be placed across the entire switch front panel, so that the assignment of server downlinks and 100-Gbit/s network uplinks is fully configurable by the user, and DAC breakout cabling can be used to flexibly deliver 25-Gbit/s (one-lane) or 50-Gbit/s (two-lane) Ethernet ports to the endpoints. The cabling topologies are fundamentally the same ones used today with high-density SFP+ and QSFP+ equipped switches. Since 25-Gbit/s and 50-Gbit/s Ethernet can use inexpensive two- or four-pair twinax copper cables for rack-level interconnect, operators achieve the most advantageous cabling economics.
There is another benefit in having QSFP28 4x25G interfaces capable of breaking out to industry-standardized 25-Gbit/s and 50-Gbit/s Ethernet ports: the ToR switch will be able to maximize port density within the space constraints of a 1U front panel. Today’s industry deployed 1U ToR switches with all pluggable QSFP+ cages max out supporting 32 ports of 40-Gbit/s Ethernet. The migration of these interfaces to QSFP28 with per-lane breakout capability will support higher 25-Gbit/s and 50-Gbit/s Ethernet port densities in a 1U ToR switch faceplate, enable connectivity to the highest rack server densities being deployed today, and maximize the radix of the network in both the uplink and downlink directions.
The MSDC operators, who crunch some of the greatest amounts on data traffic on the Internet, build out their networks according to carefully planned technology roadmaps that scale optimally in the dimensions of both performance and capex/opex conservation. The founding members of the 25G Ethernet Consortium believe that a departure from the previously established 10-, 40-, and 100-Gbit/s Ethernet interconnect roadmap is warranted for the access layer of the data center, and are working to enable widespread 25-Gbit/s and 50-Gbit/s Ethernet port deployment to go hand-in-hand with the rollout of 100-Gbit/s Ethernet for cloud fabrics. This could very well trigger the next inflection point in the Ethernet performance-cost curve.