SAN JOSE, California — A recurring topic during the keynotes of the Open Compute Project (OCP) summit this week was the massive increase in east-west traffic within data centers. To cope with this traffic, hyperscale data center operators are plotting their moves from 100 Gb/sec Ethernet to 400G. But that’s going to take awhile. To deal with the traffic in the shorter term, Facebook innovated a distributed network system called the Fabric Aggregator.
According to Omar Baldonado, director of software engineering at Facebook, who spoke at the OCP summit yesterday, Facebook’s external traffic is dwarfed by its internal east-west traffic.
All of the social media company’s traffic, whether east-west or north-south, is handled by the fabric aggregation layer, according to a Facebook blog. But traffic growth is putting pressure on the fabric in terms of port density and capacity per node.
“There are multiple locations in the network where we process data, but the majority of the workload we handle is in the data center,” said Sree Sankar, a technical product manager with Facebook. “There is lots of interaction between compute and servers in the data center. And there are multiple buildings in a region, and each building has several network fabrics. These need to talk to each other through a fabric aggregation layer, comprised of several aggregation nodes.”
Sankar said the company has seen a tremendous increase in east-west traffic. “We needed at least three times more capacity,” she said. “We were already using the largest switch, so there was nothing we could do to solve this problem. We had to innovate.”
To keep up with its traffic growth, the company designed the Fabric Aggregator as a replacement to a general-purpose network chassis. The Fabric Aggregator is a distributed network system made up of simple building blocks. It stacks together multiple Wedge 100S switches, the same switch Facebook already uses. And then it runs the Facebook Open Switching System (FBOSS) on top. And the company developed four backplane cabling options to emulate the backplane of a classic chassis. Its specifications for all of the backplane options have been submitted to OCP.
The Fabric Aggregator runs Border Gateway Protocol (BGP) between all subswitches with no central controller. “Each subswitch operates independently, sending and receiving traffic without any interaction or dependency on other subswitches in the node,” according to the blog. “With this approach, we can scale the capacity, interchange the building blocks, and change the cable assembly quickly as our requirements change.
“The key design criteria was flexibility,” said Sankar. “We needed one solution that could scale up and out to satisfy needs in our data centers. And we needed to solve for it in a very short period of time. We have deployed this in our data center regions over the last nine months. It redefines the way we handle network capacity.”
400G and Beyond
At the OCP Summit, Andy Bechtolsheim, chief development officer at Arista Networks, talked about the ramp-up to 400G for data centers. He said there is a vast amount of network traffic between data center servers. “The easiest way to go faster is to take advantage of Ethernet,” he said.
In its most recent earnings call with investors, Arista CEO Jayshree Ullal said, “400 gig is going to be very important in certain use cases, and you can expect Arista is working very hard at it. The mainstream 400 gig market is going to take multiple years. I believe initial trials will be in 2019. But the mainstream market will be even later. And just because 400 gig comes, by the way, doesn’t mean 100 gig goes away. They’re really going to be in tandem.”
But even as 400G is being developed, some futurists such as Bechtolsheim are already eyeing the next thing. “Today, almost all optics plug into the front of the switch,” he said. “What would happen if you could put the optics on the switch itself?” He’d like to see some standards for this development. “OCP can play a major role promoting optics standards that are good for cloud networks,” he said.
And Facebook’s Baldonado concured, “Co-packaged optics is a solution we strongly believe in.”