THE HAGUE, Netherlands — Google’s network has five control systems, and in an SDN World Congress keynote Thursday, Vijoy Pandey, Google‘s head of engineering for networking, explained why each one is needed.
It’s all meant to bring Google to a state the company calls Cloud 3.0, where applications are dropped into the cloud without regard for things like job placement or load balancing. Cloud 3.0 is what happens “when you blur the boundaries between servers and make it all one big compute resource,” Pandey said.
That’s in contrast to the cloud today, where dropping an application into the cloud means you have to do all the configuration and scheduling as well. “It’s very manual, and it’s very painstaking in some ways,” Pandey said.
Elements of Pandey’s talk, particularly the Cloud 3.0 pitch, have appeared in other talks. But given this venue, he was willing to get into details about the architecture, explaining how software-defined networking (SDN) plays a role in multiple levels of Google‘s overall network.
In the data center, Google has two controllers. The first, an OpenFlow-based SDN implementation called Jupiter, was instituted in 2012. It uses a centralized distributed controller, called Firepath, which speaks to Firepath agents that sit in every switch.
The network became more agile with the addition of network virtualization, based on a second controller named Andromeda (not the same as the operating system making headlines this month). It’s necessary because “in a physical infrastructure, you are limited in how you can interact with the network,” Pandey said.
“It also allows us to connect customer VMs [virtual machines] to Google services, like big data services,” he added.
The other controllers are related to the wide-area network.
Controller No. 3 is associated with B4, the network that connects Google’s data centers. Google has been discussing B4 for a few years now, including the use of Google-designed switches to build the network, and it’s the oldest of the networks mentioned here, having been launched in 2010.
B4 uses OpenFlow to control the fabric that connects all the data centers. Bandwidth on the links between data centers is managed by a fourth controller, which appears to simply be called the TE controller. (This one doesn’t get a hip code name. “TE” stands for traffic engineering.) It uses policies to make decisions about handling traffic.
Is that all? Of course not. “The one thing that we have to be careful about is: WAN links are very, very expensive, and we need to utilize them really, really well,” Pandey said.
That leads to the fifth controller, called BwE (pronounced “byoo-ee”), which stands for bandwidth enforcement. This comes into play because B4’s switches intentionally have shallow memory buffers. That is, they expect to be able to spew traffic onto the network quickly and don’t have space to “save” a traffic flow for a few moments.
BwE runs a centralized bandwidth-allocation algorithm, with decisions enforced at the host (meaning the server running an application) “because that’s where you actually have buffer space.” It can spot which users are stuck on bottlenecked links, and it can also permit the sharing of WAN bandwidth between users.
BwE also feeds the usage information it collects back into the TE controller, to further help with decisions related to B4 paths.
All of this adds up to “a server-to-server connection that is completely SDN-based,” he said. It’s the result of thinking of the network as a single product rather than a series of boxes, and it springs from a philosophy of thinking about that network globally.
“When you’re optimizing your systems, don’t optimize the WAN separate from the data center, separate from your virtualization story. Think how all of these can optimize end-to-end,” he said.