After all of the excitement caused (primarily by the price tag of) VMware’s Nicira acquisition, we find that nobody including VMware/Nicira is the VMware of networking. VMware enabled server virtualization to take over the data center because it made it really easy to increase compute utilization by abstracting large numbers of servers/CPUs/Cores into a shared pool of compute resources that could be dynamically allocated among dynamic workloads. There is still a huge need and corresponding opportunity for somebody to abstract the data center network fabric into a shared pool of fabric resources in a way that supports the virtualized servers running distributed applications. Nicira’s approach abstracts the physical network and specific topology out of existence. They completely abdicate all responsibility for the performance of the network (other than claiming that they do overlay encapsulation reasonably quickly), taking the position that it’s just connectivity beneath their overlay. This is wonderful for VMware because it says they don’t have to do any kind of specific partnering with network equipment manufacturers to tell their virtual networking and virtual data center story. They have historically been really bad at partnering with anybody and the era of co-opetition with their frienemy Cisco has always been strained.
Nicira (and the other overlay solutions) can allow multiple “virtual networks” to share a common network foundation, with the only requirement being that the various servers have connectivity to send and receive packets. Most of the media and pundits seem to think this means that they magically make all networks sufficient to support virtualized and distributed computing applications. Quite the opposite, what this means is that there is no coupling between the overlay and the specific engineering and behavior of the underlying network, so communications performance will be crude and best achieved through massive overprovisioning. The overlay brings us right back to where we’ve always been, the best combination of networking boxes and network engineering will generate the best performance. The only difference is there is no coupling between network engineering and the virtual networks riding above it. This makes it more important, not less so, that the switches and routers be highly optimized and efficient and provides a better environment for network equipment manufacturers to compete on the price and performance of their solutions, rather than partnerships and channel relationships with hypervisor vendors. Soon we will have an environment where you can buy your overlay network from any one of several vendors and that story won’t provide any differentiation (superior integration into the orchestration products may however). In such an environment the differentiator for the cloud you build will be how good is the physical network sitting under your overlay and possibly how well can applications tunnel through the overlay control plane to impact the behavior of the physical fabric resources allocated to them. If enterprise A builds a routed uni-path network with traditional routers it will be sufficient for the communications needs of any overlay solution. but vastly inferior in supporting I/O intensive workloads than the network enterprise B builds with the latest multipath switching and state of the art topology cleverness.
Surprisingly, Cisco has rapidly figured this out. CTO Warrior’s recent comments represent the first noise in ages from the 800 pound gorilla that is (with respect to overlays) straight interpretation of reality and not some kind of misdirection. She said in her August 2nd blog entry, “First, SDN, network virtualization and overlay networks (choose your favorite descriptor) are not going to commoditize the underlying networking infrastructure. These architectures actually place more demands on the core infrastructure to enable network virtualization securely, with high performance, at scale”. This conflates multiple unrelated concepts (SDN, network virtualization and overlays) but has a thread of truth. Network virtualization via overlay (one way of implementing it, chosen by Nicira) does nothing to improve the performance of the data center network or to make it more tightly integrated with application I/O performance requirements. As a result, it does not commoditize the physical network implementation and actually increases dependency on a great solution. This does nothing to resolve the debate about whether an SDN architecture is a better way to build a high performance, low cost, application-responsive network. There are strong indications that central control plane intelligence and dumb, fast forwarding devices will support data center networks with the best price/performance. If one calls that SDN, it will indeed commoditize the hardware and paint Cisco into a corner with respect to margins and keeping wall street happy. Warrior is correct in stating that Overlays don’t diminish the opportunity to sell differentiated network boxes, but is either delusional or disingenous in suggesting that SDN doesn’t have potential to kill her cash cow.
So the VMware/Nicira acquisition changes nothing in the battle between traditional, vertically integrated high margin networking and new world SDN solutions from a horizontally integrated vendor ecosystem. If Cisco can figure out how to build expensive big iron boxes that collectively produce a better data center network than emerging SDN solutions, they will have an opportunity to continue business as usual. If one or more competitors figure out how to build a better fabric with cheap hardware and smart software Cisco and some other incumbent vendors are in deep trouble. Right now all of the major vendors are pushing Ethernet fabrics based on proprietary solutions, mostly close relatives of TRILL. This is resulting in a fairly homogeneous, undifferentiated Ethernet fabric switch market segment. It is hard to see how choosing one of these solutions (QFabric, FabricPath, VCS, etc.) is a better decision than any other. I’m waiting to see one of these suppliers do something really innovative to differentiate their fabric solution. This might be better wiring, topology, forwarding logic or it might be better operational automation or application integration. Today all of the solutions are based on scaling a 50 year old topology (Clos/fat-tree descendents) to modest scale by cloud standards ( None scale beyond twenty thousand or so fabric ports). If and when somebody can establish their fabric as a superior foundation to sit under the soon-to-be-ubiquitous overlay networks they will have a huge competitive advantage and cause the rest of the market to respond. Where is a fabric product with an innovative new topology, greater scale, lower latency, lower cost, higher threshold for packet loss or something truly meaningful?
Back to VMware/Nicira, we could ask whether VMware has gained a big competitive advantage in their business. If VMware had any history of moving innovation rapidly into the product, they might have a window of advantage among hypervisor suppliers by providing overlay networking. Based on VMware’s history of incredibly slow delivery of network innovation in the product line, I would not be surprised if moving Nicira into VMware slows them down enough that an aggressive push by the open source community will bring comparable overlay technology to the rest of the world in time to take away any such advantage. The wild-card is how much value is in the Nicira controller logic (managing the mapping between tunnel ingress/egress and the network addresses within the tenant virtual networks) and how hard is that to reproduce elsewhere versus how hard is it to integrate into VMwares orchestration products. I’m looking forward to watching this play out and keeping an eye out for some real innovation in the fabric space. And of course still hoping someone will figure out how to step up and be the VMware of networking. It’s not clear to me that VMware can even continue to be the VMware of server virtualization much longer. Clearly they see this and that is why they are trying to move up the stack into orchestration and talking about the virtual data center.