A few weeks ago, I had the opportunity to participate in Sigcomm 2013, an annual conference addressing applications, technologies, architectures and protocols for computer communications. Having dedicated the better part of my last seven years to software defined networking (SDN), I was honored to submit a joint abstract with Professor Randy Katz of UC Berkeley, Cisco Systems’ Dino Farinacci, and David Meyer of Brocade Communications, for the HotSDN workshop session entitled, “Software Defined Flow-Mapping for Scaling Virtualized Network Functions.”
Although a first-time attendee, I found the event truly beneficial and a significant step towards the exploration and understanding of SDN.
While the initial SDN story of separating network control and forwarding forced a fresh perspective on computer networking, it was clear we are now moving beyond the simplistic controller story and approaching the real challenges of SDN: scalability, consistency, and resiliency, especially for those dealing with distribution-scale bar of carrier networks and the mass affinity and identity bar of using SDN to connect Virtual Network Functions (VNFs). Many questions weighed heavy on the minds of attendees, including: How can we design switches and programming interfaces that offer greater flexibility without compromising performance? How do we design new applications that capitalize on the programmability of the network? And how can SDN interoperate with existing protocols and devices?
Fault Tolerance and Hop-by-Hop Flow Setup
The HotSDN workshop focused on recent research and developments related to SDN and featured an array of thought-provoking presentations and workshops. One session I found particularly relevant in regards to the aforementioned SDN challenges was, “FatTire: Declarative Fault Tolerance for Software Defined Networks.” This session included Mark Reitblatt, Arjun Guha, and Nate Foster, all of Cornell University, and Marco Canini of TU Berlin/T-Labs, who all did an excellent job describing the hop-by-hop flow setup problem.
Assuming the simplest possible network recursion, a source cloud is connected to a destination cloud through two possible switches: S1 and S2. However, when connecting source to destination, the traffic has to go through a middle function, M, which is conveniently connected to both S1 and S2. The number of flow configurations needed to set and protect flow segments from source to destination assuming any link can fail (source to S1/S2, S1/S2 to M, and S1/S2 to destination) quickly accumulates, picking up redundant and massive configuration for such a simple topology.
By addressing the complexity through the use of a declarative model describing the constraints, the contributors offered a unique way of addressing the issue. In a larger scheme, the solution may not necessarily be the most practical, as the hop-by-hop flow problem may be inherently exponential, but it will be fascinating to watch the progression.
On Data Stores and Network CAPs
Another highlight for me, (and apparently many others, since it received the coveted “Best Paper Award”), was entitled, “Towards an Elastic Distributed SDN Controller,” contributed by Advait Dixit and Ramana Kompella of Purdue University, and Fang Hao, Srit Mukherjee, and T.V. Lakshman of Bell Labs and Alcatel-Lucent. This session offered a great explanation of how a shared, distributed data store can be used to federate a monolithic controller to many fault-tolerant mini controllers.
This approach makes perfect sense as organizations enroll IT software technologies, such as shared-database distribution, into the core networking control. It also brings up an interesting issue reminiscent of the classic chicken-and-egg debate. This is how a network can be used to realize a distributed data store, which in turn, is used to control the network.
Additionally, having powerful distributed concurrent control architecture doesn’t alleviate the need to make sure the controller is not tasked with an NP-complete (exponential) task, (as reiterated in the session mentioned above).
Last, but certainly not least, was a session from Aurojit Panda, Colin Scott, Scott Shenker, and Ali Ghodsi of UC Berkley, and Teemu Koponen of VMware. “CAP for Networks” addressed a famous problem from the distributed databases space: the CAP problem for SDNs. To put it simply, you can only have two out of the three C-A-P system aspects. Meaning that given the possibility of partition (P), you can either have consistency (C) or availability (A) qualities, but not both.
In the database space, SQL typically chooses one trade-off set of strong consistency, while NoSQL “Internet” databases typically choose eventual consistency. The corresponding session paper builds a model that includes control distribution networks and data-path networks and then goes on to discuss a few of the CAP implications based on this model.
What’s interesting to observe is: assuming that control distribution is based on a shared data store, as the Purdue paper suggests, and a shared-fate overlay model between the data store and the data path networks, then the CAP-for-network problem is basically reduced to a CAP-for-database problem. While the former is only now being explored, the latter has been studied and discussed for quite a while, with proven best practices to boot.
All in all, I found Sigcomm 2013’s HotSDN a giant step forward in terms of understanding both the possibilities and challenges of SDN models. Not surprisingly, and keeping in mind ever-evolving research, putting a few structures and constraints — such as overlay flows and underlay data-stores — around SDN beyond naive models will only strengthen its practicality and applicability. I’m looking forward to the future and am certain that the matters brought to light in this year’s Sigcomm 2013 presentations will further evolve as the exploration of SDN and its possibilities and potential continues.