In my last blog post, I talked about six SDN use cases that are applicable for campus networks. In this post, I will focus on the first use case: network virtualization, particularly in the case of network slicing or traffic isolation.
Many organizations leverage network virtualization (a.k.a. VRF-lite) to isolate one class of traffic from another class of traffic for several reasons. For example, in the retail segment, some organizations may choose to use VRF-lite to isolate PCI data from non-PCI data in order to meet PCI regulations. In education, a typical university campus network acts like a service provider to academic departments, police, medical and other entities that reside in a university campus. These networks need to isolate traffic from one department to another department for administrative reasons or to meet individual department needs. Another example is that when a company acquires another company, there may be a need to continue to keep the acquired company’s traffic/network different from the parent company’s, due to overlapping addresses or regulatory reasons.
VRF-lite has been the primary method for achieving network virtualization in campus networks, as opposed to other technologies such as MPLS. This ability to create multiple virtual networks over a physical network helps in lowering the total cost of ownership by reducing hardware and operations while providing flexibility to provide full network services.
Where VRF-lite Falls Short
VRF-lite or different flavors of VRF-lite achieve key goals of network virtualization, but these technologies have proven to be difficult to configure, deploy, and manage. On every switch or router, network administrators have to configure, hop-by-hop, the labels (VLAN tags) and routing protocols per VRF. Some routing protocols support multiple address families, so only one instance of the routing protocol is sufficient to support multiple VRFs, but in other cases, one has to configure multiple instances of the protocol, one per VRF. Leaking routes from one VRF to another VRF requires configuration. Not only is this laborious but also error-prone. Debugging and troubleshooting issues with multiple VRFs is not an easy task.
Current VRF technologies allow an interface, sub interface or a VLAN to be associated with a VRF. When a packet arrives, the interface or sub-interface or VLAN that the packet arrived on is used to identify the VRF. This is quite restrictive if one needs to extend this association of VRF based on context of a flow. For example, if PCI traffic needs to be in a separate VRF, then the entire physical interface that the POS terminal is connected to or the SSID associated with the POS will have to be used to identify the VRF. What happens if the same device is also used for non-payment information? With smartphone and mobile payment technologies, a single device can do more than one activity. A smartphone can not only process credit card payments but also pull product information.
Another limitiation is that current VRF-lite solutions cannot provide traffic isolation based on context of a flow — factors such as application, user, device, location, or time. And current VRF-lite technologies require that every hop in the network be VRF-lite aware. Not only is this quite restrictive, but it’s also quite difficult to extend VRF-lite over non-VRF-lite switches.
Network Virtualization with SDN
An SDN approach to network virtualization can provide many benefits:
- Simplify provisioning of VRFs by eliminating VRF-lite’s hop-by-hop configuration
- Centralize the VRF definition
- Make traffic isolation criteria flexible based on context of a flow
- Easily extend network virtualization over networks that do not support network virtualization
- Provide better visibility into virtual networks.
In an SDN environment, the SDN controller would have complete visibility into the network topology and as well as reachability (L2/L3 forwarding information). A network virtualization application running on top of the SDN controller would handle the VRF-lite equivalent functionality. This application allows network administrators to define VRFs at network level, rather than a per-device level, based on VRF membership criterion (interface, VLAN, user, device, location, time, application etc). Once this is defined, the network virtualization (SDNv) application can program switches and routers with flows and forwarding instructions.
There are two ways one can achieve network virtualization:
- Tunneling option: When a new flow appears in the network, the SDNv app can identify the VRF associated with the flow and tunnel it from the first-hop switch to the last-hop switch (the switch that is under SDN controller administration). This method is simple, as it needs to program only two switches. All intermediate switches forward the flow based on global routing tables. The last-hop switch is programmed with appropriate flow forwarding information, e.g. Layer 2 forwarding or Layer 3 forwarding to reach the final destination.
- Hop-by-Hop flow programming option: Similar to the VRF-lite mechanism, this option has the SDNv app identify the VRF associated with the flow and program every switch from first hop to the last hop with appropriate flow-forwarding entries. In this approach, the first- and last- hop switches will be programmed differently from intermediate switches. The first-hop switch will be programmed with the classification information to identify a specific flow and forwarding information that would tag the flow with a VLAN id.The last-hop switch’s programming depends on whether the destination is directly connected to it or is reached via another network virtualization service provider. If the source and destination of the flow are within the SDN domain, then the flow would typically originate and end as a Layer 2 flow, with the first and last hops programmed accordingly. If the last hop is a handoff to a traditional VRF-lite domain or to another network virtualization technology such as MPLS, then that hop may need to participate in VRF-lite or MPLS. Intermediate switches will be primarily programmed with Layer 3 flow information to switch the flow based on the incoming VLAN ID and Layer 3 header of the packet, forwarding it with the same VLAN ID (or VNET ID), or a different VLAN ID, to the next hop.
While both options offer some benefits, both also have certain limitations.
1. Tunneling Option
Benefits: Simplicity: only first- and last-hop switches need to be programmed. This can easily scale as the number of flows and number of switches increase in a large campus deployment. This also allows network virtualization to extend over switches that do not understand VRF-lite.
Considerations: With any tunneling mechanism, network admins lose visibility into underlying application traffic. If an intermediate switch is under a different administration, then troubleshooting a problem may not be possible. For example, if an appropriate QoS classification needs to be applied at an intermediate switch based on application needs, the switch doesn’t have the visibility into the application because of the tunnel header, unless the switch has the capability to process beyond the tunnel header. Otherwise, the only option is that the switch needs to depend on the DSCP value in the tunnel header. In short, policy (QoS, ACL, Policing, span, sFlow) enforcement based on the underlying application can become difficult or even impossible to enforce on intermediate switches.
2. Hop-by-Hop Flow Programming Option
Benefits: This method can leverage the full capabilities of intermediate switches to implement policies. The network administrator has complete visibility into flows and can apply flexible policies.
Considerations: The primary challenge is that this solution may not scale easily if there is a large number of dynamic and short-lived flows. It also may not scale if the controller has to program many switches in a large campus network. Programming many intermediate switches with flow information may introduce noticeable latency; for applications such as voice and video, this may cause user experience issues.
This solution also requires every intermediate switch to have enough forwarding-table capacity to accommodate large numbers of flows. If an intermediate switch runs out of forwarding capacity, then tunneling over that intermediate switch is an option but not an elegant one. One way to address these scalablility issues is to create default macro-flows per VRF on every switch, and map micro-flows into macro-flows so that only first- and last-hop switches need to be programmed appropriately. All intermediate switches will forward a micro-flow based on macro-flow forwarding information. Of course, one needs a granular controller, and every intermediate switch needs to be programmed with the flow forwarding information.