At the recent Intel Developer Forum in San Francisco, there was lots of discussion about the tradeoffs associated with various approaches to virtual switching. In this post, we’ll outline the pros and cons of the most common solutions and show that it’s possible to meet aggressive performance targets without compromising on critical system-level features.
Virtual switching is a key function within data centers based on software-defined networking (SDN) as well as in telecom infrastructure that leverages network functions virtualization (NFV). In the NFV scenario, for example, the virtual switch (vSwitch) is responsible for switching network traffic between the core network and the virtualized applications or virtual network functions (VNFs) that are running in virtual machines (VMs). The vSwitch runs on the same server platform as the VNFs and its switching performance directly affects the number of subscribers that can be supported on a single server blade. This, in turn, impacts the overall operational cost per subscriber and has a major influence on the opex improvements that can be achieved through a move to NFV.
Because switching performance is such an important driver of opex reductions, two approaches have been developed that boost performance while compromising on functionality: PCI pass-through and single-root I/O virtualization (SR-IOV). As we’ll see, though, the functions that are dropped by these approaches turn out to be critical for carrier-grade telecom networks.
PCI pass-through is the simplest approach to switching for NFV infrastructure. As explained in detail here, it allows a physical PCI Network Interface Card (NIC) on the host server to be assigned directly to a guest VM. The guest OS drivers can use the device hardware directly without relying on any driver capabilities from the host OS.
Using PCI pass-through, you can deliver network traffic to the virtual network functions (VNFs) at line rate, with a latency that is completely dependent on the physical NIC. But NICs are mapped to VMs on a 1:1 basis, with no support for the sharing of NICs between VMs, which prevents the dynamic reassignment of resources that is a key concept within NFV. Each VM requires a dedicated NIC that can’t be shared, and NICs are significantly more expensive than cores as well as being less flexible.
SR-IOV, which is implemented in some but not all NICs, provides a mechanism by which a single Ethernet port can appear to be multiple separate physical devices. This enables a single NIC to be shared between multiple VMs.
As in the case of PCI pass-through, SR-IOV delivers network traffic to the VNFs at line rate, typically with a latency of 50µs, which meets the requirements for NFV infrastructure. With SR-IOV, a basic level of NIC sharing is possible, but not the complete flexibility that enables fully dynamic reallocation of resources. NIC sharing also reduces net throughput, so additional (expensive) NICs are typically required to achieve system-level performance targets.
For NFV, though, the biggest limitations of PCI pass-through and SR-IOV become apparent when we consider features that are absolute requirements for carrier-grade telecom networks:
- Network security is limited since the guest VMs have direct access to the network. Critical security features such as ACL and QoS protection are not supported, so there is no protection against denial-of-service attacks.
- These approaches prevent the implementation of live VM migration, whereby VMs can be migrated from one physical core to another (which may be on a different server) with no loss of traffic or data. Only “cold migration,” which typically impacts services for at least two minutes, is possible.
- Hitless software patching and upgrades are impossible, so network operators are forced to use cold migration for these functions too.
- It can take up to four seconds to detect link failures, which impacts required link protection capabilities.
- Service providers are limited in their ability to set up and manage VNF service chains. Normally, a chain would be set up autonomously from the perspective of the VNF (perhaps by an external orchestrator), but if the VNF owns the interface (as in the case of PCI pass-through or SR-IOV), it has to be involved in the setup and management of the chains, which is infeasible or complex.
For service providers who are deploying NFV in their live networks, neither PCI pass-through nor SR-IOV enable them to provide the carrier-grade reliability that is required by telecom customers, namely six-nines (99.9999%) service uptime.