Over the past decade, there has been an exponential increase in infrastructure complexity as data centers continue to grow in size. Software defined infrastructure (SDI) has emerged to address these challenges, whereby a data center or network infrastructure auto-configures itself at runtime based on application/business requirements and operator constraints. Automation in SDIs enables infrastructure operators to achieve higher conformance to SLAs, avoid over-provisioning, and over time, gain competitive advantage through productive staff and satisfied users.
Automation in SDIs comes in one or more of the following types:
- Performance optimization. Redirect traffic in real time around congested routers, perform traffic shaping, or auto-provision resources to meet application SLA requirements.
- Security enforcement. Apply fine-granularity traffic control, so as to provide application traffic isolation among tenants, implement access control policies, and detect and mitigate against cyber attacks in real time.
- Richer in-network functionality. Provide value-added services on behalf of tenants, by steering traffic toward software services that support mobility, real-time applications, or application monitoring.
Two key enabling technologies of SDI implementations are software-defined networking (SDN) and network functions virtualization (NFV). SDN provides networks with the flexibility to steer and provision network resources dynamically. In NFV, network functions are virtualized as pre-packaged software services that are easily deployable in a cloud or network infrastructure environment. So instead of hard-coding a service deployment and its network services, these can now be dynamically provisioned; traffic is then steered through the software services, significantly increasing the agility with which these are provisioned.
Challenges Behind SDN and NFV
In combination, SDN and NFV provide the basic plumbing for configuring the devices and virtual network functions (VNFs) to implement SDI. However, neither provides the underlying intelligence that can generate or recommend the required configuration that can then be automatically implemented. Significant technical challenges need to be addressed before one can achieve the vision of fully automated actuation.
Challenge 1: Closing the infrastructure/application visibility divide. Data analytics solutions largely fall into two distinct camps. On the one hand, there are infrastructure-monitoring tools that allow us to monitor physical and virtual devices such as routers, switches, machines, and VMs. On the other hand, there are application-monitoring tools, typically deployed at endpoints or in specialized hardware appliances, which provide application-specific metrics such as video quality. The visibility divide has undesirable consequences. Too often, data center operators suffer from time spent proving to their customers that the problem does not lie in the network, but with the customers’ applications. Conversely, data center customers are frustrated when no detailed explanation exists for why their deployed services have degraded in performance.
Any SDI solutions need to enable one to bridge the infrastructure/application visibility divide. This allows one to correlate application degradation to physical bottlenecks, and hence allow a software defined infrastructure to proactively provision physical resources to meet application SLAs. Bridging the visibility divide is not solely for the purpose of performance optimization (SDI Type 1, in the list above). It also has implications for security (Type 2) given the prevalence of application-layer network attacks, and can enable richer in-network functionalities for applications (Type 3).
Challenge 2: Enabling big data analytics and online learning for the infrastructure. Automation within SDI requires the ability to make real-time decisions based on observed load and performance characteristics within a data center. This goes beyond simple forms of data collection and packet processing. Rather, the challenge is the ability to make sense of large quantities of data in real-time — otherwise known as big data analysis.
As an example, consider the network intrusion detection challenge for securing SDIs (Type 2). Signature-based algorithms are no longer viable, when zero-day attacks are becoming prevalent. Instead, online machine-learning-based algorithms pave the way to a future of more effectively detecting anomalous behavior. Another use of learning-based techniques is in smarter prediction — for example, predicting future application degradation based on current performance trends. One significant technical challenge is the ability to scale up these online learning algorithms to operate them at line-speed. Likewise, the agility to reprogram software-based analyzers allows us to support a wide range of analytics functionalities, hence “future-proofing” analytics appliances.
Challenge 3: Tackling the requirements/configurations impedance mismatch. The third and final challenge is the need to bridge the gap between business requirements and infrastructure configurations. In an SDI, data center operators and customers alike should be able to specify their operational objectives (SLAs) and security constraints using high-level policy languages. A runtime system will then automatically (1) provision resources to meet these objectives and constraints, and (2) dynamically allocate resources to monitor services and provide a feedback loop that maintains the performance/security goals at all times.
Tackling this challenge requires further innovation in new domain-specific languages and decision procedure engines. There are promising new declarative domain specific language (DSL) proposals based on data-driven and functional programming models that aim to raise the level of abstraction on infrastructure configurations. One further need is to apply efficient decision procedures (e.g. using mathematical solvers) in conjunction with these DSLs to allow us to efficiently converge to efficient configurations in real time, verify the correctness of existing configurations, and provide safe transitions from one configuration to another during actuation.
Declarative Networking for Software Defined Infrastructure
The Holy Grail for any form of automation is to relieve the human from the feedback-actuation loop. In an era of driverless cars, industrial robots, and smart buildings, software infrastructure management today remains largely a labor-intensive task. We still have a way to go before full automation is realized. In the meantime, however, intermediate steps are achievable that allow us to significantly improve the manageability of data centers and networks. One needs to take an interdisciplinary approach at every layer, combining techniques in networking, databases, machine learning, and programming languages.
Declarative networking, in particular, is a promising technology in this context. Declarative programming allows programmers to say (or declare) what they want, without worrying about the details of how to achieve it. This programming paradigm makes it easy to implement protocols that can be incrementally evaluated and verified. Given its roots in database query languages, it has a natural integration with big-data processing systems. It also provides language and runtime support for dynamic adaption, and is suited as an actuation policy language.
The future belongs to software defined infrastructure platforms that can address the challenges we have outlined above in a unified manner.