Dror Goldenberg is the VP Software Architecture at Mellanox Technology. In his role he drives the software vision of Mellanox, enabling interfaces for state of the art network features and integrating these network into middleware, applications and management platforms to improve their performance and scalability. Mellanox software powers datacenters, HPC and clouds and provides the best performance and scalability for many applications such as high performance computing, storage, big data, artificial intelligence, cloud, NFV and more.
SDxCentral: It’s been shown that service providers and large enterprises need high-performance hardware to scale their applications for the cloud. Can you explain how end users go about achieving the same total infrastructure efficiency as large providers do?
Goldenberg: As we all know, infrastructure is really application-driven. So, we first need to understand cloud-native applications to make intuitive sense of why the hyperscale cloud builders are embracing high-performance hardware to scale their applications for the cloud. Cloud-native applications have the following three key characteristics:
- Microservices oriented. Legacy monolithic applications are being broken down into smaller modules called microservices with well-defined APIs. This is a key movement to enable agile application development and DevOps style continuous integration/continuous deployment (CI/CD). Microservices are often packaged in containers resulting in much smaller instance footprint. As a result, on the same physical server, you will have a lot more instances frequently communicating with each other and with instances on other servers. The outcome is a much higher level of east west traffic, and the servers need much higher I/O capacity to cope with that.
- Decoupling of state from transaction processing. Cloud-native applications scale out instead of scaling up. When a microservice instance runs out of capacity, more instances will be spun up to service the load instead of adding more resources to the original instance to beef it up. If the instance contains a lot of local state, it is hard to share that with the new instances and achieve proper load balancing. For cloud-native applications to really scale out, processing and state needs to be logically separated, and the state pooled as a logically centralized data repository. For this model to work properly, data locality issues needs to be resolved gracefully and transparently, and high-performance networking hardware is essential to make that happen.
- Dynamically orchestrated. Orchestration of compute, storage and networking resources is based on the application requirements to meet service level agreements. The resource orchestration needs to be completely automated with ideally zero manual intervention. It not only happens at provisioning time, but also more importantly on demand through dynamic adjustment. To perform dynamic adjustment, the orchestrator needs to understand the health of the microservice instances and the state of the hardware resource pools through telemetry. For high-performance hardware, telemetry needs to be built in as an inherent part instead of as an afterthought using a bolt-on software module.
Once we understand the key cloud-native application characteristics, it is pretty clear why the public cloud builders are not only using faster CPU and storage to enhance application performance (both of which demand faster networking), but are also focusing on faster and more efficient networking hardware to cope with the scale-out, dynamic microservices model. No wonder the majority of Super Seven cloud and web services providers either have adopted 40G server connectivity and are looking into 50/100G, or they are actively looking into upgrading from 10G to 25 and 50G. In addition, shared resources, multi-tenancy, and heightened security requirements, requires a software virtual switch running in the server hypervisor or container engine. As server connectivity advances to higher speeds, the virtual switch functionality increasingly needs to be offloaded to hardware so that it does not incur significant CPU overhead performing packet processing.
In summary, the key actions that cloud builders are taking drive the move to 25/50/100G and letting the more intelligent networking hardware to offload networking processing from the CPU. The effect on total infrastructure efficiency is easy to see:
- Hardware Efficiency: When cloud infrastructure is built in an optimized way, the full potential of all key resources, including compute, storage and networking will be unleashed. Everything works together like a well-oiled machine to deliver the highest application performance and workload density without bottlenecks – to ultimately guarantee business service SLA while capping costs. This is especially true for public cloud service providers who are motivated to drive up their infrastructure efficiency because their revenue is directly tied to the number of virtual machines (VMs) or containers they can run on their infrastructure.
- Operational Efficiency: The second aspect of total infrastructure efficiency comes from operational efficiency through automation. Intelligent resource provisioning and adjustment, based on telemetry and cloud automation, enhances operational efficiency and minimizes misconfiguration and downtime, ultimately reducing OpEx.
- Simplicity & Application Efficiency: Last but not least, because high-performance networking eliminates data locality concerns, and enable applications to access remote storage at nearly the same great speed as local storage, simplifying application design and eliminating the need to engineer around the network. This is where total infrastructure efficiency can directly affect agile software development and service creation.
SDxCentral: What effect is the evolution of the cloud infrastructure — including the use of APIs and approaches such as DevOps — having on driving network automation?
Goldenberg: Ultimately business agility and faster innovation are the driving forces behind cloud and network automation. To significantly shorten the process to turn ideas into revenue, we need the technology platform and infrastructure to support agile development and a streamlined process to integrate, test and deploy changes. Exactly how fast are innovators making changes and pushing them into deployment? To give you an idea, Amazon is pushing through 50 million deployments per year, and that is about 100 deployments per minute. This is just not something that can be achieved with manual provisioning.
Cloud automation dictates all resource orchestration to be automated, including the network. You cannot take days to configure the network after you spin up a VM or container in minutes or seconds. Your agility is only as good as the weakest link and we make sure that the weakest link isn’t the network. Network automation and the elimination of manual intervention dictates the use of APIs for network resource provisioning, network configuration, and policy definition.
Mellanox is continuously improving network automation and network monitoring through better network telemetry and analytics, not only for the network infrastructure, but also for the server host I/O system to make network automation truly intelligent and consistent end-to-end.
SDxCentral: How have you seen the innovations pioneered by the hyperscale web players implemented in enterprise and service provider data-center infrastructure design?
Goldenberg: I want to provide three examples here.
First is 25/50/100G Ethernet standards. Back in 2014, the 25G Ethernet Consortium was initiated by a group of companies, including Mellanox and the cloud giants Microsoft and Google, to define a specification that enabled interoperable 25G and 50Gbps Ethernet solutions. The hyperscale web and cloud players figured out long ago that high-speed networks are not just about network performance, they are about total infrastructure efficiency. Cloud-native applications, scale-out infrastructure and faster storage all demand faster networking, which enables servers and storage resources to serve applications and users, rather than be consumed simply moving data. Forward-thinking enterprises are looking at these hyperscale giants and trying to understand how to achieve Webscale IT efficiencies on an enterprise scale IT budget and 25Gbps is an attractive option for them to gain significant performance improvements in a very affordable way.
Second is machine learning/artificial intelligence. A lot of people know that Google’s AI program, AlphaGo, beat Lee Sedol, a Go world champion from South Korea. But machine learning and artificial intelligence have much wider applications in hyperscale data centers. They are now powering services such as self-driving cars, computer vision and image recognition, your Facebook Newsfeed, more personalized ad placement for search engine giants such as Google or Baidu, and Azure Machine Learning from Microsoft that is providing ML services to developers to build intelligent applications. Now, more than ever before, for enterprises and startups alike, organizations must embrace and master change or get left behind. Industry leaders are moving beyond the digital transformation and embracing New IT, which is characterized by its “cognitive” attribute. Organizations are increasingly using Big Data and machine learning technologies to deliver near real-time or real-time insights.
Finally, there is the concept of composability. Hyperscale players want to take the Lego building approach to build their infrastructure, basically disaggregating their infrastructure into components with well-defined interfaces, so that they can pick the best of breed components and easily compose a customized infrastructure out of them. Beyond hyperscale and cloud-oriented service providers, enterprises who are looking into private and hybrid cloud, or who reorganize their IT departments to speed service delivery and gain business agility, are also increasingly looking into this open approach in building their infrastructure.
SDxCentral: What are some of the key trends in the storage space, especially cloud storage, and how is high-performance networking the must-have for the new generation of storage technologies?
Goldenberg: Storage media is getting faster, exponentially, because businesses want to crunch more data and derive near real-time insights. Today’s flash far outperforms hard drives for throughput, latency, IOPS, power consumption, and reliability. It has better price/performance than hard disks and already represents between 10-15% of shipping enterprise storage capacity according to analysts. Tomorrow’s NVMe devices will support up to 2-3GB/s (16-24Gb/s) each with latencies <50 us (that’s <0.05 milliseconds vs. 2-5 milliseconds for hard drives). As storage media evolves, storage networking needs to catch up to make it possible to fully leverage the fast access time.
Also because of the vast amount of data to store and analyze, especially when we move into the Internet of Things (IoT) era, simply scaling up storage boxes just would not work anymore. Traditional storage arrays are in favor of scale-out software-defined storage. By spreading the functions of a storage array across many independent – or more accurately, interdependent – nodes, scale-out storage systems are inherently network dependent.
Furthermore, scale-out storage can take the form of either hyper-converged infrastructure with compute, storage, networking and virtualization all integrated in one chassis to ease management, or shared storage model such as NVMe over Fabric that allows compute and storage to independently scale. In this case, fast network is essential in enabling remote storage access at the same great speed as local storage.
Thanks to efficient networking technologies that not only provide 25/40/50/100Gb/s, and soon 200Gb/s of throughput at sub-microsecond latency, but also efficient transport of RDMA (Remote Direct Memory Access) and RoCE (RDMA over Converged Ethernet), the potentials of the new generation of storage technologies can be fully unleashed.
SDxCentral: Can you explain how open networking and disaggregation are likely to influence infrastructure in the coming years?
Goldenberg: Open Networking and disaggregation are examples of infrastructure composability, and is exactly the direction modern networking is heading to. At Mellanox, we call it Open Composable Networks that is based on:
- Network disaggregation (into modular components)
- Open and unified abstraction interface between these components
- Automated composition and orchestration, with efficient middleware that can pool these abstracted resources, and orchestrate them on demand.
Network disaggregation was first envisioned and utilized by the hyper-scale web and cloud service providers, but has rapidly expanded its influence to enterprises in many industries. This has allowed enterprises to take control and easily customize their own network infrastructure. It is the first step to enable organizations to get their network infrastructure done right to support ephemeral cloud-native workloads, and DevOps practices to achieve Web-Scale IT efficiency.
With this architecture, applications can easily and quickly compose and recompose network resources for optimal performance. Once networks are disaggregated into functionally independent modules with open, clearly defined abstraction interfaces, you are free to leverage the best-of-breed building blocks to compose a network infrastructure that is tailored to best serve your applications’ unique requirements. Only with this new generation of Open Composable Networks can you maximize and leverage the combined wisdom of the larger networking ecosystem, completely eliminate vendor lock-in, and to unleash your own innovation.
An excellent example of this is what we demoed at the Open Compute Project (OCP) Summit in early 2016, where we proved interoperability of 5 of our Spectrum switches running 5 different switch operating systems or routing packages including Microsoft SONiC, Cumulus Linux, OpenSwitch, MetaSwitch and MLNX-OS. This is made possible through community defined API framework including Switch Abstraction Interface (SAI) in the OCP community, and switchdev in the Linux community. Mellanox is a major contributor to both sets of APIs.
SDxCentral: What is intelligent offload technology and how does it work?
Goldenberg: Legacy server network I/O cards (or NICs) can normally only perform some very basic stateless offloads such as checksum calculation, scatter-gather, TCP segmentation offloads, on well-known packet formats and protocols. These offloads can significantly improve I/O performance and reduce CPU overhead, but newly implemented protocols such as VXLAN and GENEVE often render all these offloads useless. In addition, the CPU always needs to be involved for stateful packet processing, and this becomes impractical as the sever connectivity speeds goes from 10G to 25/50/100G, and/or in NFV scenarios where there are a significant portion of small packets.
Starting from ConnectX®-4 series of NICs, Mellanox supports accelerated virtual switching capability in server NIC hardware through the ASAP2 (Accelerated Switching And Packet Processing) engine. With a pipeline-based programmable eSwitch (Embedded Switch) built into the NIC, ConnectX-4 can handle a large portion of the packet processing operations in hardware. These operations include VXLAN encapsulation/decapsulation, packet classification based on a set of common L2 – L4 header fields, QoS and Access Control List (ACL). Built on top of these enhanced NIC hardware capabilities, ASAP2 provides a programmable, high-performance and highly efficient hardware forwarding plane that can work seamlessly with the SDN control plane. It overcomes the performance and efficiency degradation issues associated with software virtual switching implementation. Now you might be wondering whether you need to write vendor-specific code to take advantage of ASAP2, the answer is “No”. The beauty about our design is that the offload and acceleration capabilities are exposed through a set of open and standard APIs, and there is no risk of vendor lock-in. This is the type of intelligent offload we are talking about here, and it is just the starting point. Mellanox also announced FPGA and multi-ARM-core based Innova and Bluefield NICs that are even more intelligent and more easily programmable. This allows moving beyond basic switching/routing to allow advanced network and security features such as IPSec, TLS and Deep Packet Inspection (DPI) to be offloaded to the NIC hardware, to further enhance performance and reduce CPU overhead.
As Moore’s Law and CPU performance improvement slows down, we will compensate by driving more intelligence into the network subsystem, including the NICs and switches, and that will ultimately contribute to boosting the total infrastructure efficiency.