MemVerge, a startup that’s building software to converge memory and storage, today announced it has raised $24.5 million in a Series A funding round and will launch a beta of its product in June.
The startup’s first round of funding was led by Gaorong Capital, Jerusalem Venture Partners, LDV Partners, Lightspeed Venture Partners, and Northern Light Venture Capital.
MemVerge CEO and co-founder Charles Fan, a former VMware and Dell EMC executive, said that the company will use the majority of the funding on product development and R&D. The remainder, he said, will be used to build a sales and marketing team starting in the second half of this year.
Fan co-founded the startup with Shuki Bruck, a professor at Caltech Institute of Technology, and Yue Li, a postdoctoral scholar at Caltech. Bruck and Fan’s history goes back more than 20 years, when Bruck was Fan’s Ph.D advisor. The pair also co-founded Rainfinity, which was later acquired by Dell EMC. Bruck also taught Li at Caltech as a post-doctorate.
The three formed the company two years ago anticipating the launch of a new data center hardware that would be “transformative to the data center infrastructure service industry,” said Fan.
“Just hardware alone is not enough,” he said, which is why MemVerge’s product is a software it is christening as Memory Converged Infrastructure. “This is essentially a distributed software system support for both storage and memory APIs to the applications so we can take full advantage of the hardware in our layer.”
MemVerge’s infrastructure has been in alpha and the company says it will hit beta in June.
The need for this infrastructure comes from the boundaries between memory and storage.
“In the history of computing you have memory and storage — memory is for your running programs, storage is for when you need to keep things for longer, and as a consequence you deal with I/O [input/output],” said Fan. “That is why you move memory to storage and from storage to memory, and that slows things down.”
When you converge memory and storage, as MemVerge’s software does, it negates the need for I/O, thus greatly improving the speed of big data workloads. “I think it’s a revolution in the history of computing,” he said.
The other two are that memory size becomes a bottleneck as data-centric applications becoming increasingly memory centric and that while there are open source innovations, they are not easy for enterprises to deploy and make operational. And while persistent memory can unlock new potential for use cases, it requires software to fully engage.
An Intel-Based Optane Infrastructure
According to Fan while there are a number of pain points that MemVerge addresses, the triggering event was the shipping of Intel’s Optane DC Persistent Memory. Intel launched a beta for this last October.
Intel has been working jointly with Micron Technology on this technology for over a decade. And in January, Navin Shenoy, EVP of Intel’s data center group, told SDxCentral this persistent memory was “truly groundbreaking.”
MemVerge claims its proprietary distributed memory objects technology provides a convergence layer with sub-microsecond response time that delivers up to 10 times the memory size and 10 times the data input/output speed compared with conventional solutions.
Fan said that while this hardware has a lot of promises, none of the existing distributed storage software will work on this new Intel data center media. “This new media is 400 times faster than the fastest media before and if you just lay existing software on top of it, the software will immediately become the bottleneck. 95 percent of the latency will happen in the software — that removes the benefit of this hardware,” he said.
Another benefit, Fan noted, was that while the Intel hardware has a lot of capacity for memory, there is not a system that allows companies to extend memory beyond a single node. The hardware only supports three modes: memory, storage, or an application rewrite to use it as both — both problems that MemVerge’s software can solve, according to Fan.
MemVerge Memory Converged Infrastructure will be delivered two ways: it can come as an appliance with the software already deployed inside, or the software can be licensed to run on a company’s existing hardware.
The hardware appliance is Intel Cascade Lake-based — which is Intel’s lastest Xeon Scalable processor. Fan claims these computing platforms have up to six terabytes of persistent memory that can be used as system memory, up to 360 terabytes of raw storage space, and can scale up to 128 of these appliances in one cluster.
“So essentially in the same cluster, you basically have an integrated system of up to 168 terabytes of memory, up to 50 petabytes of storage — all at the same time,” he said.
Use Cases and Customer Base
There are two use cases that Fan currently sees as applicable for the infrastructure: AI model training and modern data warehouses.
Currently, he said, it is working with an internet service provider to improve the performance and speed of AI training. In alpha, he claims training speed has improved sixfold, and data load speed has increased 350 times compared to the existing storage.
The second use case it is testing in alpha is a type of “data warehouse where unstructured data is running on Spark Sequel (for a cloud service provider), which has a high requirement on memory and also processes data fast,” Fan said. In adding the MemVerge appliance, it was able to accelerate the cluster speed five times for storage, seven times for RDD caching, and increase the elasticity of the Spark forum so it sits better on a cloud model, he said.
In addition to internet and cloud service providers, Fan said he expects to generate interest from enterprise and service providers that deal with large amounts of data and have high requirements for processing data quickly.