SendGrid, an email processing company, wanted to run more of its workloads on Amazon Web Services (AWS). Rather than just doing a “lift and shift” to move those workloads, it decided to change its on-premise architecture to account for the differences in how those workloads would be handled in AWS.
To give some sense of the scale of SendGrid’s operations, the company sends more than 30 billion emails per month. In addition to delivering email marketing campaigns, it also sends transactional emails. For example, customers who take a ride with Uber will get an email receipt sent through SendGrid. The Denver-based company has over 55,000 customers.
Since its founding in 2009, the company has run its operations on premises in a private cloud and has also worked with public cloud providers. But it wanted to work with Amazon Web Services (AWS), moving more workloads to that platform. In September it did, in fact, announce that both its email application programming interface (API) and marketing service are now available on AWS Marketplace.
But as part of the process of working with AWS, SendGrid needed to update its infrastructure.
“If you think about email, it’s not a new technology,” said J.R. Jasperson, SendGrid’s chief architect. “The traditional way email infrastructure has been developed has not been a great fit for a cloud-native architecture. About a year-and-a-half ago, we began to re-architect to be a better fit to leverage public cloud.”
State vs. ComputeJasperson explained that when an email is ready to be delivered it’s not necessarily an instantaneous transaction. It could be scheduled or deferred, or there could be some reason it doesn’t go through immediately. “We retry for a period of time,” he said. “That implies the system behind the scenes is stateful. The email is being stored on the machines that are doing that. But stateful compute is not a workload that is efficient for the transient nature of a public cloud. It is compute-intensive to process email.”
It’s expensive to pay for compute resources in a public cloud that are simply storing emails for as long as 120 hours. SendGrid didn’t want to pay for compute that it wasn’t actively using, so it needed to re-architect to decouple state from compute. It wanted its internal system to model after the more modern AWS S3 (Simple Storage Service) or AWS EBS (Elastic Block Storage).
Jasperson said, “To re-architect we wanted to tease apart the things that were CPU intensive and the parts that maintain state. There were a number of things we needed to do. Some was amending code — the secret sauce of our deliverability. Many of these components inside SendGrid were custom software.”
The company’s re-architecture also involved moving away from some hardware components. “It’s generally a good idea anyway,” said Jasperson. “There’s a big move away from hardware load balancers and using software to re-direct traffic. The changes we’ve made on premise happen to fit in with what a cloud workload would look like. We are pulling logic that was formerly in an ASIC into software.”
SendGrid’s re-architecture “simultaneously solves some scalability and future pain points and also changes the software so it will run more efficiently in the cloud,” said Jasperson. “If we hadn’t done this re-architecture, a lot of stuff would be off the table. This gives the option to move these workloads to the cloud. We expect we’ll be doing a lot of that going forward.”
As a side note, SendGrid earlier this month filed a registration statement with the U.S. Securities and Exchange Commission (SEC) relating to the proposed initial public offering of shares of its common stock.