In the tech world, Big Data and Hadoop are all the rage. You can tell the marketing folks aren’t really sure what it all means yet, but know they want to be a part of it. It’s like the huge party you heard about on the other side of town — even if you haven’t been there yet, you know you have to get there.
So let’s pay attention. There were some interesting developments in Big Data at O’Reilly’s Strata + Hadoop World in New York City this week (doesn’t that sound like some kind of new restaurant chain?).
First, let’s cover the basics of Hadoop in case some of you don’t know what’s going on. Strata itself published a nice explanation of Hadoop by Mike Olson, CEO of Cloudera, a software company focused on Hadoop.
Hadoop is an open-source, distributed database system designed for interpreting large amounts of data. Therefore it’s the database software plumbing at the heart of the “Big Data” movement. The open-source framework was initially developed by Google and Yahoo. But because it’s open-source, that means everybody can add to it. And of course, everybody is trying to develop their own “distribution.” In other words, everybody wants to put their own spin on it.
Clearly, the Google factor has had an impact on Hadoop’s success. But there are other reasons. Some of the main reasons that Hadoop is popular, as far as I can gather, are the following:
1) It’s architected to handle large and diverse amounts of data. As we know, data is growing massively. And gleaning information from that data has become really important to all businesses.
2) It is “distributed” software, meaning that it can run on many machines at the same time. So therefore, it’s good for scaling cloud-based applications.
3) It’s grown to critical scale, both from a use and a marketing perspective. This is important in a open-source environment because it leverages the community.
Okay, so much for my layman’s attempt to understand the Hadoop phenomenon. Let’s move onto the product news: Because in the tech world, it’s all about marketing.
Now that Hadoop has become “cool,” and tech pundits around the world need to pepper their language with “Hadoop” and “Big Data” just to prove they are hip (how am I doing?).
As the companies converged on the Strata + Hadoop World conference this week, my partners at CMSWire.com jumped all over the news. Here’s a roundup of what we saw:
Big Data Myths: 3,000 people converged on NYC to examine the Big Data phenomenon. What did they find? Many truths and myths. Read Noreen Seebacher’s summary of Big Data Myths and Customer Insight.
Cloudera Broadens its Mission: As I said, the first step is to leverage the Hadoop angle in your marketing. Cloudera has done that well, by becoming one of the leading startups linking itself to Hadoop. But what’s next? In a big move, Cloudera is now broadening its mission, saying it wants to be a data-management company. CMSWire Hadoop expert Virginia Backaitis jumped on the news.
Microsoft Gets its Hadoop On: Like it or not, Microsoft (MSFT) is a big player in cloud services and data. Just think about all of the data flowing through its enterprise applications, including its Azure cloud services and SQL Server databases. Microsoft introduced Windows Azure HDINsight, its own Hadoop distribution (or “distro if you want to sound cool) based on the Hortonworks Data Platform (HDP).
EMC Goes Hadooping with HSK 2.0: Maybe we come come up with a new term for this trend, which is launching new Hadoop products tied to your existing products. I kind of like the sound of “Hadooping.” Microsoft wants to integrate Hadoop into Azure, and EMC (EMC), another tech giant, wants Hadoop to be a part of its Isilon storage division. To this end it has introduced Hadoop Starter Kit (HSK) 2.0. You can follow that story here.
As you can see, this is just the start of what’s likely to be a long sequence of evolving tools and products in the Hadoop world. Everyone in the world is Hadooping these days.