“Hadoop, This is Your Data Warehouse Calling”

The phone rang, so he answered it.

“Hello? This is Hadoop.”

“Hi, my name is Dee Dubya, and I’m calling on behalf of our company’s enterprise data warehouse ecosystem team. There have been reports of an elephant in the…”

“An elephant? I’m an elephant. Are you referring to me?” Hadoop asked.

“I don’t know, Hadoop. That’s why I’m calling,” Dee Dubya continued. “There have been reports of an elephant in our data center, and in several developers’ cubicles, and even in the closets. They say they’re working on secret ‘shadow IT’ projects in there, whatever that means. Does any of this sound familiar?”

Hadoop laughed. “Sure, I’ve been in all those places. Even in the closet. Is there a problem, Dee?”

I don’t know, Hadoop. That’s why I’m calling,” Dee Dubya said again. Hadoop was now starting to get annoyed.

“So what do you know, Dee?” Hadoop asked. “Now that we’ve established that I’m the elephant in the data center, and the cubicles, and the closets, what is the issue here? Why are you calling?”

“We just want to make sure that you’re not stepping on any toes,” Dee Dubya replied. “I mean, you’re a big dude and all. Don’t they call you the big data dude?”

“Yes, I’m one of the big data dudes,” Hadoop said. “But I work with a team of dudes and dudettes, like Pig, Mahout, Hive, Oozie, Sqoop, Hbase, Chukwa, Tez, and Spark. Oh, and we can’t forget Zookeeper who…”

Dee Dubya interrupted, “Can I just ask – why are you and your buddies here? What are you hoping to accomplish?”

“First of all,” Hadoop began, “you need to understand that we’re here because we were invited. We’re what you call ‘open source’ so folks can access us for free and utilize us how they see fit. And I know your next question: What have we been asked to do so far? Well, let me tell you.

“For starters, we’ve provided a great storage alternative for your IT guys. They love us. I’ve been able to help them store all sorts of crazy data – from transactional data to images to web logs and more – at a fraction of the cost they’re used to paying. I run on low-cost commodity hardware, so it’s no big deal to scale out when the time comes. I’m saving these guys bundles—plus giving them a viable long-term storage solution. Archive to infinity and beyond!”

Dee Dubya was not amused. “I’m not sure why they asked you to help them with storage. They could have come to us.”

“Dee, they could have and they considered it,” Hadoop said. “But when we ran the numbers, it showed that for some of your relational systems, they could save up to 20X with my storage system. It’s hard to walk away from cost savings like that.”

Dee Dubya said nothing.

“Another way we’ve been able to help your folks out is by providing additional processing power—again, at a fraction of the cost,” Hadoop continued. “One of the issues your folks were running into was not being able to keep up with all the data that needed to be processed and moved to your data warehouse. They were also taxing your warehouse with constant updates, and it was becoming too costly and timely to continue using their existing tools.”

Hadoop went on, “So I showed them MapReduce, one of my processing engines, and Spark, another real-time engine, and we’ve been able to offload a lot of the processing work they were asking your operational and warehouse systems to do. My motto is: Leave the storage and processing to me so that your relational systems can do what they do best.”

Dee Dubya was dumbfounded. It felt like a gang—a friendly one, she hoped—had just moved into her organization and took over some of the work and services she had been providing folks for years. 

Dee Dubya had a choice to make: either fight back and protect her turf or figure out how to work together with Hadoop and his team and better serve the organization.

“How else have you been helping out?” Dee Dubya asked Hadoop, with a bit of hesitation.

“Well, now it’s starting to get fun!” Hadoop replied. “Sure, the IT guys are giddy about the storage stuff, and processing data faster is great, but the real value is the all the data your business folks now have access to. We’ve been able to process new data sources – like social media, mobile and GPS data – with some of your existing customer data. It’s providing the business with new opportunities for data exploration, analysis and decision making.

“The way I like to explain it is this,” Hadoop said excitedly. “With traditional data, folks have trained themselves to define their business questions, or hypotheses, upfront, and then go to the data to find their answers. In this new world of big data, however, folks don’t have to know the business questions upfront. They may know them, which is fine, but the point is that they can now go to this new, big data to not only find their answers, but also discover the questions they never knew to ask before. It’s absolutely mind-blowing—and we’re just getting started!”

“You certainly get amped up about all this, don’t you, Hadoop?” Dee Dubya replied, still trying to process everything she was hearing.

“Indeed I do! And once I can share with you all the possibilities of bringing our two worlds together, you’ll be excited, too. I promise! I bet we’ll even be able to move mountains together!” Hadoop exclaimed.

“Maybe you could move a mountain. I’m not an elephant.” 

Hadoop laughed, “Touché.”