Democratizing Data Part 2: Data’s Hidden Monopoly

This is the second in a series of posts investigating Big Data democratization. We will look at the market forces in the Big Data industry. We will also examine the internal changes businesses must look toward in leveraging the opportunity to its hilt. Then, a review of the tech layer of this story, the pressure and opportunity of the open source especially. Finally, we’ll look at the impact to and on the public, not just collectively, but on an individual level as well, and examine where you and where I might fit into the data awakening. Trust us, we’ll be fair. In all honestly, some may prefer the snooze button.

Recall the film Annie Hall, where Alvy Singer finds himself in a debate with a man standing in line on the subject of Marshall McLuhan’s work. The man in line is speaking loudly and voicing pontifications on the writer’s work that to Alvy are way off base. A short debate ensues, yet Alvy is able to bring his trump card at the end – Mr. McLuhan himself – to refute the man-in-line’s position as great misinterpretations of his [McLuhan’s] work. 

“Boy, if life were really like this,” says Singer, and we all unanimously agree. 

No matter the field, there are always well-versed ringleaders, call them legacy organizations or professions, leading the way – oft times with subjective advice on best practices that don’t actually work for everyone. The New York Times is a great example of this, given that the newspaper’s business practices are highly regarded across industries. 

In spring 2014, The New York Times hired a data scientist, albeit a CDO (chief data officer) of sorts, to help the organization with predictive analyses that neither the CIO or CMO positions were currently capable of doing. And that issue, the merging of the CIO and CMO roles, is affecting every single organization utilizing digital media in absolutely anyway – across the board. The New York Times, along with city municipalities including L.A., New York City, Chicago and more, addressed this issue with a new hire – one dedicated to the data, not IT or inbound analytics. 

So here’s the deal: today’s enterprise has this crazy conflict going on. Essentially, it’s the battle for control of data, and man, is it interesting to watch both legacy and startup brands figure out how to best solve for their disparate data sources

Is it the CIO, CMO or a new CDO hire who will best bring disparate data sources together for the entire company? What is causing this disparate data and thus confusion to begin with? What is the best way to allocate roles and responsibilities in this new data-driven economy? Who, ultimately, has control of a company’s customer/audience/fan data?

To Which Role Does Data Belong?

To answer these questions, let’s look at the type of supported data companies are seeing. For example, most would agree that data around the finance, logistics, inventory, SKU information, all matters of security – what we generally think of the digital operations of an enterprise – would sit in the purview of the CIO. Meanwhile, information on email lists, customer information (CRM), web analytics, advertising performance, etc. would rest under the CMO. This outline is how things are generally organized.  

Yet, this pre-big data organization can quickly melt into wild chaos. Things begin to get muddy when someone asks a question that crosses the pre-set role lines. For instance, “What is the most popular piece of content for repeat customers that purchased SKU 2345 in the last 30 days from any one of these 10 email campaigns?”

This is question is reasonable, but has just created chaos in a highly segmented organization in which CIOs and CMOs work off different, disparate data sources, both with different goals in mind.

Now, who is responsible for gathering that report? How is priority ranked against other requests? How do we even know this is the right question? And, for heaven’s sake, what crazy human is asking all these questions and making a c-suite employee run all these spreadsheets? 

Janitor Work is Monopolizing Data-Minded Employees 

For the CIO and CMO, they aren’t the ones producing those spreadsheets, nor collecting the disparate data to fill those spreadsheets with any type of informative or actionable information. No, engineers, data scientists and data analysts are doing that. And in doing so, organizations have essentially turned these employees into a bunch of list running crazies in the engine room while Capt. Kirk and Spock just wait it out. 

Guys, there’s a fire in the engine room and we’re givin’ it all she’s got.

See, data scientists are doing far too much work in creating pulls of data, making lists and running reports on what happened in the past. This is annoying and frustrating work. Frustrating because these guys and gals would rather be writing – and some lucky ones do – strictly crazy predictive analytics models that predict future behaviors and could actually make the business exponential more efficient using actual data from the company’s consumers. 

Right now, though, we’re drowning in these damn lists.  

According to the New York Times, data scientists spend from 50% to 80% of their time mired in this more mundane labor of collecting and preparing big data, before it can be explored for useful nuggets.

“Practically, because of the diversity of data, you spend a lot of your time being a data janitor, before you can get to the cool, sexy things that got you into the field in the first place,” said Matt Mohebbi, a data scientist and co-founder of Iodine, in an interview with the New York Times.

It’s Time for a Software Solution – Otherwise Known as Data Democratization

This data janitorial work is a software problem, not necessarily a “need-a-new-hire” problem. See, big data software that democratizes data across an organization does exactly what these data scientists are doing everyday – without needing the high pay or boring these data-pioneers with unnecessary and boring labor. 

Here, the issue is in bridging the delta between those in the command center – your data scientists – and those employees working with increasing performance metrics within a swift moving and ever more rapidly changing marketplace – your CIOs and CMOs. To do this, software must unify disparate data sources without much help, if any, from the data scientists in an organization and present that now-unified data in a visual-enough manner that data literacy is easy to acquire for employees no matter their role focus. 

“Let’s end the data janitorial work for our data scientists and let them do the cool, sexy parts of their jobs.”

In essence, with data democratization software, the business side of an enterprise can ask its own questions of data and get their own answers sans the need for tech help. Better yet, they can act on those insights in real-time, sans having to wait for a report to be run, data to be analyzed and someone to be paid. 

Let’s end the data janitorial work for our data scientists and let them do the cool, sexy parts of their jobs day in and day out. Their predictive algorithms will amplify a company’s efficiency and accuracy on predictive models. For the everyday big data questions that produce ROI, brand loyalty and customer engagement, let the business side work their magic, find unknown adjacencies and increase the bottom line. 

Umbel can help. Check out a demo to see how.