Why Data Diversity Is A Crucial Success Factor?

March 1st, 2018

The vast increase in the amount of data we are able to generate, store and examine, thanks to the IoT and our always-online world, is what has really made it possible though.

Of course, business using information to make better decisions is nothing new. What’s different these days is the scale of available information – the volume of data generated by the world’s businesses doubles every 1.2 years – and the opportunities it provides to innovate.

Businesses which build their operations on their ability to collect and use data are disrupting industries every day. Amazon, Uber and AirB’n’B all grew by operationalizing big data generated through their own core activities in the markets they have individually dominated. Data ranging from website clickstreams to vehicle fuel consumption is used to tailor services to meet customer demands, and streamline business processes.

Now though, innovators seeking out the new cutting-edge are looking beyond the data immediately available from their primary activities and operations.

When it comes to data, as with all other aspects of business, diversity is critically important. This is true in a meta-sense, as non-representative data sets are less likely to yield workable insights than those which cover all facets of the issue under investigation. It’s also true in terms of the variety of data available.

Variety has always been one of the fundamental “V’s” of Big Data – alongside volume, velocity, and various others that have been added over the years. Today – with the sheer divergence of datasets available, it’s more critical than ever, as insight can often be found in unexpected places.

To launch a truly diverse data-driven strategy, the key is to think beyond the data that an organization already has readily available, or that which would be simplest for them to collect. Thanks to breakthroughs in technology such as image analysis and natural language processing, meaning can be extracted, in an automated way, from video, handwriting, recorded speech and the text of emails and social media posts.

And in marketing, advertisers are developing methods of better understanding their customers’ lives and habits, by analyzing how, when and where their products are talked about, photographed and posted to social media.

This messy, scrambled-up, unstructured data in fact makes up over 90% of the data generated worldwide. The rest is nice, orderly structured data, often generated by machines talking to each other and making logs. This data is made up of numbers which can easily be slotted into charts and tables and analyzed with simple mathematics. As well as making up a majority of the volume of our data, unstructured data very probably holds a majority of the so-far undiscovered insights. Getting at them isn’t always easy – but the rewards for those with the initiative and imagination to try are potentially enormous.