Misconceptions About The BIG DATA Concept

What is BIG DATA…???

Big data is the collection of humongous Data – structured or unstructured.  This raw data can have various implementations in an organization such as analyzing insights and assist in initiating better decisions & strategic business moves in future. But that is not it; big data is going to change the world and everything and everyone is going to get affected by it. I will try and clear out all the doubts and misconceptions about the BIG DATA concept with the help of this article.

The world is engaged in gathering & analyzing huge data. The technology to harness this massive amount of data has enabled us (humans) to understand everything about the world and within it. Analyzing this huge data can help us in different sectors such as predict human behavior & patterns, decode the human DNA in minutes, prevent terror attacks, find cures  for deseases, innovations & inventions etc.

For e.g. a retail company can analyze your past buying patterns, analyze your social media accounts for needs and preferences, check their stock list, tracks your cell phone locations and examine the external weather all this can be done within seconds. And send you a voucher for your preference when you are within the radius of the retail store. Sounds spooky right..!!! But yes all this is possible with datafication. Datafication is caused by a number of things such as digitalization of music, videos and books and this is possible due to rise in internet usage.

In the initial days, we use to buy hard copies of books and novels to read but nowadays devices like kindle enable us to have any book or novels in azw or documented formats. These devices analyze individual profiles for their preferences and needs and provide or suggest books and novels according to their reading preferences. Same is the case with music, movies, clothes etc. have you ever wondered while using the internet we receive suggestions similar to what we were searching for its due to the same reason as the devices are recording our needs and preferences and fetching similar data and suggesting it back. Now, imagine all these with billions of searches performed daily, 500+ million tweets on twitter per day & almost 75 hours of video uploaded on YouTube every minute.

According to Eric Schmidt, Google’s executive chairman “since the dawn of the civilization humankind have generated five Exabyte of data till the end of 2003.” But as per the computer giant IBM 2.5, Exabyte of data is generated every day in 2012 and the pace is accelerating.

This data is in different formats such as text, video, web search logs, sensor data, financial transactions and credit card payments etc. All this data is segregated in four V’s of Big Data.


It refers to the massive amount of data generated every second. According to a survey by IBM, it is estimated that 2.5 quintillion bytes of data is generated each day & 40 zettabytes of data will be generated by the end of 2020.


It refers to the speed at which the new data is generated and rolled out. NYSE captures 1 TB of trade information during each trade session. A very good e.g. of this is the credit card fraud detection where millions of transactions are analyzed for unusual patterns in real time.


It refers to different forms of data recorded it can be in any form videos, docs, social media updates, financial data, sensor data etc. Almost 80% of the data collected in the world is in unstructured form. Almost 30 billion pieces of content is shared solely on Facebook every month & 500 million tweets per day.


It refers to the uncertainty of data. The huge volumes often lack the quality or accuracy. 33% of companies don’t believe the data to be completely reliable. Poor data quality costs the US economy 1.3$ trillion every year. According to a survey by IBM, almost 30% of respondent don’t know whether their data is accurate or not.

Now the question is how it is going to change the world…???

Initially, we had traditional tools which couldn’t deal with fast moving & unstructured data. But now the scenario is completely different as we have software’s like Hadoop which enables us to analyze and format the huge amount of structured and unstructured data. All this is done by an automated system which segregates the task between many different computers or VM’s.  Due to this companies can gather all this accessible and inaccessible data together and deduce the actual gist from it to generate impressive results or outputs.

Let’s take a look at few big data implementations today

Politicians today are using social media analytics to plan and execute their promotion campaigns, select a target audience and finalize where they should run their campaigns.

Facebook analyzes your pictures, locations; friends list & compare with others to determine your potential friends and your interests & preferences on the basis of your likes and posts you share.

Investigation agencies analyze data’s from social media, CCTV cameras, texts, phone calls etc. to track criminal activities and also to predict terror attacks.

Video analytics and sensor data are being used to analyze the games and sports activities and how to improve them for e.g. you can now buy a cricket bat which has more than 200 sensors which analyze you’re playing pattern and provides suggestions on how to improve your game.

Have you ever wondered how apple iTunes or YouTube provides you with the similar genre of songs you are listening to they do it by analyzing individual preferences and provides suggestions accordingly all this is done with the help of datafication.

Self-driving car by Google analyzes and collect a massive amount of data in real time in order to stay on the road. Safely

Companies in order to predict sales volumes and brand equity are using a new concept called sentiment analysis from Facebook and twitter posts. They are also analyzing buying patterns of people for e.g. if a lady has changed her buying pattern they get to know that she is pregnant and accordingly they promote baby products and pregnancy related goods to that specific client.

Even in medical sector symptoms of a specific disease which is yet to happen to the patient can be detected based on the heartbeat and the physical symptoms recorded of previous patients and take precautions before the situation exaggerate.

All this and it’s just the beginning of a new era. The era of “Big Data” or “Datafication” all we can see now is just the horizon but the oceans water runs deeper and we don’t know what lies in the depths of it.

Leave a Reply

Follow by Email