{"id":6546,"date":"2015-10-07T08:24:13","date_gmt":"2015-10-07T08:24:13","guid":{"rendered":"http:\/\/www.esds.co.in\/blog\/?p=6546"},"modified":"2018-05-21T08:58:18","modified_gmt":"2018-05-21T08:58:18","slug":"big-data-needs-to-be-clean-data","status":"publish","type":"post","link":"https:\/\/www.esds.co.in\/blog\/big-data-needs-to-be-clean-data\/","title":{"rendered":"Big Data Needs to Be Clean Data"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-6547\" src=\"https:\/\/www.esds.co.in\/blog\/wp-content\/uploads\/2015\/10\/BigData-Needs-Clean-Data.png\" alt=\"BigData Needs Clean Data\" width=\"674\" height=\"329\" srcset=\"https:\/\/www.esds.co.in\/blog\/wp-content\/uploads\/2015\/10\/BigData-Needs-Clean-Data.png 674w, https:\/\/www.esds.co.in\/blog\/wp-content\/uploads\/2015\/10\/BigData-Needs-Clean-Data-300x146.png 300w, https:\/\/www.esds.co.in\/blog\/wp-content\/uploads\/2015\/10\/BigData-Needs-Clean-Data-660x322.png 660w\" sizes=\"auto, (max-width: 674px) 100vw, 674px\" \/><\/p><div id=\"ez-toc-container\" class=\"ez-toc-v2_0_76 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.esds.co.in\/blog\/big-data-needs-to-be-clean-data\/#Big_Data_and_Its_Forms\" >Big Data and Its Forms<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.esds.co.in\/blog\/big-data-needs-to-be-clean-data\/#Source_of_Errors_in_Big_Data\" >Source of Errors in Big Data<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.esds.co.in\/blog\/big-data-needs-to-be-clean-data\/#Why_Is_Big_Data_Cleanup_Necessary\" >Why Is Big Data Cleanup Necessary?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.esds.co.in\/blog\/big-data-needs-to-be-clean-data\/#What_Forms_of_Errors_Appear_in_Big_Data\" >What Forms of Errors Appear in Big Data?<\/a><\/li><\/ul><\/nav><\/div>\n\n<p style=\"text-align: justify;\">Businesses are flocking to the cloud, and it\u2019s no wonder. <strong>Cloud platforms are the perfect solution to avoiding hardware installation and maintenance costs.<\/strong><\/p>\n<p style=\"text-align: justify;\">On top of that, with <a href=\"https:\/\/www.esds.co.in\/enlight-cloud-hosting\"><strong>cloud services<\/strong><\/a> you can easily add or remove resources. You can process and store data fast. That\u2019s especially helpful when you need to handle \u201cBig Data\u201d.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Big_Data_and_Its_Forms\"><\/span>Big Data and Its Forms<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><strong>Big Data refers to the mammoth amount of data that businesses nowadays receive from different sources on a daily basis.<\/strong> In fact, 2.5 quintillion (that\u2019s 18 zeros by the way!) data bytes are created on a daily basis.<\/p>\n<p>The sources of Big Data can be:<\/p>\n<ul>\n<li>Mobile devices<\/li>\n<li>Smart devices<\/li>\n<li>Sensors<\/li>\n<li>Social media<\/li>\n<li>Transactions, <span id=\"b82d7357-eece-4da8-ae09-388e93620e99\" class=\"GINGER_SOFTWARE_mark\">etc<\/span><\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><strong>Big Data can be unstructured, semi-structured, or structured. <\/strong>It provides huge benefits to businesses, but only if it\u2019s treated the right way.<\/p>\n<p>The treatment of your data begins with its cleaning followed by processing. We\u2019ll talk about the first phase here.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Source_of_Errors_in_Big_Data\"><\/span>Source of Errors in Big Data<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\">Errors have a way of creeping into even a foolproof system. The errors in Big Data or, in fact, any data, may come from a variety of sources.<\/p>\n<p style=\"text-align: justify;\">The most basic cause of inaccuracies in <span id=\"7412c922-ee6a-4f63-9f60-48a483c701ea\" class=\"GINGER_SOFTWARE_mark\">data<\/span> is <strong>human error<\/strong>. For instance, while filling a survey form, a customer may enter his\/her name with incorrect spelling. This may lead to problems when the feedback is integrated into an existing customer profile database.<\/p>\n<p style=\"text-align: justify;\">There\u2019s always a possibility of having <strong>fake entries<\/strong>, or even <strong>multiple entries, <\/strong>which may also create problems in your data analysis.<\/p>\n<p style=\"text-align: justify;\">Finally, you can also create errors in your data by <strong>condensing<\/strong> it. This occurs more commonly when dealing with a database of product reviews.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Why_Is_Big_Data_Cleanup_Necessary\"><\/span>Why Is Big Data Cleanup Necessary?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\"><strong>US businesses lose $600 million every year because of dirty data. Having clean data takes your revenue up by <em>66<\/em>%!<\/strong><\/p>\n<p style=\"text-align: justify;\">Not to mention the fact that customers will be more willing to believe you if you have a reputation of maintaining clean data records.<\/p>\n<p style=\"text-align: justify;\"><strong>Know that having clean Big Data can save you time and money. It can build you a good reputation in the market and trust among customers.<\/strong><\/p>\n<p style=\"text-align: justify;\">The major benefit of having clean Big Data, however, is better decisions. <strong>If you\u2019re using some made up or unreliable data for your analysis, you\u2019ll get only invalid conclusions<\/strong>. As the saying goes, \u201c<em>garbage in, garbage out\u201d.<\/em><\/p>\n<h2><span class=\"ez-toc-section\" id=\"What_Forms_of_Errors_Appear_in_Big_Data\"><\/span>What Forms of Errors Appear in Big Data?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p style=\"text-align: justify;\">The list of errors that you will have to face while fixing up your data is endless and ever growing. However, typical errors are:<\/p>\n<ul>\n<li><strong>Aliasing <\/strong>&#8211; When different entities are merged, perhaps because of the same tag<\/li>\n<li><strong>Incorrect entries <\/strong>\u2013 Either intentional or unintentional<\/li>\n<li><strong>Missing entries <\/strong>\u2013 When data is lost in the system due to glitches, etc.<\/li>\n<li><strong>Multiple entries <\/strong>\u2013 When the same information has different tags<\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><strong>Data cleaning is a messy job.<\/strong> You can always hire someone to do it for you. After all, you need to have a clean source of information to take better, more informed business decisions.<\/p>\n<p style=\"text-align: justify;\">But bear in mind that no one can ever know the dirt in your data like you do. You\u2019re the only one who can truly identify the clean from the dirty. That\u2019s because you know what it should look like. So, be brave and do it!<\/p>\n<p style=\"text-align: justify;\">Can you draw good enough conclusions from raw data? Or is it necessary to have clean data? Share your opinions with us!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Big data is of no use to your business if it\u2019s unclean and unreliable. Learn why you need to fix it up!<\/p>\n","protected":false},"author":81,"featured_media":6547,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[744],"tags":[745,1382],"class_list":["post-6546","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data","tag-big-data-2","tag-data-cleaning"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/posts\/6546","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/users\/81"}],"replies":[{"embeddable":true,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/comments?post=6546"}],"version-history":[{"count":3,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/posts\/6546\/revisions"}],"predecessor-version":[{"id":8999,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/posts\/6546\/revisions\/8999"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/media\/6547"}],"wp:attachment":[{"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/media?parent=6546"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/categories?post=6546"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.esds.co.in\/blog\/wp-json\/wp\/v2\/tags?post=6546"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}