Big Data: Back to Basics

Today’s selection is…

Belgian expert Pierre Nicolas Schwab’s piece on Big Data based on his past assignments with companies trying to launch such initiatives.  When I discovered Pierre Nicolas’s prose,  I must admit that I felt better for discovering that the issue of “small data” is ubiquitous. According to him there are 3 common mistakes companies make when launching their big data initiatives:

  1. They don’t know where the data is (actually; I would add that sometimes they do but they are not allowed to use it. Very often a specific department owns the data and prevents anyone else from using it. This is mostly true of customer data, and I even witnessed that in small and medium firms, it’s not just a big logo issue),
  2. They don’t know what to do with their data: this too may seem ludicrous, but it’s true that big data requires both business/marketing and technical acumen, and it’s pretty easy for many to get drowned in data without knowing what to do with them,
  3. They’ve been told that IT is magical and you don’t need to do anything for systems will do it for you. Well, this is not true, massaging data requires a lot of work and dedication, trial and error, and also the fine tuning of data. In some cases (retail for instance), it is true though that a lot of data comes straight from check-out systems and not much needs to be done… except master data mining and very complex analytics, which often only data scientists can manage.

One last point I would add is often the poor quality of the data itself, mostly when it has been entered manually and rarely, if not never, updated properly. I have known someone who spent his entire company life updating customer databases for a large IT company, this is a never ending job… even though large databases exist, if you own the data, you need to keep it clean and up to date; this is a hug task.