Big Data: Back to Basics
Today’s selection is…
Belgian expert Pierre Nicolas Schwab’s piece on Big Data based on his past assignments with companies trying to launch such initiatives. When I discovered Pierre Nicolas’s prose, I must admit that I felt better for discovering that the issue of “small data” is ubiquitous. According to him there are 3 common mistakes companies make when launching their big data initiatives:
- They don’t know where the data is (actually; I would add that sometimes they do but they are not allowed to use it. Very often a specific department owns the data and prevents anyone else from using it. This is mostly true of customer data, and I even witnessed that in small and medium firms, it’s not just a big logo issue),
- They don’t know what to do with their data: this too may seem ludicrous, but it’s true that big data requires both business/marketing and technical acumen, and it’s pretty easy for many to get drowned in data without knowing what to do with them,
- They’ve been told that IT is magical and you don’t need to do anything for systems will do it for you. Well, this is not true, massaging data requires a lot of work and dedication, trial and error, and also the fine tuning of data. In some cases (retail for instance), it is true though that a lot of data comes straight from check-out systems and not much needs to be done… except master data mining and very complex analytics, which often only data scientists can manage.
One last point I would add is often the poor quality of the data itself, mostly when it has been entered manually and rarely, if not never, updated properly. I have known someone who spent his entire company life updating customer databases for a large IT company, this is a never ending job… even though large databases exist, if you own the data, you need to keep it clean and up to date; this is a hug task.