The 10 Big Data Commandments
The Big Data wave has not reached the shor, yet we start realizing that some of the buzzwords already materialized in concrete technologies and propositions. There are still a lot of uncertainty around what this wave is going to bring… And it makes people start murmuring and complaining, and eventually start to make a golden Big Data calf in which to worship.
As Moses would do on his way down from Mount Sinai seing all these people singing and dancing around this golden calf, we will try to demolish the false god by sharing the…
10 commandments for big data practicioners
- You shall have no other gods before the divine V-Trinity: velocity, variety and volume
As this is the real essence of big data… without velocity, you could rely on the traditional batch jobs that seem to go forever… If you take the variety of sources away, you are missing the point: the real benefit and the unprecedented insights can be inferred only when you bring various data sources together, as in the relationships is where the intelligence resides. Volume is nowadays a given. The treasure hunter of insights does not simply stay at home… there is a very long journey between the breadth and the depth to the long-tail territories you need to undergo: long time series, huge amount of raw information, long-tail analysis… All of data requires huge data volumes, but the benefits are going to be equally huge!
- You shall not make for yourself any false image other than data to drive your business
Nowadays the competitive advantage of data driven organizations is no longer just a good ally, but a must have and a must do. The range of analytical capabilities emerging with big data and the fact that businesses can be modeled and forecasting is becoming a common practice and not a mathematical sorcery left no room but to bring your data assets to support the decision taking (from the day-to-day business to macro-strategic decisions)
- You shall not take the name of Big data in vain.
Or in other words, you shalt move away from the bull-shitting lineage and engage people with the right big data skills set. What seems to be a no-brainer is really difficult when we are just surfing the hype wave. You won’t see the bull-shitter in the list of threatened species… Just make sure you staff your team properly… include the so called data ninjas –people able to find and connect new data sources-, data scientist, product managers, visualization specialist, etc… everything but people who just talk
- Remember the best practice to manage the data, to keep it holy.
Big data decided to take the path of flexibility in the trade-off between versatility /stability vs. consistency / standardization given by a rigid structure. This decision should be celebrated and is making the Big Data wave more promissing than the semantic one used to be, mainly because of this lack of flexibility. The downside is though, the need for a well-defined practice and a strict discipline regarding the data management and the schema documentation and maintenance.
- Honor your data, listen to them.
Keep it mind that the value you get out of your data assets is fully up to you. Think of your data like shy person… Alone with her you are not going to get much insights, but when you bring this person together with her friends, she is going to most likely open up and start telling stories
- You shall not murder your Business Intelligence and Data Warehousing past
When I read that Lance Amstrong was going to run a marathon, I instantly knew that he was going to do it well. Even if it was something new for him, he has been practicing endurance sports down the road for ages. The same applies to a typical Business Intelligence Analyst… they have been their entire life with data, being very close to the data… suffering the pain of the volume, the speed and the variety, so they are going to be the first ones embracing the new Big Data technologies.
- You shall not commit adultery
Or in other words, you shalt not try to solve a small data problem with big data technology… One of the most common mistakes in Big Data is to see everything like a nail when you got the big data hammer in your hand. It’s critical to know what can be solved with which technology and what can’t be… or even simpler, what’s the best tool for which problem.
- You shall not let people steal you
If you wholesale your data, if you give them away, you are selling out your company soul! Don’t let anybody else profit from your data rather than you… Make sure you find the right partner to squeeze out all the juice but keep the ownership and be the one supporting your own decisions on your own data
- You shall not bear false witness against your historical data.
Or even worst, you shalt not dump historical data. Think of your data like many women think of their shoes: just don’t throw them away, as they might become fashionable again in the future… just keep them in the closet. Big Data make the past explain the present and the present predict the feature… But the pattern discovery on one hand or the machine learning algorithms on the other work best with a long history you can learn from
- You shall not covet your neighbor’s private data; you shall not covet your neighbor’s personal information, nor his location, nor his cookies, nor anything that is your neighbor’s, unless you got permission
Exactly… hands off the personal identifiable information unless you are 100% sure you can use it. The more data sources you can combine, the more insights are going to get… sometimes going to far away into people’s privacy realm can have drastical consequences for your business. Just avoid it and understand where the limits are
Note about the pictures in this post: these pictures are not hosted on this site, they are just links to existing pictures in the media archive of imdb.com. As stated in the imdb how to link I’m providing the attribution here: