Everyone knows the importance of data-driven decision making in business and is becoming more popular. However, what tends to get lost in that obsession is whether the underlying data is reliable or not.
It is easy to say that inaccurate data generates commercial problems. But what is even worse is when a company executes data that appears reliable when in fact it has flaws that significantly affect critical decisions. I know that the title is somewhat dramatic, but people invest countless hours trying to make their business succeed, and nothing is more painful than realizing that they were doomed from the start. That is why we need to eradicate the dangerous tendency of data providers to collect data collection and data organization .
The collection and organization of data are often treated as two sides of the same coin, however many companies do not realize that mixing them leads to metrics with incorrect or ambiguous labels. Given that the separation of these processes is not yet a common practice, the unfortunate reality is that most companies, even those that identify themselves as data-based, put themselves at an irreversible disadvantage without realizing it, and destroy that for which they That they are working. achieve.
Separate data collection and organization ensures that your company can obtain a clear and reliable image of your data. It is the best way to mitigate risk and optimize decision making.
The mirror of the music industry
There is an analogy that I like to make that compares data analysis with the music industry. In the 1
Nowadays, nobody uses the physical cut and paste to create complex and multi-layered musical tracks. Instead, they use digital recordings that can be cut, trimmed and played without losing the integrity of the original raw sounds. However, when it comes to data analysis, the status quo of the industry is like that of the music of the 1960s.
Combining data collection and organization means that it is virtually impossible to retain the integrity of the data. the original data if something needs to be rewritten or changed And even the process of doing that rewriting requires hours of laborious coding and work.
For example, suppose your company starts selling a single product (a mattress, for example) and its analyzes are programmed to track each purchase, labeled "buy". As the company grows, add a new product, a pillow. His analysis, which previously only followed a "purchase", can not distinguish between a mattress purchase and a pillow purchase.
Your company could reconfigure your system to track two events, "mattress purchase" and "pillow purchase," but all the old data is still labeled simply "buy." Now they have to go back and manually update all the data points labeled "buy" to "buy mattress" or simply live with an inconsistency between historical and new data.
This is the life when the data collection and organization are grouped: each new addition of product requires a lot of time, work prone to errors of re-schematizing manually the historical data. But by keeping the collection process separate from the organization, you can easily change the labels without losing any of the original data.
Suddenly, it is possible to simply create the new "mattress purchase" label and then apply it retroactively to all the historical data, no matter how it has been previously labeled. This allows companies to track all information over time, as they grow from mattresses and pillows, to sheets, bedspreads, quilts and more (also applies to products not related to the bedroom).
Nondestructive Edition and The Future of Data Analysis
Essentially, by separating the process of data analysis, the company in the previous example can treat your data in the same way that modern recording artists treat music digital; they can cut and rename the data however they want, without compromising fundamental integrity. The music industry calls this "non-destructive" edition. The data analysis does not even have a name, because it is an uncommon practice.
However, non-destructive editing is eminently important for any data-driven business because metrics with ambiguous or incorrect labels cause as many problems as incorrect information. As companies grow and expand, their data becomes exponentially more complicated. By separating collection and organization, companies acquire the ability to adapt their analysis easily and retroactively. They no longer have to invest valuable resources to re-tag data by hand, risking the integrity and usefulness of knowledge.
Ultimately, the future of the analytics industry depends on the separation of data collection and organization. Opens resources that simplify customer perceptions. And in the long run, it will allow companies that dream of being truly driven by data the freedom to finally embrace their potential.