Competing With Data & Analytics
As organizations everywhere increasingly embrace analytics, it is tempting to think that additional data will provide the crucial insight, reveal the overlooked explanation, or crisply discern key solutions within a morass of muddled information. But “more data” is not the answer to every problem.
Organizations that add data indiscriminately run the risk of becoming data hoarders instead of data collectors. An analyst working in a large financial services institution offered this useful distinction: “Hoarders store everything and don’t know how to determine what is important. Collectors know exactly what is valuable and prioritize what to keep.”
As data storage costs continue to plummet, why not just save everything? Why not be a hoarder? The answer is: hoarding wastes resources and, paradoxically, reduces the usefulness of existing data.
First, costs still exist; while storage costs have decreased, they are not zero. The sheer volume of data produced by modern information technologies adds up quickly and relentlessly. Calvin Smith, principal manager of global innovation at EMC Corporation, observes that “… ‘big data’ does not describe a Holy Grail data set that some companies ‘get’ and other companies don’t … big data could really include all data … and it’s not easy or cheap to attempt to collect and store all the data out there.”
Furthermore, storage costs themselves are a small part of data’s expense; maintenance costs (such as provisioning, backing up, verifying, and recovering) can be substantial and require expensive staff involvement.
Second, hoarding data interferes with existing data since it diverts scarce analyst and managerial resources that may be better applied elsewhere. If actionable insights are the proverbial needle in the haystack, adding more data may just make the haystack bigger, and the needle that much harder to find. The financial services analyst notes, “Even if data is free to store, high-priced data scientists will still waste time looking at it and try to find spurious patterns or incorporate the data into models to no avail. There is still an opportunity cost to looking at the wrong data and not having a strong sense for what questions are important to answer.”