The Machine Learning Race Is Really a Data Race
Organizations that hope to make AI a differentiator need to draw from alternative data sets — ones they may have to create themselves.
Machine learning — or artificial intelligence, if you prefer — is already becoming a commodity. Companies racing to simultaneously define and implement machine learning are finding, to their surprise, that implementing the algorithms used to make machines intelligent about a data set or problem is the easy part. There is a robust cohort of plug-and-play solutions to painlessly accomplish the heavy programmatic lifting, from the open-source machine learning framework of Google’s TensorFlow to Microsoft’s Azure Machine Learning and Amazon’s SageMaker.
What’s not becoming commoditized, though, is data. Instead, data is emerging as the key differentiator in the machine learning race. This is because good data is uncommon.
Useful Data: Both Valuable and Rare
Data is becoming a differentiator because many companies don’t have the data they need. Although companies have measured themselves in systematic ways using generally accepted accounting principles for decades, this measurement has long been focused on physical and financial assets — things and money. A Nobel Prize was even awarded on capital asset pricing in 2013, reinforcing these well-established priorities.
But today’s most valuable companies trade in software and networks, not just physical goods and capital assets. Over the past 40 years, the asset focus has completely flipped, from the market being dominated by 83% tangible assets in 1975 to 84% intangible assets in 2015. Instead of manufacturing coffeepots and selling washing machines, today’s corporate giants offer apps and connect people. This shift has created a drastic mismatch between what we measure and what actually drives value.
Get Updates on Leading With AI and Data
Get monthly insights on how artificial intelligence impacts your organization and what it means for your company and customers.
Please enter a valid email address
Thank you for signing up
The result is that useful data is problematically rare. There is a growing gap between market and book values. Because of this gap, companies are racing to apply machine learning to important business decisions, even replacing some of their expensive consultants, only to realize that the data they need doesn’t even exist yet. In essence, the fancy new AI systems are being asked to apply new techniques to the same old material.
Just like people, a machine learning system is not going to be smart about any topic until it has been taught. Machines need a lot more data than humans do in order to get smart — although, granted, they do read that data a lot faster.