How Big Data Is Empowering AI and Machine Learning at Scale

Big Data is powerful on its own. So is artificial intelligence. What happens when the two are merged?

Big data is moving to a new stage of maturity — one that promises even greater business impact and industry disruption over the course of the coming decade. As big data initiatives mature, organizations are now combining the agility of big data processes with the scale of artificial intelligence (AI) capabilities to accelerate the delivery of business value.

The Convergence of Big Data and AI

The convergence of big data with AI has emerged as the single most important development that is shaping the future of how firms drive business value from their data and analytics capabilities. The availability of greater volumes and sources of data is, for the first time, enabling capabilities in AI and machine learning that remained dormant for decades due to lack of data availability, limited sample sizes, and an inability to analyze massive amounts of data in milliseconds. Digital capabilities have moved data from batch to real-time, on-line, always-available access.

Although many AI technologies have been in existence for several decades, only now are they able to take advantage of datasets of sufficient size to provide meaningful learning and results. The ability to access large volumes of data with agility and ready access is leading to a rapid evolution in the application of AI and machine-learning applications. Whereas statisticians and early data scientists were often limited to working with “sample” sets of data, big data has enabled data scientists to access and work with massive sets of data without restriction. Rather than relying on representative data samples, data scientists can now rely on the data itself, in all of its granularity, nuance, and detail. This is why many organizations have moved from a hypothesis-based approach to a “data first” approach. Organizations can now load all of the data and let the data itself point the direction and tell the story. Unnecessary or redundant data can be culled, and more indicative and predictive data can be analyzed using “analytical sandboxes” or big data “centers of excellence,” which take advantage of the flexibility and agility of data management approaches. Apostles of big data have often referred to their approach as “load and go.” Big data enables an environment that encourages data discovery through iteration. As a result, businesses can move faster, experiment more, and learn quickly. To put it differently, big data enables organizations to fail fast and learn faster.

Big Data and AI at MetLife

Pete Johnson is one of the most experienced executives working in the field of big data and AI within industry today. Having worked in the field of artificial intelligence for a generation dating back to his academic career at Yale University, Johnson now leads big data and AI initiatives as a fellow at MetLife. Johnson previously held positions as senior vice president for Strategic Technology with Mellon Bank and served as the executive vice president and chief technology officer of Cognitive Systems Inc. (CSI), an early artificial intelligence company specializing in natural language processing, expert systems, case-based reasoning, and data mining. CSI was founded by several members of the Yale University faculty in 1981, when Johnson completed his MS in computer science.

Johnson, whom I’ve known for over a decade, is a regular participant in a series of executive thought-leadership breakfasts that I host for senior industry executives to share perspectives on topics in big data, AI, and machine learning among their peers. Participants in the most recent executive breakfasts have included chief data officers, chief analytics officers, chief digital officers, chief technology officers, and heads of big data for firms including AIG, American Express, Blackrock, Charles Schwab, CitiGroup, General Electric (GE), MetLife, TD Ameritrade, VISA, and Wells Fargo, among others. As a long-suffering expert in the field of artificial intelligence, Johnson observes three critical ways in which big data is now empowering AI:

  1. Big data technology — We have the ability now to process huge quantities of data that previously required extremely expensive hardware and software, or “commodity parallelism.”
  2. Availability of large data sets — ICR, transcription, voice and image files, weather data, and logistics data are now available in ways that were never possible in the past; even old “paper sourced” data is coming online.
  3. Machine learning at scale — “Scaled up” algorithms such as recurrent neural networks and deep learning are powering the breakthrough of AI.

Johnson notes a number of ways in which MetLife is employing AI that have been enabled by big data:

  1. Speech recognition has enabled vastly superior tracking of incidents and outcomes as a result of highly scaled machine learning implementations that indicate pending failures. An example is the ability to analyze doctor’s reports that originated as written forms. This is enabling recognition of disease progression, improving treatment efficacy, and formulation of “return-to-work” strategies — all issues that are important to insurers.
  2. Back-office effectiveness is delivering cost savings and improved customer service through more efficient claims processing as a result of claims models that have been enriched with unstructured data (like the doctor’s reports). This enables the insurer to improve patient health from a preventive perspective, as we can recognize anomalies sooner and take action faster.
  3. The holy grail will be the ability to execute automated underwriting, a practice that is becoming fairly common in areas such as property and casualty insurance. The next steps will be applying AI and machine learning to general health and wellness.

Johnson sums up his experience, “We have now reached critical mass. When you put these things — big data, AI, machine learning — together, we are starting to see better solutions for a number of classic problems. It will take longer for products with much longer tails involving health/wellness and life. But it’s coming.”

A Decade of Disruption at Scale

AI empowered by big data is accelerating the potential for disruptive change. The ubiquitous proliferation of data, combined with the means to capture and analyze massive volumes of data with agility and speed at scale, is driving innovation that extends far beyond traditional data and analytics functions. The ability to make informed decisions based on up-to-the-moment information is rapidly becoming the mainstream norm.

The figure below is from NewVantage Partners’ annual Big Data Executive Survey, which was published in early 2017 and reflects the outlook of top executives for the coming decade. In my January MIT Sloan Management Review article, “Companies Brace from Decade of Disruption from AI,” I noted that executives reported believing that AI would be the “single most disruptive” new capability over the course of the next decade. Additionally, these executives also noted that AI is first among all new capabilities that executives believe will have a disruptive impact on their firms — with an astounding 88.5% of executives reporting that they expect AI to have an impact on their firm (See "AI Is 'Most Disruptive' New Capability Over the Next Decade").

The impact of big data goes well beyond simple data and analytics. Big data and AI in combination are providing a powerful foundation for a rapidly descending wave of heightened innovation and business disruption. While the first wave of big data was about speed and flexibility, it appears that the next wave of big data will be all about leveraging the power of AI and machine learning to deliver business value at scale.

2 Comments On: How Big Data Is Empowering AI and Machine Learning at Scale

  • Samir Asaf | May 18, 2017

    Excellent article by Randy, highlighting the power of AI and Big Data. This convergence is necessary but not a sufficient condition for delivering business value. CXOs need relevant real-time insights focused on their Critical Success Factors (CSFs), so that analytic outputs directly support strategic and operational decision-making.

  • Tinko Stoyanov | May 18, 2017

    If possible I just would like to correct the title. The “big information” (not the big data) will really empower the AI. The data is a subset of information (and ML is just a part of AI). AI needs “big” information to start functioning (properly).

Add a comment