If You Think Big Data’s Challenges Are Tough Now

…you should recognize that it’s only going to get tougher as data from sensors and smart devices becomes more prevalent.

According to the EMC/IDC Digital Universe Report, data is doubling in size every two years. From 2013 to 2020, the digital universe is projected to grow from 4.4 trillion gigabytes to 44 trillion gigabytes.

For executives faced with the age-old challenge of how to organize and learn from the data available to them, the problem just got a lot harder.

The Internet of Things (IoT) may be a new topic to many executives, but it lies at the heart of the phenomenon of data proliferation. Today, according to the EMC/IDG study, consumers and workers generate two-thirds of all new data. This is about to change. It is projected that within this decade, most new data will be generated not by people — consumers and workers — but by sensors and embedded, intelligent devices connected to the Internet — smart phones, traffic lights, MRI scanners, smart energy grids, and heavy industrial systems.

While still small as a percentage of all data that is generated, it is expected that data generated by “things” will grow from 2% of all data that is captured in 2013 to 10% in 2020, with this pace accelerating during the next decade.

This warrants the attention of any executive charged with an organization’s data strategy. As connectivity becomes pervasive, proper management of intelligent devices will provide the foundation for organizational adaptation to rapidly changing needs. Health care providers are employing sensors to monitor patients in real time, enabling physicians to more effectively diagnose disease and prescribe treatments. Financial services firms are able to monitor and manage risk and detect fraud. General Electric has invested heavily in manufacturing sensor-equipped devices that stream data across what it calls the “Industrial Internet.” All of these benefits result from the situational awareness enabled by intelligent devices, sensors, and the monitoring and capture of real-time data for analysis and action.

Find Value in Fresh Data

To date, the Big Data challenge for most executives has been about organizing and managing large volumes and greater varieties of historical data, mostly for purposes of improving their analytic capabilities — understanding past behaviors and activity as a predictor of future actions. Historic data is, however, data that’s “at rest”— it teaches us about the past and enables us to forecast about the future, but it is static data.

Big Data, in the context of the IoT, is about data “in motion,” which refers to the velocity and highly interactive nature of the data captured by sensors and intelligent devices. Data is dynamic and exists on a time continuum. When understood in this context, it is easy to appreciate why data can have great impact when it is fresh — newly captured, highly interactive, based on an immediate activity and reflective of the current state of affairs or interest. As an example, knowing a customer’s preference during a purchase can be significantly more valuable than knowing a customer’s preference after a purchase. This is often referred to as “situational awareness,” which means that by having real-time sensitivity to customer preferences at the moment of purchase, a business may be able to tailor, adapt or customize an offer to best fit a customer’s needs at the moment of purchase.

While historic information about past behavior has always been the single most important predictor of future behavior, information in the moment becomes an even greater and powerful predictor of consumer choices. It is the interactive nature and the freshness of “data in motion” that makes this information so relevant. Similarly, in the universe of the IoT, smart processes driven by sensors and other intelligent devices can be continually monitored, controlled and adjusted to automatically course-correct if needed, resulting in increased efficiencies. Complex operating environments can be monitored to detect hazards and take corrective actions, resulting in risk mitigation and cost savings.

While solutions like the Hadoop Distributed File System (HDFS) have been developed to provide organizations with a highly cost-effective Big Data approach to storing vast amounts of historical data for analysis, developing an enterprise data architecture that can take advantage of dynamic data from the IoT represents a new challenge. The requirements of fresh data imply an ability to ingest, analyze and interact with vast streams of incoming data in real time, requiring an “in-memory” data management approach. Data experts like Michael Stonebraker, adjunct professor at the MIT Computer Science and Artificial Intelligence Laboratory and a cofounder of VoltDB, believes that executives will need a new data architecture to ensure that they realize the potential benefits that are expected to result from the IoT.

Big Data has broadened the data landscape by introducing new forms of data — unstructured content, social media data, and now sensor data — that can be integrated with traditional forms of historical transaction data to provide a fuller business picture. Executives have been hard at work on developing and implementing processes so that data can flow more effectively and usefully through an organization, to the greater business benefit of the enterprise.

The IoT marries the power of new Big Data approaches for managing and analyzing historical data with high speed monitoring and processing of events as they occur. Given the emergence of the IoT, and the expected growth of sensor and intelligent device–generated data, executives would do well to plan ahead for the next great frontier in the march of data and analytics.