GE’s Vince Campisi describes the company’s expedition to the data lake.
You’d have to be off the grid to have missed General Electric’s plugging of the industrial Internet — the use of sensors to collect data about things like turbines and jet engines and factory floors. For GE, the industrial Internet means being able to sell services to its customers based on detailed analysis of data streaming from its equipment and the ability to predict failures and other key events.
Doing this required the building of a “data lake” — a storage system to hold enormous amounts of raw data in its native format for future use — as well as the meshing of longstanding industrial culture with a newfangled approach to data and the hiring or development of a variety of analytics talent.
At the helm of much of the technology development is Vince Campisi, a 16-year veteran of GE who became CIO of GE Software at the launch of the GE Center of Software Excellence. Campisi spoke with MIT Sloan Management Review contributing editor Michael Fitzgerald.
What’s been your favorite example of what the data lake can really do for you as a company?
Last November , we set a goal that said, 90 days from now, we want to demonstrate the power of a data lake and stand it up in order to demonstrate what we’re trying to prove with industrial Internet and industrial big data. We set out to connect with 25 airlines, collect and manage machine data from 3.4 million flights, and ingest all that information [toward] helping improve time on wing, which means revenue generated per engine.
We got it done in 70 days. We created the data lake, ingested and connected the full flight data coming off an engine, blending it with shop visits and parts information, and getting it in the hands of our data science community to look at things that were reducing time on wing for customers. For example, based on some of the analytics we solved for, things like washing an engine more consistently and frequently improves its reliability and efficiency.