Competing With Data & Analytics
What to Read Next
Already a member?Sign in
That the era of big data is upon us is no longer a question. What to do with all that data appears to be the next big hurdle. Particularly in certain industries — life sciences, for example — that collectively generate so much data, researchers are unable to delve into any real insights, with any real speed.
A recent New York Times article, “DNA Sequencing Caught in the Deluge of Data,” discussed China-based BGI, the world’s largest genomics research company. BGI produces so much data — the equivalent of 2,000 human genomes a day — that it can’t transmit results to clients or collaborators over the Internet. To do so would take weeks. Instead, BGI puts the data on disks and sends it out via FedEx. Really.
The crux, according to The Times, is that “the ability to determine DNA sequences is starting to outrun the ability of researchers to store, transmit and especially to analyze the data.”
(A single human genome sequence consumes about two terabytes of data, according to Simon Robinson, research director, storage, at 451 Research, an analyst group focused on the business of enterprise IT innovation. Once you do some processing and analysis of that genome, it can quickly turn into seven terabytes of data. For a single genome. “That’s why this is such a problem,” Robinson told me in a phone interview. “It’s a step change in the sheer volume of data that is created.”)
So what kinds of solutions might be possible? A number of companies are starting to look to the cloud, not only for storing massive amounts of data, but for analysis and deriving insights, too. Companies like Amazon, IBM and Rackspace offer flexible cloud storage models that, in the words of Rackspace, provide “dynamic scaling at a moment’s notice.” Other companies, including Google, are developing new methods for analyzing all that data — as a service.
In a recent Sandhill blog, “Big Data and Insight As a Service,” Evangelos Simoudis, senior managing director of Trident Capital, outlined two types of cloud-based, big data analytics services.