Big Idea: Competing With Data & Analytics

Analytical Value From Data That Cries Wolf

Imperfect data can still be put to good uses.

Sam Ransbotham September 28, 2014 Reading Time: 4 min

Topics

Competing With Data & Analytics

How does data inform business processes, offerings, and engagement with customers? This research looks at trends in the use of analytics, the evolution of analytics strategy, optimal team composition, and new opportunities for data-driven innovation.

Understand the trade offs between false positives and false negatives.

Ideally, data contain neither false positives nor false negatives. In reality, data sources balance the two; decreasing one usually means increasing the other. For example, consider news. A newspaper can expend considerable effort to validate information from multiple independent sources before publication. To avoid unnecessary publication cost and reputation damage, the design of the process emphasizes publishing articles that are accurate. However, this accuracy comes at the expense of false negatives. By working so hard to avoid publishing mistakes, a newspaper may miss or delay real stories. As a data source, newspapers avoid false positives, but in doing so they sometimes are left with false negatives.

In contrast, Twitter users rarely miss anything — for example, the raid on Bin Laden was live tweeted long before mainstream news covered the story. Yet inaccurate (and often completely unfounded) rumors run rampant as well — many celebrities must have feline ancestry to bounce back from their many Twitter-based death reports. With low production cost and minimal reputational concerns, this data source offers insights and speed unavailable in the opposite approach. But the tradeoff is an increase in false positives.

Identify the sensitivity and specificity of each data source.

Organizations need to assess both the rate of true positives (sensitivity) and true negatives (specificity) for each data source to understand inherent tradeoffs.

Topics

Competing With Data & Analytics

About the Author

Sam Ransbotham is an associate professor of information systems at the Carroll School of Management at Boston College and the MIT Sloan Management Review Guest Editor for the Data and Analytics Big Idea Initiative. He can be reached at sam.ransbotham@bc.edu and on Twitter at @ransbotham.

Data is what links our abstract world into the granularity of things we understand and relate to. Before data becomes useful information, It must be measured, stored, manipulated and displayed. Each one of these phases is a combination of reliable and unreliable methods. Human subjectivity is needed in many cases, when technology alone can't simulate circumstantial and/or environmental factors. I like how this article talks about having data sources to be as versatile as possible to cover the most part of what influences data. Also, I agree completely on the benefit of knowing the rate of false positives/negatives with respect to data sources, which could lead to better decision making.

Determining which results are false positives and false negatives itself is a challenge. You don't know beforehand. Data analytics are useful but testing for false positives and negatives requires a process that does not use the data as a verification source. The results of analytics should be scrutinized using what humans are good at, the "smell test" or intuition. Where did the data come from? Do we trust the data and those who created the models? What are their motivations? In research studies most senior management decisions are made not based on existing data but on networking with colleagues and acquiring new data sources. McDonald's was suffering from a decease in same store sales and used their data to try and determine why. Unfortunately there was nothing in the data except for confirming a decrease in sales. What they determined is that customers were dissatisfied with the customer experience. Slow and unpleasant interactions resulted in customers not coming back. There was no data available to determine the cause but a visit by management would have easily exposed why sales were down. Too many people assume data is "real" that data represents "facts". But to quote a philosopher; "all data lies". Extracting the story, fiction or non- fiction from the data is the challenge and using the same data to determine that isn't very helpful. Many analytics are in support the story someone wants to tell. As a result data selection is biased.

Good article, and I certainly agree with the idea of combining several data sources to try to paint the true picture. Establishing a link between the different data sources is perhaps another consideration , that plays into how we weight one data source over another.

Analytical Value From Data That Cries Wolf

Topics

Competing With Data & Analytics

Understand the trade offs between false positives and false negatives.

Identify the sensitivity and specificity of each data source.

Topics

Competing With Data & Analytics

About the Author

Add a comment Cancel reply

Comments (3)

HR Harvard

Richard Ordowich

Keith Drummond

Topics

Competing With Data & Analytics

Understand the trade offs between false positives and false negatives.

Identify the sensitivity and specificity of each data source.

Topics

Competing With Data & Analytics

About the Author

More Like This

Add a comment Cancel reply

Comments (3)

HR Harvard

Richard Ordowich

Keith Drummond