Deodorizing Your Data

To deal with “smelly” data, try refactoring your analytics processes.

Reading Time: 4 min 

Topics

Competing With Data & Analytics

How does data inform business processes, offerings, and engagement with customers? This research looks at trends in the use of analytics, the evolution of analytics strategy, optimal team composition, and new opportunities for data-driven innovation.
More in this series
Already a member?
Not a member?
Sign up today
Member
Free

5 free articles per month, $6.95/article thereafter, free newsletter.

Subscribe
$75/Year

Unlimited digital content, quarterly magazine, free newsletter, entire archive.

Sign me up

In programming, people describe their sense of underlying fragility in code by noting that it “smells bad.” A bad code smell is an amalgam of signals (such as size, complexity, duplication) that combine to indicate larger, deeper problems. Similarly, analytics in organizations can stink — and it takes more than a spritz of air freshener to solve the problem.

So how do you know if your analytics needs deodorizing? Symptoms abound. Poking around with data may uncover more quality problems than it does insightful answers. Results may be sketchy and change drastically under the least bit of scrutiny. Different parts of the organization have multiple versions of the truth. Analysis that should be routine is repeatedly done ad hoc each time, requiring duplication of effort with each iteration.

Certainly bad data smells are not intended — but they can be prevented by understanding how they develop. Some of the factors contributing to smelly data include:

  • Complex realities. Analytics compiles data that are snapshots taken in a complex world — and these snapshots don’t always fit into well-structured or clean models. Furthermore, that world continues to change, even if systems don’t. For example, evolving businesses and requirements led to “14 separate health plans with inconsistent approaches to defining similar types of data” at the health care provider WellPoint, according to a recent MIT SMR case study. Each system likely made sense in isolation or at the time it was developed — but any later attempt to generate analytical results must synthesize each of these disparate sources of data.
  • Acquisitions. Organizations often grow through acquisition of other, previously independent organizations, each with idiosyncratic systems. Tom Fontanella, senior IS director at Sanofi, reports that a master data management project that Genzyme undertook before being acquired by Sanofi found that “30-day payment terms [were] expressed as Net 30, 30, 30 Day, 30days, LC30, 030NL …” due to a series of acquisitions over a number of years, often in different areas of the world. This lack of consistency meant that the data on 30-day payment was squirreled away under a range of labels — a malodorous situation indeed.
  • Urgency. Operational pressure can be intense.

Read the Full Article

Topics

Competing With Data & Analytics

How does data inform business processes, offerings, and engagement with customers? This research looks at trends in the use of analytics, the evolution of analytics strategy, optimal team composition, and new opportunities for data-driven innovation.
More in this series

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.