What Leaders Should Know About Measuring AI Project Value

Most AI/machine learning projects report only on technical metrics that don’t tell leaders how much business value could be delivered. To prevent project failures, press for business metrics instead.

Reading Time: 11 min 

Topics

Permissions and PDF

Neil Webb/theispot.com

“AI” can mean many things, but for organizations using artificial intelligence to improve existing, large-scale operations, the applicable technology is machine learning (ML), which is a central basis for — and what many people mean by — AI. ML has the potential to improve all kinds of business processes: It generates predictive models that improve targeted marketing, fraud mitigation, financial risk management, logistics, and much more. To differentiate from generative AI, initiatives like these are also sometimes called predictive AI or predictive analytics. You might expect that the performance of these predictive ML models — how good they are, and how much value they deliver — would be front and center. After all, generating business value is the whole point.

But you would be wrong. When it comes to evaluating a model, most ML projects report on the wrong metrics — and this often kills the project entirely.

In this article, adapted from The AI Playbook: Mastering the Rare Art of Machine Learning Deployment, I’ll explain the difference between technical and business metrics for benchmarking ML. I’ll also show how to report on performance in business terms, using credit card fraud detection as an example.

Why Business Metrics Must Come First

When evaluating ML models, data scientists focus almost entirely on technical metrics like precision, recall, and lift, a kind of predictive multiplier (in other words, how many times better than guessing does the model predict?). But these metrics are critically insufficient. They tell us the relative performance of a predictive model — in comparison to a baseline such as random guessing — but provide no direct reading on the absolute business value of a model. Even the most common, go-to metric, accuracy, falls into this category. (Also, it’s usually impertinent and often misleading.)

Instead, the focus should be on business metrics — such as revenue, profit, savings, and number of customers acquired. These straightforward, salient metrics gauge the fundamental notions of success. They relate directly to business objectives and reveal the true value of the imperfect predictions ML delivers. They’re core to building a much-needed bridge between business and data science teams.

Unfortunately, data scientists routinely omit business metrics from reports and discussions, despite their importance. Instead, technical metrics dominate the ML practice — both in terms of technical execution and in reporting results to stakeholders.

Topics

Reprint #:

65334

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Comments (3)
Kaitki Agarwal
Eric, great article, very insightful. I see data scientists, technology teams, and business teams operating and thinking in silos. They need a bridge or framework to communicate and understand each other's world to fully utilize AI's potential in the business world.
Eric Siegel
Philip,

Thanks for your comment. Indeed, a more refined calculation would take into account the specific cost of each case of uncaught fraud. OTOH, even a reasonable estimate like that is a lot better than what's normally done in the name of business value estimation: nothing! Is that what you mean? I'd love to hear what you mean more specifically -- can't find you on LinkedIn, but please feel free to reach out to me there.
Philip Le
Thanks Eric on the good article. It might seems straightforward that we can quantify the cost of $100 for the customer inconvenience and $500  for fraudster gets away with it in the fraud detection example, but in practice it is not that simple. How can we best estimate these numbers and what is the approach here that we can quantify these accurately?