“AI” can mean many things, but for organizations using artificial intelligence to improve existing, large-scale operations, the applicable technology is machine learning (ML), which is a central basis for — and what many people mean by — AI. ML has the potential to improve all kinds of business processes: It generates predictive models that improve targeted marketing, fraud mitigation, financial risk management, logistics, and much more. To differentiate from generative AI, initiatives like these are also sometimes called predictive AI or predictive analytics. You might expect that the performance of these predictive ML models — how good they are, and how much value they deliver — would be front and center. After all, generating business value is the whole point.
But you would be wrong. When it comes to evaluating a model, most ML projects report on the wrong metrics — and this often kills the project entirely.
Get Updates on Leading With AI and Data
Get monthly insights on how artificial intelligence impacts your organization and what it means for your company and customers.
Please enter a valid email address
Thank you for signing up
In this article, adapted from The AI Playbook: Mastering the Rare Art of Machine Learning Deployment, I’ll explain the difference between technical and business metrics for benchmarking ML. I’ll also show how to report on performance in business terms, using credit card fraud detection as an example.
Why Business Metrics Must Come First
When evaluating ML models, data scientists focus almost entirely on technical metrics like precision, recall, and lift, a kind of predictive multiplier (in other words, how many times better than guessing does the model predict?). But these metrics are critically insufficient. They tell us the relative performance of a predictive model — in comparison to a baseline such as random guessing — but provide no direct reading on the absolute business value of a model. Even the most common, go-to metric, accuracy, falls into this category. (Also, it’s usually impertinent and often misleading.)
Instead, the focus should be on business metrics — such as revenue, profit, savings, and number of customers acquired. These straightforward, salient metrics gauge the fundamental notions of success. They relate directly to business objectives and reveal the true value of the imperfect predictions ML delivers. They’re core to building a much-needed bridge between business and data science teams.
Unfortunately, data scientists routinely omit business metrics from reports and discussions, despite their importance. Instead, technical metrics dominate the ML practice — both in terms of technical execution and in reporting results to stakeholders.