A New Machine Learning Approach Answers What-If Questions

Stefan Feuerriegel; Yash Raj Shrestha; Georg von Krogh

Magazine Spring 2025 Issue Research Feature

A New Machine Learning Approach Answers What-If Questions

Causal ML enables managers to explore different options to improve decision-making.

Stefan Feuerriegel, Yash Raj Shrestha, and Georg von Krogh February 26, 2025 Reading Time: 15 min

Topics

Permissions and PDF Download

Twitter Facebook Linkedin

Machine learning is now widely used to guide decisions in processes where gauging the probability of a specific outcome — such as whether a customer will repay a loan — is sufficient. But because the technology, as traditionally applied, relies on correlations to make predictions, the insights it offers managers is flawed, at best, when it comes to anticipating the impact of different choices on business outcomes.1

Consider leaders at a large company who must decide how much to invest in R&D in the coming year. Using traditional ML, they can ask what will happen when they increase their spending. They might find a strong correlation between higher levels of investment and higher revenue when the economy is growing. And they might conclude that, since economic conditions are favorable, they should increase the R&D budget.

But should they really? If so, by how much? External factors, such as levels of consumer spending, technology spillover from competitors, and interest rates, also influence revenue growth. Comparing how different levels of investment might affect revenue while considering these other variables is useful for the manager who is trying to determine the R&D budget that will deliver the greatest benefit to the company.

Causal ML — an emerging area of machine learning — can help to answer such what-if questions through causal inference. Similar to how marketers use A/B tests to infer which of two ads is likely to generate more sales, causal ML can inform what might happen if managers were to take a particular action.2

This makes the technology useful in many of the same business functions that use traditional ML, including product development, manufacturing, finance, human resources, and marketing.3 Traditional ML is still the go-to approach when the only goal is to make predictions — such as whether stock prices will rise or which products customers are most likely to buy. When a company wants to predict what would happen if it were to make one decision versus another — such as whether a 10% discount or none is more likely to induce a customer to make a repeat purchase — it needs causal ML.

Our research on machine learning and AI and our experience helping companies apply causal ML points out a path to using the technology successfully. (See “The Research”) Companies will need the right expertise, too, and should boost employees’ literacy in causal ML.

What Causal ML Can and Cannot Do

Causal ML is a powerful tool, but managers may find the name misleading. The label “counterfactual prediction” would more accurately reflect what it does: predict outcomes based on hypothetical actions. The technology is best understood as a way to make better guesses rather than as a source of definitive answers. Framing it in this way can remind managers not to overinterpret the results.

It does this using causal inference, which looks at past results to understand cause-and-effect relationships among variables. But instead of focusing on why something happened, causal ML applies these relationships to predict the effects of interventions in new, forward-looking settings.

Causal ML is best understood as a way to make better guesses rather than as a source of definitive answers.

However, the method cannot explain why a causal relationship exists between a particular factor and the outcome it affects. For instance, a causal ML model might predict that reducing an R&D budget will decrease revenue, but it will not explain why that relationship exists or whether confounders — factors affecting both the decision and the outcome — might change and invalidate that prediction. Managers should use their domain expertise to evaluate whether a given prediction makes sense. This approach helps ensure that the model’s predictions are interpreted correctly and remain relevant to real-world decisions.

Like traditional machine learning, causal ML is most effective when managers have large volumes of data, their options are clearly defined, and the desired outcome is well understood. It is generally unsuitable for one-off decisions and in scenarios requiring intuition or creativity.

Choose the Right Problem — and Data

Causal ML is best at predicting the outcomes of straightforward decisions that are supported by ample historical data from internal and external sources. Questions about operations can be good candidates for the approach because they are made frequently and companies have a lot of data to support them.4 The following are examples of causal ML’s use in that context:

Booking.com collects data from thousands of hotel reservations every hour. Marketers at the company use causal ML to determine not only whether to give discounts but also which customers should get them.
Chocolate maker Lindt has extensive data about environmental conditions, equipment, packaging, and other factors that affect the quality of its world-famous truffles. Manufacturing managers use causal ML to help them fine-tune parameters such as the temperature of machines and the configurations of truffle molds.5
Hitachi ABB Power Grids turned to causal ML to reduce failure rates in its semiconductor manufacturing process, using machine performance data. It was able to cut its yield loss by about half by identifying which combination of machines consistently produced the best-quality chips.6

At Novartis, managers who had been educated about the capabilities of different kinds of machine learning were able to identify several decision-making tasks where replacing traditional machine learning with causal ML offered significant benefits. They had asked a traditional ML model whether increasing the marketing budget would increase sales, but its predictions were not helping them decide how to allocate that budget. They decided to use causal ML to evaluate how different promotional campaigns might affect future sales. They used the predictions to distribute resources to the campaigns that were likely to be most effective.

A decision that is suitable for causal ML can be expressed as a number or a binary choice (such as an amount of revenue or buy/hold). It may also be framed as a question about which action to take: to allocate a marketing budget of $10,000 or $15,000 for the next quarter, or to offer a 10% discount or none on a product.7

Further, causal ML cannot effectively address every potential use case, even if it seems suitable for that on the surface. Confounders — the variables that affect both the outcome and the decision — introduce biases that affect predictions and must be accounted for. They can be challenging or impossible to test for, and they affect the accuracy of predictions. If, for example, data is available only for product sales during an economic upturn, predictions of product sales during a downturn would be less reliable.

When managers have determined what they want to decide, identified how they will measure the outcome, and affirmed that they have enough data, they can begin to work with data scientists to assemble and categorize that data to build their causal ML model. Business leaders and other individuals with domain knowledge are essential partners to data scientists and machine learning experts in building causal ML models that provide reliable results.

Training the model to capture complex cause-and-effect relationships requires data from at least a few dozen — and ideally, hundreds or thousands — of historical decisions. With massive amounts of data, the model can uncover connections between variables that may be unknown to managers or difficult to quantify. Less data leads to less-accurate predictions.

Broadly, causal ML requires three categories of data that were alluded to earlier: decisions, outcomes, and confounders. Decision data encompasses what managers have done in the past, such as the staffing levels or budgets they set, discounts they offered, investments they made, or processes they changed. Outcome data may include any measurable business result, such as sales volume, revenue growth, quality metrics, or productivity.

Confounders can come from internal or external sources. They may include economic conditions, workforce composition, and competitor behavior, and they can vary with the decision being made. For a marketing decision, the type of device customers use may be a confounder because those with more expensive smartphones may tend to spend more money whether or not they respond to an incentive.

For example, Neue Zürcher Zeitung, an international media company that publishes the largest-circulation newspaper in Switzerland, implemented causal ML to improve the effectiveness of editors’ content promotion decisions. The decision variable was whether an online article was promoted on one of two front pages that were served to readers. The outcome variable was a performance score that combined website traffic, reader engagement, and subscription signups. Confounders included time factors (such as the hour of the day), content characteristics (such as the article format), past performance indicators (including clicks), and past promotion decisions (including whether the article had been promoted elsewhere).

Identify Possible Causal Factors

A valuable lesson from our work has been the value of sketching a causal graph on a whiteboard that illustrates the expected relationships between the outcome, the decision, and the confounders at the start of the model development process. Managers’ knowledge and expertise are essential here because they have repeatedly made decisions and learned to anticipate certain results.

The causal graph tells the data scientists (who should be experts in causal inference) whether to treat a variable as a cause or an effect in the model. In this way, the team can rule out reverse causality errors. That is, they can ensure that the model does not misinterpret one variable as causing another when, in reality, the effect is the opposite.

Imagine a celebrity with millions of social media followers. If we do not know much about social media or stardom, we might conclude that fame comes from having a high follower count. The reverse is more likely to be true. As even the average teenager has observed, to get millions of strangers to follow their social media accounts, they first have to do something that gets them noticed.

In the case of our R&D spending question, the budget influences revenue, not the other way around. Meanwhile, confounders such as the economic climate, market trends, or team expertise are acknowledged as driving both the budget decision and business outcomes but are not influenced by either. The model would take all of this into account. (See “A Causal Graph for an R&D Budget Decision.”)

Choose the Output

Next, managers need to choose the type of answer the model should give in response to the question (referred to in statistics as the output, or estimand): It can predict the end result of a decision or the relative benefit of one alternative compared with another.

Each of those outputs can be useful, depending on how the manager is thinking through a decision. Focusing on end results, such as potential revenues under different budget scenarios or personalized incentives for individual customers, helps with strategic planning. However, comparing the incremental effects of different decisions is often sufficient for making one: If a manager wants to know which of two ads is likely to boost sales more effectively, they do not necessarily need to predict how much revenue each variant might generate. They only need to know the relative benefit: that one ad is likely to generate three times more revenue than the other. Moreover, focusing on relative benefits generates more reliable predictions than focusing on end results. We recommend pursuing only as much granularity as is necessary.

Editors from Neue Zürcher Zeitung were interested in predicting the actual click rates for each article they promoted, but the company opted instead to predict the likely net gain in performance from promoting an item. This approach enabled causal ML to make more accurate predictions about which content, when promoted, would increase clicks and subscriptions. Editors learned that promoting articles written by the editor in chief significantly increased both outcomes.8 They had been promoting the top editor’s articles sparingly, and the findings served as a starting point to revise their promotional strategy.

Train, Test, and Validate the Model

Once managers have defined the decision they want to make and their preferred type of output, data and machine learning scientists can choose the causal ML model that is right for the job. Once the model is implemented, machine learning engineers will train it using the previously categorized data.

The final step is to test and validate the causal ML model in practice to ensure that it is reliable and that its predictions result in better business performance. Validation also offers the opportunity for decision makers, including senior leaders, to gain trust in its predictions. Starting with relatively simple and straightforward problems where clear decision alternatives can be identified and assessed makes this step easier to accomplish.

Testing and validation require care because managers can observe only the outcome of the decision that was made in the real world. They have no way to know what the outcome would have been had a different decision been made. Two strategies, “human in the loop” and the familiar A/B testing approach, have proved successful.

Broadly, causal ML requires three categories of data: decisions, outcomes, and confounders.

Neue Zürcher Zeitung chose to integrate the model’s recommendations with human decision-making processes.9 Its causal ML model recommends which content to promote, but the editors make the final decisions. The model relies on the same information that editors previously used to make their promotion decisions, so they can trust that the model is not missing key elements. The causal ML model’s recommendations typically match the editors’ own gut feelings, which gives them confidence that it is reliable.

Some decisions are tricky, and editors know that their judgment is not perfect. In cases where causal ML recommends a different decision than they would have made, the editors can test the recommendation and see the result. Over time, they should see that causal ML is able to make reliable recommendations in ambiguous situations. At that point, they will be able to follow the causal ML recommendations instead of their instincts more frequently.

Hitachi ABB used A/B testing to validate the causal ML models it built to improve manufacturing quality. In one application, managers used the model to predict which of several machines would produce the best-quality output in the etching and implantation steps of the semiconductor fabrication process and contribute to the highest-quality output overall.

To confirm that the predictions were reliable, managers did a controlled experiment in which they changed the machine used for etching and implantation and kept the machines used for other processes the same. They found that the better machine for etching and implantation was the same one that the causal ML model had predicted. Thanks to causal ML, managers were able to find and address the source of manufacturing issues more efficiently than they could have with either manual methods or traditional ML.10

Prepare the Organization

While causal ML has the potential to improve decisions, implementing such systems requires a high level of AI literacy in the workforce, specialized technical expertise, and patience — because these projects may take longer to develop than traditional ML applications. Managers can prepare their organizations by educating themselves and their workforces about causal AI and building the interdisciplinary teams needed to develop the applications.

To excel at using causal ML, teams need strong expertise in data science and machine learning, along with domain knowledge.

Many companies today are investing heavily in educating employees about traditional ML and generative AI models (such as ChatGPT) to stay competitive and innovative. If the organization plans to use causal ML, it needs to include this technology in its AI literacy efforts. Employees who are alert to the strengths and limitations of different AI approaches will be empowered to find opportunities to use them effectively.

We found that to excel at using causal ML, teams need strong expertise in data science and machine learning, along with domain knowledge. However, building such teams can be costly, particularly when it requires companies to hire data scientists or turn to external consultants and partners.

Moreover, data scientists and machine learning engineers are typically assigned to different teams. They need to work closely when developing and implementing causal ML models and have strong engagement with the business stakeholders who have domain knowledge. (Domain knowledge is also essential in traditional machine learning but is often less rigorously applied because teams do not deeply consider the underlying relationships between variables when building those models.)

For example, at Neue Zürcher Zeitung, the insights that editors and marketers have into editorial processes, customer preferences, and the long-term objectives of the brand help data scientists define variables that measure those factors. At Hitachi ABB, engineers supply the insight to define which production variables to include in the models.

Interdisciplinary teams are often plagued by a lack of common understanding, vocabulary, and ways of working. Managers need to foster an environment where cross-functional collaboration can thrive and all relevant stakeholders are involved throughout the model development process. Regular workshops, meetings, and training sessions where data scientists, machine learning engineers, and domain experts jointly explore problems, refine models, and discuss the implications of the findings together can foster an environment in which cross-functional collaboration thrives.

Machine learning has changed how numerous organizations make decisions; causal ML can deepen insights further by predicting the effects of different choices on business outcomes. Companies are more likely to benefit from machine learning when decision makers trust the results. Knowing what causal ML can do and how it compares with traditional ML can help them choose the right projects for each technology and increase their success rates.

When managers use causal ML prudently to explore the options for straightforward decisions, they can significantly improve their operations — and, ultimately, their financial results.

Topics

About the Authors

Stefan Feuerriegel is the director of the Institute of AI in Management in the LMU Munich School of Management. Yash Raj Shrestha is the group head at the Applied Artificial Intelligence Lab at the University of Lausanne. Georg von Krogh is a professor and chair of strategic management and innovation at ETH Zurich and an associated faculty member at the ETH AI Center.

References (10)

1. S. Feuerriegel, Y.R. Shrestha, G. von Krogh, et al., “Bringing Artificial Intelligence to Business Management,” Nature Machine Intelligence 4, no. 7 (July 2022): 611-613; and P. Hünermund, J. Kaminski, and C. Schmitt, “Causal Machine Learning and Business Decision-Making,” SSRN, updated Feb. 19, 2022, https://ssrn.com.

2. S. Feuerriegel, D. Frauen, V. Melnychuk, et al., “Causal Machine Learning for Predicting Treatment Outcomes,” Nature Medicine 30 (April 2024): 958-968; V. Chernozhukov, C. Hansen, N. Kallus, et al., “Applied Causal Inference Powered by ML and AI,” PDF file (pub. by the authors, July, 28, 2024), https:causalml-book.org; and C. Fernández-Loría and F. Provost, “Causal Decision-Making and Causal Effect Estimation Are Not the Same … and Why It Matters,” Informs Journal on Data Science 1, no. 1 (April-June 2022): 4-16.

Show All References

Tags:

Artificial Intelligence Decision-Making Machine Learning

Reprint #:

66336

Add a comment Cancel reply

You must sign in to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Topics

What Causal ML Can and Cannot Do

Choose the Right Problem — and Data

Identify Possible Causal Factors

Choose the Output

Train, Test, and Validate the Model

Prepare the Organization

Related Articles

Topics

About the Authors

References (10)

Tags:

Reprint #:

More Like This

Add a comment Cancel reply