Predictive Models Can Lose the Plot. Here’s How to Keep Them on Track.

Algorithmic inertia can result in guidance that leads businesses astray.

Reading Time: 15 min 

Topics

Permissions and PDF Download

Simon Prades

Organizations are increasingly turning to sophisticated data analytics algorithms to support real-time decision-making in dynamic environments. However, these organizational efforts often fail — sometimes with spectacular consequences.

In 2018, real estate marketplace Zillow launched Zillow Offers, an “instant buyer” arm of the business. It leveraged a proprietary algorithm called Zestimate, which calculates how much a given residential property can be expected to sell for. Based on these calculations, Zillow Offers planned to purchase, renovate, and resell properties for a profit.1 While it had some success for the first few years, the model failed to adjust to the new dynamics of a more volatile market in 2021. Zillow lost an average of $25,000 on every home it sold in the fourth quarter that year — resulting in a write-down of $881 million.2

This is an example of what we call algorithmic inertia: when organizations use algorithmic models to take environmental changes into account but fail to keep pace with those changes. In this article, we explain algorithmic inertia, identify its sources, and suggest practices organizations can implement to overcome it.

A Credit-Rating Catastrophe

To understand the phenomenon of algorithmic inertia, we conducted an in-depth study of another organization that failed to respond to changes in the environment: Moody’s, a financial research firm that provides credit ratings for bonds and complex financial instruments such as residential mortgage-backed securities (RMBSs). These securities aggregated bundles of individual mortgages into distinct tranches with unique characteristics during the period leading up to the global financial crisis of 2008.3

Moody’s made a concerted effort to account for environmental changes in its credit ratings by developing a proprietary algorithmic model in 2000 called M3 Prime. The model analyzed data about properties, mortgage holders, and the economy to estimate two parameters central to calculating a credit rating: expected losses for the mortgage pool and the loss coverage protection required for a security to maintain a AAA rating. An analyst would present a recommendation to Moody’s credit rating committee, which assigned a publicly posted rating for the security. Moody’s monitored these ratings and upgraded or downgraded RMBSs as the environment changed. The M3 Prime model achieved early success, so in 2006, Moody’s expanded its scope of algorithmic analysis by introducing a derivative model, M3 Subprime.

Between 2000 and 2008, Moody’s provided credit ratings for thousands of RMBSs, but it ended up downgrading 83% of AAA-rated mortgage-backed securities valued at billions of dollars by 2008. The U.S. government, along with 21 states and the District of Columbia, held Moody’s responsible for the role that its inflated ratings of those and other products played in precipitating the financial crisis. In 2017, the agency agreed to pay $864 million to settle the allegations.4

This is a particularly illustrative example of algorithmic inertia with devastating societal consequences. Moody’s decisions offer an excellent context for exploring algorithmic inertia because the organization was explicitly responsible for analyzing environmental changes as part of its core service. Moreover, we were able to access detailed information about its algorithmic model from a report produced by the Financial Crisis Inquiry Commission that includes extensive interviews conducted under oath with Moody’s executives who were involved in the business at the time.5

Our analysis enabled us to identify the most significant contributing factors to algorithmic inertia — buried assumptions, superficial remodeling, simulation of the unknown future, and specialized compartmentalization — described below. (See “Four Sources of Algorithmic Inertia.”)

Buried assumptions. Failing to revisit fundamental assumptions undergirding inputs of the algorithmic model in light of changes in the environment contributes significantly to algorithmic inertia. For example, loan originators were increasingly underwriting mortgages based on lower credit standards and substantially less documentation than before. So an original assumption undergirding Moody’s M3 Prime model — that the technology that mortgage originators were using to streamline the loan application process was also enabling a more accurate assessment of underlying risks — wasn’t modified to reflect the changing lending environment. The managing director of credit policy at Moody’s told a federal inquiry panel that he sat on a high-level structured credit committee that would have been expected to deal with issues like declining mortgage underwriting standards, but the topic was never raised. “We talked about everything but … the elephant sitting on the table,” he said.6

Failing to revisit fundamental assumptions of a model in light of changes in the environment contributes significantly to algorithmic inertia.

Moody’s model also assumed that consumers’ FICO credit scores were the primary predictive factor in loan defaults. But the quality of this data input significantly diminished over time: As the use of these credit scores became increasingly common, individuals found ways to artificially inflate them. As a result, low- and no-documentation mortgages carried latent risk that was not being taken into account in Moody’s algorithmic model.7

Superficial remodeling. This phenomenon occurs when organizations make only minor modifications to the algorithmic model in response to substantive changes in the environment. At Moody’s, some major changes to the environment included a growing number of loan originators, increasingly low-quality mortgages, and an unprecedented decline in interest rates.

Moody’s response to these changes was to seek to capture more business in the rapidly growing market, so it fine-tuned the model to be “more efficient, more profitable, cheaper, and more versatile,” according to its chief credit officer — not to be more accurate.8 When it modified M3 Prime to introduce the M3 Subprime model, it extrapolated loss curves for subprime loans based on premium loans rather than developing fresh loss curves for subprime loans.

Simulation of the unknown future. Relying on an algorithmic model to produce viable scenarios for the future environment can also leave organizations vulnerable to algorithmic inertia. Moody’s constructed a simulation engine featuring 1,250 macroeconomic scenarios that enabled it to estimate possible future losses based on variations in economic markers such as inflation, unemployment, and house prices. However, the simulation engine was limited by its underlying structure and assumptions, so analysts did not consider the changes that were occurring, did not update scenarios, and failed to accurately represent the changing macroeconomic environment. Based on the belief that detailed performance histories could more precisely reveal causal links between economic stresses and loan behavior, Moody’s used estimates based on historical parameters rather than expected pool loss distributions to examine behavior in stress scenarios.9

Specialized compartmentalization. This situation arises when experts in different domains are involved in an algorithm’s design and use and there is no overarching single ownership or shared understanding of the model. At Moody’s, responsibilities for the credit rating routine were divided between the domain experts (credit rating committee members) who used the quantitative model and the quantitative analysts who had developed it.

Because ownership and use of the model were distributed, and Moody’s didn’t strictly define how to use it, credit rating committee members established ad hoc rules to adjust the results of the model when its outputs didn’t conform to what their expert judgment led them to believe it ought to produce. Model outputs weren’t considered final; rather, the models were seen as tools to be used in conjunction with other approaches, and there was much divergence in how different ratings committees made their determinations.

The models were developed and modified by individuals who were distant from the domains in which they would be applied; disparate groups of domain experts then used the models in inconsistent ways without understanding their underlying logic. The managing director for rating RMBSs described the model as so technically complex that few people understood how it worked.10

This issue is at the heart of what makes algorithmic inertia hard to tackle: The models and algorithms are often so complex that domain experts can hardly grasp the details of their functioning, while data scientists are disconnected from how their models are being used in the real world.

Develop Practices That Combat Algorithmic Inertia

We have described how each of the causes of algorithmic inertia played out in Moody’s use of an algorithmic model to dynamically incorporate changes in the environment into its credit ratings. Despite recognizing flaws in the model and making active attempts to change it, the organization was unable to effectively adapt to the environment, thereby substantially contributing to the 2008 financial crisis.

To prevent similar degradation of critical algorithms’ predictive value, we suggest that organizations implement the four practices described below. (See “Keeping Algorithms Relevant.”)

Expose data and assumptions. Organizations should articulate and document the data used in their algorithmic models, including data sources and fundamental assumptions underlying their data selection decisions, which can have deleterious effects.11 Models often include operationalizations of many concepts, and it is easy for organizations to lose track of these parameters, which can be buried in layers of software code. Parameters representing the environment need to be documented to ensure that they remain visible. Similarly, the fundamental assumptions undergirding the model should be articulated and periodically revisited.

Moody’s used a data set on premium mortgages to train a model that was intended to be used to rate RMBSs composed of subprime mortgages. Initially, this might have been a reasonable choice due to the availability of data. But when a model’s initial data set isn’t refreshed, algorithmic inertia can result. As the Moody’s case suggests, data is never completely accurate, objective, and flawless. Therefore, making the sources of data and assumptions about those sources transparent to algorithm users, and continually reflecting on the appropriateness of that data, are critical practices for organizations seeking to avoid algorithmic inertia.

Organizations must keep data sources clearly organized and evaluate them periodically. Different data sources have different qualities and characteristics. Ensuring that these sources are distinguished from one another before they are fed into algorithmic models, processed, and constantly compared against one another enables data scientists to identify and eliminate algorithmic inertia sooner rather than later.

The assumptions underpinning the use of an algorithm should also be documented and articulated. Any attempt to model the environment involves quantification — transforming aspects of reality into numerical data. Such quantification inevitably involves making assumptions about how the environment works. However, while quantification is necessary for algorithmic models to work, details about how it is performed can get lost in the complex process of designing and using algorithmic models. Therefore, maintaining a living record of such assumptions may prevent the emergence of algorithmic inertia.

Periodically redesign algorithmic routines. Organizations should regularly redesign — and be willing to overhaul — their algorithmic model and reconsider how it fits into broader organizational routines. The initial design of an algorithmic model can take a lot of work, and it is natural for an organization to want to reap the benefits of that work. However, in a dynamic and quickly changing environment, it’s important to be willing not just to make incremental changes to a model but to fundamentally overhaul it if necessary.

Of course, organizations face a trade-off when it comes to overhauling an algorithmic routine: It can be very expensive to completely re-architect an algorithmic model. However, the consequences of failing to do so can be disastrous.

For example, when Moody’s had to rate an increasing number of subprime-dominated RMBSs, the organization chose to incrementally modify the M3 Prime model. However, it may have been more effective to specify the distinctions between the prime and the subprime markets and do a deeper overhaul of the original model. In addition to rethinking the algorithmic model itself, an organization can consider how it is deployed in practice: Hypothetically, Moody’s could have applied the M3 Prime model differently to different types of RMBSs — perhaps simply requiring more human intervention for tranches composed of lower-quality loans.

Redesigning and overhauling an algorithmic model is contingent upon understanding what organizational processes interrelate with the model and analyzing the implications that changes in the environment have for it. If it becomes clear that either the model or the processes that it relies on or feeds into have been rendered obsolete or ineffective, an overhaul should be seriously considered.

Assume that the model will break. It can be dangerous for an organization to think of potential future scenarios only through the prism of what algorithmic models predict: All assumptions embedded in a model limit the potential futures that can be considered. To address algorithmic inertia associated with the simulation of an unknown future, it is important to assume that the model will break. Consider scenarios beyond the scope of the algorithmic model; this requires challenging predictive assumptions as well as presuming that the model is fundamentally flawed.

An active practice of considering scenarios that are outside the model can help motivate and inspire the prior two practices — exposing data and assumptions and periodically redesigning algorithmic routines — by forcing team members to actively consider the limitations of algorithmic models.

It can be dangerous for an organization to think of potential future scenarios only through the prism of what algorithmic models predict.

One particularly useful approach might be to make use of qualitative predictions of the future instead of quantitative predictions that rely on available data from the past. These forms of scenario planning offer opportunities to consider radically different visions of what the future may hold. This might also entail developing hybrid algorithms that do not precisely rely on past data to predict scenarios but also embed in them qualitative measures and expert rules introduced by domain experts.

Build bridges between data scientists and domain experts. Organizations must create processes for data scientists and domain experts to work closely together to design their algorithmic routines.12 Practically speaking, data scientists and AI specialists approach problems very differently than domain experts do. Domain experts focus on organizational routines and idiosyncratic situations, whereas data scientists focus on developing generalizable constructs based on mathematical principles. To overcome algorithmic inertia, data scientists and domain experts must work closely together to understand how characteristics of organizational routines and idiosyncratic situations map to the mathematical parameters used in an algorithmic model.

When the worlds of data scientists and domain experts are completely separate, there is also the danger that data scientists and domain experts will shift responsibility by superficially trusting each other’s work. Such assumptions can actually prevent crucial dialogue between the two worlds. For instance, Moody’s ended up subverting the results of its credit rating model because the credit rating analysts didn’t attempt to understand why the model might be generating results that didn’t fit with their intuitions. Building bridges between data scientists and domain experts enables domain experts to obtain an intuitive grasp of how the algorithmic model works. Such common ground could enable organizations to create and use models that better adapt to changes in the environment.

One structural bridge-building practice that organizations can use to facilitate communication between data scientists and domain experts is establishing a position such as a product manager. This should be held by one individual with both domain and data science experience who has direct responsibility for overseeing algorithmic routines. For example, some data experts have called for the creation of a new organizational structure that includes an “innovation marshal” role — someone who is respected by both data scientists and field experts.13 Given their knowledge and expertise in both areas, these people can gain the respect of the organization by developing and maintaining “high-bandwidth, bidirectional communication channels” that help ensure that algorithmic routines are able to adapt to environmental changes.14

Another bridge-building practice is called model explainability: describing the algorithmic models in a practical and comprehensible manner. For data scientists, such explainability can facilitate access to the expert knowledge needed to counteract the sources of algorithmic inertia; for domain experts, such explainability can help them develop a deep and intuitive understanding of how the model takes environmental changes into account. Model explainability establishes common ground between two groups of professionals who have different types of expertise. Such practices enable organizations to build bridges instead of just talking about them.


Organizations seeking to reap the benefits of powerful predictive analytics are increasingly confronting the problem of algorithmic inertia. Despite leveraging dynamic algorithms to adapt to changes in the environment, organizations may find that the results are not keeping pace with new developments. By exposing data and assumptions, periodically redesigning algorithmic routines, assuming that their models will break, and building bridges, organizations can increase the likelihood that their substantial investments in algorithmic solutions will pay off with better decision-making.

Topics

References

1. C. Stokel-Walker, “Why Zillow Couldn’t Make Algorithmic House Pricing Work,” Wired, Nov. 11, 2021, www.wired.com.

2. W. Parker, “Zillow’s Shuttered Home-Flipping Business Lost $881 Million in 2021,” The Wall Street Journal, Feb. 10, 2022, www.wsj.com.

3. O. Omidvar, M. Safavi, and V.L. Glaser, “Algorithmic Routines and Dynamic Inertia: How Organizations Avoid Adapting to Changes in the Environment,” Journal of Management Studies 60, no. 2 (March 2023): 313-345.

4.Justice Department and State Partners Secure Nearly $864 Million Settlement With Moody’s Arising From Conduct in the Lead Up to the Financial Crisis,” U.S. Department of Justice, Jan. 13, 2017, www.justice.gov.

5.The Financial Crisis Inquiry Report: Final Report of the National Commission on the Causes of the Financial and Economic Crisis in the United States,” PDF file (Washington, D.C.: Financial Crisis Inquiry Commission, January 2011), www.govinfo.gov.

6. Ibid.

7. A. Rona-Tas and S. Hiss, “The Role of Ratings in the Subprime Mortgage Crisis: The Art of Corporate and the Science of Consumer Rating,” in “Markets on Trial: The Economic Sociology of the U.S. Financial Crisis: Part A,” eds. M. Lounsbury and P.M. Hirsch (Bingley, U.K.: Emerald Group Publishing, 2010), 115-155.

8. “The Financial Crisis Inquiry Report.”

9. J. Siegel, “Moody’s Mortgage Metrics: A Model Analysis of Residential Mortgage Pools” (New York: Moody’s Investors Service, April 1, 2003).

10. “The Financial Crisis Inquiry Report.”

11. D. Lindebaum, V. Glaser, C. Moser, et al., “When Algorithms Rule, Values Can Wither,” MIT Sloan Management Review 64, no. 2 (winter 2023): 66-69.

12. V.L. Glaser, “Design Performances: How Organizations Inscribe Artifacts to Change Routines,” Academy of Management Journal 60, no. 6 (December 2017): 2126-2154.

13. R.W. Hoerl, D. Kuonen, and T.C. Redman, “To Succeed With Data Science, First Build the ‘Bridge,’” MIT Sloan Management Review, Oct. 22, 2020, https://sloanreview.mit.edu.

14. Ibid.

Reprint #:

64415

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.