The Hidden Side Effects of Recommendation Systems

Both consumers and businesses should be aware of potential decision-making biases introduced by online recommendations.

Reading Time: 7 min 

Topics

Frontiers

An MIT SMR initiative exploring how technology is reshaping the practice of management.
More in this series
Buy

Recommendation engines influence the choices we make every day — what book to read next, which song to download, which person to date.

At their best, smart systems serve buyers and sellers alike: Consumers save the time and effort of wading through the vast possibilities of the digital marketplace, and businesses build loyalty and drive sales through differentiated experiences.

But, as with many other new technologies, digital recommendations are also a source of unintended consequences. Our research shows that recommendations do more than just reflect consumer preferences — they actually shape them. If this sounds like a subtle distinction, it is not. Recommendation systems have the potential to fuel biases and affect sales in unexpected ways. Our findings have important implications for recommendation engine design, not just in the music industry — the basis of our study — but in any setting where retailers use recommendation algorithms to improve customer experience and drive sales.

Consumer Choice in a Crowded Marketplace

E-commerce has dramatically affected consumer choice. Unconstrained by physical limitations of the brick-and-mortar model, businesses can offer virtually unlimited selections of products online, giving consumers access not only to popular items but to obscure, niche ones as well. There are both more needles and more hay. As consumers face a radically wider set of options, they must exercise greater care in evaluating potential products for purchase or consumption. Experience-based (or taste-based) goods such as music, books, and movies are particularly complex: Consumers must spend time experiencing them before they know if they like them. Even if the sticker prices for goods aren’t high, or the goods are included as part of subscription services, the time consumers must spend to evaluate each of them is valuable. Worse, the sunk cost of evaluation time is unrecoverable: Consumers can’t unlisten, unread, or unwatch goods that turn out to be a poor fit.

In this context, sophisticated algorithms capable of making effective personalized recommendations provide sizable benefits. They reduce search and evaluation time, drive sales, and introduce new items to consumers. Some 30% of Amazon’s page views result from recommendations,1 more than 80% of the content watched by Netflix subscribers comes through personalized recommendations,2 and more than 40 million Spotify listeners can now access personalized playlists generated by its “Discover Weekly” module, which leads to more than half of the monthly listens for over 8,000 artists.3

More Than Just a Recommendation

For the consumer, the way systems arrive at personal recommendations is relatively easy to understand. Based on a customer’s past activities and stated preferences, these systems present new options: queues of potential items of interest, like Netflix’s “Other movies you may enjoy” and Amazon’s “Customers who bought this also bought.” As our research shows, however, such personalized recommendations do more than introduce consumers to new products; they also shape their future preferences and behaviors in unexpected ways. (See “Related Research.”)

We looked at how personalized recommendations influence preferences and willingness to pay for a common experience-based digital good: music. As in other industries involving experience-based goods, new models (such as Spotify and Apple Music) are disrupting the music industry. Digital distribution channels, including paid subscriptions, on-demand streaming, and digital downloads, are currently about 80% of the U.S. music market. Regardless of the distribution channel, algorithms and recommendation engines significantly affect the digital consumption of music, as recommendations add value in identifying unknown songs that are more likely to strike a chord with the consumer. Surprisingly, recommendation systems alter how much consumers are willing to pay for a product that they just listened to. Consumers don’t just prefer what they have experienced and know they enjoy; they prefer what the system said they would like. This is surprising since consumers shouldn’t need a system to tell them how much they enjoyed a song they just heard. The advent of recommendation systems may leave us questioning our own taste. We move from asking ourselves, “Do I like this?” to asking, “Should I like this?”

Researching Music Consumers

Our findings are based on three laboratory experiments with a total of 169 consumers of music: college students. In the first experiment, participants listened to songs and told us how much they would pay for each song. We randomly assigned recommendation ratings to selected songs. We presented these ratings (between 1 and 5 stars) as a recommendation system’s predictions of their preference for each song. If they desired, participants could listen to song samples to reduce their uncertainty about how well they liked the music. Participants were unaware that we randomly generated the recommendation ratings and believed the ratings were calculated on the basis of their preferences from past data. Recommendations significantly altered willingness to pay, with a 1-star increase in recommendation rating creating an average 12% to 17% increase in willingness to pay. This result is compelling, since the random recommendations were unrelated to the participants’ actual preferences.

The same effects exist for real recommendations that contain errors. In our second experiment, we used real song recommendations derived from a widely used, state-of-the-art algorithm. But we intentionally introduced random error in the predicted ratings, ranging from -1.5 stars to +1.5 stars. Again, participants were unaware of the recommendation manipulation and could listen to song samples. An intentional boost in the real recommendation rating by 1 star increased willingness to pay by 10% to 13%, on average.

In the first two experiments, participants could listen to 30-second song samples (since sampling is a common practice on retail music sites), but listening was not mandatory. Thus, the changes in willingness to pay could stem from participants using the ratings in lieu of listening themselves, but we don’t know to what degree. Although important and useful to know, it’s unsurprising that recommendations affect willingness to pay when consumers know less about a product; that is, after all, the explicit purpose of recommendation systems. In our third experiment, as a way to explore the more interesting and possibly problematic effect when consumers have familiarity with a product, we required participants to listen to all song samples before they indicated their willingness to pay. Again, randomly generated recommendation ratings significantly affected consumers’ willingness to pay. We saw an approximate 8% to 12% increase in willingness to pay for each 1-star increase in shown recommendation ratings. The effect of recommendations on willingness to pay remains strong even immediately after mandatory consumption of the recommended item, when much less preference uncertainty should exist for the consumer.

Consequences for Consumers and Retailers

For consumers, recommendation engines have a potential dark side — they can manipulate preferences in ways consumers don’t realize. After all, the details underlying recommendation algorithms are far from transparent. Faulty recommendation engines that inaccurately estimate consumers’ true preferences stand to pull down willingness to pay for some items and increase it for others, regardless of the likelihood of actual fit. This may tempt less ethical organizations to inflate recommendations artificially. Even aside from the disreputable practice of direct manipulation, random error is a real problem for all recommendation systems. For example, the best-performing recommendation systems in the $1 million Netflix Prize competition, using the latest machine learning developments in recommendation algorithms at the time, were off in their rating predictions on average by 20% of the rating scale (that is, an error of about 0.8 on a scale of 1 to 5 stars).

Both over- and underestimation are problems. Inflated ratings induce consumers to buy products they might not consider otherwise and could leave consumers disappointed from unmet expectations. Deflated ratings potentially turn off consumers from products they may have otherwise purchased. Mistakes hurt in both directions.

And the effects persist beyond dissatisfaction with a single purchase. They compound over time. After consumers experience a product, their feedback (like product ratings or purchases) influences future personalized predictions. Biased feedback can contaminate the system and lead to a vicious cycle of bias — the online retail equivalent of squealing audio feedback. Designers could also get an artificially inflated view of prediction accuracy, compromising their ability to improve systems. Even worse, unscrupulous agents could use such vulnerabilities to manipulate recommendation systems.

Given that perfect prediction is not possible, retailers and managers must be aware of the potential discord from unintended side effects of their recommendations. Our findings highlight the importance of reducing bias in recommendation systems, for example, through innovations in algorithm and user interface design and through human oversight, as an ongoing priority for the future.

Topics

Frontiers

An MIT SMR initiative exploring how technology is reshaping the practice of management.
More in this series

References

1.A. Sharma, J.M. Hofman, and D.J. Watts, “Estimating the Causal Impact of Recommendation Systems From Observational Data,” Proceedings of the Sixteenth ACM Conference on Economics and Computation (Portland, Oregon, June 15-19, 2015): 453-470.

2.C.A. Gomez-Uribe and N. Hunt, “The Netflix Recommender System: Algorithms, Business Value, and Innovation,” ACM Transactions on Management Information Systems 6, no. 4 (January 2016): 13.

3.E. Van Buskirk, “The Most Streamed Music From Spotify Discover Weekly,” July 7, 2016.

Reprint #:

60201

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.