Counterpoints /

The Sports Analytics Podcast from MIT SMR

Information Overload?

play Listen
Full Series Next »

Hang on to your calculators, sports fans, we’re having our second special debate episode. This week, we take on two pressing questions in the sports analytics field. First, we’ll debate the issue of information overload and whether there is such a thing as too much data. Then, we’ll turn our attention to a very different — but related — issue: biometrics. We’ll go to the mat over whether professional athletes will be willing to share their personal biometric data in real time.

Transcript

Paul Michelman: Hang onto your calculators, sports fans — we’re having our second very special debate episode. This week, Ben and I go head-to-head on two pressing questions in the sports analytics field.

Ben Shields: First, we’ll take on the issue of data overload and debate whether there is such a thing as too much data. Then, we’ll turn our attention to a very different but nevertheless related issue: biometrics. We’ll go to the mat on whether professional athletes will be willing to share their personal biometric data in real time.

Paul Michelman: All right, Ben, let’s get into it. Data. It’s flowing from everywhere. That’s a wonderful thing. We love data! Without ever-improving data, the sports analytics field can’t progress. But can there be too much data?

Ben Shields: So I’m going to tell you why there is no such thing as too much data. And there’s three main points I’m going to cover. First, I’m going to argue that we are still at a data deficit. We’re actually a long way away from having too much data. Second, I’m going to talk about the difference between having the data and what you actually can do with it. And then finally, I’ll mention a little bit of a future point: the fact that advanced analytics methods like machine learning actually thrive on more data. The more data, the better. So my first point here: I think the sports analytics field is still at a data deficit. We’re actually a long way from being confidently able to say there’s too much data at our disposal. For example, the National Hockey League doesn’t even have player tracking data yet. That’s going to be coming into the league this upcoming season. This past year is when the NFL had a full season’s worth for every team of player tracking data. Teams in both of those leagues are just scratching the surface for how they can use this data to improve their decision-making. And player tracking data is just one part of the puzzle.

There are so many open questions around behavioral components of sports analytics. For example, for our loyal listeners on this podcast we just had Simon Strachan on here talking about his attempts to measure team chemistry — and how though they’ve made great strides, they still have a lot of work to do to better understand from a behavioral standpoint how teams and players get along and play well together.

So I think many sports researchers, especially [those] who are hungry for more data to do more analysis, would agree with me that at least at this stage of the game, we’re still at a data deficit. That fundamental point aside, Paul, I want to address the prompt head-on with this second point: Even if you have what you feel like [is] too much data, there is a difference between having the data and what you do with it. So for example, in the NBA, all 30 teams have access to Second Spectrum data. One team may feel like that’s too much information, whereas another team may feel like “we need more.” The bottom line is teams are gaining a competitive advantage in how they analyze and apply data. And in a lot of ways in such a competitive environment as sports, I can guarantee you that teams are going to want to get their hands on even more data, so that they can just find those milliseconds or squeeze out whatever performance they can get in order to win what in many cases are games of inches. And they’ll do that with not only more data but also [by] the way that they analyze it. A final point that I wanted to make about the fact that there’s no such thing as too much data is: These advanced analytics methods that we are talking about, for instance on our podcasts, thrive on more data.

Machine learning, for instance, is a methodology that thrives on ever-increasing amounts of information: the more data, the more accurate the models. We had our friend Brian Burke on the podcast, who said that his model that he’s building to map Tom Brady and other quarterbacks’ brains is hungry for more information. So all of the machine learning experts out there I think would agree with me when I say: In fact, there isn’t enough data right now for these machine learning models to perform as effectively as we might want them to.

So look, I think if we look [into] the future — 10, 15 years down the line — we could maybe have the conversation about there being too much data. But when we look at the here and now, Paul, I would say: First, there is no such thing as too much data, because No. 1, we’re still at a data deficit in the sports industry. No. 2, there’s a big difference between having the data and what you do with it. And then No. 3, these advanced analytics methods, including machine learning, thrive on more data. They actually want more of it and probably don’t even say there’s too much. I rest my case for this moment in time.

Paul Michelman: So that is very well argued, Ben. And, interestingly, your second point — it’s not just the data, it’s what you do with the data — is my central argument why there is such a thing as too much data. Now, I’m going to admit my argument is a little bit theoretical. I am abstracting the question from where we are in sports analytics because I have seen in other industries how too much data can actually be a huge drag. But I’m also going to go so far as to say I think this does apply in the sports analytics field, and I’ll explain.

It is a waste of time, attention, and resources to collect data you will not use, and it is a pernicious problem in industry. We delude ourselves into believing that collecting data is the goal — it’s not, and you made this point very well. Employing the data is the goal. So collecting data for which there is no business purpose, or which the organization is incapable of making use of over a known time horizon, is nothing more than a waste of resources and a distraction from activities and attention that can produce real value right now. You argue that there is a data deficit. And in some sectors there may be, but in other cases, a lack of data is not the problem. In many cases, there is plenty of data, and it’s mostly becoming a commodity given that teams don’t collect it on their own. They’re relying on third parties to supply the data. That diminishes the value of the data and embellishes the value of making unique use of the data.

It’s not the data — it’s the techniques, the tools, and the strategies to derive insights from the data, and then to deploy those insights for strategic advantage. That’s what’s in short supply. No one has unlimited resources. And Ben, as you have written, many sports organizations are still wrestling with the basic organizational structures and cultural change required to take advantage of even the most basic analytical insights. Organizations need to focus on creating value and advantage from the data that exists and stop following the shiny penny of measuring the next frontier and forehead perspiration drip rate. When we focus on the next great piece of data that we don’t have — or fall in love with the next big thing that we may be capable of measuring — we lose focus on creating distinguishing value and advantage from the data we already have and have yet to master. Let someone else worry about what to measure next; you concentrate on creating value from what you can already measure.

Ben Shields: Paul, you have proven yourself to be a very skilled debater because you use both one of my arguments and some of my writing to your advantage. So I applaud you for that. And, of course, on some level, I agree with you. I think the main distinction that I’m going to draw is I believe that sports organizations are a little bit different from other industries in that they have a very clear goal that they’re seeking to achieve — and that is to win games. I don’t mean to trivialize it, I don’t mean to oversimplify it, but they’re trying to win games. And they also, as a result, have a set of very focused problems to solve: What should be our starting lineup? What offensive strategy should we deploy? How should we regulate a player’s training and practice schedule? And in a world where there are clear goals and focus problems to solve, that helps these teams put the data to good use. It helps these teams separate the signal from the noise. And, in fact, in a world where teams are wrestling with the same types of goals and problems, where they will gain a competitive advantage is in new types of data and in new ways that they analyze it. And in many ways — my own biases are coming [into] play here — I think that the sports industry can serve as a very helpful example for how to use data in decision-making, even if it’s a lot of information.

Paul Michelman: I mean, the truth is, Ben, I largely agree with your argument, but I think it’s important that we look at the whole picture, right? Just as organizations are rightly identifying gaps in the data sets and identifying opportunities that they really can’t exploit without more data, there really is this huge risk that they’re taking their eyes off the prize and not taking full advantage of the data that’s already in their possession. And I think it’s a big challenge for organizations to get both right.

So let’s move on to question two, shall we?

Ben Shields: Let’s do it. Before we get into this debate, let me just explain what we mean by biometrics with a very simple example. Many of you may know of wearable technologies, for example, on the consumer market, the Fitbit. Well, these wearable technologies athletes increasingly have the opportunity to wear, and they collect biometric data like the athlete’s heart rate. That data can then be used to help modify an athlete’s training or perhaps get a sense of how healthy the athlete is and how well they’re performing. And I’m going to argue that professional athletes will over time be incentivized to share this type of information in real time.

Paul Michelman: And real time with the public, I think is kind of what we’re talking about here. And with that framing, I’m going to tell you why they will not be willing to do this — at least not in a time horizon that we can put a finger on. It’s too foreign. It’s too scary. It’s too weird. And it will be perceived by athletes as a massive privacy risk. The incentives that will be required to overcome these hurdles will just be too great. I mean, think about the risk of hacking. Imagine your biometrics were hacked by the Russians who then share it on fake Facebook accounts. Think about the Senate hearings. But seriously, what if the data reveals something that could jeopardize an athlete’s ability to earn as much money as she possibly can? There’s so much that’ll be left to interpretation, misinterpretation, and abuse. And then there is, I think, a pretty significant competitive concern. If an athlete’s vitals are being tracked throughout a game, who’s to say the opponent couldn’t look at the data and determine that the athlete is slowing down — or in a scarier situation, take advantage of a potential injury. Furthermore, the athlete’s performance data can change public perception about that athlete. We’ve seen reports of who’s the fastest or slowest players in basketball, or which players run the most and least distance in soccer. If a player can be proven that he’s not giving his full effort through the data, then his job and reputation could be on the line. And then finally, there are logistical issues: Who actually owns this data in a professional context? Who gets to monetize it — the player or the team? I think there are just too many impediments to be overcome, at least at the highest levels of professional sports.

Ben Shields: Paul, one of the main reasons why I appreciate you so much is that you have just an incredible and consistent skepticism about the topics we debate.

Paul Michelman: It comes naturally.

Ben Shields: I think by now our listeners understand and appreciate that about you as well. And the problem with your argument is that all you do is focus on the downsides here, and you don’t talk about any of the potential upsides for athletes. And I’m going to talk about those here in a minute, but first I want to make things really clear. I am 100% for professional athletes to make the decision about their data on their own. Athletes should have the agency to decide whether their biometric data is shared in real time. And that’s one really important caveat, because they do have to weigh the downsides that you cited, versus the potential upside. I also want to give a caveat that there are lots of player union considerations as part of this question as well…. Clearly, and for obvious reasons, powerful unions in professional sports are very skeptical like yourself of where biometric data being shared in real time for public consumption could go. So I’m realistic about that. That said, I do think there could be some exciting incentives for professional athletes to share their biometric data in real time. And it comes down to a couple of reasons: (1) In terms of their on-the-field performance, they can gain a competitive advantage; and (2) there are commercial or monetization potentials for professional athletes as well. So first, from a competitive advantage standpoint, let’s imagine that we are at the NFL Combine, which has already become a made-for-television event. And all of these players are fighting for draft position, and they’re running fast 40s, and they’re jumping — but now we enter into a situation where we also see a player’s biometric data, which can give a player who may not have some of the physical gifts or run the 40 as fast another metric that could distinguish them from their competition.

I could easily imagine, in five, 10 years’ time, where when we’re analyzing athletes for the NFL draft, watching them perform in combines, for instance, that their biometric makeup could be part of the discussion as to whether a player is worth drafting very high.

Paul Michelman: Wait. How does making the biometric data public, give players an advantage?

Ben Shields: Well, I’ll tell you why: Because there have already been some experiments in players sharing biometric data in real time. So for instance, in the National Rugby League State of Origin Series, Jonathan Thurston of Queensland was about ready to take a kick to win the game. Think of it as a penalty kick, in soccer parlance. And he was wearing a wearable technology that was tracking his heart rate. And what happened was: You could imagine that during this extremely high-pressurized situation that his heart rate would be going through the roof. No, that’s not what the data showed.

In fact, it showed that his veins were about as cold as ice and that his heart rate in that moment slowed down — and he was as calm as he could possibly be. And so what does that say? Well, it says that if Jonathan Thurston is going to be in that type of situation in the future, that he’s not going to let the pressure get to him. He’s going to respond calmly and put himself in a position to succeed. Now to be clear, this was an experiment that was run [and] the data wasn’t shared in real time, but you could imagine a future where during a combine or some sort of tryout that an athlete’s biometric data could give us a window into how she or he might perform in game time. I realize it sounds like a crazy idea, but it’s being tested, and I wouldn’t put it outside of the realm of possibility as a way for an athlete to gain a competitive advantage over his competitors.

Paul Michelman: So I see that, and I think that’s a really helpful example. Thank you. What I was reacting to in the NFL Combine example is something that I think is implicit in this debate — that athletes are sharing their data in real time with the public. And it’s the “with the public” that I didn’t see as at all valuable for an athlete at the Combine. With GMs, with front office personnel — that I totally get. It’s with the public that didn’t connect for me.

Ben Shields: Well, I think that’s fair. The other thing I would mention though is that when it comes to draft decisions, or any player acquisition decision that you could imagine, public perception is an important consideration. Now, of course, the Pelicans are going to take Zion Williamson because he’s the best player in the draft, but it also happened to be that hundreds, if not thousands, of people were in the streets of New Orleans supporting this positively. So if a player is lauded not only for his combine and physical skills but also for his biometric makeup, then that could result in positive public perception as well about the overall story or narrative of this player and give the organization even more reason to select him.

Paul Michelman: Of course, it could totally blow up in their face and have the opposite effect, but point taken.

Ben Shields: An expected response and one that I appreciate. All right, so look, there’s a competitive advantage possibility here. Again, this is a player’s choice. This data could be used as a way to differentiate a player from her or his competition. But a second point here about how players might be incentivized in the future to share their biometric data in real time is on the commercial end. And there are at least two opportunities. You could easily imagine an esports gamer sharing their biometric data via Twitch, while also watching them play in real time, and having that data sponsored by whatever corporate sponsor that comes into play.

Esports is a natural place for these types of media integrations. And oh, by the way, given the fact that esports athletes are increasingly their own brands — their own businesses — you could certainly imagine in the future a professional esports player cutting a deal to share their biometric data in real time for sponsorship dollars via their media broadcasts. But also you could imagine possible monetization opportunities in the gambling space. Fantasy players and gamblers are hungry for any type of injury information. And as this space, especially in the United States, becomes more professionalized, there’s going to be a hunger on the part of gamblers to know more information about athletes. This is where potentially the leagues could get involved as they are over time — maybe not today — but over time going to be thinking about how they can monetize more effectively the burgeoning gambling market in this country and globally.

Paul Michelman: Could you imagine the leagues mandating or just dictating that: (1) Players are going to be wearing biometric sensing devices and (2) that their leagues or the teams are going to share the data whether the players like it or not?

Ben Shields: Well, I don’t see that in the future. I think that players are going to have choice. I also think it’s important in this conversation to differentiate between different types of biometric data. There could be more sensitive information that might not be on the table for public sharing, whereas other forms of biometric data may be more appropriate for broadly sharing with the public.

Paul Michelman: I do think your esports example’s a good one. I think that is the scenario where you can imagine this happening, because it’s an industry that is already kind of all about the data. The participants are young and accustomed to a kind of transparency, and even embracing a kind of transparency, that I think is foreign to some people who are older — or certainly less familiar and less comfortable to people who are older. And they’re not highly compensated, at least not relative to Big Four professional athletes, and so the incentives are going to be more meaningful. So I buy that. That’s the extent of where I’m going to be willing to agree with you, however. I think when you get to, say, the NFL, which would be the place where the value would be the greatest, because that is the epicenter of gambling and fantasy sports — I just can’t see professional NFL players being willing to share anything other than the most basic data. But it was very well argued.

Ben Shields: Well thank you, Paul. And you know, if I may, I agree with you on that point. I think that [there] is going to be a big difference between the Big Four adopting more liberal biometric sharing policies versus startup leagues, for instance, either in esports or in other sports, that are trying to differentiate themselves from the establishment that could be more aggressive in this area.

Paul Michelman: This has been Counterpoints, the sports analytics podcast from MIT Sloan Management Review. We will be taking a brief hiatus for the rest of August — so Ben can go fishing. Our next show drops on Thursday, Sept. 12.

Ben Shields: You can find us on iTunes, Google Play, and wherever fine podcasts are streamed. If you enjoy Counterpoints, please take a moment to rate and review the program on Apple Podcast, and tell your friends while you’re at it.

Paul Michelman: Counterpoints is produced by Mary Dooe. Our theme music was composed by Matt Reed. Our coordinating producer is Mackenzie Wise. Our crack researcher is Jake Manashi, and our maven of marketing is Desiree Barry.