It is incredibly difficult to judge individual talent in the NFL, because so much of a player’s ability to succeed is based on context: his teammates and the system in which he operates. But the need to isolate performance is huge. Those who gain a reliable method will have a huge advantage. So where do you start? You start with the most valuable position on the field, the quarterback. Then, you isolate further on what appears to be the greatest differentiator between elite-level quarterbacks: their minds. The ultimate goal: to understand how a quarterback processes information and to track the patterns their minds tend to follow. Or said another way: to map Tom Brady’s brain. This is the quest of ESPN’s Brian Burke, who will be presenting this research on deep learning and quarterback decision-making at the 2019 Sloan Sports Analytics conference next week. He joined us for a preview of his work.
Paul Michelman: When you’re headquartered in Cambridge, Massachusetts, you’re going to hear a lot about the greatness of Bill Belichick and Tom Brady — it just comes with the territory. What ends up being a little more rare is hearing the coach heap praise upon the quarterback. Belichick would rather wax poetic about the left-footed punters or the brilliance of Lawrence Taylor than sing the wonders of the man with whom history will forever have him joined at the hip. But when Belichick does deign to single out his quarterback, it’s not about his arm strength, his accuracy, or his thousand-watt smile — it’s about his brain.
Ben Shields: At the low point of this past Patriots season, following back-to-back December losses for the first time in 16 years, Belichick was asked about Brady’s decision to make a contested throw in the end zone in a crucial moment against Pittsburgh. Belichick responded with a tone that would suggest he had been asked to jump in the Charles River during the polar vortex. “You can second-guess if you want to,” Coach responded, “but nobody knows better at that time with the ball in his hands where he feels like he’s got the best chance. I don’t think anyone’s going to make it better than him.” From the man they used to call “Doom,” that’s about as high a compliment as you can get. And here, as we sit, seven weeks and another Super Bowl ring later, Brady’s decision-making remains his greatest asset and perhaps his most unparalleled gift. I’m Ben Shields.
Paul Michelman: I’m Paul Michelman, and this is Counterpoints — the sports analytics podcast from MIT Sloan Management Review. In this episode: One day, in the not-too-distant future, we will be able to map a goat’s brain.
Ben Shields: Paul, I’m already hearing the legions of Patriots fans correcting you. It should be the G.O.A.T.’s brain — as in Tom Brady, the greatest of all time.
Paul Michelman: Yeah, that makes more sense.... In this episode: One day in the not-too-distant future, we will be able to map Tom Brady’s brain.
Paul Michelman: Hang onto your hats, listeners. In this episode, we’re going full geek, but we’ll do it nicely.
Ben Shields: As we discussed in a recent episode with Cade Massey, it is incredibly difficult to judge individual talent in the NFL because so much of a player’s ability to succeed is based on context — his teammates and the system in which he operates. But the need to isolate performance is huge. Those who get to a reliable method would have a huge advantage.
Paul Michelman: So where do you start? You start with the most valuable position on the field — the quarterback. And where do you focus your attention and your analysis? Well, if you assume that no one makes it to the NFL without superior athletic skills, you isolate further on what appears to be the greatest differentiator between elite-level quarterbacks — their minds, the quarterback’s ability to understand what is happening in the moment and to make literally split-second decisions. The ultimate goal [is] to understand how a quarterback processes information and the patterns their minds tend to follow; in other words, to map Tom Brady’s brain.
Ben Shields: This is the quest of ESPN’s Brian Burke, who will be presenting his research on deep learning and quarterback decision-making at the 2019 Sloan Sports Analytics Conference next week. He joins us now for a preview of his work.
Paul Michelman: Brian, thanks for joining Counterpoints today.
Brian Burke: Thanks for having me on.
Paul Michelman: So, Brian, we’re going to geek out in this episode, and we’re excited to do that, but we also want to make sure our listeners don’t get scared off. We’re going to go easy on you, folks. In fact, Brian, we’re going to ask that you go easy on Ben and I as well. We’ll think about our conversation as following a set of stairs — with that first step assuming that we are around a third-grade reading level when it comes to machine learning. And we don’t really have to imagine that — I’ll only speak for myself, that’s basically where I am. And we’ll work our way up — right? — to that Ph.D. level by the end of the program. So, let’s ease in. Tell us what brought you to this point. What interested you in this research? What’s your background that got you here?
Brian Burke: Yeah, I’ve been doing football analytics for about 12 years now. I started on my own, and I’m now with ESPN. But, before that, the Navy (unfortunately) taught me my statistics. They pulled me out of the cockpit (I used to be a pilot), sent me to grad school, and for some reason thought it’d be a good idea for me to know a lot of multivariate regression. It was completely useless for me until I got out of the Navy and [used it] in football.
Ben Shields: You’ve got a really interesting paper that you are going to share at the MIT Sloan Sports Analytics Conference this year. It’s called “Deep QB: Deep Learning With Player Tracking to Quantify Quarterback Decision-Making and Performance.” When you set out to do this research — give us some perspective — what problem were you trying to solve in this paper?
Brian Burke: The problem I was trying to solve was to find out how useful the player-tracking data could be. What kind of insight can we get from it beyond just the “Tyreek Hill hit 22 mph on this play”? So that was primarily the extent of the application of this player-tracking data which had been around for a couple of years. We had access — full access to it. Even the teams themselves didn’t have as much access as the media partners like ESPN. And I was frustrated just looking at the kinds of applications that we were doing with it. And I thought we could do a whole lot more. So, I thought, one thing that neural networks do well is a kind of multi-class classification, which means: I’m not trying to predict between two things — will one team win or the other team win? But I have maybe five things that might happen and they’re mutually exclusive, so what are the probabilities among those five things. And those five things, in my mind, might be who a quarterback throws to. And so, that was how the idea got started. Literally, the file name of all the computer code I have is like “first try,” you know, “dot r.” So, it was really just an experiment — can this work? Can we make useful insights out of this player-tracking data?
Ben Shields: And so, Brian, you built this neural network. And I’m wondering if you can, in layman’s terms, explain what a feed-forward artificial neural network is?
Brian Burke: Yeah, that’s a mouthful. It’s not easy to explain over a podcast or on the radio, but I’ll do my best. Most of our listeners might know regression. So, you might have several variables. And so, imagine if the network [is] where each one of those variables points or connects to one other node and that other node is your output. And there’s some transformation, mathematical transformation. So, if you were doing a logistic regression or something, you would have a logit function in that one node. And then you would get an output and that would be your prediction or your estimate [in] that model. So, a neural network is just like that, but instead of just one node, there can be very many nodes. And you can arrange them in layers so that they successively connect to each other, so every neuron in the first layer connects to every other neuron in the second layer. And each one of those neurons, or nodes, has a weight attached to it, or a coefficient — just like a regression has a coefficient. And you feed it data and you train a set of y variables or train the model to generate those coefficients or weights between each neuron, and you get something at the end of this process that’s pretty smart.
Paul Michelman: And so, v.1 of this model, or “first try” as you call it, if I understand it, [is that] based on any given situation, the model can tell you the optimal choice a quarterback can make. Is that correct?
Brian Burke: Almost. Optimal is the goal. I think we’re close to optimal. I can’t prove it. But what it’s doing — it’s doing several things. The first thing it does is it looks at the field of play. It looks at: Where are the receivers? It looks at where the defenders are. It looks at what’s going on with the quarterback. Where is he? Is he moving? Is he under pressure? It looks at not only the positions but the velocities, the orientations. So, one of the great things about this data is there are two sensors on each player, one on each shoulder pad, and that gives us player orientation. And that is an important part of the equation. So, it looks at this field of play, it looks at the array of players, it looks at all the combinations of positions and velocities, and it can understand the play. It can act as the quarterback’s decision-making function and say, who should I throw to, given this array of players. Who’s open? Who’s not? Who’s further downfield? If I throw here, is it an interception risk? And so, it’s telling us who a typical quarterback would throw to, which is not necessarily optimal. But at the same time, we can ask it questions like: What would be the expected yardage gained if you throw to each of the five different eligible receivers? And who did the quarterback actually throw it to? And we can compare those things and then we can get an idea of what really would be optimal.
Paul Michelman: And so, when we think about “should,” it’s a combination of likelihood of completion, size of the gain, and risk? Is that right?
Brian Burke: Those things are all important. There are four variants of this model and each variant is producing something different. The first is predicting who the quarterback would throw to, right? It’s not a “should” question, it’s more of a “would” question. So, of all the quarterbacks — if we just kind of averaged out their brains — who are quarterbacks in the NFL, who would they throw to? One of the interesting things we found out is that quarterbacks, in general, are just like their coaches: a little bit risk-averse. You know, they want a completion more than they necessarily want the best outcome, which would be the most yards.
Paul Michelman: Should we just have you walk through all four of the variants now? What do you think?
Brian Burke: OK. So, the first variant does the player or the pass prediction, right? Who is going to be targeted? The second variant does the expected yardage gain. And the third does an outcome. So, there are three outcomes on every pass: First is a completion, the second would be an incompletion, and the third is a reception. So, the model can do all of those things. The fourth variant is just like the first, it’s trying to make a prediction on who the quarterback would target, but it uses something called transfer learning — which is a pretty clever method using neural networks. What it’s intended to do is mimic the decision-making of an individual quarterback. So, let’s say we take Tom Brady’s brain, and we try to download it into a USB memory chip or something. That’s kind of the goal. But the problem is there’s not enough data — there aren’t enough Tom Brady passes in the data. And so, what you do is you train the model on all the data, on all the quarterbacks we have, and then you freeze the bottom layers — the first initial layers of this neural network. You freeze those weights and then you retrain the data. And you allow the weights of the top executive function on the network to be retrained just on Tom Brady. And what that does is it allows you to use the full data set to train and teach this model things like geometry and velocities. And it’s really understanding these things — it’s kind of learning the Pythagorean theorem. And so, the core basics, the basic functions of the brain, are learned on all the data, and then just the executive functions are kind of trained on Tom Brady himself.
Ben Shields: Thank you Brian for that explanation. I want to get into potentially some of the limitations of your current work, knowing full well that this is going to be a long-term project for you. How can you reasonably isolate a quarterback’s play in such an interdependent and often chaotic game? So, I’m thinking about Jared Goff, for instance, who has a pretty strong offensive line, great coach — how are you able to disassociate Goff from the context around him in this model?
Brian Burke: The way we do that is... Goff is presented with a picture. At the time he releases the ball, the time he throws the ball, there’s a configuration of receivers and defenders. And the scheme of both his own team and the scheme of his opponent, as well as the skills, the speeds, the abilities of these opponents, are all captured within that configuration. So, we know the positions and the velocities, accelerations, orientations — everything these players are doing. And we can say, given that configuration, here’s the expected outcome of this play. You know, you should make about a nine-yard gain, it should be completed, it should not be intercepted, and so on. Then we can look at what Goff actually did and make a comparison. So, if he is overperforming what his team and what his opponents have presented him, then we can attribute that to Goff’s individual skill. So, in that way we’re isolating quarterbacks’ performance and quarterbacks’ decision-making as much as we can. Not completely, there’s still some limitations, but probably more than anyone ever before.
Paul Michelman: So, there’s this frozen moment in time where you actually are able to decontextualize the quarterback from his surroundings, right? Or from the other things that impact his performance over the course of a game.
Brian Burke: That’s the idea. Yeah.
Paul Michelman: Let’s look at some of the specific quarterbacks. You measured every quarterback in the NFL, right? And the metric, I believe, correct me if I’m wrong, but the kind of key metric is how the quarterback fared against the expected yards per attempt, right? So, the algorithm says this is what the yards per attempt should be, and some quarterbacks exceed that, some fall below it. When you look at the quarterbacks who exceed it the most (so, we can’t visualize this for our listeners, but there’s a scatter graph that has the names of five or six quarterbacks sitting high above everybody else), and most of those quarterbacks are generally accepted to be among the best in the game: Goff, Mahomes, Brees. But then you have at least one who kind of appears out of nowhere and that’s Ryan Fitzpatrick. And similarly, if we take the quarterback who at least those of us in this part of the world believe is the greatest of all time, he actually sits below average in terms of yards per attempt against expectations. What do those things suggest to you? Why are these outliers in there?
Brian Burke: Well, first of all, one thing is this chart is looking at yards per attempt, and there’s another element to passing which is the interception risk. And so, this chart isn’t necessarily capturing that. So, Ryan Fitzpatrick, it’s magic, right? He really did have this kind of magical season where he was throwing these deep bombs. Yeah, he really did have a kind of magical year and this is kind of capturing that. But he also had about a 5% interception rate, which is more than twice the league average. So that kind of explains Fitzpatrick. Yeah, Tom Brady’s kind of in the middle of the pack on this chart. What that would tend to suggest is that his team is presenting him with excellent opportunities. Right? They’re getting open. They’re doing the right thing — running the right routes, in the right way. They’re making the right reads. For example, they run a lot of option routes in this offense. And so, they’re doing all the right things, giving him opportunities. So, he’s not necessarily overperforming those.
Ben Shields: All right, Brian, you have talked us through the data that you wanted to make sense of, the neural network that you’ve built to interpret the data, the variance involved in the neural network. I now want to talk a little bit about the application of this model. How do you see teams and quarterbacks, coaches, using a model like Deep QB to improve their performance?
Brian Burke: One good example I point to is Kirk Cousins. So, Kirk Cousins was this free agent last year, and there was a big question within the Redskins organization: What’s his value? What’s his worth? And they didn’t think he was that good because they thought most of his performance was due to... Remember, Sean McVay was the offensive coordinator for Cousins for a while and they had an excellent receiving corps. So, they thought, well, he’s really just benefiting from that — he’s not the reason that the Redskins offense did well. So, what this could do is separate that, and we can actually measure this. And here’s the yards per attempt and interception rates and so on that you would expect a quarterback to have, given what the Redskins could do, [and] here’s what Kirk Cousins actually did and he actually overperformed that. So yes, both things were true. His scheme and his receiving corps were excellent, but he actually overperformed that slightly. So, you could say, actually, he was a big part of that offense’s success for a couple of years. So that’s one application. Another application might be for media and fans. So, folks on my side of the fence, we can learn things about the sport that we could never really learn before. Part of football is really kind of cloaked behind this curtain, and we don’t really get to see behind the scenes where the schemes are made and how the quarterbacks make their decisions. And this is kind of understanding that for us. And so, it’s going to help us understand things we never really could before.
Paul Michelman: Can you imagine then kind of a real-time overlay on the screen, where an algorithm is telling us what the quarterback should do and then we can judge. Or we won’t have to judge it, then we’ll see what they have done relative to what the model says.
Brian Burke: Yeah, no, I can not only imagine that, but we’ve made some demo reels with things like that. So, there are some more technical challenges to do this in real time, but it’s definitely possible.
Paul Michelman: So that’s really fun. I mean, we definitely get through the Kirk Cousins example how that would be an incredibly useful set of data to help a team, right? Analyze the value of signing a player as a free agent, right? Or targeting a player as a free agent. Or perhaps drafting a player — presumably this information is kind of transferable from college to pros. But what about this as a defensive tool, right? The defense would have access to the same data. What might it suggest about the way the teams scheme a particular quarterback?
Brian Burke: This could be part of a bigger, broader model. Some of my former colleagues at Disney Research built a tool like this for basketball where you could sketch out a play and then ask it what would the opposition do. You could, say, if you’re a defensive coordinator, sketch out certain route combinations, what kind of defense you would run, and then look at what a particular quarterback, you know, using variant four of that model — say, “Robo Tom Brady” — who would Tom Brady throw to in this situation? So, you know, that might be the long-range, kind of end point with something like this.
Ben Shields: Brian, I know your model has already been performing well. You state in the paper that you’re at 60% cross-validated cases that are considered a positive outcome, which is great, but I’m thinking about where you’re going to take this model. What do you need to ensure that it’s going to perform even more strongly in the future? Where are you taking Deep QB going forward?
Brian Burke: Well, one of the great things about football is it keeps on producing more and more data. And this kind of neural network approach is very data-hungry. So, the more data we can get, the better and better the model will be. That’s one thing. The second thing is I started off with an ambitious project, but I was worried it just wouldn’t work at all, [that] it just wouldn’t converge on a solution. So, I made some decisions early on to make it easy for the computer to solve. I needed some limitations — I didn’t put in all the linemen, I didn’t put in all the pass rush. I had a single variable for whether or not the quarterback was under duress or not. So, I think a big part of the picture for a quarterback is: Where is the pressure coming from? Are certain receivers obscured from his view where he wouldn’t see them as open? There are also things like: Quarterbacks, I notice, ignore wide-open receivers, at times, if they’re deep and wide open. But that might not be due to some fault of their own but due to the way the pass progressions were kind of designed within that play. So, he was not supposed to be looking at that receiver at that time, for example.
Paul Michelman: What’s the kind of threshold for success? So, if you’re at about a 60% success rate, for lack of a better term today, when will we consider Robo Tom Brady to be reliable? Is it 80%? Do you have a number in mind?
Brian Burke: I don’t have a number in mind. I know 60% doesn’t sound that impressive, right? It’s a D-minus, I guess, right? But what’s the standard? Well, the naive approach would be: Well, if this neural network didn’t know anything or understand anything at all, it would have a one-in-five chance of being correct. So, it’s triple that kind of naive estimate. So, that’s why I think it’s actually performing pretty well. So, I don’t know what the ceiling would be, and I don’t know what a minimally acceptable number would be, [but we] keep trying to push it higher and higher.
Ben Shields: And Brian, obviously, I know your work is focused predominantly on football, but since plenty of other businesses and organizations are interested in machine learning, what kind of applications have a model like yours that uses neural networks do you see as being relevant to businesses outside of the sports industry?
Brian Burke: These kinds of models are already being widely used throughout many different industries and in much more advanced forms — the self-driving cars and the autopilot systems. I imagine in financial markets and things — trading decisions — these are being used to help make those. My kids are going through the college admissions process right now, and I imagine you could have an admissions model for things like that. There’s absolutely no limit. And it is a bit scary because we’re approaching things that were deemed to be safe from automation. We’re now automating jobs and tasks that used to be reserved for human beings and an NFL quarterback is now one of those jobs.
Paul Michelman: So, let’s bring this back to human beings and to the NFL to close this out. NFL general managers, NFL coaches, are notoriously resistant to analytics relative at least to front office NBA, front office Major League Baseball. What’s your sense of the receptivity to this model? Have you been working with any teams? How do you think it’s going to go over, I guess is what I’m asking?
Brian Burke: It’s not going to go over. No, I don’t expect to walk into a [head coach’s] office and show him this and say, “Hey, look, this is going to change the way you think about this sport,” or anything. This is a first step in trying to get our arms around just how far we can go with this data and just how far we can push the envelope at this point. This is just a first step. I think there are going to be many more steps in between this paper and then trying to sell this to a head coach. But yeah, I think there are some insights — that if you trust this model, there are some really good insights that it could provide a team. It’s like we talked about: sort of the risk acceptance for quarterbacks, or maybe isolating quarterback performance and skill from the rest of the team can help evaluate players. And, definitely, if this kind of data makes its way to the college level, then it would, I think, be of great use in terms of evaluating draft prospects.
Ben Shields: Great. Brian, we’ve covered a lot of ground here. I think we’re now leaving a lot smarter on your research and look forward to what you produce in the future. Appreciate you joining us.
Brian Burke: Yeah, thanks for having me.
Paul Michelman: Thanks very much, Brian.
Ben Shields: This has been Counterpoints, the sports analytics podcast from MIT Sloan Management Review.
Paul Michelman: You can find us on iTunes, Google Play, Stitcher, Spotify, and wherever fine podcasts are streamed. If you enjoy Counterpoints, please take a moment to rate and review the program.
Ben Shields: Counterpoints is produced by Mary Dooe. Our theme music was composed by Matt Reed. Our coordinating producer is Mackenzie Wise. Our crack researcher is Jake Manashi, and our maven of marketing is Desiree Barry.