In a sport famous for its miracle plays and dramatic endings, it doesn’t feel like college football should be predictable year in and year out — but in the five years of the college football playoff era, a few trends have emerged in the data that suggest it’s more predictable than we may think. While winning all of your games is still the most foolproof way to get in, having a blemish on your résumé is not a death knell — there are plenty of other factors that will swing a school in or out of the top 4. Ben speaks with University of Wisconsin-Madison professor and “Badger Bracketologist” Laura Albert about her college football playoff predictive modeling and what lessons can be gleaned from the other major college playoff tournament, March Madness, in determining postseason college football rankings.
Ben Shields: It was Dec. 6, 2014, and it seemed like there was no way Ohio State was making the inaugural college football playoff. Left for dead after an early-season loss to unranked Virginia Tech, the Buckeyes had clawed their way back to sixth in the penultimate rankings. But in the Big Ten Championship, they faced the unenviable task of playing No. 11 Wisconsin, while being forced to start their third-string quarterback. Even more frightening — well, at least to those of us in the sports analytics field — there was no data or precedent to determine what OSU needed to do to make the big dance, nothing to predict just how close or how far Urban Meyer’s squad was from the teams above them. Amidst all the uncertainty, Ohio State went out and clobbered the Badgers, 59 to nothing, and shifted their narrative completely.
Paul Michelman: Armed with a shiny conference championship game trophy plus that impressive margin of victory, the Buckeyes leapt into the top 4 and into the playoff — a decision that unsurprisingly inspired a fair amount of controversy. The Big 12 was outraged that both of their contenders and conference cochampions, TCU and Baylor, had been left out. By selecting Ohio State, the selection committee implied that an early-season loss to a power conference team was no longer the end of the world, but that scheduling cupcakes could be, as those Big 12 schools learned the hard way. As frantic as the fallout of 2014 was, the case of Ohio State helped to clarify what the committee valued. And though there have certainly been snubs and surprises in the four years since, the Buckeye blueprint is still a great place to start. I’m Paul Michelman.
Ben Shields: I’m Ben Shields, and this is Counterpoints, the sports analytics podcast from MIT Sloan Management Review. In this episode, we’re sidestepping the committee and maybe even the controversy to discover just how predictable the college football playoff can be.
Paul Michelman: In a sport famous for its miracle plays and dramatic endings, it doesn’t feel like college football should be predictable year in and year out — but in the five years of the college football playoff era, a few trends have emerged in the data that suggest it’s more predictable than we might think.
Ben Shields: While winning all of your games is still the most foolproof way to get in — all apologies to a certain major state university in Orlando, Florida — having a blemish on your résumé is not a death knell. There are plenty of other factors that will swing a school in or out of the top 4.
Paul Michelman: In this week’s interview, Ben speaks with University of Wisconsin-Madison professor and “Badger Bracketologist,” Laura Albert, about her CFP predictive modeling and what lessons can be gleaned from the other major college playoff tournament, March Madness, in determining postseason college football rankings.
Ben Shields: Laura, thanks so much for joining us.
Laura Albert: Thanks for having me.
Ben Shields: So I think your research is very relevant to so many college football fans that are wondering how can their team make the college football playoff next season — what does it need to do?
Laura Albert: Well, based on some of my modeling efforts, and I will describe those in a minute, but before we jump into the details, there is a bit of a formula here. One is: Be in a major conference. That’s one thing that a team should do. The second thing is they should have a pretty challenging schedule — but maybe not too challenging, because they only want to lose one game. And that’s even slightly nuanced — they don’t want to lose the wrong game, because they need to make it to their conference championship game and then win it. And that’s pretty much the formula.
Ben Shields: So that sounds relatively simple enough. Obviously more difficult to achieve — right?
Laura Albert: Right.
Ben Shields: So help us understand how you arrived at this conclusion. What data do you use?
Laura Albert: Sure. I just do this using the math and using the data. And I think that’s a lot of fun. The data I use is just pretty simple data. I like to actually teach topics like sports analytics in the classroom. And so the data I use is just simple. It’s the games that were played and the scores (and really the score differential) and where the game was played (that is, home, away, or neutral setting). And then I do look at the teams that played that specific matchup — that’s embedded in the model, because the strength of schedule really matters, and who you beat and who beats you is really important information. And that gives me just a few data points. Each team plays about 12 games over the course of the season, and I can get a pretty good ranking, [one] that matches an expert’s ranking, after just a few games — usually about seven or eight is adequate. And I’ve got to say that I think that’s really beautiful and amazing that we can do that just with mathematical models — that we could get a ranking that looks like it was generated by an expert, with just a few data points.
Ben Shields: Laura, that to me is really quite striking, that in this era of big data you’re telling me that you can be fairly consistent with expert rankings and with the teams that actually make the playoff, just through a few data points — that’s pretty striking.
Laura Albert: It is. I guess I’m saying big data is overrated a little bit. Of course, in a lot of actual applications, big data helps us do things we’ve never done before. But I think what’s lost in discussion is just how powerful a few data points can be, if they’re the right data points and we look at them the right way. And I think that really comes through in ranking college football teams.
Ben Shields: OK, so that’s interesting that you say if you have quality data and you look at them in the right way. So I want to turn a little bit to the models that you have been developing to understand the college football playoff. Can you explain your model in a way that is understandable for our listeners?
Laura Albert: Sure. I base my models on Markov chains, and a Markov chain is a mathematical model that’s used to understand how a system evolves over time. And this is usually a system that has some randomness or something that’s unpredictable in it. So it’s often used to model the stock market and the spread of disease, and it’s also used to rank sports teams or just solve ranking problems in general. And the idea here is, in a Markov chain... it’s a math model, right? So it’s not really an equation, but there’s a series of equations, series of linear — it’s a linear system. And we move from team to team. So each team is like a state in the system. And I move to a team... So if I’m in a team, I move randomly to one of the teams that beats that team, right? And I keep doing that, and I move from team to team an infinite amount of times, right? So I kind of do this over a very long time horizon. And what happens is: I’m always going from teams that lose to teams that win. So when I move to teams that win a lot, I visit those the most frequently. And I measure how frequently I’m … with these teams, and the teams that win a lot and beat very good teams are the teams I visit the most often. And that’s my measure for how I rate a team. And then I just sort those. And the team that I visit most often is the best team in the country, which is quite often Alabama. And that helps me rank the teams. And it’s pretty amazing. And it’s a pretty simple concept, but it seems to work really well. What I do in my model is I … that probability that I move from one team to another. I only can move if they play each other. So that’s where that schedule comes into account. And I move probabilistically, so I don’t always go from losing teams to winning teams. I can go in either direction. But that probability that I move in one direction versus the other is based on how overwhelming that game outcome was. So it looks at the score differential and where the game was played. So if you win big at an away game, that would give you a large probability of the losing team moving to that winning team. And there’s home-field advantage, so you don’t really get quite as much credit for a home win as an away win. And that means there’s no formula for strength of schedule in a model such as this. The strength of schedule is endogenous to the model — it’s sort of determined on the fly after a bunch of games have been played. And that’s sort of the model in a nutshell. And I train the model on historic data and looking at teams that play each other twice in a season, or in back-to-back seasons, to look at the probability that they’d kind of beat that team based on a score differential in the first game — and based on where that first game was played.
Ben Shields: I have to ask you: Are you creating multiple different types of ranking models as part of this work?
Laura Albert: Yeah. So if you look at that basic approach, there’s some different things you can try out in that framework. And I’ve done that. While the models mostly agree, there are some differences, especially if you look at that fourth team to make it in the playoff. You know, there’s a big difference between being ranked fourth and being ranked fifth, right? You’re in the playoff or you’re out of the playoff. And I started tinkering around with some of that, and it was hard to determine which model was really right. You know, you never really know. They both say that there’s a chance that these teams are really good. And I also started implementing other models that weren’t always Markov chain based. And those would give me similar but different rankings. And I ended up with a series of rankings that I use to rank the sports teams. And what I did was I kind of wrapped a Markov chain around my Markov chains, and that helped look at the areas of agreement and also disagreement between the rankings. And in that case, generally all teams sort of usually moved towards Alabama, whoever’s the top-ranked team. But that helps me combine different ranking methods and give me sort of like an über-ranking that takes into account several different ideas in terms of ranking. And I find that to be a little bit more robust to any one single game outcome. And I really struggle with what to do with these games with the wild endings. Because as you know, in college football, one game, one loss, can mean you’re out of the playoff. And sometimes those losses can be kind of weighed pretty heavily in any single ranking. And if I do a few things, a few different rankings, and I combine them, then it’s a little bit more of a robust way to take some of those data points into account.
Ben Shields: So Laura, how long have you been analyzing the college football playoff — for how many seasons?
Laura Albert: I have been doing it since the beginning. So I think that’s about five playoffs since 2014.
Ben Shields: And the results of your modeling have been fairly consistent with the four teams that make it. Is that correct?
Laura Albert: That’s true. So I think I just disagreed maybe once when Alabama made it in two years ago, when they... didn’t even make it into the SEC Championship game. We had an undefeated Central Florida team. So at the end of the season, I’m showing a lot of agreement with what the committee does. And I just, you know, I just run a little math in the background. I don’t have to get together and meet and do all this work every week. So I don’t know, I think I’m more efficiently coming to the same conclusions.
Ben Shields: The power of math.
Laura Albert: The power of math, indeed.
Ben Shields: Since you mentioned UCF. What are you saying to UCF fans? What is your message to them?
Laura Albert: My message is: I would’ve put them in the playoff that year. I actually think the committee didn’t get it right that year. But I did go back and look at that season in particular. And I will say at this point, I don’t only rank, I also forecast. So the ranking is descriptive, it’s just describing the past and who are the best teams right now, and the forecasts look into the future. And the forecasts are really, really interesting, because I take the remaining schedule into account — I simulate that thousands of times. And then based on the remaining schedule and what happens in that simulation, different teams may make it into the conference championship games, and I could simulate those. And then I can look at who might be the top four ranked teams at the end of the season, in the forecast. And if I do that many times, I really see some interesting patterns in terms of strength of schedule and the remaining schedule, and also the importance of making it into the conference championship game, which I alluded to earlier.
OK, so what does all that have to do with Central Florida? I looked at that season many times, and I looked at the forecast that I was generating well before the season was over. So UCF was undefeated at any point in the season, but in the middle of the season, there was still a good chance that they could lose a game before the season was up. And when I looked at the forecast, I found that they would never even have a small chance of making it into the playoff until really the end of the season. And that was a little surprising to me. I thought they would always have a pretty reasonable chance, maybe being in my top 10 teams to make the playoff with four or five weeks to go. But they really weren’t. And that’s because the strength of the schedule just wasn’t there for them. I had them ranked fifth at the end of the season, and I had Wisconsin ranked fourth. Wisconsin didn’t make it into the playoff, because they had an undefeated season but lost the Big 10 Conference Championship Game to Ohio State. And I would agree Wisconsin shouldn’t have made it into the playoff, because that was the one game you absolutely cannot lose, although they lost one close game in the entire season, which is why they were ranked fourth. So I would’ve put UCF in as they were my fifth-ranked team. But I do recognize that it is hard to justify a team like Central Florida making it into the playoff. I never had them ranked more than fifth that entire season, despite their undefeated record. And that’s where we come back to the factors that explain a team’s ability to make it to the playoff. In that year, I think they might have been ranked even lower had that not been an easy year to get into the playoff. I think that was the [first time] a team that didn’t make it into its conference championship game was invited to the playoff. So the committee kind of dug kind of deep for Alabama and really only let them in because other teams like Wisconsin lost.
Ben Shields: Yes, I’m sure, especially among UCF fans, that even though the math might be on their side, in the end they didn’t get their wish. Nevertheless, that’s the way sports sometimes turns out. I want to ask you one final question about the college football playoff before moving onto your work with March Madness, and that is: What implications, if any, would predicting the college football playoff have if it expands to eight teams?
Laura Albert: Oh, well, I think that is really interesting if it expands to eight teams. And in that case, it might not have quite the same formula for picking the teams, which I like right now — that any four teams could make it into the playoff. And that’s why I also wish UCF had made it in — that there aren’t automatic bids, and maybe that’s something that would occur once we start expanding the playoff. I think it’s more exciting if different teams make it in from year to year, as a fan — unless it’s my team. I like to see a mix of different teams. And as a professor, I would like to say those four extra teams in the playoffs — that would take some time away from academics. I’d like to see that be part of the conversation even if we end up expanding the playoff to eight teams.
But, from looking at the rankings, it’s really hard, and it’s kind of fragile to make it [into] those top four teams every year. And so there are teams that are perennially good: Alabama, Ohio State. And they’re not in the playoffs every year. I guess Alabama’s had a good recent run. And if expanded to eight teams, I think Alabama’s always finished in the top 8 of my rankings. You know, that might make it a little less exciting and less exciting from the fan perspective. I think there probably is some correlation to having different teams in the playoff and the excitement of the playoff. But almost certainly, the true best team would probably have a better chance to win the overall conference football championship.
Ben Shields: Speaking of playoff systems that lead to sometimes fluky wins or unlucky losses, let’s turn now to March Madness, which I know you’ve done some work on in terms of predicting as well. So how do you approach March Madness predictions? In the same ways as the college football playoff? And maybe — what’s different about your approach, as well?
Laura Albert: Sure. So what I do for March Madness — and first of all, I’ll say, I only rank the men’s basketball team; I haven’t looked at the women’s yet, although I know I should. There’s only so many hours in the day — but what I do is I just rank the teams. I don’t do a forecasting element. And the reason I don’t for March Madness is because of the automatic bids that I mentioned earlier — I don’t end up forecasting the rest of the season. But it’s the same basic idea in terms of ranking the teams. And I had seen different math models for ranking men’s basketball teams before I started with football. One of the differences is that there are about 30 games in a season versus about 11 or maybe 12 when I started with football. So there’s many more games. There’s more crossover games with different conferences in basketball.
And so usually what we see with data is [that] more data means you can make a better prediction. We’re always trying to get some sense of the true underlying rankings, because we never really know who the best is. We just have some evidence of the best. We always have the sample — which is really what games have been played and what were their outcomes. And that’s also why I was so surprised in football that we can get such good rankings so quickly — compared to basketball, where they have many more games. I thought it wouldn’t really be possible in football with such a short season. So what I guess I’m saying is that there are a lot of similarities between ranking football and ranking basketball teams. What’s different is the data. So the scores are different, the scoring differentials, the kind of parity that we see. In basketball, many teams in the same conference play each other twice, and so you can see how often a team will win both of those games in that two-game series over the course of the season. I thought there’d be a lot more correlation there than there actually is in the data. And when I give talks on this, I show scatter plots. And it’s always like the most shocking visual in my talk. It’s like, what? You win by 20 points at home and you only have like a 62% chance of winning your next game? And I just look at the data, and that’s pretty surprising how random it can be, even if these are the same teams and the same players just playing twice. And so I just train the model on different data, but the same underlying dynamics is really useful in ranking basketball teams.
Ben Shields: And Laura, the results of your work on ranking basketball teams — I assume it’s relatively close to some of the other ranking systems, is that correct?
Laura Albert: It’s not bad. And I will say that there are about three times as many Division One basketball teams as football teams. And so with so many different teams in so many different matchups, you can kind of go up or down by 20 in the rankings, sometimes, from ranking to ranking, if you’re in the middle of the pack. If you’re at the top or the bottom, those are a little bit more stable. And so that’s also been interesting in terms of combining the different ranking methods. In the ones that I build, I notice that there’s a lot more variability with so many more teams, and frankly, many more teams even in a conference. And you have so many mid-major conferences in basketball, as well. And the rankings do a good job of trying to match up equal teams across different conferences. Ranking within a conference is much easier — because they play so many games with one another, it’s pretty straightforward to use different methods and draw similar conclusions. But getting all — I think it’s 352 or 353 teams now — getting all those in the right order is a bit challenging. I don’t build a special model for the tournament itself, which I would love to do, but it’s always a really busy time in the semester, and it’s hard to collect additional data. But I would like to collect additional data, because in the tournament, those teams are maybe not the team they’ve been all season. There have been injuries, so you can look at additional data points. You also have a lot of travel, or not a lot of travel, in the tournament, which affects the outcome. And so we can kind of home in on actual win probabilities a little bit more in the tournament. So when I fill out my bracket, I do look at my rankings, and they’re surprisingly pretty informative of who might win and which upsets are likely to be called. I can see that mid-major teams that are ranked very highly, like Loyola two years ago — they were ranked like 27th in my ranking, just really, really high for a mid-major team. And lo and behold, they went to the Final Four. Those are the types of things I can see from the rankings.
Ben Shields: Did you have Loyola going far in the tournament based on your rankings? I guess the other way of putting the question is: Do you follow your own advice?
Laura Albert: Well, I do follow my own advice. But I didn’t have them in the Final Four because I thought there [were] more likely teams to make it in the Final Four. And it’s always a balance to try and get the most points versus [to] call the right, obscure events. And so usually my brackets are pretty chalky, I will say. But, I tried to do a little bit better in the Final Four and really identify the best team to come from each region.
Ben Shields: That’s a great strategy. All right, so you alluded to this a little bit earlier, but I’m curious: Thinking about your work with the college football playoff and/or March Madness, what additional data and/or techniques would improve your work on either of these prediction exercises?
Laura Albert: First of all, I should say, as you mentioned before, I’m a college professor — I teach these methods in the classroom. I really like to do that. I’m always a little bit biased towards methods that I can bring into the classroom and talk about them. Markov chains are very flexible, and they are really useful, but there are a number of other data science techniques that we can use to rank sports teams. I’ve talked about Colley matrices in the classroom, Pythagorean scoring as a way to rank sports teams. I haven’t looked yet at speed of play, which is pretty important for basketball and explains some of the point differentials. And so I can fine-tune that a little bit based on... the statistics reflects speed of play and probably should be taken into account. So I’m always interested in some of those data points, but I ultimately like to talk about things in the classroom and then be able to explain the ideas to people who don’t like sports, which is like half the class quite often.
But for very specific outcomes like specific team matchups, I would look at things like injuries. Injuries are so hard to model. Anything that looks at data is ultimately a reflection of the past. And when you have a sudden injury, all of a sudden the past is no longer really predictive of the future. And that’s always a challenge in some of the models. But being able to look at the effect of injuries over a longer time is something that could be taken into account. Also things like coach experience and team experience, preseason rankings, how a team did the previous season — these are all things that can be predictive and can inform decisions if you really want to bet on the games. But I just kind of do it for fun. And then for specific sports, you can actually look at the structure of the game. I’ve been starting to look at volleyball data. [It’s a game in] which the players rotate, and they switch from the front to the back. And there’s a lot of structure in the game, much like in a baseball. And taking that structure into account really helps us really quantify the value of doing very specific things like putting this person in this part [of] the lineup.
Ben Shields: I want to end [with] — because you certainly are a college professor, you’re teaching sports analytics as a way to illuminate the value of data science to help in other contexts…. So can you talk about some of the analogs between what you’re teaching through sports analytics and maybe some other industry problems that some of these methods can help solve? Connect us back to the business world, potentially.
Laura Albert: Sure. I would be happy to do that. So I mentioned Markov chains having a lot of other applications. And this Markov chain idea, specifically used for ranking, is the basis of Google’s page rank algorithm. So it’s like an algorithm that changed our world and made information so much easier to find, right? So you can’t really go through the whole internet and read all the internet pages and figure out where the information is. But what Google did was instead of wins and losses, they have incoming web links and outgoing web links, and they looked [at]: Where were they going? So that’s like our schedule there and our outcomes. And they used that to figure out where the information was on the internet without having to read it and without having a human in the loop. So there’s this kind of artificial intelligence there.
That idea of ranking and giving somebody top-ranked suggestions is something that a lot of people that are trying to sell you something on the internet, or some ads and whatnot…. They will potentially recommend some other products to you based on maybe what you’ve clicked or what you’ve looked at, so they can offer you some products and see where you want to go. And that becomes like a schedule that you create, and then that can lead to more products that they could recommend for you, which helps you find what you want more easily and leads to them making some more money if you’re happy. In terms of my actual research, I don’t really do a lot of predictive models in my research. I actually do a lot of optimization. And I use optimization to design and operate public sector systems — complex systems where there’s a lot of interconnected main parts, but there’s also a big-data-driven aspect of this. Like how do we do data-driven engineering? How do we design a complex system for the public good? In my case, it’s been Homeland Security, disaster response, and emergency medical services. So I’ve looked at how we can do data-driven engineering to look at past data to locate and route a fleet of different types of ambulances to patients with different priorities. And that’s actually a pretty complex problem, but one that’s pretty important and helps us use our tax dollars more efficiently and more effectively.
Ben Shields: Well those are great examples, Laura, and I really appreciate how your work spans not only sports but industries and problems outside of sports. And yet you are as an educator seeing the opportunity to talk about data methods and techniques through the lens of sports. And that brings us back to where we started this conversation, which is the college football playoff is entirely predictable. And I know both of our teams have been rivals over the years, Wisconsin and Northwestern, and I’m hoping that, perhaps one of us has a lot to cheer about in this upcoming season.
Laura Albert: I’d agree with that. That sounds good to me.
Ben Shields: Well, thank you so much for coming on the show and sharing with us your knowledge and expertise, and we’ll look forward to the upcoming season.
Laura Albert: Yes, we will. Thanks for having me. It was a pleasure.
Ben Shields: This has been Counterpoints, the sports analytics podcast from MIT Sloan Management Review.
Paul Michelman: You can find us on Apple Podcasts, Google Play, Stitcher, Spotify, and wherever fine podcasts are streamed. If you have an idea for a topic we should cover or a guest we should invite, please drop us a line at firstname.lastname@example.org.
Ben Shields: Counterpoints is produced by Mary Dooe. Our theme music was composed by Matt Reed. Our coordinating producer is Mackenzie Wise. Our crack researcher is Jake Manashi, and our maven of marketing is Desiree Barry.