College Football Morning: Ohio State vs. Notre Dame, in Numbers
How much better are the Buckeyes?
This is the first part of our two-part national championship preview. Come back on Monday to read about Ohio State vs. Notre Dame in words.
**
Ohio State is more talented than Notre Dame. Throughout this playoff, Ohio State has played better football than Notre Dame. Most weeks this season, the Ohio State Buckeyes were better than the Notre Dame Fighting Irish.
By how much, though?
It’s easy to obsess over matchups, and matchups are where football games are won. We mentioned yesterday, in talk about Ohio State, Michigan, and Oregon, how matchups can even sometimes be leveraged to give the worse overall team an outright advantage. Still, a lot of college football can be explained by simply asking who is better than their opponent. A lot of college football can be explained by lining up Teams A, B, and C, and saying that if A is three points better than B and B is six points better than C, A must be nine points better than C on a hypothetical average day.
The best way to balance these two schools of thought, when evaluating a single game, is to start with how good each team is. Make that the baseline. Then, evaluate matchups to see how matchups might pull the game in different directions from its “raw” point spread. The best way to start with Ohio State and Notre Dame is not to ask whether Al Golden will try to combat Jeremiah Smith by calling a lot of zone. It’s to ask how much better Ohio State is in the first place.
Let’s start with that.
We regularly reference Movelor, FPI, and SP+ in these blog posts. They’re far from the only three credible rating systems out there, but we understand all three rather well, especially since Movelor is the one we built ourselves. Here’s what each says about Notre Dame and Ohio State:
Movelor: Ohio State by 3.6
Movelor’s big input—its only input, really—is point differential. Movelor is an elo-adjacent system which relies heavily on margin of victory, which is a jargon-laden way of saying: Movelor gives each team a power rating and uses that to predict a point spread before each game. After each game, it adjusts each team’s power rating based on how wrong the predicted point spread was.
It’s important to note here that Movelor is not calibrated for a twelve-team playoff. It’s not even calibrated for a four-team playoff, or for a BCS National Championship. Movelor is calibrated to optimize its accuracy across all Division I football games. A Week 6 matchup between two-win FCS teams matters as much to Movelor as a national title game.
This isn’t really a weakness of Movelor. Every system does something like this. I’m not sure how you’d even go about trying to build a system specifically for predicting national championship games. The same things which accurately predict those Week 6 FCS showdowns should also accurately predict a January clash between Ohio State and Notre Dame, or one between Washington and Michigan, or one between Georgia and TCU. At the end of the day, we’re rating football teams. The equations which do that effectively should stay the same no matter the game.
Still, we’ve seen a lot of blowouts in national championships, and even more in the playoffs in general. It isn’t clear why. Does something about the stage make the better team play its best and the poorer team play its worst? Does desperation on the losing side and “emptying the tank” for the winners expand the average 12 or 13-point error we see from credible systems and betting markets?
The best theory I can offer here is that if this phenomenon is real, this small-sample phenomenon that’s smaller statistically than it looks (the average error for national championships isn’t that much higher than 12 or 13 points), it could have to do with timing. A lot of time passes between games in December and January. A team improving by a full touchdown over September and October wouldn’t make anybody bat an eye. In December and January, it happens over three or four games, not eight or nine. We don’t see the improvements happening. We catch up with the improved product a few weeks later. Movelor’s well-tuned to properly weight recency in November. But maybe Movelor and the rest of these systems don’t catch up fast enough to improvements made in the pseudo-offseason.
What this would mean for this game, I would guess, is that Movelor could be too low on Ohio State. Or rather, Movelor could be too low on the gap between these teams. Movelor rates Notre Dame the second-best team in the country. Notre Dame has earned that, and Notre Dame has not been getting worse. But because it has less talent and more injuries, Notre Dame’s ceiling is lower than that of the Buckeyes. Let me try to explain how this could play out.
Say that three weeks of rest and playoff preparation are worth four points, but that teams can only access those four points if they were not already playing their best possible football. Notre Dame? They’ve been playing their best football since October. Ohio State? They weren’t playing their best football as recently as December 1st, the day they lost to Michigan. In this explanation, maybe Ohio State and Texas and Georgia and Penn State all got better than they would have been had each game taken place on December 15th, a week after conference championships. Maybe Notre Dame didn’t get better. Maybe Notre Dame stayed put, the teams around it got better, and since Movelor was already so high on the Irish, it wasn’t very surprised.
Looking at Movelor in particular, I can tell you that Notre Dame is only 1–2 against Movelor’s spread in playoff games. That’s highly unusual among rating systems—Movelor was onto Notre Dame before everybody else was—but that makes it interesting. There’s an angle here where Notre Dame’s what Notre Dame’s been, but it’s not enjoying the usual improvements playoff teams make. Maybe something similar happened to TCU in 2022, or to Washington last year. Maybe those teams maxed themselves out in the regular season. Come playoff time, maybe there were no more improvements left to make.
This is, to be clear, how I think such a phenomenon could be coming to pass if said phenomenon is real in the first place. Is said phenomenon real? I don’t know. But if it is, this is how it could work.
What I can tell you, and Notre Dame’s 1–2 playoff record against Movelor’s spread plays into this, is that Movelor is 8–2 against the betting market spread in the ten playoff games so far this year. It had Oregon favored by 0.6 points over Ohio State. It only had the Texas/Clemson line at 10.8. Those are the only two games where it’s been wrong. It went 3–1 in the first round, 3–1 in the quarterfinals, and 2–0 in the semifinals.
Ten is a tiny sample, but digging in a little further, I’m not sure this is all entirely insignificant. Movelor was onto Notre Dame being better than Georgia and Penn State. It was onto Notre Dame being a lot better than Indiana. It was onto Ohio State being a lot better than Tennessee. It was onto Ohio State being comfortably better than Texas. It missed Ohio State being better than Oregon, let alone better by a lot, but it’s still been more accurate in total on Ohio State than betting markets have been. It’s been more accurate on Notre Dame as well. In Notre Dame’s case in particular, it saw the team’s strength when the markets missed it. It didn’t out-think itself. Other systems and the market did.
So, that’s what Movelor thinks. Based on every final scoring margin over the last twenty years, with a little bit of offseason adjustment thrown in, Movelor thinks Ohio State is 3.6 points better than Notre Dame.
FPI: Ohio State by 1.8
FPI is even more bullish on the Irish. This is surprising. FPI’s big input is yards per play. ESPN isn’t very transparent about how its systems work, but my understanding is that FPI looks at how teams perform against one another in yards per play, then produces a power rating which adjusts for who everybody has played.
What this means, looking at Notre Dame and Ohio State, is that yardage efficiency is even more friendly to Notre Dame than raw point differential. A big play-by-play measure of how teams compare likes the Irish even more than Movelor likes the Irish.
The same potential national championship-specific pitfall we just discussed at length also applies to FPI. We don’t know if FPI adjusts dramatically enough to postseason results. On the other side, it’s worth mentioning that among every prominent rating system this college football season, FPI was the strongest against the midweek spread. Yards per play tells you a lot about college football teams.
SP+: Ohio State by 5.9
Of these three, SP+ gets the closest to betting markets.
Again, ESPN isn’t very transparent about how its systems work, but my understanding is that SP+ looks at efficiency, much like FPI, and then also looks heavily at Success Rate, a statistic which measures how often offenses get 50% of necessary yards on first down, 70% on second down, and 100% on third and fourth down, plus how often defenses stop them from getting those portions of the yards to gain. Success Rate is an interesting stat. SP+ is more accurate than Movelor. SP+ is 3–7 against the spread these playoffs and 1–5 in games involving Ohio State or Notre Dame.
That SP+ has been so low on Ohio State is noteworthy, because SP+ is usually a system which loudly praises Ryan Day’s teams. Even at the end of the 2019 season, the one LSU dominated from start to finish, SP+ said Ohio State was 2.3 points better than the Tigers. This was after LSU pulverized Clemson in the national title game, a week and a half after Clemson eliminated OSU.
I’d love to talk to Bill Connelly about this (I’d love to talk to Connelly for any reason—the guy is brilliant), but I’m curious if SP+ struggles at the fringes. I’m curious if it has a hard time separating the best two or three or four teams in the country. I would think that teams in that stratosphere have astronomically high success rates in total. But while FPI and Movelor are equipped to distinguish between different levels of dominance (15 yards per play is more than 12; a 60-point victory is more than a 40-point victory), Success Rate makes a significant part of SP+ rather binary. I don’t know if this is what’s gone on with Ohio State, or if there’s even a measurable problem there at all. But it’s a theory.
It's worth noting, having said all of this, that SP+ is still really high on Ohio State. It still calls the Buckeyes the best team in the country. It’s still higher on the Buckeyes than FPI or Movelor is. But it’s interesting that SP+ has missed Ohio State three games in a row. It makes me wonder if SP+ can’t get that ceiling high enough.
Contrarily, SP+ is the lowest of these three on Notre Dame. FPI and Movelor agree that Notre Dame is the second-best team in the country. SP+ has the Irish fifth, trailing Mississippi, Oregon, and Alabama. (Putting Mississippi and Alabama in the top five isn’t ridiculous. Movelor and FPI both have those two teams in the top six. Those were very good teams. They lacked consistency.)
My theory when it comes to SP+ and Notre Dame is that Success Rate’s second-down threshold undersells the Irish offense. Most teams might need 70% of necessary yards on second down, and given what numbers are divisible by what, I understand why Success Rate’s creators landed on 70%. But 70% is somewhat arbitrary, and with Notre Dame so unusually run-heavy for a team of its caliber, and with Notre Dame so particularly built to convert third downs of short and medium distances, I don’t think Mike Denbrock views second down the same way most offensive coordinators have to view second down. Again, this is just a theory, and it only applies to the Irish offense, which SP+ already views as very good (sixth-best in the country, only 0.7 points worse per game than Ohio State’s). But that would be an explanation.
**
To sum up what we’ve done so far:
We can see why Movelor, FPI, and SP+ could all underrate Ohio State (the long layoff phenomenon vis-à-vis national championship timing). We can see why SP+ could underrate both Ohio State (the limited ceiling phenomenon) and Notre Dame (limitations with Success Rate as a stat). Now. Let’s look at the final boss.
Betting Markets: Ohio State by 8.5
It’s important to note here that betting markets are NOT measuring how good Ohio State and Notre Dame are. They’re doing the thing we did above—starting with how good each team is, then evaluating matchups to see how they might pull the game in different directions from its “raw” point spread. It’s also important to note that while betting markets are the most accurate college football predictors on a game-by-game basis, sportsbook’s equations change in games with a lot of fan interest. Nate Silver has written about how the Super Bowl and March Madness are easier targets for bettors than average sporting events. The Super Bowl and March Madness make sportsbooks more liable to public opinion than expert opinion.
Finally, this isn’t important but we’ll also note that betting markets have wobbled, as they tend to do. The line was Ohio State by 9.5. Then it was Ohio State by 8. Now it’s Ohio State by 8.5. It could change again before Monday.
Why are betting markets higher on Ohio State than any of these rating systems?
There are a few possible explanations. One is that markets are considering the factors we outlined above, the long layoff phenomenon and the limited ceiling phenomenon and limitations with Success Rate, and that balancing all those out gives you Ohio State by 8.5, not Ohio State by two or four or six. One is that the public is understandably enamored with Jeremiah Smith, and books have given Ohio State a few extra points to account for public enthusiasm. One is that this is about injuries, or that it’s about matchups—which we’ll discuss on Monday—and that if you asked some magic oracle how much better Ohio State is than Notre Dame, it’d say two or four or six points. It could also be a broad combination of things, some included on this list and some missed. I am not as smart as these markets.
**
So. How much better is Ohio State than Notre Dame? Probably somewhere between 1.8 and 8.5 points. With roughly 50% of college games ending up within ten or eleven points of a reputable rating system’s spread, that means a low-level confidence interval for this game spans from something like a 9-point Irish victory to something like a 19-point Buckeye win. Notre Dame by double digits? That would be surprising. Ohio State by twenty or more? Also surprising. This is a small confidence interval—going up to even a 90% confidence interval would broaden the window much more dramatically—but this is the general outlook. A reasonable expectation for this game is for it to finish somewhere between Irish by 9 and Buckeyes by 19.
**
Enjoying College Football Morning? Subscribe to receive it directly in your inbox.
**
There wasn’t a lot of college football news over the last 24 hours, but one thought on something we declined to mention yesterday:
The Department of Education said yesterday that Title IX will apply to revenue sharing. Which means: When a school shares revenue with its athletes, it has to share the same amount with women that it shares with men. This could change, either through lawsuit or by the Trump administration simply flipping the ruling, making a different decision from the Biden administration. (A whole lot of effective law is delegated to unelected bureaucrats, and that’s not truer with one party than the other.)
Understandably, the ruling created a shockwave of reactions, but unless I’m getting something very wrong here, this is a very small deal. The memo communicating the ruling was vague when it came to how this applies to NIL payments from booster collectives. As long as those aren’t constrained, there’ll still be a fairly free market. Schools will just be required to increase their spending on women’s athletics by seven or eight million dollars, at most, if they want to share revenue with their football and men’s basketball teams. That’s a ton of money, sure. But it isn’t going to change things in any noticeable way, and booster collectives can adjust to steer things towards their natural state.
**
This post was also published at www.thebarkingcrow.com, where you can always find all of Joe Stunardi, Stuart McGrath, and NIT Stu’s work.