Sports Stats for Nerds

Okay . . . let's try this again.

Moderators: Shirley, Sabo, brian, rass, DaveInSeattle

Post Reply
User avatar
Shirley
The Dude
Posts: 7723
Joined: Mon Mar 11, 2013 2:32 pm

Sports Stats for Nerds

Post by Shirley »

I figured we could use a thread for some egghead discussion about advanced stats.

I have been thinking a bit about expected results/wins in the NCAA tournament by seed. We all know that by seed, a 1 seed should make the Final Four, a 2 seed the Elite Eight, etc. We also know that realistically, the actual expected values won't match those. For example, only about 40% of 1 seeds actually make the Final Four, so the expected # of wins should probably be a bit less than 4.

So, I found some numbers of win % by round by seed and saw some interesting patterns. By typical expected value calculations, you can just add up the measured percentage that make each round to get expected wins. For example, the expected wins for a 1 seed comes out to 3.36, which seems about right.

But then I hit some curious numbers. Using this method, the expected wins for 10 and 11 seeds are 0.64 and 0.60, even though they only win their first game about 38% and 37% of the time. The higher expected win totals happen because some of these teams win 2, 3, or even 4 games. But those expected win numbers being over .5 give me pause. If a 10 seed should advance only about 38% of the team, it seems counter intuitive that the expected wins values imply you're more likely to get 1 win than 0.

So, what's a better way to measure expected results? Look at median wins by seed? Something else? Any thoughts?
Totally Kafkaesque
User avatar
Shirley
The Dude
Posts: 7723
Joined: Mon Mar 11, 2013 2:32 pm

Re: Sports Stats for Nerds

Post by Shirley »

Here, someone else calculated expected wins the same way I did. They also included the standard deviations. http://bracketodds.cs.illinois.edu/seedadv.html

It's also interesting to note that 10 and 11 seeds do better than 9 seeds. 12s are pretty close too.
Totally Kafkaesque
User avatar
mister d
The Dude
Posts: 29491
Joined: Tue Mar 12, 2013 8:15 am

Re: Sports Stats for Nerds

Post by mister d »

I put something like this together a few years back to show how shitty Jay Wright is in the tourney. (Didn't age well!) The 8/9 vs 10/11/12 thing makes sense just because of the matchup with the 1, so without reseeding you probably shouldn't adjust that out. The 0.64 thing makes sense too ... 0.38 would be the 1st round xW and then an overall ~25% chance of winning from there on out.
Johnnie wrote: Sat Sep 10, 2022 8:13 pmOh shit, you just reminded me about toilet paper.
User avatar
Shirley
The Dude
Posts: 7723
Joined: Mon Mar 11, 2013 2:32 pm

Re: Sports Stats for Nerds

Post by Shirley »

mister d wrote: Tue Mar 13, 2018 8:41 am I put something like this together a few years back to show how shitty Jay Wright is in the tourney. (Didn't age well!) The 8/9 vs 10/11/12 thing makes sense just because of the matchup with the 1, so without reseeding you probably shouldn't adjust that out. The 0.64 thing makes sense too ... 0.38 would be the 1st round xW and then an overall ~25% chance of winning from there on out.
Yeah, it makes sense in terms of pure expected wins/value. But, to use your example, say Jay Wright is coaching a 10 seed and his team loses in the first round. That's not a bad result; it's the expected result over 60% of the time. But an interpretation of a 0.60 expected win value says that a loss is more bad than a single win is good. So it doesn't seem like the right way to judge the performance of teams/coaches.
Totally Kafkaesque
User avatar
mister d
The Dude
Posts: 29491
Joined: Tue Mar 12, 2013 8:15 am

Re: Sports Stats for Nerds

Post by mister d »

Yeah, I don't think you can really use it as a single season projection or measuring stick. Without factoring in sample size, you can probably use W-xW to conclude John Giannini is the greatest tournament coach of all-time. But ... there's also data that strongly suggests 10s are historically where underseeds happen ...
Johnnie wrote: Sat Sep 10, 2022 8:13 pmOh shit, you just reminded me about toilet paper.
User avatar
mister d
The Dude
Posts: 29491
Joined: Tue Mar 12, 2013 8:15 am

Re: Sports Stats for Nerds

Post by mister d »

7/10: .614

2/7: .705
2/3: .650
2/10: .609
Johnnie wrote: Sat Sep 10, 2022 8:13 pmOh shit, you just reminded me about toilet paper.
User avatar
DSafetyGuy
The Dude
Posts: 8866
Joined: Mon Mar 18, 2013 12:29 pm
Location: Behind the high school

Re: Sports Stats for Nerds

Post by DSafetyGuy »

How do you adjust for the quality of the teams who are playing as opposed to the seed number alone? There's a world of complaining about who is in/out every year with comparatively little complaining about the seedings.

For example, last year, Wichita State was a seven-seed in their regional. Kenpom had them as the sixth-best team in the country prior to the tournament. St. Mary's was also a seven-seed, but Pomeroy had them as #14 in the nation.
“The running, the jumping... a celebration of life.”
User avatar
Shirley
The Dude
Posts: 7723
Joined: Mon Mar 11, 2013 2:32 pm

Re: Sports Stats for Nerds

Post by Shirley »

DSafetyGuy wrote: Tue Mar 13, 2018 9:23 am How do you adjust for the quality of the teams who are playing as opposed to the seed number alone? There's a world of complaining about who is in/out every year with comparatively little complaining about the seedings.

For example, last year, Wichita State was a seven-seed in their regional. Kenpom had them as the sixth-best team in the country prior to the tournament. St. Mary's was also a seven-seed, but Pomeroy had them as #14 in the nation.
I think there are probably two ways to evaluate tournament performance. The simpler one is to just use seeds, as I am trying. The other way, more accurate but way harder, is to use a proper rating system, like Pomeroy or Sagarin and evaluate each game separately. I suspect that over a large enough sample size, the seed method will be nearly as accurate, because seeding mistakes happen both ways.
Totally Kafkaesque
User avatar
Steve of phpBB
The Dude
Posts: 8664
Joined: Mon Mar 11, 2013 10:44 am
Location: Feeling gravity's pull

Re: Sports Stats for Nerds

Post by Steve of phpBB »

How legit is it to base projections like this on something subjective like seeding? It seems to me that the seeding process involves so much judgment and guesswork by the folks doing the seeding that it really isn't much of an objective measure of anything.

Maybe a better question - how reliable or variable are the expected win projections? Are the results fairly consistent or are they all over the map? (This is one of the many times that my last statistics class was in 1985.)
And his one problem is he didn’t go to Russia that night because he had extracurricular activities, and they froze to death.
User avatar
mister d
The Dude
Posts: 29491
Joined: Tue Mar 12, 2013 8:15 am

Re: Sports Stats for Nerds

Post by mister d »

I wouldn't be positive seeding mistakes are evenly distributed. The committee has biases at an entity level (even if members turn over) assuming they use similar tools and follow their own historical trends year to year. Like it would seem completely illogical that the higher surviving seed in a 1st round matchup would have a worse historical record than the lower seed in the next round but the 7/10 has that. #7s are .309 in the 2nd round (.295 against #2s) while #10s are .451 (.391). Sooo, circling back, a #10 losing to the #7 isn't worse than winning is good, but you can project that atleast one #10 has been underseeded to the point they're a serious threat to run through the #2.
Johnnie wrote: Sat Sep 10, 2022 8:13 pmOh shit, you just reminded me about toilet paper.
User avatar
Steve of phpBB
The Dude
Posts: 8664
Joined: Mon Mar 11, 2013 10:44 am
Location: Feeling gravity's pull

Re: Sports Stats for Nerds

Post by Steve of phpBB »

mister d wrote: Tue Mar 13, 2018 9:46 am I wouldn't be positive seeding mistakes are evenly distributed. The committee has biases at an entity level (even if members turn over) assuming they use similar tools and follow their own historical trends year to year. Like it would seem completely illogical that the higher surviving seed in a 1st round matchup would have a worse historical record than the lower seed in the next round but the 7/10 has that. #7s are .309 in the 2nd round (.295 against #2s) while #10s are .451 (.391). Sooo, circling back, a #10 losing to the #7 isn't worse than winning is good, but you can project that atleast one #10 has been underseeded to the point they're a serious threat to run through the #2.
I think I understand what you're saying, but can you clarify what you think are the implications of seeding mistakes not being evenly distributed?
And his one problem is he didn’t go to Russia that night because he had extracurricular activities, and they froze to death.
User avatar
mister d
The Dude
Posts: 29491
Joined: Tue Mar 12, 2013 8:15 am

Re: Sports Stats for Nerds

Post by mister d »

Either accidental, where the committee might view a certain type of team (late risers, teams with mediocre records but missing a star player who is now back) as "#10s" or intentional, where they either invert their 7/10 or intentionally create a high risk 5/12 matchup knowing that particular game draws tons of attention.
Johnnie wrote: Sat Sep 10, 2022 8:13 pmOh shit, you just reminded me about toilet paper.
User avatar
Pruitt
The Dude
Posts: 18105
Joined: Tue Jun 04, 2013 10:02 am
Location: North Shore of Lake Ontario

Re: Sports Stats for Nerds

Post by Pruitt »

Larry Fitzgerald may retire

But as great as he is and has been, this stat is astounding:
Even more mind-boggling is that on 2,263 targets, the veteran wideout has only dropped 29 passes. He has more career tackles (39) than drops.
Incredible when you first read it, and it gets more incredible the more it sinks in.
"beautiful, with an exotic-yet-familiar facial structure and an arresting gaze."
User avatar
Shirley
The Dude
Posts: 7723
Joined: Mon Mar 11, 2013 2:32 pm

Re: Sports Stats for Nerds

Post by Shirley »

Ha, I completely forgot about this thread. I started it just a few days before UVA's historic first-round loss. And I'm sure I was thinking at the time about expected tournament results due to Tony Bennett's reputation (unfair I thought and think) as an underachiever in the tournament. I'm pretty sure I dropped this plan after the UMBC loss!
Totally Kafkaesque
User avatar
Pruitt IV
The Big Lebowski
Posts: 1623
Joined: Fri Nov 18, 2022 5:56 am

Re: Sports Stats for Nerds

Post by Pruitt IV »

In only one season where Tom Brady started more than one game did his team fail to make the playoffs. And that was 2002 when the Pats went 9-7 and he led the league in TD passes.
Canadian International
Post Reply