The Poor Man’s Analyst

Johan Santana, Part Deux

leave a comment »

I’m not 100% sure that I feel comfortable writing this post, for two reasons. One, I’ve never used odds ratios before, so there could be some rule I’m violating without knowing. And two, I slept approximately zero hours last night doing a ~45 page group paper for my business management class. With that in mind, let’s see where this takes us…

First, I should introduce what an odds ratio actually is. It is defined by Wikipedia as “the ratio of the odds of an event occurring in one group to the odds of it occurring in another group, or to a sample-based estimate of that ratio.” In baseball terms, it means that we take the odds of an event happening for a pitcher, and compare that to the odds of the same event happening to the batter, and the formula spits out the expected outcome. Numbers must be converted into “odds ratios” before plugged in…don’t ask me why, but it seems to make sense. Here’s the formula, using on-base percentage:

  • Translate OBP (or your rate of choice) into odds ratio form: (OBP/1-OBP) to get the odds ratio (OR)
  • (batter OR / lg OR) * (pitcher OR / lg OR) = (expected OR / lg OR)
  • Then reverse step 1 to get the expected outcome of the matchup.

Thanks to Pizza Cutter for the explanation on that one. So here’s how this relates to Johan Santana (3 paragraphs in). Peter Bendix, of Beyond the Box Score and FanGraphs fame, penned a piece for the latter about a week ago on the subject of Johan Santana. In it, he shared some of the same concerns that I did about the Mets’ ace. Peter said this about Johan’s LOB%: “His LOB% in 2008 was the highest of his career [at 82.6%]. Over the last three years, his LOB% has been 76.3%, 77.7% and 78.3%, respectively.” Generally, a sabermetrician would say that his LOB% is bound to regress towards the mean next season, and I would agree. But I decided to check out the veracity of that claim, using odds ratios in certain situations to see where he over- or under-performed the expected outcome.


We’ll do an overall line on OBP just as a sanity check.

Johan OR: (.286/1-.286)= .4006

Opponents OR: (.328/1-.328)= .4881

League OR: (.331/1-.331)= .4948

Using the odds ratio formula above, we get this:

(.4881/.4948) x (.4006/.4948) = (expected/.4948). Expected OR equals .3952, for an expected OBP of .283. That’s pretty much exactly what we were looking for–Johan’s eOBP to be slightly under the real thing, because his average opponent was slightly worse than league average. Now we can go onto how Johan performed vs. expectation with runners on base.

Runners on Base

Batters overall hit exactly the same with RISP as they do in all situations, so we’ll use the same number as above.

Johan RISP OR: (.307/1-.307)= .4430

Overall Opponent OR: .4881

League OR: .4948

(.4881/.4948) x (.4430/.4948) = (expected/.4948). Expected OR equals .4370, which comes out to an eOBP of .304.

All of that mumbo jumbo means that Johan Santana did pretty much exactly what was expected with runners in scoring position (in terms of OBP), according to the numbers. Frankly, I’m disappointed. I was hoping I could say something like “expect even more regression than usual,” or, “his LOB% was actually unlucky!” But the numbers don’t say either of those things.

Sorry to disappoint. Johan is in for some regression–nobody can sustain a LOB% that high for that long–just not any more or less regression than we would normally expect.


Written by dcn29

December 8, 2008 at 9:38 PM

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: