As you may know, I am a baseball fan. I just like the game. I kind of understand it and it suits me. Unusually, I’ve been drawn into watching the Major League Baseball playoffs. In fact, I watched the entire six and a half hour 18-inning game between the Giants and the Nationals. Normally, I’m not that interested in “the majors,” but it is unfortunately the main kind of baseball that I have the opportunity to watch. Some time ago, we got some little station that allowed us to watch the A-advanced San Jose Giants. That was very cool.
Anyway, I watched most of the Cardinals-Giants game today, and so I was looking at the box scores. One of the great things about baseball is that it has a whole bunch of statistics. I remember reading an essay by Carl Sagan in which he talked about how his first real introduction to mathematics was reading baseball box scores with his father. That was also true of me too, although I was alone because my father has never especially cared for sports. One of the more mysterious statistics in the Earned Run Average or ERA.
ERA Basics
The ERA is the number of runs that a pitcher allows in a nine-inning game. (The “earned” part of this is that the run must not have been scored because someone made an error.) But given that pitchers rarely play entire games, the ERA must be prorated. For example, in today’s game, John Lackey allowed four runs over the course of six innings. This is equivalent for allowing six runs over the course of nine innings, and so the box score indeed shows an ERA of 6.00. Via Wikipedia, this can be calculated very simply:
What’s interesting about the ERA are the details. For example, players don’t necessarily play complete innings. So given that there are three outs in an inning, the inning is broken up into thirds. So if a player retires one batter in an inning, he is said to have played one-third of that inning. For example, in today’s game, Tim Hudson pitched the first six innings and then got one batter out in the seventh before he was removed. Thus, he pitched six and a third innings. (This is confusingly indicated by 6.1 in the box score.) Because he allowed four runs during that time, his ERA was 4/6.33 or 5.68. And indeed, that is what the box score shows.
He Never Pitched at All!
But what about the situation when a pitcher doesn’t get anyone out? This isn’t that uncommon. A team can bring in a pitcher for one particular batter. If that batter hits a home run, the pitcher could have allowed up to four runs. If the pitcher is taken out at that point, he hasn’t pitched any innings, because he has not retired any batters. Thus, in the worst case scenario, his ERA would be 4/0.00 or ∞?! Well, unfortunately, yes.
Just such a thing happened in today’s game. It was tied at the end of nine innings, so the game went into extra innings. In the bottom of the 10th inning, the Cardinals did not score. That meant all the Giants had to do was get one run and the game was over. Randy Choate was brought in to pitch that inning. He walked the first batter. The second batter got a single, leaving men on first and second base. The third batter bunted, allowing the player on second to score and the game was over. So Choate didn’t pitch any part of an inning because he got no outs.
Normally, this would provide an ERA of ∞. But in this case, there were no earned runs, because the run was caused by an error. (Interestingly, the error was by the pitcher Randy Choate himself — he made a bad throw to first base.) So this provides an ERA of 0/0.00 which is simply undefined. Nonetheless, the MLB box score from the game gives Choate an ERA of 13.50. But it gives the same ERA to the Giants’ Sergio Romo, who pitched to a single player (the last) and got him out. That should be an ERA of 0/0.33 or 0.00 — not 13.50.
Am I going crazy?!
My friend and business partner Will tells me that the ERA is not really a stat that anyone pays attention to anymore. More interesting is WHIP (Walks and Hits per Inning Pitched). Clearly, the same problem applies to the WHIP as to the ERA. He is too practical to be interested in such cracks in the system. And he is right that these statistics are really meant to apply to seasons or even whole careers — not single games. But it drives me crazy.
If MLB is going to post an ERA for a game, they ought to get it right. And if there is some special calculation that goes into dealing with players who don’t pitch for at least one out, it ought to be made clear. Instead, I find no explanation. What’s worse, people set up whole ERA calculators that do nothing but spit out the simple calculation above. And if you enter 0.00 innings, they say, “Please make sure the number of innings pitched > 0.” Great help!
As Louise on Bob’s Burgers would say, “Am I going crazy?!”
It’s often a mess. Baseball “starts” over with season stats once the playoffs begin. Partially to avoid having players, for the purposes of the record books, watch their season stats go down because they’re facing a select group of tough competitors; partially so fans can easily see how someone is performing in the playoffs. They also start stats over if someone is traded mid-season between the AL and NL, which once had different umpires and never played each other until the World Series — it makes no sense to restart those stats anymore.
Your friend is right that no serious stat-heads pay much attention to ERA. But the stat-heads are kinda groupthinky. For example no stat-head values batting average; they all prefer home runs. (Hmm, what team was last in home runs this season? Uh . . . Kansas City.) In aggregate the stats that are popular among stat-heads do give the best picture of why players/teams were successful the last several seasons; they’re not worth much predicting what future trends are going to be. As you know, things go up and down in baseball (eras where pitchers dominate, eras where hitters do) and so it’s almost impossible to say “this is the most important measurable attribute.”
And don’t get them started about fielding statistics, oh Lord!
I figured you might have some input in this. I’m not so interested in the statistics as a measure of reality. I’m just trying to figure out what they mean. I remember reading one of Wilt Chamberlain’s books, and he talked about how players would change their play based upon what statistics were kept. He noted that players started trying to steal balls a lot more once they started keeping steals as a stat in the early 1970s. He also said that Jerry West was probably one of the all-time greatest stealer, but no one knows that because they didn’t keep the stat.
One thing I’ve noticed watching these games, is that at least the teams in the playoffs are far more focused on a form of baseball that Ty Cobb would have approved of. There aren’t a lot of home runs. I’ve really enjoyed watching the games.