Friday, April 01, 2005

Boston, New York and Pythagoras...

"Baseball is a game of inches." Conventional wisdom, handed down through the ages, that has an enormous amount of truth within. In fact, it's not just inches - it's millimeters and microseconds. There may be two feet difference between a home run and an out when a ball's 400 feet from home plate, but what was the difference when that ball contacted the bat. Is a 2 millimeter difference in the swing plane enough to change a long fly ball into a home run, or vice versa? Is a 500 us difference in the swing start time enough to do it?

Baseball is a game of skill, without a doubt. A game of reflexes and quickness and strength.

And it's a game of luck. People resent the term in this context, but it's there, and there's no other way to describe it. There's variation in everything - when the variations help it's good luck, when they hurt, it's bad luck. There's a lot of luck that goes into a season, a game, an inning. Pitcher A gives up BB, H, K, K, HR, K, and he's given up 3 runs. Pitcher B give up HR, K, BB, K, H, K and he's given up 1 run. With an identical performance. It's the luck of the draw that the HR hitter came up with 2 on for pitcher A and none on for pitcher B. And it happens with hitters. How many screaming line drives have you seen caught be a fielder? If you've watched baseball at all you've certainly seen it. While a bloop falls in for a double. Excellent execution is no guarantee of a good result, nor bad execution of a bad one. Or two teams play a 4 game series, and one team wins 13-2 and 12-4, then loses 2-1 and 6-5. Over the series, the first team played better baseball, but ended up with a split. On any given day, the better team just cannot consistently "impose its will" on another.

A .500 team is one that can be expected to win half of its games. Do a simulation on a 50/50 team for 162 games, and you'll see 64-98, 98-64, and everything in between. Due to chance. Luck.

Over the course of 162 games, we expect the luck to even out, and to a large extent, it does. Most of the time, the teams that make the post-season are, objectively, good, and the teams that finish at the bottom are, objectively, bad. But there's reason to be skeptical that one team finishing 5 games ahead of another is necessarily a better team. It may be, but it may not. There's a certain amount of randomness involved. Does anyone think that the 2004 Red Sox would beat the 2004 Cardinals 162 straight times? Of course not, but over the course of 4 games, that's what the sample looks like. Is there any doubt that the 2004 Boston Red Sox were objectively better than the 2004 Baltimore Orioles? No, and yet the Orioles won 10 of their 19 meetings.

This is background for the point I want to make, which is this: the 2004 Boston Red Sox were a significantly better team than the 2004 New York Yankees.

Before I demonstrate that, I want to describe one tool and one myth.

One of the early baseball analysis tools that Bill James provided was the discovery of the mathematical relationship between runs scored, runs allowed and winning percentage. Looking at what teams had done over the history of baseball, he found that the actual winning percentage of a team was, in almost all cases, approximated by (Runs ^ 2) / ((Runs ^ 2) + (Runs Allowed ^ 2)). Because this resembled the Pythagorean relationship between the sides of a right triangle, he dubbed it the "Pythagorean" winning percentage, and it's been used by many people ever since as a metric to take some of the luck out of team records. (Note: actually using 1.83 as the exponent, which produces closer results, on average, than 2).

It is a common tenet of baseball wisdom that "good teams win the close games." It is also, like many of the common tenets of baseball wisdom, false. When the issue has been studied, it was found that good teams win blow-outs. Good teams do win more close games than bad teams, but that's because good teams win more games of every type than bad teams. The 3 teams with the best records in baseball last year all had lower winning percentages in 1-run games than they did overall. Boston, which won over 60% of it's games, was 16-18 in games decided by 1 run. The Tampa Bay Devil Rays had a better record in 1-run games than Boston did.

A study by David Smith available at retrosheet determined that "it seems reasonable to conclude that, while 1-Run games may provide good theater, they are not especially good predictors of ultimate team success." Bill James, in a column available at Diamond Mind, concluded that "one-run games involve a huge amount of luck. This may be the only safe statement that can be made about them."


So let's look at the 2004 Red Sox vs. the 2004 Yankees.

The Red Sox scored more runs, largely on the strength of their ballpark being more conducive to run-scoring. They were, park-adjusted, comparable teams. But the Red Sox also allowed fewer runs, demonstrating much better pitching. The Red Sox outscored their opponents by 181 runs, more than twice the Yankee differential of 89.



Boston - New York - 2004
WL%ScoredAllowedPyth %Pyth W"Luck"

Boston98640.6059497680.59696.551.45

New York101610.6238978080.54888.7812.22


Yes, the Yankees won 3 more regular season games than the Red Sox did. The question is, is that result representative of the relative quality of the teams? I don't think so. Obviously, in terms of runs scored/runs allowed differential, the Red Sox were a significantly better team than New York. There are a couple of other indicators.



Red Sox - Yankees - 2004 Expanded
Head-to-headFinal Overall Record

Regular seasonPost-seasonOverall

WLWLWLWL%

Boston118431511109670.6193

NYY811341115107660.6185



Head-to-head in the regular season, the Red Sox took 11 of 19. Head-to-head in the post-season, they took 4 of 7. Head-to-head overall, they took 15 of 26. Overall, all the games that counted, the Red Sox won 109 and the Yankees won 107, and the Red Sox finished with a better winning percentage than New York.

But the head-to-head stuff is all small sample size. The overall difference in winning percentage is so small that you've got to go to 4 decimal places to see it. The thing that's convincing to me is the run differential difference between the two teams. The Yankees 12+ game advantage over their projected record is the 2nd-greatest differential in MLB history.

If we look at the entire history of Major League Baseball, there have been 2327 team seasons. On average, the teams have exceeded their projected records by about .5 win. The standard deviation on the sample is about 4. 80% of all teams have finished with a record that is within 1 standard deviation of the average. 98% have been with 2 standard deviations, about +- 8 wins of their projected records. The following chart shows what the distribution has looked like.



The 2004 Yankees were an outlier. As already noted, they were the 2nd "luckiest" team in baseball history. A team that scored 897 runs and allowed 808, as New York did, should be expected to win 89 games, not 101.

Is there some particular characteristic to this Yankee team that makes them better than their runs scored and allowed suggests? Well, they have exceeded their projection in 8 of the last 9 years, so there may be. Certainly, in a close game with a lead, they've very rarely lost, as Rivera has been a dominant performance over that timespan. But prior to last year, they've ranged from a low of -3.8 in 1997 to a high of 5.7 in 2001. Nothing remotely resembling that 2004 performance. So, can we expect them to repeat that?

No, we cannot.

There have been 32 teams in baseball history whose "luck", that is, their actual wins minus their projected wins, was more than 8 games. 2 of those were the 2004 Yankees and the 2004 Cincinnati Reds. We don't know what they've done the following season.

But of the other 30, only 2 were 8+ the following year. Only 1 of those 30 teams was "luckier" the following season. The group saw their average record "luck" decrease by over 9 games. As a group, their 2nd year "luck" was .004, essentially 0.

There is no reason to expect the Yankees to overperform in 2005 anything like they did in 2004. Everything suggests that the Red Sox were better than New York last year. So if the Yankees did improve in the off-season (and it's likely that they did, at least a little because of the Randy Johnson addition), well, they needed to to catch the Red Sox. The fact is that they could very easily have improved their fundamental ability by 5-7 games and still not match their 2004 record.

I've written about the starting pitchers. I've written about the position players. Here's the bottom line:
The 2004 Red Sox were a better team than the 2004 Yankees. The Red Sox improved. The Yankees improved. The 2005 Red Sox are a better team than the 2005 Yankees. Period.

Will the Red Sox win the East? Who knows? I don't. The teams are close enough that either of them could win the division by 3-5 games, and it would prove nothing about who's actually the better team. I say the Red Sox are better, they're just as likely to win the division as the Yankees are, these are the two best teams in the AL, and a 3rd straight ALCS match-up is, if not actually likely, not unlikely at all.

|

0 Comments:

Post a Comment

Comment?

<< Home