Wednesday, October 01, 2008

MLB Playoff Predictions

I'd like to begin with a new acronym, to cut down on the number of times that I have to repeat certain comments.

Any Thing Can Happen In A Short Series
ATCHIASS

You'll see it again...

Los Angeles Angels of Anaheim vs. Boston Red Sox



I want to reiterate my fundamental position that a 5-7 game series tells you nothing about the relative virtues of the competing teams. I'd expect the significantly better team to win 3 of 5 significantly less than 100% of the time. Either team could win this series. Either team could lose this series. Either team could sweep this series. That goes for all the rest of them, too.

ATCHIASS

That said...

I hold these truths to be self-evident:
  1. The Boston Red Sox are a better team than the Los Angeles Angels of Anaheim.

  2. That fact and $700 billion would get them a bailout of the US financial system.

  3. It won't get them into the ALCS.


The Angels won 100 games, the Red Sox 95. Even more telling, the Angels beat the Red Sox in 8 of the 9 games played between the two teams this year. How on earth can I say that the Red Sox are a better team?

1) They scored more runs and
2) they allowed fewer runs while
3) playing a tougher schedule

Baseball Prospectus does some standings based on not only runs scored and allowed, but adjusted for the opponents and the components* of run scoring and allowing. They think that these standings are a better representation of the quality of the teams in baseball. Boston's 3rd order record was 102-60. The Angel's was 84-78, worse than the top four teams in the AL East.

Of course, that fact and $700 billion would get them a bailout of the US financial system.

I've written weekly about the Pythagorean projection of season results, and the Angels have overperformed their projection by a historically significant amount. In the history of Major League baseball, two (2) teams (the 1905 Detroit Tigers and the 2004 NY Yankees [anyone remember what ended up happening to them?]) have had larger gaps between their actual and expected won/loss record. The Angels won 100 games while outscoring their opponents by 68 runs. The previous worst run differential for a 100-win team was 89, by the 2004 NY Yankees.

Another way to look at it is that the Angels averaged .42 runs per game more than their opposition. They are the 24th team in history to have a run differential of between .41 and .43 runs per game. The previous 23 averaged a schedule-adjusted 88 wins, with a high of 96 and a low of 82. If you calculate the average and standard deviation and the z-scores, you see that ~71% of the teams were within +- 1 standard deviation of the mean. ~91% were within 2. If you put together a bar chart, you get something that looks suspiciously like the normal bell curve. In other words, the statistical evidence suggests that teams outscoring their opponents by 68 runs over a 162 game season win 88 games, plus or minus. Plus or minus what? Random variation. "Luck."

So one of two things is the case. Either a) the Angels are constructed and manned in such a way as to render previous conceptions of the relationship between run differential and winning percentage meaningless or b) they had a historically fluky lucky won/loss record.

I'm inclined to the latter.

The Red Sox, of course, outscored their opponents by 151 runs, far and away the best in the AL. They were projected to win 96 games and won 95. The Angels were projected to win 88 and won 100. How did that happen? The Angels won the vast majority of the games in which luck was a bigger factor than relative team quality.

Let's think about this for a moment. Is anyone shocked when a good team loses to a bad team in a game in July? Of course not - it's baseball, and people recognize that one game is meaningless in determining team quality. But over the course of 162 games, we get a pretty good idea, from their performance (not just the won/loss record) which is the better team. There's no one who seriously doubts that the 2008 Dodgers were better than the 2008 Giants, right? But those teams split 18 games. So we don't focus on one month, or one week, or one day, recognizing that, as the samples build, we see what teams are. Over the long haul. There's too much variation, too many bounces of the ball that can go the wrong way, to decide who's better over the course of a week.

Likewise, the closer the game, the more room there is for a bounce, a bloop that falls or a rocket right at a fielder, to impact the outcome. Good teams win big when their talent has good days and a bad team's players have off days. When the good team's players have off days and the bad team's players have good days, things get close. One and two run games tell you less about a team's quality than bigger differentials, because there's more "luck" involved.

In games decided by one or two runs this year, Los Angeles was 61-28 (.685). Boston was 34-33 (.507).
In games decided by three runs or more, Los Angeles was 39-34 (.534). Boston was 61-34 (.642).

That's how on earth I can say that the Red Sox are a better team than the Angels. And they could get swept out of the playoffs over the weekend, and I'll still be saying it on Monday.

Objective rankings



In one of his abstracts, Bill James produced a play-off predictor system, based on the playoff that had occurred up to that point. Some of it makes sense, some doesn't, and I'm sure that it's less relevant than it was. But it is kind of a fun toy, so here it is, Red Sox vs. Angels:

1. 1 pt to the lead team for each half-game in the standings (LAA - 10)
2. 3 pts to the team that scored more runs (BOS - 3)
3. 14 pts to the team with fewer doubles (LAA - 14)
4. 12 pts to the team with more triples (BOS - 12)
5. 10 pts to the team with more home runs (BOS - 10)
6. 8 pts to the team with the lower team batting average (LAA - 8)
7. 8 pts to the team that committed fewer errors (BOS - 8)
8. 7 pts to the team that turned more double plays (LAA - 7)
9. 7 pts to the team that walked more batters (BOS - 7)
10. 19 pts to the team that had more shutouts (BOS - 19)
11. 15 pts to the team whose ERA was lower (LAA - 15)
12. 12 pts to the team that has been in postseason most recently or
went further (BOS - 12)
13. 12 pts to the team that won season series (LAA - 12)

BOS - 83, LAA - 66

The Bill James Playoff predictor likes the Red Sox in this series.

So do I.

Baseball Prospectus has another curve-fitting exercise they call the "secret sauce" ranking. This looks at teams' ranks in three sabermetric categories, with the overall higher ranked team predicted to have better post-season success. As it's a sum-of-ranks rating, lower is better.

Boston (15) over LAA (19)


Tampa Bay Rays vs. Chicago White Sox



The gap on paper between these two teams is not as great as the gap on paper between Boston and Los Angeles. Tampa was better than Chicago, but not by as much as the 8 game difference in the standings would make it appear. If you care about things like playoff experience, Chicago's got a lot more of it. Of course, many people expected Tampa to fold down the stretch, and they did nothing of the sort. Chicago's got a better offense, the Rays have done a better job preventing runs from scoring. It still sort of seems as if the Tampa bullpen's done it with smoke, mirrors, baling wire and duct tape. Garza's been very good, but Kazmir's varied between excellent and "can't throw a strike."

Objective rankings



1. 1 pt to the lead team for each half-game in the standings (TB - 16)
2. 3 pts to the team that scored more runs (CHI - 3)
3. 14 pts to the team with fewer doubles (TB - 14)
4. 12 pts to the team with more triples (TB - 12)
5. 10 pts to the team with more home runs (CHI - 10)
6. 8 pts to the team with the lower team batting average (TB - 8)
7. 8 pts to the team that committed fewer errors (TB - 8)
8. 7 pts to the team that turned more double plays (CHI - 7)
9. 7 pts to the team that walked more batters (TB - 7)
10. 19 pts to the team that had more shutouts (TB - 19)
11. 15 pts to the team whose ERA was lower (TB - 15)
12. 12 pts to the team that has been in postseason most recently or
went further (CHI - 12)
13. 12 pts to the team that won season series (TB - 12)

TB - 111, CHI - 44

Tampa Bay (32) ties Chicago (32)


Chicago Cubs vs. Los Angeles Dodgers



If any series looks a mismatch, it's this one. The Secret Sauce says it's a blowout, James PP says advantage Cubs but not a big one. ATCHIASS, but these two teams are not of similar quality. Joe Torre's drawn praise, but for what, I'm not sure. Most people were predicting the Dodgers to win 90 games and the West before the season started. I don't see any great accomplishment in winning 84 and the West. There are seven teams with better records who are golfing already. The Cubs, on the other hand, led the world in run differential, and led the NL from start to finish. They scored 155 more runs than the Dodgers, and allowed 23 more.

But...ATCHIASS

Objective rankings



1. 1 pt to the lead team for each half-game in the standings (CHC - 26)
2. 3 pts to the team that scored more runs (CHC - 3)
3. 14 pts to the team with fewer doubles (LAD - 14)
4. 12 pts to the team with more triples (LAD - 12)
5. 10 pts to the team with more home runs (CHC - 10)
6. 8 pts to the team with the lower team batting average (CHC - 8)
7. 8 pts to the team that committed fewer errors (CHC - 8)
8. 7 pts to the team that turned more double plays (LAD - 7)
9. 7 pts to the team that walked more batters (CHC - 7)
10. 19 pts to the team that had more shutouts (LAD - 19)
11. 15 pts to the team whose ERA was lower (LAD - 15)
12. 12 pts to the team that has been in postseason most recently or
went further (CHC - 12)
13. 12 pts to the team that won season series (LAD - 0)

CHC - 86, LAD - 67


Secret Sauce:

Chicago (19) over Los Angeles (52)


Philadelphia Phillies vs. Milwaukee Brewers



Both of these teams closed strong to leave the Mets on the outside looking in. Sabathia didn't have much left when the Indians got to the ALCS last year, and the Brewers are going to go as far as he can take them.

Objective rankings



1. 1 pt to the lead team for each half-game in the standings (PHI - 4)
2. 3 pts to the team that scored more runs (PHI - 3)
3. 14 pts to the team with fewer doubles (PHI - 14)
4. 12 pts to the team with more triples (PHI - 12)
5. 10 pts to the team with more home runs (PHI - 10)
6. 8 pts to the team with the lower team batting average (MIL - 8)
7. 8 pts to the team that committed fewer errors (PHI - 8)
8. 7 pts to the team that turned more double plays (MIL - 7)
9. 7 pts to the team that walked more batters (PHI - 7)
10. 19 pts to the team that had more shutouts (PHI - 19)
11. 15 pts to the team whose ERA was lower (MIL - 15)
12. 12 pts to the team that has been in postseason most recently or
went further (PHI - 12)
13. 12 pts to the team that won season series (PHI - 12)

PHI - 113, MIL - 30


Secret Sauce:

Milwaukee (38) over Philadelphia (40)


Picks



I want to reiterate my fundamental position that a 5-7 game series tells you nothing about the relative virtues of the competing teams. I'd expect the significantly better team to win 3 of 5 significantly less than 100% of the time. Either team could win this series. Either team could lose this series. Either team could sweep this series. That goes for all the rest of them, too.

ATCHIASS

My baseball pundit contract requires, though, that I make predictions. This enables me to gloat about things I get right. Anything I get wrong, well, there are sure to be mitigating circumstances that we'll enable me to claim victory on them, too. ;-) (In that sense, it's kind of like the Hitchhiker's Guide to the Galaxy - in any cases where my predictions differ from actual reality, it's reality that's got it wrong...)

So here they are - the Lyford predictions for the ALDS and NLDS Series:

Boston over Los Angeles of Anaheim
Tampa Bay over Chicago
Philadelphia over Milwaukee
Chicago over Los Angeles


("Three favorites and your hometown team? Going way out on a limb, there, huh?" To which I respond, "Hey, I just calls 'em the way I sees 'em...")





* - Why do they look at "components" of run-scoring? Because there is so much "luck" involved in the game. A pitcher who strikes out three batters in an inning in which he allows three walks and a home run may give up 1, 2, 3 or 4 runs. For the offense, the HR and walks are all good outcomes of the pitcher/batter confrontation, the strike outs are bad outcomes, but the order in which they occur is critical to their applied value. Every good outcome has an inherent value. A walk puts a runner on base, makes a pitcher throw more pitches and gives another batter a chance to hit. Whether the inherent value contributes to any actual or applied value depends on order.

The same thing applies to teams as well. A team that scores 4 runs 3 times and 2 runs 3 times in a six game stretch, while allowing 3 runs three times and five runs three times might be 3-3 or 6-0. It's all about scoring big runs when your pitchers aren't pitching well, and pitching well in games when you aren't scoring much. To the extent that teams have shown that tendency or "ability," it's not consistent year-to-year, or even

Having accepted that, the way we look at players and teams hinges on a mindset.

  • A: Some players/teams are inherently "clutch," that is, they demonstrate a repeatable ability to perform better in "key" situations, and therefore produce results which exceed the value expected from the components pieces of their offensive events, or runs scored and runs allowed.

  • B: Everyone who makes it to the Major Leagues has had to perform in clutch situations many times, and has demonstrated an ability to perform in situations of mental stress. There are no players who have demonstrated a statistically significant repeatable ability to increase their performance in "key" situations. Therefore, there is no need to discard the null hypothesis, which is that team and player results are built upon a foundation of raw performance, and vary according to a certain amount of "luck."


I hold with position B.

Labels: , , , , , , , , , ,

|

0 Comments:

Post a Comment

Comment?

<< Home