Back in March, I wrote a piece at Hardball Times in which I boldly (and perhaps foolishly) predicted that the Reds would win ~86 games, based on CHONE projections. I was pretty careful in my work-up to be as realistic about playing time as I could, using conservative PT estimates for starters, and divvying up reserve PA's in a way that seemed reasonable. I acknowledged that I might be missing high, but cited two other published projections that had the Reds somewhere around 0.500 (the Hardball Times Season Preview and CHONE's own NL Central projections).
The Reds got pretty close to 0.500 in 2009. Their actual record was 0.481 (3 G below 0.500). However, depending on how you calculate it, their expected (pythagorean) record given the team stats was somewhere between 0.430 (11 G below 0.500) and 0.470 (5 G below 0.500; see my four part series recap). Here's a look at the overall team. In the table below, all numbers below are given in runs above average.
Component | Projected | Actual |
Offense | +20 | -61 |
Fielding | -6 | +40 |
Pitching | +39 | -46 |
Total RAA | +53 | -67 |
Table notes: Offense, Fielding, and Pitching are all calculated from 2009 CHONE projections, whereas Actual numbers come from actual 2009 numbers. Offense uses the same set of linear weights in both cases (this is a different set than the season recap series used). Fielding for projections come from 2009 CHONE fielding projections, while actual fielding comes from the recap series. Pitching is based on FIPRuns for both datasets (this is a different approach that I took last spring, but the underlying data are the same). Nothing is park adjusted, for simplicity's sake.
I've read that 2009 was a bad year for projection systems a whole, but holy crap, this is brutal. Clearly the projections missed all over the place. The 2009 Reds had much worse offense, much better fielding, and much worse pitching than expected. Overall, the cumulative effect is a shortfall of ~120 runs between reality and the projections I was using.
So what happened? There are two basic explanations. Either 1) players didn't perform as well as projected in terms of their rate stats (suckitude), or 2) good players didn't get enough playing time (injuries, etc).
- If a group's rate stats were better than expected, but their playing time was down, I assigned all of the blame for the shortfall to a playing time tally ("o/u PT" below). This never actually happened. :(
- If a group's rate stats were worse than expected, but their playing time was up, I assigned all of the blame for the shortfall to the rate stats ("o/u Rates" below).
- If both rate stats and playing time were worse than expected, I first estimated what their shortfall would have been IF they'd gotten to the projected playing time. Essentially, I took the production they actually did, and added in additional PA's or IP's until they reached their projected playing time. The additional PA's or IP's were added while assuming players played to their projected (not actual) rate stats, as I judged that to be more predictive than their season rate stats. Anyway, this shortfall, after playing time shortfalls were eliminated, was the "o/u Rates" shortfall. The remaining shortfall was assigned to "o/u PT".
Position Players
Projected | Actual | Comparison | |||||||||||||
Group | PA | OBP | SLG | Field | WAR | PA | OBP | SLG | Field | WAR | Over/Under | o/u PT | o/u Rate | ||
Starting C | 467 | 0.332 | 0.435 | -4 | 2.2 | 331 | 0.336 | 0.362 | 0 | 0.8 | -1.4 | -0.6 | -0.8 | ||
Reserve C | 256 | 0.324 | 0.359 | 0 | 0.7 | 442 | 0.334 | 0.301 | 6 | 1.4 | +0.6 | +0.6 | +0.1 | ||
Starting 1B | 576 | 0.365 | 0.494 | 3 | 3.2 | 544 | 0.414 | 0.567 | 1 | 5.0 | +1.8 | +0.0 | +1.8 | ||
Starting 2B | 594 | 0.325 | 0.455 | 6 | 3.4 | 644 | 0.329 | 0.447 | 10 | 4.0 | +0.6 | +0.3 | +0.3 | ||
Starting 3B | 561 | 0.360 | 0.485 | -12 | 2.6 | 327 | 0.348 | 0.387 | 1 | 1.2 | -1.4 | -1.1 | -0.3 | ||
Starting SS | 638 | 0.310 | 0.391 | 0 | 1.6 | 562 | 0.278 | 0.301 | 7 | -0.2 | -1.8 | -0.2 | -1.6 | ||
Reserve IF | 436 | 0.317 | 0.422 | -2 | 1.1 | 412 | 0.318 | 0.345 | 0 | 0.1 | -1.0 | -0.1 | -1.0 | ||
Starting CF | 543 | 0.328 | 0.341 | 3 | 1.2 | 437 | 0.275 | 0.285 | 1 | -1.0 | -2.2 | -0.2 | -2.0 | ||
Starting RF | 536 | 0.334 | 0.509 | 2 | 2.8 | 387 | 0.303 | 0.470 | 7 | 1.6 | -1.2 | -0.8 | -0.4 | ||
Starting LF/Reserve OF | 1177 | 0.338 | 0.430 | -1 | 3.2 | 1722 | 0.325 | 0.442 | 6 | 6.1 | +2.9 | +1.9 | +1.0 | ||
Pitcher Hitting | 326 | 0.178 | 0.177 | 0 | 1.6 | 379 | 0.140 | 0.212 | 0 | 2.1 | +0.6 | +0.3 | +0.3 | ||
Total (from total stats) | 6110 | 0.326 | 0.422 | -6 | 23.6 | 6187 | 0.315 | 0.393 | 40 | 21.0 | -2.6 | 0 | -2.6 | ||
Total (summing groups) | -2.6 | +0.1 | -2.7 | ||||||||||||
Total (only shortfalls) | -9.0 | -3.0 | -6.0 |
Table notes: Projected numbers are based on CHONE projection rate stats as well as the number of PA's that I permitted each group of players to have. A player could get no more than the PA's projected by CHONE, but often got substantially less. "o/u PT" is the proportion of the WAR shortfall or surplus that I attributed to playing time differences, whereas "o/u Rate" is the proportion of the WAR shortfall or surplus attributed to differences in rate stats. The first Total line (from total stats) is an overall parsing of blame based on overall stats--since I projected almost exactly the right number of PA's, by definition any shortfall must have been rate based. So the second column looks at the summed total of the "o/u PT" column and "o/u Rate" columns and is, I think, a better overall summary of where the problem areas were. Finally, Total (only shortfalls) includes column totals for those groups that had a net overall shortfall. A full spreadsheet with all players included can be found here.
The Reds had some injury difficulties last season, and as a result had a lot of players fall short on projected playing time. Starters at C, 3B, SS, CF, and RF all fell significantly short of expected playing time. You might be thinking that this is because we projected too much playing time, but the truth is that the Reds had just THREE players who cleared more than 400 PA's last season...and even then, two of them missed significant time. I don't see how we could have possibly projected that. The Reds definitely had the injury bug last year.
Why then, do we have only a total +0.1 WAR surplus attributed to playing time? Two reasons. First, reserves (especially outfielders) who stepped in to fill those PA's contributed above-replacement level work that was close to their expected rates, and thus largely was credited to increased playing time. This entirely negated the playing time shortfall from the starters who missed time.
Second, and more importantly, all of those starters who missed significantly more time last year than expected (Hernandez, Encarnacion/Rolen, Gonzalez/Janish, Taveras, Bruce) ALSO performed well below their expected levels last season. As a result, blame for those shortfalls was split between both playing time and rate stats--and it turned out that much more blame was levied on poor rate stats than poor playing time (-6 WAR vs. -3 WAR, respectively, from the last line in the table). We just had a whole bunch of important players who didn't produce--even when they were healthy.
What about pitchers? They had the Volquez injury there, right? End of story? Well, sort of...
Pitchers
Projected | Actual | Comparison | |||||||||
Name | IP | FIP | WAR | IP | FIP | WAR | Over/Under | o/u PT | o/u Rate | ||
Starter 1 (Harang) | 193.0 | 4.11 | 2.7 | 162.3 | 4.22 | 2.1 | -0.6 | -0.4 | -0.2 | ||
Starter 2 (Volquez) | 166.0 | 3.96 | 2.6 | 49.6 | 5.10 | 0.2 | -2.4 | -1.8 | -0.6 | ||
Starter 3 (Arroyo) | 188.0 | 4.42 | 2.0 | 220.3 | 4.86 | 1.3 | -0.7 | 0.0 | -0.7 | ||
Starter 4 (Cueto) | 146.0 | 4.56 | 1.3 | 171.3 | 4.77 | 1.2 | -0.1 | 0.0 | -0.1 | ||
Other Starters | 245.8 | 4.71 | 1.9 | 397.4 | 5.40 | 0.3 | -1.5 | 0.0 | -1.5 | ||
Closer | 64.0 | 3.45 | 0.9 | 66.6 | 3.18 | 1.4 | +0.5 | +0.1 | +0.5 | ||
Core Relievers | 268.0 | 4.09 | 1.2 | 173.6 | 4.86 | 0.0 | -1.2 | -0.4 | -0.8 | ||
Other Relievers | 171.0 | 4.49 | 0.0 | 216.5 | 3.88 | 1.9 | +1.9 | +0.4 | +1.5 | ||
Total (from total stats) | 1441.8 | 4.29 | 12.7 | 1457.6 | 4.71 | 8.4 | -4.2 | +0.1 | -4.2 | ||
Total (summing groups) | -4.2 | -2.3 | -2.0 | ||||||||
Total (only shoftfalls) | -6.6 | -2.7 | -3.9 |
The remaining shoftfall, however, was again due to poor performance. All four of the primary starters (and all five in the opening rotation) had FIP's higher than projected. Micah Owings, in particular, was a disaster, coming in at -0.3 WAR after being projected at +1 WAR (though, to be fair, Owings is a big reason why the pitchers' offensive rate stats were better than projected, so some of this cancels). And despite pitching only 23 innings, Mike Lincoln's -0.7 WAR drives a big chunk of the core reliever shortfall. Fortunately, other pitchers in the bullpen stepped up (Masset, Herrera, Fisher), making the bullpen a net positive.
Wrapping up
So, if you're keeping score at home, I attributed virtually all of the net position player shortfall to poorer than expected performance (rate stats), as the lost playing time that did occur was countered by effective performances from the reserves who filled in for them. For pitchers, it was closer to a 50/50 split between injuries and poor performances. Therefore, the reason the Reds didn't live up to my expectations was mostly due to underperforming players, rather than just injuries.
The other take-home from this is that I feel reasonably confident after going through this exercise that there isn't a fundamental bias in the methodology by which I was a parsing out playing time that led to the over-prediction of 86 wins. Instead, it was really more a case of the projection system I was relying on missing high on a lot of players. Well, that, and the Volquez injury.
This is not to knock CHONE as a system. I think if I had done this with most of the other projection systems, it would have shown a fairly similar result. Very few systems, for example, would have forecast Jay Bruce to hit just 0.223/0.303/0.470, or Willy Taveras to hit 0.240/0.275/0.285 for that matter! Nevertheless, it is probably the case that the next time I try to do this, I will use averages of multiple projection systems, as that approach has been demonstrated to be more predictive than using any one system by itself.