Tango tells us: "Jackie Robinson was a 28-year old MLB rookie. In his career, he earned 63 wins. If we look at all position players in MLB history, but only from age 28 to the end of their careers, Jackie Robinson ends up in the top 25 players of all time."
Perhaps the biggest news of the week, FanGraphs substantially added to their fielding statistics this week, adding John Dewan's Defensive Runs Saved (DRS) statistic among others. I'd give UZR a slight edge if I had to choose between it at DRS...but the good news here is that we don't have to choose just one, as UZR will be available soon as well. While UZR and DRS at FanGraphs use the same underlying data, they use different algorithms and therefore do differ to a small degree. I will probably take the average of the two as the best available fielding stat available to us...though it sure would be nice to have a STATS Inc-based statistic available as well beyond just Zone Rating. Here are your 2010 Reds starters by DRS in 2009:
Phillips +1 (seems to miss low; UZR had him at +7 last year, though, which also seems low)
Cabrera -33 runs (holy crap!! Easily the worst DRS in baseball relative to position)
Stubbs -1 (very surprising, small sample, nothing to worry about)
Another nice new addition at FanGraphs was the re-institution of in-season ZiPS projections. Beyond its fantasy application, I love in-season projections as a check against my tendency to overreact to small sample sizes. For example, despite the recent up-tick, Jay Bruce is off to an uninspiring .171/.237/.200 start over his first 38 PA's (through Thursday's game). How much has this affected his projection? Not much. Pre-season, ZiPS had him at .251/.315/.459. Based on his opening of the season, his updated projection puts him at .247/.312/.449 over the rest of the season. Pretty much unchanged! If only I had these for CHONE too...
Interesting look at the recent accusations of racism in the free agent market. Short story: the data do not support the claim. In fact, black players were paid better than any other racial group this past offseason (ignoring Hideki Matsui's n=1 Asian group).
Jeff Sackmann uses his college splits database to evaluate whether it's more predictive of future success to focus on college performances against top-flight competition (hitters and pitchers who would later be drafted) than it is to just use the entire sample. The answer? Just use the entire sample. Apparently even if you use similar sample sizes, using elite competition matchups adds very little to your prediction. Good stuff is coming from this college project, folks. Looking forward to seeing where we are with this in 5 years.
Entertaining post by the always enjoyable Larry Granillo. Here he looks for the "best," in terms of total WAR, name in baseball history. E.g. "Thomas" includes Thomas Glavine, George Thomas Seaver, and Frank Thomas. Any suggestions?
The common assumption is baseball players turn it up a notch when playing for a new deal. Wrong. Surprisingly, this flies in the face of a study in Baseball Between the Numbers by Baseball Prospectus that found the opposite. I'm surprised the two studies came to such different findings, as they used similar methodologies--just different years. Are today's players less able to amp it up than those of a decade ago?
There was a minor nerd fight this week between David Appleman and Colin Wyers. Colin posted an article critiquing the use of batted ball data, and especially line drive data, at baseball prospectus. David posted a defense post of the data used for tRA at FanGraphs (linked). What do I think? Clearly, the classification of batted balls as LD vs. FB is a subjective task that is fraught with all kinds of problems (remember Harry Pavalis's study a few weeks back at THT that found a minor league stadium essentially stopped using line drives?). But as Appleman showed in his post, and Pizza Cutter before him, there is actual reliable data in those LD% numbers. Like many numbers in sabermetrics, they are imperfect. But when interpreted with caution, they provide some useful information that we would lose if we ignored them. Mike Fast, in the comments of David's post, correctly states that we do need to pursue the underlying biases in these data further to better understand them. Ultimately, when we finally have time in flight data for all batted balls, we can abandon these LD/FB designations and just calculate trajectories. That's ultimately the solution to the problem...and BIS is now tracking this, which means that we may have that info soon for 2009+ seasons. Assuming someone will pay for it.