Before we begin, please take out a #2 pencil and answer the following question:
Your 2006 National League Most Valuable Player ballot more closely resembled which of the following?
A) 1.Freddy Sanchez, 2.Miguel Cabrera, 3.Albert Pujols, 4.Garrett Atkins, 5.Matt Holliday
B) 1.Albert Pujols, 2.Ryan Howard, 3.Lance Berkman, 4.Miguel Cabrera, 5.Carlos Beltran
For as long as everybody on this board has been watching/listening to baseball, the first statistic cited in a player's stat line is Batting Average. It's easy to calculate: Hits divided by At-Bats. Everybody knows what the numbers mean: .200 is the "Mendoza Line," .250 is mediocre, .300 is an All-Star, .350 will win you a batting title, .400 is the stuff of legend. A player's batting average gives you a pretty good idea of his hitting prowess. The top 10 players in career Batting Average include Ty Cobb, Rogers Hornsby, Shoeless Joe Jackson, Tris Speaker, Ted Williams, and Babe Ruth. Looking at simply batting average, however, leaves off some great players in baseball history: Willie Mays, Mickey Mantle and Hank Aaron don't even crack the top 100 in career BA.
Take a look at last year's NL team batting averages:
http://sports.espn.go.com/mlb/stats/aggregate?sort=avg&split=0&group=8&season=2006&s easonType=2&statType=batting&type=reg
The teams at the top, for the most part, seem to score more runs than the teams at the bottom. However, note that the team that led the NL in runs scored was 6th in BA, one spot behind the 2nd to last place team in runs scored. Based on that, I would venture to say that batting average doesn't tell us everything we need to know about getting runs across the plate.
Let's take a look at the information that we get from Batting Average:
Out < Hit
That is a correct statement. A hit is definitely better than an out. However, there's a whole lot of information that batting average leaves out. Hits come in many different varieties, with some more valuable than others. Outcomes other than hits and outs, such as walks, can greatly effect run production. This leads to multiple choice question #2:
Which is the better statement regarding the value of a Plate Appearance?
A) Out < Hit
B) Out < Walk < Single < Double < Triple < Home Run
If only there were a way to create a statistic that provides all the information that statment (B) above does. On the one hand, we have On-Base Percentage, which includes walks, but gives no credit for hits of the extra-base variety. In essence, an out is worth 0, and singles, doubles, triples and homers are worth 1 each. On the other, we have Slugging Average, which is calculated using Total Bases, but gives nothing for walks. 0 for an out, 1 for a single, 2 for a double, 3 for a triple, 4 for a homer.
But.......
If we were to add On-Base Percentage and Slugging Average together (and call it something crazy like On-base Plus Slugging), something magical happens:
Out: 0
Walk: 1
Single: 2
Double: 3
Triple: 4
Homer: 5
That's a pretty solid valuation of probably 90% of what happens while a batter is at the plate. Also, look what happens to those NL splits I referenced above:
http://sports.espn.go.com/mlb/stats/aggregate?sort=ops&split=0&group=8&season=2006&s easonType=2&statType=batting&type=reg
There are the Phillies, 1st in both OPS and runs, and there are the Cubs, 2nd to last in both. The rest of the teams fall pretty neatly in line, too.
OPS, quite simply, correlates better with Run Production than Batting Average, because it provides more information. For those that hadn't figured it out, in my initial question, option "A" was the top 5 NL players in Batting Average, and option "B" was the top 5 NL players in OPS. Saber-haters often try to disparage OPS because it has no intrinsic meaning in the way that batting average is simply "average number of hits per at-bat." They're right, there's no intrinsic meaning to OPS. It's just a number. However, it's a number that does a damn good job at telling us a player's contribution to run scoring. Here's a scale, so that next time one of our resident saber-dorks references OPS, you'll know if it's Pujolsian, or more NeifiPerezian:
.700: Mendoza Line
.750: Mediocre
.775: Average
.800: Above Average
.900: All-Star
1.000: MVP Contender
For any people who don't like sabermetrics, but really enjoyed statistics in high school math, here's a table of r-squared values for various traditional stats compared to runs scored. R-squared values show a degree of correlation expressed as a number between 1 (perfect correlation) and -1 (perfect inverse correlation). This table comes courtesy of a previous post from JinAZ on this very site:
OPS 0.91
SLG 0.83
OBP 0.81
AVG 0.71
Hit# 0.68
HR# 0.52
BB# 0.35
K# 0.03
(SB-CS) 0.004
SB# 0.0005
Essentially, OPS tells us 91% of what's going on in terms of run production. Batting average does OK, at 71%, though not as well as either OBP or SLG. Interestingly, strikeouts show a slight tendency to have a positive effect on run production although, as Red Menace says, that's a Sabermetrics 300 level class.