PETEY'S NEW STAT CORNER: REPs and BaRFs
"You never know what you're going to get out of the bullpen."
It was something my father would often nervously say when the starting pitcher was being yanked for a reliever. There was wisdom there: While John Franco, Frank Williams, and Rob Murphy could usually be counted on, most relievers were subject to this truth: They might well come into a game sucking out loud and barf up the whole game. Even the best blew it sometimes.
In 1987 I read Jim Bouton's classic Ball Four for the first time. His game results seemed mostly random from his 1969 MLB Seattle Pilots season. The book included an appendix that showed how HE rated HIMSELF as a reliever, on a per-game basis. I thought he had a pretty good feel for how each outing should be categorized unto itself: EXCELLENT, GOOD, FAIR, or POOR. But these ratings were completely subjective, as he had no concrete rules for this rating system. The idea stuck with me for decades:
I WISH WE HAD A BETTER STAT FOR RELIEVERS....
Then last autumn, I wondered how I could develop a useful rating system for relief appearances based on Bouton's rating concept of categorizing each appearance as a separate event. Having played fantasy baseball since 1989, I knew that WHIP (walks + hits + hit batsmen) equated to runners allowed, which is arguably the best measure of a reliever; one or two bad outings a year might make an ERA look misleading, W-L record for a reliever is nearly useless, inherited runners score or are stranded perhaps mostly by luck, and the Saves stat has funky rules in combination with oddly-conservative usage of one reliever by most teams. Following the findings of our Sabermetric brothers before us, we have learned that getting outs/not making outs is the most important aspect of the game. My conclusion was that looking at the runners allowed per appearance is pivotal to knowing that reliever's dependable usage and tendencies. It might be as useful as any other stat we have for rating relievers.
So I dove into the numbers. I made spreadsheets that detailed every one of the 8200 National League relief appearances from 2009 to make sure I had a good study sample size. Then I came up with a quantifiable definition of what made a relief appearance a quality one versus a fair one versus one that probably blew the whole game. I started drawing some hard lines in a few places, and I think I've come up with a good measuring stick:
REG (Reliever Effective Game):
A WHIP that is LOWER THAN 2.000
FAIR appearances:
A WHIP that is anywhere from 2.000 to 3.000
BaRF (Bullpen Relief Failure):
A WHIP that is HIGHER THAN 3.000
Here was my highly-subjective reasoning for these standards:
If you're a reliever, your job is to come in and effectively get outs. If you pitch 1 inning and allow only 1 baserunner, it's hard to argue that you didn't do your job. Worst case scenario: 1 run scores. If you pitch more than an inning, you may allow more than 1 runner to reach per inning, as long as overall your WHIP is below 2.000 for that appearance.
If you allow 2 or 3 baserunners per inning, you may escape most innings without allowing a run. You might not look good, but overall, you've likely done your job.
If you allow more baserunners than outs recorded, you've really blown it as a reliever. If you pitch 1 inning, that means you've allowed 4 baserunners, and have likely started a rally or allowed multiple runs to score.
These categories seem good to me with one exception: one-batter appearances. These categories work well once you've pitched to at least 2 batters, but we need to make an adjustment here, as outings where you pitch to only one batter and allow him to reach do not fit tidily into mathematical formulas: dividing by zero rips holes in the space-time continuum, you know. If you pitch to only one batter, and you walk him or allow a base hit, you haven't exactly screwed up the game, so we'll give you a FAIR rating. If you face one batter and retire him, though, then you HAVE done your job, and done it perfectly. So my standards for one-batter appearances:
Retire the batter, and you get a REG. Allowing the only batter you face to reach gets you a FAIR, not a BaRF. Face more than one batter, though, and the standard rules apply.
If you can see a player's season percentage of REG or BaRF games, there's some good information in there. So...
% of relief appearances with a REG is your REP (Reliever Effectiveness Percentage)
% of relief appearances with a REG is your BaRF (Bullpen Relief Failure Percentage)
National League average for a pitcher with at least 20 relief appearances (a number I pulled out of my ass) was:
REG = 62.9%
BaRF = 14.2%
I find that the most useful way to look at these numbers is by percentage or ratio. Here is a bar graph that shows the Reds 2009 bullpen.
League average is listed at the top, and that the Reds pen did quite well last year. You can see that looking at the size of a reliever's REG and BaRF could prove a useful tool for managers in determining when best to use which reliever. You might want to use your highest REP guy in a situation, but you also might be gambling a lot of BaRFs if that same often-successful reliever has a tendency to be a feast-or-famine talent (not unlike Badroyo vs. Goodroyo).
Of note is that just under my 20-game qualifying threshold lies a man who pitched 19 games and lead the league in bullpen suckitude. That man, was the Reds' own Mike Lincoln (not a shocker), who not only had the NL's 16th-worst 47.4 REP (>15 G), he had the league's worst BaRF: 36.8%. So if you called in Lincoln from the pen, you can expect him to spark your opponents' rally more than 1 out of 3 times. Ouch.
Here are the leaders in these categories:
As you can see, when you look at relievers with at least 20 appearances, you get a range in REP from 87.0% to 36.4%, and a BaRF range of 0.0% to 35.7%, with league averages again being REG 62.9% and BaRF 14.2%.
Of note is one of the best relievers in the NL last year, Josh Fogg. Yes, THAT Josh Fogg. He was rocked in his sole start in 2009, but as a reliever he came in 23 times to record 20 REGs and only 1 BaRF. That's solid performance.
Even more impressive was Brad Thompson's '09 campaign, in which he pitched in 24 games without a single BaRF outing and 19 REGs.
Another way you may find these stats useful is as a ratio of REGs to BaRFs. Here are the league leaders in REG/BaRF ratio:
While there has recently been a recent rating system based on appearances using WAR, calculating it is bulky, whereas any single appearance is easy to figure using my system.
So whatcha think? Too sloppy? Not useful? Groundbreaking? Let's talk about it, and use it to our evil advantage in our everyday sabermetric pursuits.
What REP and BaRF research would you most like to see next? Vote below to help steer my research.
Go, Reds! They're my favorite team!
12 comments
|
1 recs |
Do you like this story?
Comments
once a boobs man
always a boobs man.
Great work though, Petey. I think you’re on to something, especially something called BaRFs.
Set the gearshift to the high gear of your soul.
by Kevin Mitchell is Batman on Jul 20, 2010 3:18 PM EDT reply actions
With BaRF in the title
… I expected this to be much more immature. I’m disappointed.
Not knocking your work, but what about Holds versus Blown Saves. I find that useful for relievers. That is really what they are called on to do – not surrender the lead.
Though thinking it through now, that doesn’t tell you whether the reliever prevented a 1-run deficit from becoming a 5-run deficit — so you’re on to something!
The season doesn't start until the Cincinnati Reds take the field! Reclaim The Opener!!
Yeah, I'm not a big fan of holds/blown saves.
In a 1-0 game, Goose Gossage can come in with bases loaded and no outs and wriggle out of it unscathed, but if he strikes out the next 5 batters, allows a walk to the next guy and then the closer comes in and lets that guy score, Goose gets no hold and a blown save. INJUSTICE!!
I’m not much of a fan of saves, actually. :P
"Don't turn off the TV if we've still got bats in our hands." - Dusty Baker
by PeteyHendrix on Jul 20, 2010 4:27 PM EDT up reply actions
I should say that this is definitely just stage 1 of this research.
While it may be interesting to look at, unless these stats have predictive value for (at least some) players, it’s not terribly useful. I hope to find that these stats have SOME predictive value, but that’s a very long spreadsheet from here. :)
"Don't turn off the TV if we've still got bats in our hands." - Dusty Baker
Relievers are really tough to objectively measure.
No traditional stats do a good job of it. For relievers, context is EVERYTHING, and if a stat does not consider context, it will not do a good job of rating relievers, IMO.
For example, coming in to face one batter. What if that one batter is Juan Castro? Is there any excuse for walking him? That would be a BaRF to me. But that’s also unfair, because it’s only one batter. However, if you come in with the bases loaded and walk him, unquestionably that is BaRFtastic. Who is coming up next? Is it Chase Utley? That makes it even worse if you fail to get Castro. So much context…
A long time ago, I came across an article where the author tried to devise a system for rating relief appearances. The reliever was given credit for the difficulty of the situation. A runner on 3rd was worth 4 points if stranded, for example (or 2 points with two outs). So if he got three outs while stranding that runner he accumulated 10 points. If the runner scored it is worth -1, or -2 from 2B, or -3 from 1B assuming no outs to begin with (I am making up the numbers, but the idea is clear, I think). It was the best idea I’ve heard of to evaluate relievers, but still did not take into account the quality of opposing hitters or the game situation, so it still leaves something not insignificant to be desired.
There is also WPA now, which is somewhat halfway decent for relievers.
At any rate, I’m not digging the REP right now because too much context is ignored. Theoretically a reliever could run a WHIP of 1.60 for a season, and be rated well by this stat as long as he’s really consistent in only giving up 1-2 runners EVERY inning (Coco), and doesn’t account for how hard a guy gets hit.
This seems like a lot of work buddy. I wish you the best of luck with future stages of research. Stage 1: BaRF. Stage 2: … Stage 3: good relief stat. We need one.
You're right in that this does not include context, which would be better.
And if it doesn’t demonstrate some degree of predictability, it ain’t worth much other than it’s kinda kewl to look at the 1st graph. :P
However, FWIW, in your example about the bases loaded and issuing a walk, you can’t really blame the relief pitcher for the run – the fault lies more on the preceding pitcher(s).
I like WPA, and the point system you discuss seems to have some merit, but both are terrifically cumbersome to calculate, for what that’s worth. We’ll see what comes of a larger study re: predictability, Thanks for taking the time to read and discuss! :)
"Don't turn off the TV if we've still got bats in our hands." - Dusty Baker
by PeteyHendrix on Jul 21, 2010 7:12 PM EDT up reply actions
The cumbersome bit is why that system never caught on, I am certain
This post also reminds me of another idea I saw a long time ago. With inherited runners, blame is assigned to both the starter and the reliever depending on their portion of the damage in letting him score. I don’t remember the details of how it was done, but it didn’t catch on either, because it would involve recalculating ERAs for all the pitchers, and no one cares about ERA for relievers anyway
i think this fangraphs article is onto somethign similar based on WPA and i kinda liked it
http://www.fangraphs.com/blogs/index.php/shutdowns-meltdowns/
"Now onto more important things: Punching Errorlando Cabrerror in the fucking tits." -Geki
As mentioned in the post, I like this WAR-based stat...it's just terrifically cumbersome.
"Don't turn off the TV if we've still got bats in our hands." - Dusty Baker
by PeteyHendrix on Jul 23, 2010 2:35 PM EDT up reply actions
Thanks for including boobs on the poll...
Its about time this blog got back to sabremetrics
Though some might argue that "best utility player" is a contradiction in terms.
BuubaFan...
Interesting stuff, Petey
and I definitely think you should dive into the AL for 2009 and see what results you find
Thanks!
"Don't turn off the TV if we've still got bats in our hands." - Dusty Baker
by PeteyHendrix on Jul 26, 2010 12:55 AM EDT up reply actions

by 
































