The new modified NERP
Posted: Mon Nov 02, 2015 9:28 am
In John's most recent blog, he refers to Dean Carrano's offense vs defense article. In this article, Dean proposes a simple formula by which one could establish the value of a player as compared to another.
viewtopic.php?f=15&p=5559635&sid=208640c3ac31e037c9c8dc1bf5c89327#p5559635
As John says, Dean's formula is very useful to compare one player with another, and its simplicity is a strength.
If you like simplicity, I guess you should stop here!! The purpose of this post is to show the work I did over the last year or so to better the formula and generate a new modified NERP that would be the best rating system.
Dean's proposal is summed up by this simple formula:
(TB * .318) + ((BB+HBP-CS-GIDP) * .333) + (H * .25) + (SB * .2) - (AB * .085) minus DEFENSE (as valued by his charts at the end of his article).
As I've already written elsewhere, if you round up the decimals, this formula above is mathematically the same as the one below which will sound more familiar for those among you who have been playing with linear weight formulas :
=bb/hbp*0.33+si*0.48+do*0.80+tr*1.12+hr*1.44 - outs * 0.085 - GIDP*0.418 + (SB * .2) - cs*.333 minus defense
and it's roughly equivalent to following one (by changing GIDP to gbA), which has the great advantage to match the information that is present on the SOM rating files
=(bb+hbp)*0.33 + hits*0.16+ tb*0.32 - outs * 0.085 - gbA*0.08 + (SB * .2) - cs*.333 minus defense
(just to make clear: if you go through this formula, gbA has the value of -0.165, since gbA are also computed in outs).
Take note that this formula has the form:
offense + running - defense
This formula is simple, but it has several limitations, many of which are acknowledged by Dean himself. Some are self-evident:
¬ it cannot easily compare players at different positions
¬ The formula does not consider player usage
Dean's formula is great if you're undecided about who should start the game, but it's less adapted about whom to draft, since you're mostly interested about
Taking the last two points together, the ideal formula should have the following shape:
offense + running - (defense + positional adjustment), all adjusted by playing time
¬ clutch is considered as meaningless when it's not
Even if you think clutch has little value, you'll concede that it's not worth zero. I already wrote on this here, so I won't repeat the argument. If interested, please see:
viewtopic.php?f=17&t=639124
¬ it does not provide catcher's rating for defense
I already wrote on this topic. Please see The catcher database
viewtopic.php?f=17&t=639093
¬ speed is restricted to stolen bases and caught stealings. The capacity to take an extra base or to be held at first is not taken into consideration. I've also written on this, see
viewtopic.php?f=5&t=639129&start=10
¬ Except for double plays, outs are giving the same value. Ideally, you'd like to have a rating system that value differently gbC, flyB, and lineouts, since their impact is different (admittedly, these are subtle differences). On that topic, gbA is attributed the value of a GIDP, but they're not the same. gbA can generate positive value (for example, gbA can advance runners in some contexts). So the value of gbA is not as bad as -0.165.
Then, there are a few more limitations that need further explanation:
¬ Dean's formula is based on a false assumption that every player will have the same number of rolls, that is 216 PA, or something close to 648 PA over the full course of a season. Intuitively, this makes perfect sense, as every player has 108 chances on his card and 108 chances on the pitcher's card/defensive charts. More formally, the value of a card, over the course of a season, will be determined by:
A) What's on the card (which is given by offensive NERP)
B) How many times you'll read the card (which Dean assumed constant for every player (other things such as injury risk being equal).
In fact, B is not constant across the set. What's constant is the number of outs a team will generate over the course of a season. Every team has 162 games X 27 outs to win a game. What is constant is that every team will have 4350 outs or so. If team A has better on-base than team B, then team A will generate more rolls (will draw more dices) until it reaches 4350 outs. And since teams are made up of players, players with higher on-base are the one responsable for the extra value generated by rolling more dices.
Consider Reddick and Turner. If I'm not mistaken, with BP HR=10, Reddick and Turner have the same NERP vs rhp. But Turner will generate 20 less outs for every 216 PA (I let aside lefty/righty matchups). So over the course of a full season, having Turner instead of Reddick will yield 60 more rolls for the team with Turner. There will be 3-4 additional readings on Turner's card, 27 additional readings on his teammates, and 30 additional readings on the defensive cards. By plugging in league average value (assuming a 80M), Turner will generate almost an extra 4 NERP than Reddick to his team over the course of a season. Turner is underestimated if you use Dean's formulas.
Final issue: the weights (the values) plugged in the NERP formula were generated by analyzing baseball stats. Strat is a very close approximation of baseball, but it has also its own logic, which can be computed sometimes with more accuracy than baseball. The best example of this is clutch. Clutch is not in the NERP original formula because in real-life, clutch is not captured by the statistics, but it's an important part of strat. And the difference between baseball and STRAT is subtle but undeniably present in other contexts. Consider the value of outs, which I touched upon above. In strat, it's much more easier to distinguished between the value of gbC from the value of gbA. To do so, I generated matrices of linear weights and probabilities of events, and I calculated for each "strat" event its weight, or value. I would need to give more details in another post, but the net results is that some events appear to weight differently in strat. For example, doubles seem to have a value closer to singles than in baseball.
Making the sum of it all, here is what I have for the offensive value
offensive value=(walks/hbp)*0.33+hits*0.21+tb*0.28+hr*0.1-regular_outs*0.1-gbC/flyA*0.04-gbB/flyB*0.07+stadium adjustment+weak adjustment+clutch*1.3*0.116-gbA*0.15 + extra points generated by the extra on-base: (81-outs)*(8/9*0.15+1/9*0.15*value/24)
I need to stop here.
viewtopic.php?f=15&p=5559635&sid=208640c3ac31e037c9c8dc1bf5c89327#p5559635
As John says, Dean's formula is very useful to compare one player with another, and its simplicity is a strength.
If you like simplicity, I guess you should stop here!! The purpose of this post is to show the work I did over the last year or so to better the formula and generate a new modified NERP that would be the best rating system.
Dean's proposal is summed up by this simple formula:
(TB * .318) + ((BB+HBP-CS-GIDP) * .333) + (H * .25) + (SB * .2) - (AB * .085) minus DEFENSE (as valued by his charts at the end of his article).
As I've already written elsewhere, if you round up the decimals, this formula above is mathematically the same as the one below which will sound more familiar for those among you who have been playing with linear weight formulas :
=bb/hbp*0.33+si*0.48+do*0.80+tr*1.12+hr*1.44 - outs * 0.085 - GIDP*0.418 + (SB * .2) - cs*.333 minus defense
and it's roughly equivalent to following one (by changing GIDP to gbA), which has the great advantage to match the information that is present on the SOM rating files
=(bb+hbp)*0.33 + hits*0.16+ tb*0.32 - outs * 0.085 - gbA*0.08 + (SB * .2) - cs*.333 minus defense
(just to make clear: if you go through this formula, gbA has the value of -0.165, since gbA are also computed in outs).
Take note that this formula has the form:
offense + running - defense
This formula is simple, but it has several limitations, many of which are acknowledged by Dean himself. Some are self-evident:
¬ it cannot easily compare players at different positions
¬ The formula does not consider player usage
Dean's formula is great if you're undecided about who should start the game, but it's less adapted about whom to draft, since you're mostly interested about
Taking the last two points together, the ideal formula should have the following shape:
offense + running - (defense + positional adjustment), all adjusted by playing time
¬ clutch is considered as meaningless when it's not
Even if you think clutch has little value, you'll concede that it's not worth zero. I already wrote on this here, so I won't repeat the argument. If interested, please see:
viewtopic.php?f=17&t=639124
¬ it does not provide catcher's rating for defense
I already wrote on this topic. Please see The catcher database
viewtopic.php?f=17&t=639093
¬ speed is restricted to stolen bases and caught stealings. The capacity to take an extra base or to be held at first is not taken into consideration. I've also written on this, see
viewtopic.php?f=5&t=639129&start=10
¬ Except for double plays, outs are giving the same value. Ideally, you'd like to have a rating system that value differently gbC, flyB, and lineouts, since their impact is different (admittedly, these are subtle differences). On that topic, gbA is attributed the value of a GIDP, but they're not the same. gbA can generate positive value (for example, gbA can advance runners in some contexts). So the value of gbA is not as bad as -0.165.
Then, there are a few more limitations that need further explanation:
¬ Dean's formula is based on a false assumption that every player will have the same number of rolls, that is 216 PA, or something close to 648 PA over the full course of a season. Intuitively, this makes perfect sense, as every player has 108 chances on his card and 108 chances on the pitcher's card/defensive charts. More formally, the value of a card, over the course of a season, will be determined by:
A) What's on the card (which is given by offensive NERP)
B) How many times you'll read the card (which Dean assumed constant for every player (other things such as injury risk being equal).
In fact, B is not constant across the set. What's constant is the number of outs a team will generate over the course of a season. Every team has 162 games X 27 outs to win a game. What is constant is that every team will have 4350 outs or so. If team A has better on-base than team B, then team A will generate more rolls (will draw more dices) until it reaches 4350 outs. And since teams are made up of players, players with higher on-base are the one responsable for the extra value generated by rolling more dices.
Consider Reddick and Turner. If I'm not mistaken, with BP HR=10, Reddick and Turner have the same NERP vs rhp. But Turner will generate 20 less outs for every 216 PA (I let aside lefty/righty matchups). So over the course of a full season, having Turner instead of Reddick will yield 60 more rolls for the team with Turner. There will be 3-4 additional readings on Turner's card, 27 additional readings on his teammates, and 30 additional readings on the defensive cards. By plugging in league average value (assuming a 80M), Turner will generate almost an extra 4 NERP than Reddick to his team over the course of a season. Turner is underestimated if you use Dean's formulas.
Final issue: the weights (the values) plugged in the NERP formula were generated by analyzing baseball stats. Strat is a very close approximation of baseball, but it has also its own logic, which can be computed sometimes with more accuracy than baseball. The best example of this is clutch. Clutch is not in the NERP original formula because in real-life, clutch is not captured by the statistics, but it's an important part of strat. And the difference between baseball and STRAT is subtle but undeniably present in other contexts. Consider the value of outs, which I touched upon above. In strat, it's much more easier to distinguished between the value of gbC from the value of gbA. To do so, I generated matrices of linear weights and probabilities of events, and I calculated for each "strat" event its weight, or value. I would need to give more details in another post, but the net results is that some events appear to weight differently in strat. For example, doubles seem to have a value closer to singles than in baseball.
Making the sum of it all, here is what I have for the offensive value
offensive value=(walks/hbp)*0.33+hits*0.21+tb*0.28+hr*0.1-regular_outs*0.1-gbC/flyA*0.04-gbB/flyB*0.07+stadium adjustment+weak adjustment+clutch*1.3*0.116-gbA*0.15 + extra points generated by the extra on-base: (81-outs)*(8/9*0.15+1/9*0.15*value/24)
I need to stop here.