Hey thanks bbrool, for the update. I think you might have referred to Paul Johnson, not Robert Johnson, the gifted blues guitarist.
As you probably know, Paul Johnson's Extrapolated Runs Created (as known as NERP) formula have been among the most accurate in the business to deal with individual assignments of value. Quoth Wikipedia "ERP was almost as accurate as RC at measuring team runs, it did not succumb to RC's infamous problems at the individual level, and its values stacked up well when compared to Pete Palmer's linear weights formula".
So, for people buying the SOM ratings, and who want to generate an easy way to generate an assigned value for each player, I would recommand Dean's paper based on the NERP formula.
But it's still an "empircally-driven" formula. To quote Wikipedia again: "However, like any linear formula, there is no guarantee that it will work outside of the context in which it was developed", which is basically the years 1955-1998.
In other words, the assigned run values for events (0.50 for singles, 1.44 for homeruns, etc) are good estimates for environments similar to baseball as played on average in the 1950-2000 era, but can be off the mark in extreme environments. In environments where pitchers dominate, for example, 0.50 for singles is too high---in such environments, 6 singles will not generate on average 3 runs. The singles will more probably be distributed across the innings. At the opposite, in Coors-type environment, 0.5 is not a value high enough for singles...in these environments, singles tend to agglomerate, and so runs come in bunch. Singles are likely to generate 0.55 or 0.60 runs.
That's one reason I believe that a rating system based on linear weights generated by a run expectancy matrix will produce more accurate results, because you can adapt your matrix to virtually any environment.
Yesterday, I posted the link for the run expectancy matrix based on 2014
http://www.baseballprospectus.com/sorta ... id=1657937You see that, when an average team started in 2014 a typical inning, no one on, no out, teams on average are expected to produce 0.4552 runs. A single happens, the expectancy is boosted to 0.8182 runs. The difference, 0.36 runs, is the contribution of this specific single, in this specific environment. With bases empty, two outs, the single is worth less: 0.11 runs, because the expectancy between the two situations has been only increased from 0.086 to 0.195. With bases loaded, two outs, the clutch single** has produced two runs, as we all know, but since the run expectancy has slightly lowered (we went from a situation with bases loaded to a situation with men on first and third), the net value of this clutch single, in this particular context, is 1.80 runs.
Once you do this for all 24 situations (it's fairly easily in Excel), you need to know at which frequency each situation happens (all available on the web), and you generate the overall value of singles (in 2013---I still need to upgrade the values for 2014---singles were worth 0.47 runs). But as my last example has illustrated, you are not forced to use the "baseball events". You can instead use the "strat events". You calculate the value of single* (0.37 runs), the value of single** (0.51) (on offensive cards, it's not possible to distinguish single* with single** because of the "increasing running options", but the latter option doesn't affect the result of defensive charts, which include si* and si** separately). You can calculate the real value of gbA (it's -0.33), instead of estimating the value of double-plays. With gbA, runners on second can advance to third (when the ball is hit towards the right infield). GbC/flyA are worth -0.20 runs, still negative, but not as much as strikeouts (-0.27). The value for PB/WP/BK can be differentiated from the value of T-rating (the latter never advance players from first base). Basically, it's a much more flexible system than NERP.
I made all these estimations based on the average 2013/2014 season, but you can select instead a matrix ajusted to Coors environment (again, I believe they are available on the web, although I didn't look for it). And you can even have the option to adapt a run expectancy environment for pitchers.
When I did compare the linear weights generated by this system with those generated in NERP, in typical environment, the correlation was over 0.95. But with extreme pitchers, the differences were observable at naked eyes. With Kershaw on the mound, the weights assigned to singles were down to 0.39 instead of 0.47 or 0.50 (depending on Palmer's linear weight or NERP---but I must add, the values of outs are also transformed, so the bottom-line difference is not as wide as it looks).
And last advantage, the results generated by these linear weights are truly "run estimations". So once you calculated the runs generated by an offensive player, adjust it to injury risk and defense, you have the potential to truly estimate the value of running or stealing. For example, if you simmed on the SOM-cd that runners 1-17 generate on average 12 runs, then you can simply add these 12 runs in the system, they are the same units as those obtained by the run expectancy matrix.
A few caveats, the extra PAs generated by players with better on-base are not taken into account, so an adjsutment is needed there. And one would still have to estimate accurately the real usage of pitchers, which is a heck of a job.