by MARCPELLETIER » Mon Jul 27, 2009 11:58 pm
I just wanted to underline the effort from Dean and congratulate him for applying on Strat a model based on some of the best tools available for sabermetricians.
I truly appreciated the simplicity of the article, despite discussing topics that can surprise for their complexity.
One limitation I see is that we based our evaluations of Strat cards on the values generated by real baseball analyses. I believe the next step is to base the evaluations of Strat cards based on the values generated by Strat models.
One easy example is clutch. In real-life, there is virtually no clutch effects, whereas in strat, the effect of clutch is obvious. A perfect evaluation of Strat cards MUST include the value of clutch.
Another less obvious example is the value of events themselves. To give one example, the value of walks, based on the NERP formula, is 0.33. This value is estimated from real games. In real baseball, the frequency of walks is not aleatory. There is roughly a 40% increase of non-intentional walks frequency with men on base and first base open, compared to other situations. In part, this strategy from pitchers make sense, because such situations are the ones in which the value of walks is the least. For example, the value of a walk with two outs and a man on third is only 0.18, mostly because such walks don't move any runner and put runners on-base which are not likely to score, considering that two outs are already in. In comparison, the value of walk when beginning an inning is 0.41, and the value of a walk with bases loaded is obviously of 1 run exactly. The value of 0.33 included in the NERP formula is in fact the sum of the following product: (the value of walk in any situation X the frequency of this situation).
In strat (the "pitch around" set aside), the frequency of walks is "roughly" aleatory: they happened equally in all situations. You'll find less walks which are worth 0.18 in Strat, but you'll find more walks which are worth 0.41 or 1.00. The net result is that walks are worth more than 0.33 in Strat. I haven't calculated it, but it would be easy to do it, and I wouldn't be surprised that it's closer to 0.37 than to 0.33.
But then again, maybe not. Because of the "pitch around" rule, which we cannot set aside. I'm not sure of all adjustments made by Strat creators, but perhaps they have slightly reduce the frequency of walks in other situations (through the MAX rules) in order to increase it through the PITCH AROUND rules. Which could bring the value of walks back to roughly 0.33.
The only way we could be certain is to generate DATA, for example through the use of CD-ROM, and calculate empirically, by ourselves, the value of walks, and of singles, and of doubles. Otherwise, we can build a model that reflects what goes on in Strat, and generate NERP formulas based on such model.
Of course, one may simply say that we want a rough estimate of the Strat cards, and that the values estimated by "real" analysts are sufficiently close for this estimation. I believe Dean makes that point in the article.
But think of all the possible events where we are perhaps making slight underestimations when based on real-life: the value of single in Strat, considering that the new running rules, which sharply increases the possibility of retiring a runner on path compared to real-life. The value of doubles, particularly in ATG leagues which are loaded of centerfielders with negative arms. The relative value of Homeruns in small ballparks, which is correlated with the presence of the best pitchers, both of which make baseball environments entirely different than "real-life". The value of wp, which happen in STRAT as often with men at third base than with men at first base (in real-life, of course, it's much easier to get to second base than to run home). The value of "super-relievers", throwing 200 strong innings. The value of clutch. The value of gbA vs gbC.
This said, Dean has already started such analysis, with his estimation of of arms and catcher arms based in STRAT. A nice start.
*********
On the controversy raised by maligned, on whether or not Dean penalizes sufficiently players that are getting outs (and thereby preventing their teams from having additional at-bats).
My understanding is that the answer depends on which formula Dean is using. The first formula, which includes a -0.25 value for outs, does include the detrimental value of NOT getting on-base. The second formula, which includes only a -0.085 value for outs, does not.
[b:99606de380]EDIT: THE PREVIOUS PARAGRAPH AND THE ARGUMENT FOLLOWING IS ILL-FOUNDED. THE TWO DIFFERENT FORMULAES EXPRESS TWO DIFFERENT MEASURES THAT HAVE NOTHING IN COMMON WITH THE PROBLEM RAISED BY MALIGNED AND BBROOL[/b:99606de380]
FIRST FORMULA
BR (BATTING RUNS) = .47 * SINGLES + .78 * DOUBLES + 1.09 * TRIPLES + 1.4 * HOME RUNS + .33 * (WALKS + HBP) - .25 * OUTS
SECOND FORMULA (NERP)
NERP (New Estimated Runs Produced) = .318 * TB + .333 * (BB + HBP - (gbA * .1875)) + .25 * H - .085 * AB
which is equivalent to:
NERP = .48 * SINGLES + .80 * DOUBLES + 1.12 * TRIPLES + 1.44 * HOME RUNS + .333 * (WALKS + HBP) - .085 * OUTS - .333 * (gbA * .1875).
Roughly, the value of any event can be broken in three parts: the value of getting on base, the value of advancing runners, and the value of "inning-killer". The value of an out for the three parts is respectively: 0 (at least, when double plays are considered separately) ; -0.10 (the negative value is expected here, because some "outs" will bring about the lead runner to be retired, for example, when the runner on third is chased down at homeplate, leaving the batter safe on first); and -0.16 (the value of decreasing the chances of having other hitters getting at-base).
see http://www.tangotiger.net/rc2.html
The second formula ignores the last part of negative value, but the first formula incorporates both negative values, thus yielding a value of roughly -0.26 per outs.
Last edited by
MARCPELLETIER on Wed Aug 05, 2009 12:10 am, edited 1 time in total.