Page 1 of 1

saber stuff from old forums

PostPosted: Tue Nov 21, 2006 4:25 pm
by cummings2
I copied this from the old. I thought of posting it here since the old forums are gone. The text comes from my phone's limited capacities, sorry if the formatting is not right. Also either because of TSN or my phone I'm going to have to split the original in two posts.

------------Pt 1.

peteAll-Star
Posted February 08, 2003 11:55 AM
since we have a healthy chat going about probability, I was wondering if anyone ever looked into sabermetrics??

I have just started looking into it and find it fascinating but I was hoping for some input for a good reference (bill james?)
 
JaserDAll-Star
Posted February 08, 2003 12:05 PM
I'm just starting to learn more and more about sabermetrics. As a huge Red Sox fan, having Bill James in the front office has motivated me to get in learning gear.

As an example of the power of saber, here is a post I am copying from a poster (Eric Van who deserves huge credit for his saber insights) on a Red Sox board I belong to regarding the maximazation of the Sox lineup next year.

Real interesting stuff (thanks Eric):

Let's see if we can figure out from numbers and game strategy what the best batting order is. We'll look at 10 players -- Mueller, not Shea, and both Millar and Ortiz.

Who needs to be knocked in?

The stat here is (H - HR + BB + HB - CS - GDP) / PA. If you don't see why you want to deduct HR (or GDP), we'll talk.


3 yrs RHP 2002
Giambi .347
Mueller .313 .357 in '99
Ramirez .326 .358
Damon .329 .320
Garciaparra .320 .290
Walker .308 .309 Fenway should boost it
Millar .302 .305
Ortiz .308 .282
Nixon .305 .285
Varitek .293 .284
AL ave. .274

I regard Mueller and Giambi's 3-year splits vs. RHP as misleading, so I've left those columns blank.

This is actually a subjective ranking. What's very clear is that there are two or three guys who will be on base to be knocked in a ton: Lil' G, Mueller (if he's on his game), and Manny (if he continues like last year) . Conversely, there are three guys who you don't want to bother putting before the RBI hitters: 'Tek, Nixon, and either Millar or Ortiz. Just below the top group is Damon and a regular Manny, below them Nomar (assuming a partial comeback) and Walker (because I expect his OBP to rise from playing in Fenway).

Who can knock 'em in?

This is new stuff. My Contextual Runs metric includes an extremely accurate projection for the percentage of runners waiting on base that get knocked in. It's basically 10% of them plus a SA-like percentage, except the weights are different (1B is .48, 2B is .83 and 3B or HR is 1.06) and the denominator is outs made, not AB or PA.

So I've calculated that percentage for everybody. This time I didn't count CS among the outs but did include GDP. What this tells you is the percentage of waiting runners that would get knocked in by a team composed of nine of the players. (I thought a ton whether that would give the results I was looking for and concluded that it would; again, I'm open for discussion.)


3 yrs RHP 2002
Ramirez .314 .345
Garciaparra .306 .278
Millar .272 .272
Damon .256 .240
Walker .247 .245
Nixon .248 .226
Ortiz .236 .243
Giambi .236
Al ave. .217
Varitek .211 .206
Mueller .208


This is surprising, but on reflection it makes sense. You can actually see that the much aligned BA is actually meaningful; it actually gives you a better idea of a player's RBI capabilities than SA. Guys like Dwight Evans with relatively low BA's but high OBP's thanks to tons of walks, and lots of power, should be hitting just in front of the very best hitters, not just behind them.

Jeremy Giambi wouldn't be a bad #5 hitter, but he would not drive in a ton of runs because he'd be walking so often. He would essentially be passing the RBI buck, much of the time, to whomever was next. That wouldn't be bad, but it would be a misuse of his skills. His biggest potential value to the team is as a guy for Manny and Nomar to knock in; his power is then gravy. And batting him higher in the order gets him up more often, never a bad idea for a hitter this good.

Putting it together so far.

Most of these hitters are actually neutral in terms of where they rank in these lists. They are equally suited to being a tablesetter or an RBI guy; some are just better at both than others. Manny, Damon, Walker, Nixon or Ortiz, Varitek -- in that order (although of course any kind of breakout year for the last three could vault them past Walker).

There are four hitters who are not neutral. Bill Mueller's value is totally as a tablesetter (I didn't even bother to calculate an RBI value for his last good full year of '99). Surprisingly, Jeremy Giambi is also much better suited to this role. Nomar and, even more so, Kevin Millar, have considerably more value as RBI guys (let's do the lineup with Millar first and then re-do it with Ortiz instead).

This suggests a first draft of:

Mueller
Giambi
Manny
Nomar
Millar
Damon
Walker
Nixon
Varitek

There are several problems with this. First, Damon is used to hitting leadoff. Moving a guy to a position where he is uncomfortable, causing a drop in production, is not worth a slight gain in lineup efficiency. And Damon is 4th on the table-setting list, way above average, so he's certainly not going to be wasted in that spot. So let's move him into leadoff, drop Mueller to 9th where he is still setting the table but is totally out of the spotlight, and move the last three guys up one. That's actually a heck of a good lineup (and Walker has hit 6th a lot). But I want to look at one more factor.

Who benefits from a runner on 1B?

OPS difference, last 4 years, with a guy on 1B vs. 1B unoccupied. Some of this is going to be random chance but much of it will be the benefit of the 1B hole when a runner is held.


Ramirez .111 .145 last 3
Walker .101 .172 last 3
Damon .076
Garciaparra .067
Nixon .066
Mueller .033
Millar .030
Varitek .024
Ortiz -.081
Giambi -.100



To use this data, we of course want to look at:

Who is on first.

(H - 2B - 3B - HR + BB + HBP - SB - CS) / PA.


3 yrs RHP 2002
Ramirez .319 .326
Mueller .279 .322 in '99
Giambi .318
Walker .256 .250
Varitek .256 .245
Garciaparra .247 .219
Millar .236 .247
Ortiz .243 .219
Nixon .237 .225
Damon .203 .219


You can see this is worth paying attention to. Bill Mueller is on 1B almost 50% more of the time than Johnny Damon, and Todd Walker gains 100 or 170 or something points of OPS when first base is occupied. You do not want him hitting second behind Damon.

How are we doing so far with this? Jeremy Giambi actually has a big negative split with a runner on 1B. That may be a fluke, but he certainly has shown no sign of benefitting from the 1B hole. Johnny Damon is on 1B less often than anyone on the team. Damon, Giambi is a perfect combination. Manny adores a runner on 1B and Lil' G lives there. Nomar likes a runner on 1B a lot and Manny lives there. Millar benefits just a bit and Nomar is there kind of middling (for this team).

Todd Walker hitting behind Nomar and Millar is not good. Millar is on 1B relatively infrequently, Nomar not much more.

Looking down the order, Varitek is on 1B a bit more than Nomar and Mueller is there all the time. Why not drop Walker down to 9th? This has the added very large advantage of separating Walker and Nixon, the two guys in the lineup who invite opposing lefty relievers (more on that below).

So now we have:

Damon
Giambi
Manny
Nomar
Millar
Nixon
Varitek
Mueller
Walker

which is of course the lineup I've been suggesting for a few weeks. Now you know why! (In case you can't tell, in this study I actually did re-address the issue from the beginning, with an open mind.)

How is this lineup for dealing with opposing lefty relievers? It's great. Nixon's the first of the lefties, which is very desirable. If they bring in a lefty, you can pinch-hit with Gant (or whomever). Varitek's platoon split is slight, Mueller likes lefties better. Burying Walker in the middle of this sequence is key. You can pinch-hit Mirabelli or Merloni, and guarenteed the opposing manager will leave the lefty in. Because if they bring in the righty now, you send Ortiz up, and then the righty faces Damon and Giambi instead of the lefty. (Yeah, I know that Damon has a reverse and Giambi a neutral platoon split, but those splits are against all lefties; against tough lefty relievers they would have at least a mild split the normal way. And nothing we can do with the lineup is going to prevent those guys from seeing some lefties.)

In contrast, if you bat Walker second you will always be burning two players if you pinch hit, because the opposing manager immediately brings in his righty to face 'Belli (or Merloni), Manny, Nomar, forcing you to hit Ortiz for 'Belli. I think there are better uses for 'Belli than to be announced as a PH.

Let's not think about it.

What's the best lineup with Ortiz instead of Millar? I don't think there's one anywhere near as perfect as this. Millar is the perfect #5 hitter for this team (at least until Nixon or 'Tek has a breakout season). I would seriously entertain hitting Walker 5th, though (although you might want to wait a month to see if he takes to Fenway as well as we hope). A radical solution would be to hit Damon 5th. But any of these changes would force other adjustments. I don't want to spend that mental energy until I know I need to.
 
peteAll-Star
Posted February 08, 2003 02:57 PM
JaserD,

Awesome post, both for the fact that its a good discussion of the sabermetrics approach, and to the fact that it's about the Sox (which are dear to my bleeding heart as well)...lol

now I just want to digest it for awhile...

I've been trying to use some part of the sabermetric system in my analysis of the ratings disk information, but, to this point this is all I have (given the not so complete amount of data on the disk)...

I got this from
strato saber discussion

-------------------
Below you'll find batters' SOM Card Avg, Oba, Slg and OPS with a couple of other things thrown in. This is what the players would hit if all rolls came off the hitter's card ONLY. I figured BP-HR and BP-S using a neutral park (1-9).

OPS = On Base + Slugging -- A quick and pretty accurate measure of offensive performance

Rate = (.3*OPSvLH) + (.7*OPSvRH) -- Because approximately 30% of all plate appearance come vs. LH pitching, this is a pretty good estimation of each players overall value.

Bval = Rate * Plate Appearances
--------------------

it's not rocket science but it may be enough to help weed out some players, I wouldn't mind including the equation from Erics post above, but the stats disk does not have the complete breakdown needed

[This message was edited by Pete on February 08, 2003 at 03:07 PM.]
 
Mike_JAll-Star
Posted February 08, 2003 03:27 PM
Talk to luckyman, because, as far as I've seen, he's the sabermetric king on this board. I have a grounding in the basics, and I use it with the ratings disk to do evaluations. I use a 2-step process, first calculating runs created for players based on 216 PA's (I use what's on their card for 1/2, then an estimate I can live with of the average pitcher). I then use that to get a per 27 out number (number of runs scored if that player was every batter in a game), which I think is a better measure of offensive efficiency. Both concepts, though, are straight from Bill James's mouth.

I can give (relatively) simple formulas for this stuff, if you'd like.
 
peteAll-Star
Posted February 08, 2003 03:30 PM
so to me, what I included, is measuring a players ability to get on base and the ability to advance people on base effectively...

I would also recommend the following...

intro to sabermetrics

from this page I have also found a few formula's that I may attempt to incorporate, for example...
(HITS + WALKS) (TOTAL BASES)RUNS = ---------------------------- AT-BATS + WALKS

my biggest problem is do I want to try to collect data from other locations to crunch, or to stick with the ratings disk...

I would prefer to go more the route of the linear comparison because it seems to be a bit more accurate than bill james original version
 
JaserDAll-Star
Posted February 08, 2003 03:47 PM
Pete, I haven't seen the ratings disk (just ordered it for the first time for 2002), but the problem I foresee in trying to analyze alot of "real player data" from other places is that (it's my understanding) the Strat cards already take alot of this into consideration but then adjust for ball parks.

The data you gather in other places most likely will not have ball park adjustments (at least the common stat sites don't) so it may skew your player analysis in taking the "real"data and then making similar assumptions for Strat cards. I could be wrong on this, but it's my percpetion.

This would also hold true for the example I posted from Eric Van in looking at "real" statistical data for true OBP (i.e. -HR), Contextual Runs, frequency of getting to 1B only and OPS delta with a runner on 1B only and trying to correlate this info into a Strat lineup strategy. While it's probably close, I doubt it's as accurate as the real data (again for BP adjustments on the cards)
 
Mike_JAll-Star
Posted February 08, 2003 04:17 PM
The great thing about incorporating sabermetric formulas with the ratings disk is that all the hard, solid numbers you need are right there in Xcel, ready to be plugged into calculations. The problem is that certain things are going to be hard, or impossible, to measure. And linear weights are difficult because they involve manual counting- the disk tells you total number of hits, walks, HR's, TB's, OB chances, out of 108, but doesn't tell you anything about doubles, triples, etc.

So, using the info on the disk, you can run a very simple runs created (per side) by taking OB chances* TB chances/ 108. Its important NOT to translate OB and TB's into percentages, because that's not what the RC formula needs- it needs amounts, because its a cumulative measure, not an average. If you want to incorporate BP effects, its pretty simple. Choose your own Park Averages (as a percentage- 1-10 would be .5) based on circumstance (i.e, home park average, or league average, or combo).

You get (OB chances +(Park Average * BPsingles)+ (Park Average* BPHR's)) * (TB chances + (Average*BPsingles)+ (4*(Average*BPHR's)))/ 108.

This gives you a runs created total for the 108 chances on a hitter's card.

To translate that to more of an efficiency measurement, convert it to a RC per 27 outs.
This formula is. (Runs created/(108- (OB chances+ ( Park average * BPsingles)+(average*BPHR's))*27.

You can work other factors into these formulas, but they are based more on guesswork and situational analysis. Stuff like GBA's, baserunning speed, steal ratings, ** singles, clutch, and defense, are harder to figure into these formulas, but at least these formulas give you a solid foundation. The RC formula was accepted as valid because of its predictive abilities- plug in the OB, TB, and PA numbers for entire teams, and you get very close to the actual number of runs they scored.

Linear weights theories, to me, always seem to be adjusted so that the numbers come out right- a single is worth this much, a double this much, because it makes the formula more accurate, not neccessarily because it measures any "true" value.
 
luckymanAll-Star
Posted February 08, 2003 04:38 PM
Sabermetrics is a very very interesting (and effective) to analyse the contribution of any baseball player, and I've adopted some of its basics into my playing (thanks for the compliment, Mike_J )

In my next post, I'll tackle Pete's problem, of how to adjust sabermetrics logic with the ratings you have on your strat disk, but first let me say in this post one or two things.

Basically, the logic behind sabermetrics is, as James argued, that "a hitter should be evaluated by his ability to create runs for his team". So what we are looking for is a statistic we will call "Runs created" (RC). The site Pete proposes is a good introduction to the subject, although two clarifications must be done.

First, a lot of different formulas exist. Despite the differences in formulas, for general purpose, their quality to predict the quality of an offensive performance is similar. For those of you interested to look at some of these:
www.baseballstuff.com/fraser/stats2000/stats.html
www.stephent.com/jays/erp.html

Second, the linear weight formula that is proposed in the intro paper has two mistakes. First, it doesn't include outs. Second, since the formula has been introduced, some people showed that the original formula was biased. To correct it, you must give a negative value to outs. In short, the logic is that when you have an out, your chance to score runs is reduced (as opposed to a foul ball, for example, an event where your chance to score a run is not reduced). So the linear formula people now use has changed for something like (notice the minus aside outs):

RC = .47*1B + .78*2b + 1.09*3b + 1.40*HR + .33*(BB+HBP) - .3*(AB-H).

If you want to adjust it for stolen bases and caught stealing, it becomes:

RC = .47*1B + .78*2b + 1.09*3b + 1.40*HR + .33*(BB+HBP) - .3*(AB-H) + .31*SB - .60*CS.
 
penngrayAll-Star
Posted February 08, 2003 04:54 PM
I wish the ratings disk broke out the 2b,3b etc.
 
peteAll-Star
Posted February 08, 2003 05:25 PM
Thanks luckyman and Mike_J for the input...

lucky, I had found the formula you stated previously and that is what had led me to thinking that I would have to get stats elsewhere(2b's, 3b's, etc) to complete the formula...

so for now I will baby step, I may stick with the formula I listed above, even though it is flawed, I want to start with some base of reference and work from there...

the formula being

(HITS + WALKS) (TOTAL BASES)RUNS = ---------------------------- AT-BATS + WALKS

I was also thinking that if I ran 2002 player stats through the formulas it would help to spot the cream rising to the top, it may not neccessarily pinpoint the best strat-o performers but at least give a good estimation on how they "should" perform
 
luckymanAll-Star
Posted February 08, 2003 06:01 PM
Mike_J addresses the problem quite right:
quote:
And linear weights are difficult because they involve manual counting- the disk tells you total number of hits, walks, HR's, TB's, OB chances, out of 108, but doesn't tell you anything about doubles, triples, etc.


What you have, on the strat disk, is on-base, total base and hits. By the way, notice that, since you have hits, you can calculate how much walks and hbp each card has by doing:
on-base - hits = (W+HBP)

Thus, I suggest you that you create an additional column in Excel Strat sheet with the upper formula (if you don't know how, I could explain.)

As Mike_J says, you could do ONB*TB/108, but the problem is, if you go that way, you won't be able to adjust for speed, defense, and the like. To incorporate the latter stats, you really have to keep the logic of linear weight. But you don't have the stats for singles,double, and triples, so what can you do?

Here is the answer. First, look at the weights I came up with in the last formula of my last post. The weights are:
(W+HBP) ---> 0.33
1B --------> 0.47
2B --------> 0.78
3B --------> 1.09
HR --------> 1.40
Outs-------> minus 0.3

Since this is a linear model, I can add or multiply by a constant, and the order among the players will be preserved (in other words, if Jeter is better than Tejada in the original weighted formula, he will still be better in the transformed formula). So a add 0.3 and multiply 2.8 each value, I will end up with the following weights:

(W+HBP) ---> 1.76
1B --------> 2.16
2B --------> 3.02
3B --------> 3.90
HR --------> 4.77
Outs-------> 0

Now, suppose we ADD for each card chances of on-base and chances of total bases, as listed on your strat disk. You know that, for chances of total bases, you have one point for a single, two points for a double, etc. So what this value (on-base chances plus total base chances) is really doing is to give the following weights:

(W+HBP) ---> 1
1B --------> 2
2B --------> 3
3B --------> 4
HR --------> 5
Outs-------> 0

If you compare the last two lists of weights, you see that, except for walks, the two lists are very similar.

Let's fix the weight for walks. In the former list of the two lists we are comparing, walks had a value of 1.76, and singles a value of 2.16. So the value for walks was the equivalent of 80% the value of singles. So if we give the value of 2 for singles, we should have a weight of 1.6 for walks.

So here is the twist I suggest: to have a value of "Runs created" that approximates very well the linear formula from the stats you find on the STRAT disk, you just have to create another column in Excel Strat sheet with the following formula:
=ONB + TB + 0.6*(W+HBP).

(0.6 instead of 1.6 because ONB incorporates 1 point for (W+HBP))

This will give a linear weight of Runs created, with the underlying following weights:

(W+HBP) ---> 1.6
1B --------> 2
2B --------> 3
3B --------> 4
HR --------> 5
Outs-------> 0

Since we have a linear weight RC, we will be able to adjust this RC for:
- ball park adjustment
- stolen bases
- defense
- gbA
- clutch
- star (forcing holding)

I won't get in the details of each but here the final formula I end up with. For each side (vs lhp and vs rhp), compute:

1) RC=(W+HBP)*0.6 + ONB + TB + BPadjustment + "weak" adjustment + clutch/4 + gbA/5;

where BP adjustment is adjustment for Ball Parks homerun. The formula for this is

BP adjustment = k/4 * BP

-BP is the number of Ball park homeruns on a given card
-k is the stadium homerun ratio (for Coors, it would be 19, for a neutral park, it would be 10).

AND where "weak adjustment" is :
-7 for a player being weak
-5 for a player NOT having ball park singles on his card
-12 for a player having both characteristics
0 for normal players with ball park singles

2) running adjustment= (stealing + speed/2)-6

where stealing is
+6 for a playing having a star and a 19 number for stealing
+5 for a playing having a star and a 18 number for stealing
+4 for a playing having a star and a 17 number for stealing
+3 for a playing having a star and a 16 number for stealing
+2 for a playing having a star and a 15 number for stealing or less

and where speed is speed rating (ex.for Jeter, 17)

3) offensive RC = 0.3*(RC vs lhp) + 0.7*(RC vs rhp) + running adjustment

4) overall RC: offensive RC - defensive RC

I come back to defensive RC in another post.

[This message was edited by luckyman on February 08, 2003 at 07:00 PM.]
 
luckymanAll-Star
Posted February 08, 2003 06:21 PM
Penngray, about
quote:
I wish the ratings disk broke out the 2b,3b etc.
.

That's very easy to do with Excel.

First, to be sure you don't loose any data, select the whole set of data, copy it and paste it in another spread sheet (the width of columns will change). Second, I suggest that you delete all the non-eligible cards. Then, you click in the first cell under CA (where you find defensive ratings), and press the "ranking" button that looks like this:

A (with an arrow beside)
Z

By doing so, you'll have all the catchers grouped together. You can copy (not cut) and paste it in another sheet called "catcher".

Then you just repeat the procedures for 1B, 2B etc.
 
penngrayAll-Star
Posted February 08, 2003 06:35 PM
lol, I know how to do the player categories.

actually I use auto-filters and then just filter my defensive ratings that "contain" ss, c, cf etc. when I need to look at one position.

I was saying, I wish the spreadsheet had the hit totals for doubles and triples
 
luckymanAll-Star
Posted February 08, 2003 06:43 PM
oops!!!
 
penngrayAll-Star
Posted February 08, 2003 06:45 PM
my original post was really vague
 
luckymanAll-Star
Posted February 08, 2003 06:47 PM
For defensive ratings:

for INFIELDERS
defense RC = 0,7*e +((range-1)*pos)

where e is the e rating
range is the range rating
pos is:
3.6 for 1b
10.8 for 2b
12.6 for ss
5.4 for 3b

For lf and rf, the formula is:

defensive RC = rr + e

where e is the e-rating, and rr is:

0 for range 1
6 for range 2
13 for range 3
27 for range 4
35 for range 5

For cf, it's the same, except you multiply rr by 1.5.


P.S.
For ss and 2b, you may also want to adjust for double-plays (gb A or gb 1). A simple way is to substract 6 from defensive rating for ss-1, 5 for 2b-1, 3 for ss-2 and 2 for 2b-2.

A sophisticated way is the follow the next formula. With Excel, there is a formula called MAX, with gives the highest numbers of the ones you specify. The formula is:

adjustment for gbA: =MAX((40-defensive RC)*0,18; 0)
 
luckymanAll-Star
Posted February 08, 2003 06:52 PM
So Pete,

forgive me if you got lost in the ocean of formulas I just threw in. But the whole idea is that the first step should really be:

offensive RC = 0.6*(W+HBP) + ONB + TB.

This keeps the logic of linear weight formula. It's simple. And it enables you to do all the adjustments, espiecially for defense

(so if you don't want to get in the details of all adjustments, I suggest you compute offensive RC with the simple formula above, adjust it to BP homerun in a neutral stadium, and substract from this the defensive RC allowed by each hitter. This gives the following, in two steps:

offensive RC = 0.6*(W+HBP) + ONB + TB + 2.5*BP

overall RC = 0.3*(RC vs lhp) + 0.7*(RC vs rhp) - defensive RC

where BP is the number of BP homeruns you find on a card)
 
Mike_JAll-Star
Posted February 08, 2003 07:19 PM
That's some pretty awesome stuff lucky, but I have a few question. Where are the clutch and GBA divisiors coming from? Using the divisors as a judge, you are saying that GBA situations occur about 20%, and clutch about 25%, of the time? Or am I missing something? Interested in the source of those values.

And for the defensive ratings, I'm fairly lost with the logic. I see how the pos values are based on chance ratios for positions, but I don't understand how that formula adds up to what a player's defense will actually allow.
 
peteAll-Star
Posted February 08, 2003 08:45 PM
lucky,

thanks for the reply, I appreciate everything you put up, though it may take me a bit to absorb...

if you have any more insights it would be welcome

I am currently building an sql database that will allow me to sort by ratings, I will go back and add a column based upon your formula
 
luckymanAll-Star
Posted February 08, 2003 09:13 PM
Mike_J,

Don't wonder why you can't understand how I came up with these numbers: for reasons of simplicity, I didn't mention it!! So you're not missing something, it's rather me who's hidding my assumptions.

Clutch

Let's take Edmonds numbers vs rhp:
W+HBP= 22
ONB = 52.3
TB = 50.7
BP = 7
clutch = -12

Basic RC = 0.6*(22) + 52.3 + 50.7 + 2.5*(7)
= 133.7

Now, we know that clutch situations happen around 12% of the time (that's on average, the 5th hitter in a line-up gets more, close to 15%, but let's keep it simple and take 12%)

So this basic RC, 133.7, holds 88%. In clutch situations, Edmonds' card lost 12 singles. So his basic RC in clutch situation is :

Clutch basic RC = 0.6*(22) + 40.3 + 38.7 + 2.5*(7)
=109.7

So Edmonds rating, when adjusted to clutch, is:

RC = (0.88)*133.7 + (0.12)*109.7 = 130.82

But notice that we would have approximately the same result if we had use this formula:

RC = basic RC + clutch/4 = 133.7 + (-12/4) = 130.7


gbA
A double-play is worth about two outs. If we adjust our weights to give 0 to outs, a dp is worth about the opposite of a walk. So, in our system, a double play is worth about -1.6.

But, as you know, gbA doesn't turn out in double plays each time. It only does so when there is someone on first with less than two outs. If we knew the probability that this situation occurs, then we would know the weight for gbA. I can't remember the source of this information, but I think that on average, it's between around 18% (again, it depends on where a player stands in the line-up).
So a rough approximation of the weight we should attribute to gbA is:
gbA = 18% X -1.6 = -0.29.

That being said, there will be situation where there will be gbA with guys on first/less than 2 outs, and the outcome won't be a double-play. One example is when defense is playing in, another example is if you look at the chart, with guys on first and third and the reading is gbA (3b), the play will be at home, etc.

So in strat life, gb(A) will be worth roughly between -0.20 and -0.25. So a good approximation will be around to weight :
-0.25*gb(A), or gb(A)/4
or 0.2*gb(A), or gb(A)/5,

but certainly not gb(A) taken as an unity.

(the reason I chose gb(A)/5 is because in fact, a dp is worth a little bit less than the opposite of a walk, roughly 85%, but I'm sure that gb(A)/4 would work out well too).
 
Mike_JAll-Star
Posted February 08, 2003 09:40 PM
I think that's a great system, all based on sound logic and mathematical principles. One thing I noticed though, is that you have a + for GBA's- shouldn't that be a minus? And after popping the system through the ratings disk, the only thing that leaps out as "questionable" is the possible overvalue of walks- Durazo stands out. But I think that's an issue with sabermetrics in general, not your specific formulas.
 
penngrayAll-Star
Posted February 08, 2003 10:01 PM
what do you do with the BP singles? they arent on the rating disk either.

pt.2

PostPosted: Tue Nov 21, 2006 4:30 pm
by cummings2
Mike_JAll-Star
Posted February 08, 2003 10:30 PM
Penngray, they are on the disk, just not in their own column. Any player with a * next to BPHR's (or W) has no BP singles. True for hitters and pitchers. If you want to seperate those out, create an empty column to the right of BPHR's, then do a text to columns, fixed width, with seperating line running just inside the *. That will kick the * into the empty column, and allow you to assign a numerical value.
 
penngrayAll-Star
Posted February 08, 2003 10:38 PM
thanks but would the numeric value be? the sames as the BHs? what would the value be for the 'w'?
 
luckymanAll-Star
Posted February 08, 2003 10:49 PM
Mathematics haters, STAY AWAY!!!

But since Mike_J asked for it:

For defense, I assumed that

1) I don't take in consideration RP readings
2) For each offensive at-bat, a player is on defense for 9 at-bats (so we must multiply by 9 the outcomes allowed by a defensive player).



For infielders, let's assume that all errors and hit are one-base, just like singles. The weight for singles is 2. it means:

defensive RC = 9*ch*2*((h+E)/20)

where:
9 = see assumption 2 above
ch= stands for chances to look at a position (7 chances to hit ss-X).
2 = weight value
h = hits allowed:
for a 1-rating, h=0
for a 2-rating, h=2
for a 3-rating, h=4
for a 4-rating, h=6

In other words,
for a n-rating, h=(n-1)*2

E=chances that an error occurs, as such E is not the e rating, but the chances of errors associated with the e-rating.

20= because all chances in defensive chart are divided by 20.

With some algebra, we see that the formula above equates:

defensive RC = 9*ch*2*(h/20 + E/20)
defensive RC = 9*ch*2*(h/20)+ 9*ch*2*(E/20)
defensive RC = 9*2*ch*h/20+ 9*2*ch*E/20
defensive RC = 0.9*ch*h + 0.9*ch*E (because 9*2/20 = 0.9)
defensive RC = 0.9*ch*(n-1)*2 + 0.9*ch*E (equality from above)
defensive RC = 1.8*ch*(n-1) + 0.9*ch*E

How do we know the E, the chance an error occur?

One way is to look at the charts, and compute the probability of e for every ratings. For example, by looking at the defensive chart, we see that, for a ss rated e10, error occurs if 3,16,17 and 18 are rolled, with 3 and 18 being two-bases error (let's forget this fact for a second). So probabilities are that error occurs 11/216 = 5.09%. Hence 5.09% X 20 = 1.02 chances of error.
Since for ss, ch=7, we have :
ch*E = 7*1.02 = 7.14,

But note that for a 3b rated e10, error occurs if 3, 13, 17 and 18 are rolled (with 3, 17 and 18 being 2-bases error), that is 26/216 = 12%. Hence 12% X 20 = 2.4 chances of error. Since for 3b, ch=3, we have:
ch*E = 3*2.4 = 7.2

In short, charts are created so that, for every infielder,

ch*E always equal approximately 0.72*e (now e being the e-rating).

So in our previous formula, we had,

defensive RC = 1.8*ch*(n-1) + 0.9*ch*E

now becomes, since ch*E = 0.72*e:

defensive RC = 1.8*ch*(n-1) + 0.9*0.72*e
defensive RC = 1.8*ch*(n-1) + 0.648*e

where ch is
7 for a ss
6 for a 2nd baseman
3 for a 3rd baseman
2 for a 1st baseman

you have
3.6 for 1b
10.8 for 2b
12.6 for ss
5.4 for 3b

Hence,

defensive RC = pos*(n-1) + 0.648*e

(remember that there where two-bases error that we considered as one-base error? When taken into account, we really have:

defensive RC = pos*(n-1) + 0.7*e

pt3

PostPosted: Tue Nov 21, 2006 4:37 pm
by cummings2
(remember that there where two-bases error that we considered as one-base error? When taken into account, we really have:

defensive RC = pos*(n-1) + 0.7*e
 
peteAll-Star
Posted February 08, 2003 11:01 PM
as for Mike_J's comment about overvaluing the walks, what about this...

RC = 0.6*(W+HBP) + Hits + TB + 2.5*(BP) + clutch/4

??

this may just be a stab in the dark towards understanding, so please bear with me...

Let's take Edmonds numbers vs rhp:
W+HBP= 22
ONB = 52.3
TB = 50.7
BP = 7
clutch = -12
RC = 0.6*(22) + 29.6 + 50.7 + 2.5*(7) + (-12/4) = 107
 
luckymanAll-Star
Posted February 08, 2003 11:05 PM
As for w rating, it really depends upon what you find on the pitcher's card, since the impact of having a weak card is that you don't have the homeruns that stand on pitcher's card, but you get instead a single. A homerun is worth 5, a single is worth 2. So the numerical value of w is (5-2)* chances of homerun on pitcher's card.

If you look the average hr and bp for starters, you have:

pitchers vs left vs right
hr bp hr bp
overall 2,24 3,05 2,05 2,93
top 40 1,41 1,95 1,47 1,98
top 80 1,54 2,19 1,49 2,10

In a neutral park (10), a weak rating is worth

vs overall: 3*3.65 = 10.95
vs top 40 : 3*2.42 = 7.25
vs top 80 : 3*2.59 = 7.76

Since we have the best pitchers in TSN-strat, and since relievers usually have better cards, I assume w = -7. But if you're playing in Pac, you might assume a value closer to -5.

[This message was edited by luckyman on February 08, 2003 at 11:21 PM.]
 
CubitAll-Star
Posted February 08, 2003 11:10 PM
Now we are getting somewhere in this thread. I did like JaserD's postulates above, but loved luckyman's analysis. It's sent me back to Mathematica to produce a function diagram!
 
luckymanAll-Star
Posted February 08, 2003 11:14 PM
Mike_J,

no wonder why Durazo stands for you as a surprise. Your own formula you are using is :

taking OB chances* TB chances/ 108.

But the original sabermetrics's formula that has this form is slightly different. It is:

on-base % X slugging %, or put otherwise:

on-base/108 X TB chances/(108 - (W+HBP))

In other words, your formula is slighty biaised from the original correct formula, creating its biggest bias against power hitters with high walk rate, of which Durazo is one of the best prototype.

As for the signs, you're right: you must substract gb(A)/5.

luckymanAll-Star
Posted February 08, 2003 11:19 PM
Pete,

I wouldn't suggest you take hits instead of on-base. If you really think walks are overrated, than you just have to lower the ratio in front of walks. Instead of multiply by 0.6, multiply by 0.5 or less.

In my opinion, tho, if this linear system is making a slight error, it's rather in underestimating power rather than overestimating walks.

To be quiet honest, I really have some doubts about my running adjustment. With the adjustment I give, guys like Goodwin has the maximum, 8.5 whereas guys like Olerud has the minimum, -2. This difference (10.5) is way too big. But I can't resolve the problem at my best satisfaction.

[

[This message was edited by luckyman on February 08, 2003 at 11:37 PM.]
 
BbroolAll-Star
Posted February 08, 2003 11:51 PM
Some interesting formula's. I have tried for years to work with linear formulas and have scraped that for the more simplistic RC=OBP*TB. However, this formula is really based on estimating the runs a team will score over a season and not in evaluating individual performances.

So to use it to evaluate an individual performance I evaluate the incremental value of each player to an average teams performance.

As far as incorporating defense, you can do the same thing in reverse. I have gone into a lot more detail that I don't want to duplicate here in the Old Timers board on all of the probabilities behind fielding and the average times the X chance comes into play and based on an individuals defense how much a baseline teams runs go up as you change out a defender.

At the end of the day you have a positive influence for offense and a negative influence for defense, which nets to the worth of the player.

I would recommend that when using the disk to evaluate players do not omit the impact of the pitchers card. At a minimum, you should compute an average pitchers card and combine this with each hitter to get a much clear picture of the actual probabilities.

If someone is interested in the actual mathmatics and formulas I have built into excel I can go into much more detail, but most of the keys have been touched on here already.

Bbrool

penngrayAll-Star
Posted February 08, 2003 11:58 PM
Bbrool, im interested in your spreadsheet equations.
 
luckymanAll-Star
Posted February 09, 2003 12:06 AM
I'd be interested too. In particular, I'd like to know how you handle defensive, clutch and gb (A) by using this formula.
 
peteAll-Star
Posted February 09, 2003 12:44 AM
Bbrool,

That I guess was going to be my end all, was to create a base of players, and then give the ability to build my "team", and then load pitchers (and set the ballpark) to give an overall "vs" for the whole lineup

lucky,

well I was thinking that was a bit heavy handed but I was curious about your response, thanks for all the input, it is well appreciated!!

 
BbroolAll-Star
Posted February 09, 2003 01:15 AM
The nice thing about using the runs created formula to estimate team runs is that an out is an out in the base formula RC= OBP*TB. Over the course of a season all types of outs occur but the overall estimate is relatively uneffected.

I have used a total of 3 seasons to come up with some baseline totals as follows (remember these are real stats from real players, not the inflated softball stats from the 2001 leagues

AB-5,566
BB-588
Outs-4111
Hits-1455
2B-232
3B-49
HR-187
TB-2348
Avg.-.261
Obp.-332
Slg.-.422

These numbers produce an estimated Runs scored of 779 ( the actual average was 772, so a pretty good estimation).

All of these numbers can be broken down into strat probabilities over 216 chances (hitter+pitcher) and then weighted 8 times. The 9th players PA's will be replaced with the hitter we want to evaluate. When he is combined with the other 8 average hitters new OBP and TB numbers will result and a new RC computed. The incremental increase or decrease is the players incremental value to a team.

I chose to ignore clutch although it could be incorporated. One would just estimate the number of clutch times a batter faces during the season times the players clutch averages and adjust his PA's accordingly. Again the nice thing about using a simpler formula is that over the course of a season the incremental value of clutch is nominal

Example:

Assume 50 clutch opportunities for a player with 10 additional clutch hits on their card (50*10/216 for approximately an additional 3 hits on the season).

When evaluating the offensive potential of players, the opposition defense is N/A, since it is built into your baseline assumptions for the average pitcher card (I work backwards, from Avg. OBP, and Slg I want) that you combine with your hitter. Effectively this is a constant for all hitters so pick the numbers you want, since it impacts all hitters equally.

For the defensive side take the same baseline team and lets assume everyone is a 1e0 in those numbers so all X's were outs that created those amounts. Again since this is a baseline and all defenders will use this constant we can pick what we want here. I could of reduced my offensive totals to reduce the impact of X chance singles, but their really isnt a reason to.

From the baseline I take the position I want to evaluate and put him on the field for the season. I am working with 2B's at the moment. A 2B has 6/216 X chances over 6,154 for an projected 171 X chances hit to the 2b for the season.

Take the 171 and filter it through the defenders range and e rating probabilities to see how many of those 171 chances become hits or errors. Since errors act exactly like basehits we can treat them as 1B, 2B, & 3B. For a 1e26 that is 26.92 hits which includes 2.38 doubles.

Now change your baseline numbers to convert what originally outs to the appropriate hits. As a last step the team still has to record 4111 outs for the season so we go back to the baseline averages and replace the outs and include additional bb, 1b, 2b, 3b, & Hr that come with it.

After all this we have new team averages and a new RC can be estimated. The increase is the defenders negative value to the team. For a 1e26 the teams runs go up by 26 over the course of a season.

The Runs scored has an exponential not linear relationship to hits. However, I found that for the X chart results this exponential factor is neglible and the results basically look linear as you put worse and worse defenders on the field.

Again an out is an out under the RC formula so the effects of groundballs are not differentiated. After the initial work I have done I feel comfortable that these differences are again neglible, but the formulas could be changed to incorporate these (ie estimate the number of dp opportunities from the 171 X chances and for each dp completed remove a single from your baseline team and convert it to an out to offset the singles added for the defender.

Go over and take a look at some of the defensive stuff I have put up on the All Time great board, for a little more information. Alot is still work in process, although I have converted all the 2B chart to find out what combinations of rating and errors are equivalent. havent posted those results yet.

I am always interested in thoughts on how to fine tune my calculations.

Bbrool

BbroolAll-Star
Posted February 09, 2003 02:14 AM
As I think about my defensive evaluations I think I should factor in the gb(a) for the X charts. The only question becomes how often does a dp opportunity present itself. I was thinking roughly 25% of the time a batter comes up with someone on 1B and then 2/3 of that number since you can't have a dp with two outs already. This is approximately 17% so I think I will round up and start with 20%.

So for the 171 X chances a 2B gets 34 will be DP opportunities of which 80% of those will be double plays for a 1, before error chances.

When subtracting offensive numbers for DP's I think I will reduce walks for dp's turned, since a double play only removes a base runner, it doesnt stop all the runner activity that occurred when the original single was hit.

Oh well need to think through this one a bit more, but clearly dp's from the X chances impact things a bit more than I may have originally considered.

Bbrool
 
luckymanAll-Star
Posted February 09, 2003 02:45 AM
Could you give us an example of how your system works with one player?
 
Mike_JAll-Star
Posted February 09, 2003 03:16 AM
Bbrool, I've done exact fielding percentage ratings for every position. For example, the 2b rating looks like below, top number range, number on left is E. Missing 5's because no one is silly enough to play a 5 at 2b, and it won't fit on my cut and paste. The numbers are percentages times 100, so 97.57 is actually .9757. This chart ignores extra base hits, but it does give you an exact percentage of X chance conversion.

1 2 3 4
4 97.57 87.57 77.57 67.57
5 96.99 86.99 76.99 66.99
6 96.41 86.41 76.41 66.41
8 95.14 85.14 75.14 65.14
10 93.98 83.98 73.98 63.98
11 93.29 83.29 73.29 63.29
12 92.71 82.71 72.71 62.71
13 92.13 82.13 72.13 62.13
14 91.55 81.55 71.55 61.55
15 90.86 80.86 70.86 60.86
16 90.28 80.28 70.28 60.28
17 89.70 79.70 69.70 59.70
18 89.12 79.12 69.12 59.12
19 88.54 78.54 68.54 58.54
20 87.85 77.85 67.85 57.85
21 87.27 77.27 67.27 57.27
22 86.69 76.69 66.69 56.69
23 86.11 76.11 66.11 56.11
24 85.42 75.42 65.42 55.42
25 84.84 74.84 64.84 54.84
26 84.26 74.26 64.26 54.26
27 83.68 73.68 63.68 53.68
28 82.99 72.99 62.99 52.99
29 82.41 72.41 62.41 52.41
30 81.83 71.83 61.83 51.83
32 80.56 70.56 60.56 50.56
34 79.40 69.40 59.40 49.40
37 77.55 67.55 57.55 47.55
39 76.39 66.39 56.39 46.39
41 75.12 65.12 55.12 45.12

And luckyman, although I do use the basic runs created as a starting point, I then take that number and plug it into a RC per 27 out formula, which takes into account the "out efficiency" of a player. So in my ratings, Durazo ranks fairly high. But he ranks higher with the linear weights formula, and the problem I see with him is that his total # of hits is so small, that its a legit question whether he really is a tremendously effective offensive player.

Durazo ranks above Guerrero in your linear weights formula, despite the fact that Guerrero has an OB percentage only a smidg blow Durazo, and over 20 points more of TB's. Durazo wins in a landslide in terms of having less GBA's and a better clutch, but the question is, is he really a better offensive player than Guerrero against RHP? Considering that Guerrero has an OB lower by only 2.6 chances, has over 20 more TB's, has over 20 more hits, is a much better base runner, and is a serious steal threat, this is hard to swallow. I'm not dogging your formula, because I think its based on great concepts, but I also think it does have flaws that are likely to show up in the ratings of guys who have huge walk and HR totals, with almost nothing else. Durazo and Delgado being primo examples of this for the 2003 cards.
 
luckymanAll-Star
Posted February 09, 2003 04:36 AM
I have to say : this is a great example. And presented as you do, I look bad

But you have illustrated what I label the strat-o-matic fallacy, which is, huge total bases difference means huge slugging difference.

So let me ask a question from a different angle. Suppose you have the choice between player A and player B, who will you choose?

Player A onb = .493, slugging = .828
Player B onb = .469, slugging = .822

Player A has better on-base AND better slugging, hard not to choose him, right?

Well Player A's stat is what's on Durazo offensive card in a neutral stadium, and Player B's stat is what's on Guerrero offensive card.

Proof of slugging % better for Durazo:
Durazo (45.3 + 16)/(108-34) = .828
Guerrero (65.4 + 16)/(108-9) = .822

And Strat here only replicates what was happening in the real world, since Durazo slugging was BETTER than Guerrero vs rhp (.631 vs .616 for slugging, 1.061 vs 1.026 for OPS).

And don't forget that Guerrero had last year 40 SB and 20 CS, so he virtually had no positive value in terms of stealing (although he does have a star, which is a positive value).

[This message was edited by luckyman on February 09, 2003 at 05:09 AM.]
 
luckymanAll-Star
Posted February 09, 2003 04:37 AM
Erased because twice

pt4

PostPosted: Tue Nov 21, 2006 4:44 pm
by cummings2
BbroolAll-Star
Posted February 09, 2003 05:03 AM
MikeJ,

I will go back and compare my 2b numbers to see if we are on the same page, but you will need to compute a slg pct for each fielding rating as well since TB's factor into the RC formula.

Ok here is an example of exactly what I do using J. Dimaggio playing in a 10 singles 10 hr park (my spreadsheets are set up to allow me to change the parks, but lets just go with a nuetral park).

Step 1- Compute the hitters card:

BP singles and HRS have already been converted to outs and basehits

I use a 30/70 split to get a weighted average card which looks as follows:

1B-27.54
2B-7.66
3B-1.33
HR-11.62
BB-11
Outs-48.87
PA's-108

Step 2- combine the hitters card with the average pitchers card:

I used the following card (you are free to use other card results, since this is a constant for all hitters):

1B-15.95
2B-2
3B-.2
HR-.8
BB-8
Outs-81.05
PA's-108

So Dimaggio's projected results on the year for 216 PA's are the sum of both as follows:

1B-43.49
2B-9.66
3B-1.53
HR-12.42
BB-19
Outs-129.92
PA's-216

Step 3- Project Dimaggio's numbers over a full season

you must estimate how many PA's Dimaggio will get on the season. If you are just trying to evaluate who has the best card then keep this number constant. In reality injuries are factored in here and you should use actual estimated PA's (I have other formula's to compute this based on injury risk, but that is again another topic lets just evaluate the card here).

I will assume 650 PA's for Dimaggio over the season so his numbers become:

1B-131
2B-29
3B-5
HR-37
BB-57
Outs-391
PA's-650

Step 4- Remove 1/9th of the baseline team averages and replace with the hitters numbers

The baseline numbers I used are in an earlier post in this thread, multiply all of those by 8/9 and then add in Dimaggio's 650 PA's to get the following results:

1B-1,008
2B-235
3B-49
HR-203
BB-580
Outs-4045
PA's-6,120

Step 5- Normalize outs

Outs should be a constant on a season regardless of who is in the lineup, therefore because Dimaggio is so good in his slot the rest of the team will also bat more.

We have to replace 66 outs using our average players we get the following results:

1B-16
2B-4
3B-1
HR-3
BB-9
Outs-66
PA's-98

Once we add in these PA's we have are final team totals with Joe Dimaggio replacing average PA's. The final numbers are:

1B-1,024
2B-239
3B-49
HR-206
BB-589
Outs-4111
PA's-6,219
Avg-.270
OBP-.339
Slg-.439
TB's-2,473

Step 6 compute runs created

RC= OBP*TB so .339*2,473= 838

So a team with all average players scored 779 runs. When I replace PA's with Joe Dimaggio the teams runs goes up to 838, so the value of Dimaggio to the average team is 838-779 = 59 runs created.

Hope that makes some sense and helps,

Bbrool
 
Mike_JAll-Star
Posted February 09, 2003 11:35 AM
Lucky, I'm not trying to make you look bad- far from it, because I agree with almost everything you've posted. And you are absolutely right that Durazo had a better OB+SLG against righties than Guerrero (and isn't it amazing how SOM manages to reflect performance so well numerically- why the game is so fascinating).

What I am questioning are certain sabermetric concepts/values surrounding walks in relation to slugging. Doing a brief search around the internet, I find a 2002 Bill James RC formula that adds walks and HBP to TB's, but multiplies by .24. What this says to me is that, in terms of the slugging aspect of RC, a walk is only worth 1/4th of a single. At least according to Bill James's 2 millionth variation of the RC formula.

I guess the question is, should we look at the cumulative impact of TB's on a teams' offense, or at a "percentage" impact. In other words, Erubiel Durazo, over the same amount of plate appearances versus righties, will have a better slugging % than someone like Guerrero or ARod. But his TB numbers will be far, far lower- Meaning that the actual impact of his abilty to advance runners, supposedly measured by SLG%, will in practical application be far smaller- unless a huge percentage of his walks move guys over a base.

I just think that high-power, high-walk guys are iffy when it comes sabermetrics. Looking at last years team hitting stats, Bonds and the Giants leap out. Bonds, clearly, inflates the team OB and slugging for the Giants. But when you look at team runs scored, the Giants, with a higher team OB and SLG, finish far below the free-swinging, anti-sabermetric concept Angels, in terms of runs scored.

I'm stepping out of this thread now, because with all the formulas here, my head is spinning.
 
BbroolAll-Star
Posted February 09, 2003 12:08 PM
A good site to check is stathead.com. There are articles on this page that say exactly what your suggesting MikeJ, that the RC formula's do not work well for superstar performers, which is why I use a replacement value as suggested by this site.

I may go back and complicate my RC formula a bit to account for a few things, but I think the basic formula does a pretty good job of estimating team runs, which is what I am trying to accomplish with replacement value.

Bbrool

luckymanAll-Star
Posted February 10, 2003 12:17 AM
No prob, Mike J, I was just joking!!!

You know, Mike J, you are with the majority in this forum. Everyone, from JoetheJet to qksilver, have told me, at some times, that I overestimated walks. And as you rightly pointed out, this opposition of point of view only reflects the opposition that exists, on one hand, the fans and journalists who really know the game, and on the other hand, sabermetricians, who have constantly gave more importance to walks.

About James' formula, with all respects, I believe you misunderstood the logic of his formula when you conclude that the formula implies walks are worth 1/4th of a single. The logic of the formula (that I pasted under) is to multiply A X B, A being on-base, and B representing Total Base. So basically he multiplies onb X TB. But, according to James, when you do onb X TB, you UNDERESTIMATE the value of walks. So that why he adds up .24*(W+HBP) in the B term in order to compensate this underestimation (with term A giving the same value for walks and hits).

So James is no different than the other sabermetricians in showing again that walks are underestimated in traditional formula.

But obviously this doesn't mean at all that walks are more important than singles. The lesson is rather that on-base is the most important variable for offense. Angels were sixth for ON-base, hence they finished fifth for runs scored.


RUNS CREATED = ((A+2.4*C)*(B+3*C))/(9*C) -0.9*C

A=Hits+Walks+Hit by Pitch-Caught Stealing-GIDP
B=Total Bases+.26*(Walks+Hit by Pitch)+.53*(Sacrifice Flies+Sacrifice Hits)+.64*(Stolen Bases)-.03*(Strikeouts)
C=At Bats+Walks+Hit by Pitch+Sacrifice Flies+Sacrifice Hits

Thanks Bbrool, for your example, and now that I understand it, I believe your system is a very legitimate and well designed way to measure the performance of a player.

Coming back to your example, I prefer to say that Dimaggio contributes for 59 more RC than a regular player. A regular player is worth 86.56 RC. So Dimaggio is worth 145.6 RC. If you apply the linear weight formula, you will have that Dimaggio is worth 145.5 RC, showing that the two system are probably much equivalent.
 
BbroolAll-Star
Posted February 10, 2003 12:29 AM
Luckyman,

I am onboard with you about walks. I continually try to explain to friends that OBP is the most important factor in offense, but they keep wanting to build models that emphasize TB's.

I thought about complicating my RC formula for the more detailed version you sited, but I think on an average team basis the original OBP*TB still does a good job of estimating runs scored yet keeps the computations simple enough for me to work with.

Using the RC formula I have also estimated ERA's for almost every pitcher in the All Time great league. I still have alot of work to do to get the offensive side pulled together yet though. I am always up for talking stats. I need to sit down with the 2002 disk and drop some formula's in so I will be ready for next season as well.

Bbrool
 
MajorDeeganProfessional
Posted February 25, 2003 11:52 PM
Luckyman,

I just have a couple of questions:

1. How do you account for BP effect HRs in your ballpark adjustment when dealing with Switch hitters?

2. Do you take into account injury risk? I know it isnt listed on the ratings disk, but it is an important consideration when gauging overall player value. After all, RC=0 when the guy isnt playing.

Just curious.

F-C
 
luckymanAll-Star
Posted February 26, 2003 12:12 AM
1. How do you account for BP effect HRs in your ballpark adjustment when dealing with Switch hitters?

Very easy. I remind you that my basic formula for offense, adjusted for ballparks, is:

basic RC = 0.6*W + ONB + TB + (k/4)*BP

where BP is the number of chances of ballpark homeruns
and k is the homerun rating for the stadium given the side of the hitter.

Example, In Miller:
k=17 for lefties
k=11 for righties

So Bernie Williams vs rhp has an adjustment of:
(17/4)*7 = 29.75 RC

And vs lhp:
(11/4)*2= 5.5 RC


2. Do you take into account injury risk? I know it isnt listed on the ratings disk, but it is an important consideration when gauging overall player value. After all, RC=0 when the guy isnt playing.

Yeah, but it's kind of approximate. By playing, I realized that guys like Griffey, Drew (400 AB with injury on 5 or 9) would play around 120 games and miss 40 games, so basically they would miss 10 games X chances of injury.

Hence for Griffey, I assume that his value is:

75% X his value + 25% value of a replacement player.

Guys with over 600 PA, I assume no injury.
 
luckymanAll-Star
Posted February 26, 2003 12:28 AM
By doing so:

Edmonds 106 RC (No wonder why he's an under
Cameron 101 RC (the surprise one: he'S thought as an underachiever only because people keeps looking at his batting average)
Williams 96 RC (But don't forget I didn't factor arm rating)
Beltran 90 RC
Nixon 83 RC (if he played vs lhp, but gets inflated to 94 when assign with Collier)
Hunter 82 RC
AJones 80 RC
Griffey 79 RC (injury hurts, but still pass in front of Hunter and Jones in the right ball park)
Wilson 72 RC (easily in my top 10, but never gets drafted nor selected in a 12-team league; go figure)
Kotsay 69 RC
 
MajorDeeganProfessional
Posted February 26, 2003 09:30 PM
Thanks Lucky...I got a brain cramp reading this thread which is why I dumbed out with the Switch Hitter question. I like the way you approximate injury risk.

My final question deals with 2 things, catcher ratings and throwing arms in the outfield. Im pretty much a spaz when it comes to math, so I have no clue on how to incorporate either. When Im selecting positions such as C or CF, I try to do well grabbing a cannon for an arm and 1 or 2 range. My question really revolves around, are either, catcher ratings and throwing arm rating statistically significant when determining player value.
 
PJ AxelssonAll-Star
Posted February 26, 2003 10:53 PM
Please don't take offense to this, I am interested in the math and admittedly stopped reading this thread part of the way through, but have the world series champions for the past so many years panned out with respect to all this sabermetric analysis?

Anaheim? No way were they favored mathematically to win. How about the Yankees?

I apologize for jumping in on your thread, feel free to ignore me, but I although my instinct is to ignore the new math stuff, I am very curious to hear if it is an indication of past performance or an accurate predictor of future performance.

PostPosted: Tue Nov 21, 2006 5:08 pm
by cummings2
well. that's all I got. hope it works for some of you.

PostPosted: Tue Nov 21, 2006 7:50 pm
by wahlerpc
thanks for the post Cummings. I've been an avid fantasy guy for a while, but this is my first shot at the strat-o-matic game...I'm sure this'll help.