Archive of SportingNews.com's old SOM Baseball forum

by **dkoulomzin** » Wed Apr 05, 2006 2:22 pm

All,

I've been playing 80s for about a year, and for me the big appeal is guessing the mystery card. A while ago I wrote a little computer program to help me predict what year a guy was in based on statistical analysis. (For those of you so inclined, it uses Bayes's rule to predict the relative likelihood of each year based on the probability that that year yields the results you've gotten for one stat) But I'm lazy, so I've just been using the guys actual stats (not the dice charts) to figure out what performance is expected.

Has anyone managed to do anything more sophisticated along these lines? To what end? Do you want to share?

by **Semper Gumby** » Wed Apr 05, 2006 8:45 pm

You may wish to check with Yount and Sean about this area.

I have a crude database that relies on TSN actual results versus actual splits.

However, I only have one season done. Ain't worth anything until I can add a few more season (another 6 weeks out).

I recall Yount or Sean had a post in early 2006 about downloading their report.

I suspect they are running the same system as my own.

Otherwise, a few sites have something you can use and it might be on the Baseball Factory site.

by **YountFan** » Wed Apr 05, 2006 9:17 pm

I track the stats in 27 game deltas. You don't need a fancy program to figure out the card. You can get close enough with 27 game deltas. The problem is the dice don't roll in a smooth way and that can skew the stats. Also you need a large enough sample size. BJS has theorys on this but the rule of thums is 100 ab and 50 ip, although that is really not enough to account for hot and cold streaks. 27 games is about 100 ab, so after two deltas you know at least how to get the most from your team, if not the excat card. Know the card is not all that important. It is more important to know how to use the start you are getting. If they are bad for a long time, look for better options, but some times good is good enough.

by **dkoulomzin** » Wed Apr 05, 2006 11:27 pm

Guys, do you just enter the stats every week, or do you have software that gets it for you?

My program takes into account the number of samples you have... e.g., with 1 AB, the program will give say 20.01% odds to one season, and 19.9% to another, etc. With 500 ABs, it gives a much higher confidence guess. Unfortunately, its guessing based on the actual stats from each of the five candidate years... its not actually going off of the probability that a certain card produces a certain result in a particular stadium vs the league average pitcher. So its not perfect... I just use it as a sanity check... when Dale Murphy starts the season batting .225 after 50 ABs, the odds are still ~50% that he's worth keeping.

But if someone has software to suck down strat results and put them in a spread sheet or something, that would definitely be really useful.

-Dan

by **YountFan** » Thu Apr 06, 2006 5:38 am

No automation here. I like typing them in because it give you time to think and analyze.

by **yak1407** » Thu Apr 06, 2006 12:51 pm

While a reliable predictor may work most of the time, it's not infallable.
Here's some recent performance for me:
Simulated
Whitaker, L. '85
637 98 23 14 75 69 0 13 .243 .377 .318 .695
Real
609 120 29 21 73 80 6 4 .279 .456 .363 .819
Simulated
Jackson, R. '82
484 85 19 33 75 68 0 1 .234 .486 .330 .816
Real
530 146 17 39 101 85 4 5 .275 .532 .376 .908

There is a margin of =/-40 for BA, but obviously the variation is much greater for slg. and OBP
However, based on these numbers, I would have guessed it was the '84 Jackson, his worst season.
Whitaker is even more pronounced as looking at his stats it could have any of three other seasons than the one it was.
However, there may not be a real reason to know which card you have. Using Whitaker as an example, with the exception of his 1983 .300+ plus BA, all of his stats fall within the same range.
More and more I think that production, actual production, for your team should be the key consideration.
Here's the question, if an injury revealed to you that you had a player's worst year but his production in a key area was 40 per cent higher than you would expect, would you get rid of him?
If his card is revealed and it's a bad card and he's having a bad year, of course you'll get rid of him.
But otherwise??
Now what should I do with Dave Parker, whose batting and slugging after almost 40 ABs?

by **MARKWEAVER** » Thu Apr 06, 2006 1:26 pm

My problem with doing things like this (and with using any kind of "probability" predictions in *real* baseball) is that at bats are not independent and identically distributed (IID). Well, since this is a dice game, I guess ABs are independent (but they're not in the real game if you believe in streaks), but they're definitely not identically distributed (e.g., facing Clemens is a whole lot different than facing Gale). So, saying something like batting average estimates have a margin of error of +/- 40 is not really accurate since it assumes that ABs are IID.

That being said, I tend to just use Euclidean difference between the observed stats and the expected stats for each year. It's crude, but it gets it exactly right about 40% of the time and it gets me "close" about 75% of the time (e.g., if a guy's in his second best year, I might guess his best or 3rd best years). I would guess that it probably works about as good as Bayes'. I have only completed one year, but I have had enough injuries on another team that I know almost every batter's year (sigh), and my system is doing much better this time around. Obviously it does better for batters than for pitchers given the preponderance of hitting!

by **yak1407** » Thu Apr 06, 2006 1:51 pm

Not being a mathematician, I'm not sure about your point about IID.
However, it does not take much to skew statistics.
Take Dave Parker, his aggregate real life average is .300, he gets 3 hits for every 10 ABs, not plate appearance.
Right now, he has four hits in 40 ABs.
In order to get up to .300, he needs 11 his in his next 10 ABs, a statistical improbability.
So let's give him 20 ABs to do it. Now needs 16 hits in 20 ABs.
A more reasonable expectation might be 26 hits in his next 60 ABs, but than means he needs to hit .460 over that period.
If he hits .300 over his next 60 ABs, his aggregate average, he gets 18 hits and has a batting average of .220 after the magical 100 ABs mark and I'm looking at cutting him even though for over half of those ABs he performed right at his median performance.
The only thing that would change my expectations for Parker at that point is if I learned that I had say his '79 card. Then, even though he's only hitting .220, I'm expecting him to bat .310 and will probably stay with him.
But my final decision after 100 ABs will tend to go more with his actual performance.
But what I should really be looking at is his last 10 games and his potential because if the law of averages works out, he's have some hot streaks to get up to his normal level.
But, as my Lou Whitaker and Reggie Jackson examples show, that doesn't necessarily happen. If I had assessed them on their actual performane, I may have gone looking for added performance from other players.

by **MARKWEAVER** » Thu Apr 06, 2006 2:11 pm

[quote:9faf4aa2b4="yak1407"]Not being a mathematician, I'm not sure about your point about IID.
However, it does not take much to skew statistics.
Take Dave Parker, his aggregate real life average is .300, he gets 3 hits for every 10 ABs, not plate appearance.[/quote:9faf4aa2b4]

This is actually a good example for non-IID... you don't really expect Dave Parker to get 3 hits in any particular 10 ABs. For example, you would expect something different if those 10 ABs were against Roger Clemens or Nolan Ryan than you would if they were again Rich Gale or Tim Lollar. A "batting average" is not really a typical straight-up average or percentage (like flipping a fair coin, where you expect heads 50% of the time), it's actually a weighted average, weighted over all of the pitchers that Parker happened to have ABs against that year.

So, to use your example, if I knew I had Parker's '79 card and he happened to be under-performing through his first 100 ABs, I'd take a look at how many of those ABs came against tough pitchers, especially tough lefties. Bottom line, there is no way in Hell that I'd ever drop Parker's '79 (or '78 or '85) card since future ABs are completely independent of past ABs in this fantasy game.

by **yak1407** » Thu Apr 06, 2006 2:44 pm

Which really comes back to the point about judging how a player is performing for you, not which card you have.
IID then means each at bat is a separate adventure, and a batter has a, in Parker's case, 30 per cent chance off his own card, depending on which year it is, and a whatever chance off the pitcher's card depending on who the pitcher is. of getting a hit.
Since each AB is independent. A batter could hypothetically, over 600 ABs, fail to get a hit or get a hit everytime, regardless who the batter is.
The roll of the dice should balance things out.
But die rolls are random, so if a batter consistently has poor rolls off his card and off the pitcher's card his performance will suffer and vice versa.

Archive of SportingNews.com's old SOM Baseball forum

Stats stats stats... any geeks want to share tricks?

Stats stats stats... any geeks want to share tricks?

Who is online