Graphing is Phun; 08/28/09 Predictive Nature of WAR
The baseball Pythagren Therom (located below) is one of the first formula or statistical measures that any bright eyed, new SABR’ist is bound to found their way too. As almost any one reading this blog would know the formula predicts team winning percentage based on the runs allowed and runs scored by a given team. By studying the things the team as a whole can control, runs scored and allowed it allows almost any analyst with a calculator to quickly decide which teams have been lucky and unlucky in a simplistic sense based on their predicted winning percentage and their actual winning percentage.
(Runs Scored)^2 / [(Runs Scored)^2 + (Runs Allowed)^2] = Predicted Win %
The graph above compares those two things (Wins %/Predicted Win%) in a basic scatter chart of every team over the course of the last 7 full seasons (2002-2008). As expected the square of the correlation coefficent .8778 provides us with assurance that the pythagreon formula of baseball is in fact a useful and well formed predictive tool for team success.
At the present time WAR is considered the most telling statistic of individual performance available to the general public. However it seems as SABR guided baseball fans we typically only use WAR to discuss individual performance, compare one player to another, debate post season awards, argue who was greater Clemente or Robinson, etc. etc. Yet WAR in its truest sense is a measure of wins above a replacement player and wins as we know are a team accomplishment. So it would seem plausible that by studying the total WAR of every team over the course a season we whould be able to accurately predict the Win/Loss record and the Run Differential of said team. The following graphs are an attempt at doing just that. As was the case above the final Winning percentage of all 30 teams over the course of 7 seasons (2002-2008) will be plotted against the corresponding total team WAR in that season for that team, also we will look at total team WAR against the predicted winning percentage through the baseball pythagreon formula.
As you will notice when studying the first chart the square of the correlation coefficent does not come in at the same level as on the Winning percentage V Predicted Winning percentage plot, however the square of the correlation coefficent coming out to .77 still gives satisfactory assurances of the predictive nature of WAR in terms of team winning percentage. The second chart produces an R^2 of .8325 an even better indicator that team WAR can predict run differential of a team over the course of a season.
What exactly does comparing the sum of all individual achievements in comparison to overall team success actually tell us? To be honest, I do not know exactly. However, it seems to reassure those baseball analysts that adhere to the fact that baseball is an individual sport parading itself as a team sport. As the statistical noise (especially in the defensive metrics) is refined and hopefully someday removed this type of analysis should continue to grow in strength.
Like I said in here, the results aren’t great but with the RD way of looking at a teams success being a pretty widely accepted way of doing things and the WAR V winning % having a pretty close R^2 over 7 years worth of data, I think its safe to say that teams with the guys doing to most individually are going to succeed more often then not then over those teams that play as a “team”. So to sum it up in a few words. Screw comradery, give me talent and production in baseball. In the end this might seem like a very basic concept, that good individual performance ends in good team results. But is it really? Further work needs to be done on this subject, weighting pitching andposition player WAR, figuring out which plays a larger role in overall team success could be incredibly useful in deducing what teams are bound to florish or fail going forward.