Pythagorean Expectation
Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs
they scored and allowed. Comparing a team's actual and Pythagorean
winning percentage can be used to evaluate how "lucky" or "clutch" that
team was. The term is derived from the formula's resemblance to the Pythagorean theorem.
The basic formula is:

where Win% is the winning percentage
generated by the formula. The expected number of wins would be the
expected winning percentage multiplied by the number of games played.
Win shares is the name of the metric James invented. It considers statistics for players, in the context of their team, and assigns a single number to each player for his contributions for the year. All pitching, hitting and defensive contributions by the player are taken into account. Statistics are adjusted for park, league and era.
Empirical origin
Empirically, this formula correlates fairly well with how baseball
teams actually perform, although an exponent of 1.81 is slightly more
accurate. This correlation is one justification for using runs
as a unit of measurement for player performance. Efforts have been made
to find the ideal exponent for the formula, the most widely known being
the Pythagenport formula[1] developed by Clay Davenport of Baseball Prospectus (1.5 log((r + ra)/g) + 0.45) and the less well known but equally (if not more) effective Pythagenpat formula ((r + ra)/g)0.287), developed by David Smyth.[2] Davenport expressed his support for the latter of the two, saying:
After further review, I (Clay) have come to the conclusion that the
so-called Smyth/Patriot method, aka Pythagenpat, is a better fit. In
that, X=((rs+ra)/g)^.285, although there is some wiggle room for
disagreement in the exponent. Anyway, that equation is simpler, more
elegant, and gets the better answer over a wider range of runs scored
than Pythagenport, including the mandatory value of 1 at 1 rpg.[3]
These formulas are only necessary when dealing with extreme
situations in which the average amount of runs scored per game is
either very high or very low. For most situations, simply squaring each
variable yields accurate results.
There are, some systematic statistical deviations between actual
winning percentage and expected winning percentage, which include bullpen quality and luck. In addition, the formula tends to regress toward the mean,
as teams that win a lot of games tend to be underrepresented by the
formula (meaning they "should" have won fewer games), and teams that
lose a lot of games tend to be overrepresented (they "should" have won
more).
"Second-order" and "third-order" wins
In their Adjusted Standings Report, Baseball Prospectus
refers to different "orders" of wins for a team. The basic order of
wins is simply the number of games they have won. However, because a
team's record may not reflect its true talent due to luck, different
measures of a team's talent were developed.
First-order wins, based on pure run differential, are the number of
expected wins generated by the "pythagenport" formula (see above). In
addition, to further filter out the distortions of luck, sabermetricians can also calculate a team's expected runs scored and allowed via a runs created-type equation (the most accurate at the team level being Base Runs).
These formulas result in the team's expected number of runs given their
total singles, doubles, walks, etc., which helps to eliminate the luck
factor of the order in which the team's hits and walks came within an
inning.
By plugging these expected runs scored and allowed into the
pythagorean formula, one can generate second-order wins, the number of
wins a team deserves based on the number of runs they should have
scored and allowed given their component offensive and defensive
statistics. Third-order wins are second-order wins that have been
adjusted for strength of schedule (the quality of the opponent's
pitching and hitting). Second- and third-order winning percentage has
been shown to predict future actual team winning percentage better than
both actual winning percentage and first-order winning percentage.
Theoretical explanation
Initially the correlation between the formula and actual winning
percentage was simply an experimental observation; however, Professor
Steven J. Miller provided a statistical derivation of the formula under some assumptions about baseball games: if runs for each team follow a Weibull distribution and the runs scored and allowed per game are statistically independent, then the formula gives the probability of winning.[4]
Use in basketball
When noted basketball analyst Dean Oliver applied James' pythagorean theory to his own sport, the result was similar, except for the exponents:

Another noted basketball statistician, John Hollinger, uses a similar pythagorean formula except with 16.5 as the exponent.
See also
Notes
External links
This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia Encyclopedia article "Pythagorean Expectation"
|