RANK The team's ranking out of all teams considered.
TEAM The team's label.
DIVIA An indicator of whether the team is Division I-A.
RFRANK The team's ranking from Component 1 (see below).
RSCORE1 The team's ranking from Component 2 (see below).
RSCORE The team's ranking from Component 3 (see below).
RATING The composite rating obtained from Components 1 through 3.
I have retooled my ranking scheme somewhat from past versions.
It now consists of three different components that together
measure what I think of as a ranking of the quality of the
season each team has had (a "retrodictive" approach).
The three components are:
Component 1:
The first part of my rankings scheme depends on a formula that is based
on the scheme used to rate chess players. Here is an outline of how I
implement it.
I) Get an initial score for each team (say the winning percentage).
II) Update each team's rating as follows:
For each game played,
a) add (1+delta)*(opponent's rating) for a win, and
b) add (1-delta)*(opponent's rating) for a loss,
where delta is a number between 0 and 1.
III) Iterate step II until the ratings for all of the teams stabilize.
If teams compete in a round robin setting, this method typically produces
rankings that agree with the winning percentages (which is appropriate in
this case as each team has an equivalent schedule).
A benefit of this method is that it may be used with extremely unbalanced
schedules and with any number of teams having at least minimal connections
among them.
Component 2:
This part of the ranking scheme is simply the least squares solution that
ranks teams such that the rank order minimizes one measure of "ranking
violation" - having a team be ranked higher than a team it lost to, for
example. This is achieved by using linear regression methods to compute
a rating for each team. This can be done in any statistical package that
will accomodate the number of variables that are needed (equal to the number
of teams.
The model essentially predicts the ordering of the two teams who played
in each game, and averages across all games in such a way that the times
when teams are mis-ordered relative to the game result is minimized (in
a least squares sense).
Component 3:
This component is similar to the second one, except that a transformed
version of the point differential from each game (rather than only
win/loss information) is used to assign a score to each team that
minimizes the sum of squared error between the observed difference and
the predicted difference.
This component attempts to mimic what several others have done. The main
idea is that the scores of the games themselves give information about the
relative "quality" of the teams involved. The way that the ratings for each
team are obtained is based on the following game-specific result:
score difference = 1 * winner's rating
- 1 * loser's rating
+ 0 * (sum of ratings of teams who didn't play in the game)
Since many games are played among the teams being rated, this results in
a system of linear equations. The individual team's ratings can be
estimated via least squares linear regression.
Before getting the ratings from this method, I transform the point
differences to make them better behaved for linear regression purposes
(the "score difference" I use is really abs(difference/7)^0.795). As
with component 2, I fit the ratings via least squares (the only difference
between the two is in component 2, the "score difference" is equal to
1 for every game while in component 3 the "score difference" is equal to
a transformed version of the point difference from the game.
Because the least squares solutions for the second and third components
require that there be at least as many games played as there are teams
(and usually a fair number more), these ratings cannot be obtained in
the early parts of the season.
V. Shane Pankratz / Pankratz.Vernon@mayo.edu