From: "Steve Wrathell" Subject: Essay: WHY COMPUTER RANKINGS ARE BETTER THAN POLLS Date: Wed, 21 Aug 2002 02:54:23 +0000 CPA Rankings - NCAA Football & NFL (c) 2002 Steven Wrathell, CPA, PC Go to my system's description page for links to my other pages and for other links at http://www.cae.wisc.edu/~dwilson/rsfc/rate/wrathell.html WHY COMPUTER RANKINGS ARE BETTER THAN POLLS Note: This essay is not yet complete, but there's enough here to still make some interesting reading. I'll finish it when time permits. ---------- This essay also addresses Bob Kirlin’s discussions on faking computer football rankings, arguments that contemporaneous polls are superior to retro-rankings, and criticisms of retro-rankings by Kenneth Massey and James Howell. POLL FAILURE IN 1973 Although I first created a computer-rating system for football in 1997, I actually had designed a very primitive mathematical ranking system in 1974, at the age of 14. I had a little book with all of the 1973 college football scores, no computer, no calculator, and a lot of pro-Michigan bias. All of my math was done by hand and my calculations were based on comparing scores against common opponents. Back in 1973, there were few bowls and the Rose Bowl contract required that the Big Ten and the Pac-8 could not allow any of their teams to be in any bowl, other than to have their champions in the Rose Bowl. The regular season climaxed with Ohio State meeting Michigan in Ann Arbor. Both teams were undefeated and untied. The game finished as a 10-10 tie. The Big Ten’s tie breaking rules called for a vote by the teams to decide who would go to the Rose Bowl. A 5-5 vote would send Michigan to Pasadena, since OSU had been there more recently. In a stunning move, Michigan State University voted in favor of OSU. As both U-M and MSU are, in reality, owned by the taxpayers of the State of Michigan, it was the equivalent of the government asking the automotive divisions whether Ford or Chevy should get a billion dollar contract and having Pontiac vote in favor of Ford, over its GM brother. Before the game, OSU was ranked #1 in the polls and U-M was #4. Afterwards, OSU became #4 and U-M dropped to #6. I was outraged. How could this be? If Michigan was expected to lose, but yet tied the best team, shouldn’t their ranking improve? The new #1, Alabama, met #3, Notre Dame, in the Sugar Bowl. The sports media announced that this would be the National Championship game and that a trophy would be given to the winner, no matter what else happened. As it turned out, #3 Notre Dame squeaked by #1 Alabama by 24-23, and was crowned champion. The performance of the #2 team was irrelevant. Also irrelevant was the performance of the previous #1, Ohio State, which clobbered USC 42-21. Yet, as far as the championship, the AP poll cared about only one game: the Sugar Bowl. The UPI hadn’t yet begun to take its final poll after the bowls. The "logic" of the AP was that #3 could become #1, but #2 never could do so. The truth about how good the teams were would not be permitted to get into the way of a strange set of rules that the voters had devised. I no longer have a copy of those rankings, which were only for the top teams. All I remember of my results were that Ohio State was #1 and that Michigan was #7 (and that I was upset that U-M was that low). Another contender, Oklahoma was 10-0-1, having beaten highly-ranked Nebraska, but having tied USC. Now one can go to the web for computer retro-rankings to see how today’s systems would rank the teams. For that year, 1973, the systems rank the selected teams below, as follows: Dun How Mas Rth Wil Oklahoma 1 1 1 4 4 Ohio St 4 5 3 1 2 Notre Dame 2 2 2 3 1 Alabama 3 3 6 5 6 Michigan 6 7 7 2 3 The systems are referred to by their abbreviations used in the Massey College Football Comparison. This brief summary was derived from the retro-rankings pages (accessible thru David Wilson’s site: click on the Parent Directory of my home/description page) and are copyrighted, respectively, by the Dunkel, Howell, Massey, Rothman, and Wilson systems. Note that only 1 of the 5 systems agreed with the AP selection of Notre Dame as #1, and the other 4 systems picked either Oklahoma or Ohio State (both of which had a tie in their record). THE THEORIES OF THE POLL VOTERS; B.Y.U., 1984; & NEBRASKA-MICHIGAN, 1997 In fact, the polls seem to have a "system" by which teams are generally listed in order of having the fewest losses (or ties, back in the pre-OT years). The polls almost always lower a team’s ranking when it loses (or, before OT, tied), even if the loss occurred against a higher-rated team. On the other hand, computer systems will often reward a team for a close loss against a superior team. While, today, the BCS, thru its formula, has been trying to encourage teams to play stronger schedules, the poll tradition is to penalize teams that play too tough of a schedule. Theoreticly, if a team played the 11 highest rated teams (other than itself), and lost to the top 6, but beat #7 thru 11, it would not only not be ranked by the polls, it would be bowl-ineligible, at 5-6. Yet, most computer systems would probably rank that team as the seventh best team in the country (where, in this example, it should be). The polls’ system of ranking undefeated teams ahead of one-loss teams often occurs without giving sufficient consideration to schedule strength. For example, in 1984, Brigham Young University was voted to be the #1 team by the polls. BYU was the only undefeated team that year, going into the bowls, although there were good teams with one loss which played much tougher schedules. BYU had won a few blowout and won several reasonably close games, yet they never played a top team. BYU could have silenced its critics by playing a top team in their bowl. In fact, in 1981, Clemson did just that. Clemson was deemed to be a questionable #1, but played highly-rated, and favored, Nebraska in its bowl and won, earning its ranking. But, no, BUY decided to play Michigan in its bowl. It was a down year for U-M. Although the Wolverines were one of the best 6-5 teams, they still did have 5 losses going into the bowl, and were hardly the proper challenge for a team that wanted to claim a national title. BYU beat U-M 24-17, and the polls rewarded that great 7-point victory over a then-6-6 team with the #1 ranking. Yet, did BYU actually deserve it? Retro-rankings for BYU in 1984 tend to dispute BYU’s #1 status. Only Howell rated them #1. Their respective rankings per Dunkel, Massey, Rothman, and Wilson were #’s 4, 6, 3, and 2. Even more examples of bad poll results are listed in essays (by Mark Hopkins, "Defense of Howell and Massey Retro-Rankings," and by Eric Sandage, Response to the Critique of Bob Kirlin." The polls GENERALLY have adopted a "rule" that a #1 team keeps its ranking, no matter how poorly it has played, or how well #2 or #3 has played, as long as #1 wins its games. Is this right? In 1997, Michigan fans would say, "yes," while Nebraska fans would say, "no." Michigan entered its bowl as #1 and #2 Nebraska was the only other major undefeated team. Michigan beat Washington State, but Nebraska blew out Tennessee. The polls split: AP crowned Michigan and USA-T/ESPN crowned Nebraska. Yet most computer systems, including mine, ranked Nebraska as number one. THE DIFFICULTY IN RANKING COLLEGE FOOTBALL TEAMS The problem in ranking college football teams is that the teams play only 11 or 12 regular season games out of their 116 (or so) Division 1A opponents. Their schedules are quite uneven in quality. Some will play teams from outside of Division 1A, too. Thus, the win-loss records of teams are not comparable and can be quite deceiving. In most of the major pro sports leagues, NHL, NBA, MLB, etc., every team will play every other team in its league at least a few times. In those leagues, and in the NFL, teams will play divisional rivals more often, but the difference in divisions’ strength is minor, compared to college football (unless you think the SEC and Sunbelt Conferences are about even). In the NFL, with a 16-game schedule in a 31-team league, the schedule-strengths are far more consistent than in college football. NCAA Division 1A has almost 4 times as many teams as the aforementioned pro leagues. An astonishing 50 (of 116) Division 1A teams will play in bowl games in the 2001-2002 season. The BCS or poll rankings will have an effect on which teams will be in which bowls, which vary greatly in the monetary payout to the teams. Yet, as there were 641 games between Division 1A teams, in 2000 (this excludes games against 1AA teams), can the coaches and sports writers adequately analyze each game. Some suggest that the pollsters get greater information about the teams than computer rankers do, because they watch the games on TV. How many of the pollsters watched all 641 games? There are sometimes more than 50 1A games in a week! If a top 25 team plays in a game in Hawaii that ends at 3 AM, Eastern time, has the pollster adequately considered how that affects those teams and every prior opponent those teams had -- all in time for the Sunday morning vote? Yet, the computer systems take EVERY game between its rated teams into account. Some systems even rate the teams in Division 1AA, and others even include Divisions 2 and 3 and the NAIA. In 2000’s Miami-FL vs. FSU controversy, most of the press argued that the only game to consider in the entire NCAA season was Miami’s 3-point victory (in Miami) over Florida State. But using the EXACT same logic, Washington fans argueed that the only game to consider was their victory (in Washington) over Miami. Of course, Washington had lost to Oregon, but Husky fans thought that game should be ignored. The only thing the media and fans (except in Tallahassee, FL) could agree on was that the 641 games could be set aside and only 1 or 2 games of the season should even count. The villans were the computer systems that considered ALL 641 (or more) games. Why? The computer results were based on TRUE logic, considered ALL games, had NO personal biases, and did not consider the emotional state of Miami fans. Although the Miami fans aren’t very good at punching holes in ballot cards (or limit their votes to just one Presidential candidate), they were able to let the world hear of their frustration about the fact that the computers put FSU into the National Championship game. Wasn’t Miami’s 47-point win over 1AA McNeese State more impressive than FSU’s 47-point win over highly-rated Clemson? Somehow, the computers thought beating Clemson was more impressive than beating a 1AA team. Where’s the political correctness in that? Those heartless computers’ "poor" programming did not include factors to weigh Miami fans’ emotions above lesser factors, such as TRUTH. Even the BCS-devised computer system (yes, it is a system), which is the sum of losses and the SOS factor, put FSU ahead of Miami. ------------ The rest of the essay will have to wait until I have the time to type it up, etc. -------------- Note, for some old scores used in this essay, I obtained the data from James Howell’s site. - Steve Wrathell, CPA Rankings swrathell@yahoo.com