Tuesday, May 31, 2016

Calculating ELO for Wrestling

1. Collect Data

Let's use New Japan 1998 for our data-set.

In this example, I'm going to use CageMatch.net as my source though the process of calculating ELO scores for wrestlers once you've distilled events into individual match/personal results would be the same regardless of the source of data.

Here is a copy of raw NJPW 1998 data set.

2. Organize Data

Format data so that you have fed (New Japan Pro Wrestling), event ((23.12.1998) NJPW Battle X-Mas 1998 @ Korakuen Hall in Tokyo, Japan) and results (Yuji Nagata defeats Shinya Makabe (6:42)... Kazuyuki Fujita defeats Akitoshi Saito (6:32)...) in separate columns on the same row.

Duplicate the results column and replace the match division characters (...) with a single-character delimiter (such as "@"). Note that if there is a "...." that should be translated as ".@" not "@." (This is caused when the final person before the match delimiter has a period at the end of their name - i.e. Lizmark Jr.)

Perform a "Text to Columns" transformation on the duplicated results column (with the @ delimiter) and label each column of the individual matches. You may want to apply the TRIM function to the Matches to eliminate any extraneous spaces.

3. Individual Match Rows

Rearrange and append columns so each row has only one match. Label the match # and total number of matches on the show and extract the date from the event column. Create a order column which puts all the matches in order from beginning of the year to end of the year by match number.

4. Split into Teams

Find the keyword that separates the teams (i.e. "defeats" or "vs.") and separate into the "winning team" (team AA) and the "losing team(s)" (team BB). Be sure to break off any time marks from team BB and any stipulation descriptions from team AA. Eliminate special marks like (c).

5. Clean up Teams

Create a new row for each team and note whether they were originally team AA or team BB. Eliminate any match descriptors like "by DQ" or "[Round One]".  Replace team names (nWo Japan) with the team members (Keiji Muto & nWo Sting).

Change teams into people similar to how you changed results into individual matches. Convert "&" and "," into "@" and use a text-to-columns function.

6. Individual People

Move each person to their own row. Add a result column for "win/loss/draw" and if needed go through the process of combining multiple names into a final name column (Great Muta->Keiji Muto).

IMPORTANT: Make sure there isn't any matches that have the same person on both sides. This often happens when you have Unknown in the list, but also can happen during doppleganger matches (Sin Cara vs Sin Cara) or straight-out mistakes such as:

(14.04.1998) NJPW Battle Line Kyushu 1998 - Tag 2 @ in Kagoshima, Japan
Kensuke Sasaki, Tadao Yasuda & Tatsumi Fujinami defeat Junji Hirata, Shinya Hashimoto & Tadao Yasuda (12:17)

These will result in a circular reference.

ELO PROCESS

The key to ELO is having two sheets with the same information in different order and being able to reference back & forth quickly.

One sheet is arranged by person in chronological order.  (PEOPLE)
The other sheet is arranged by match in chronological order. (MATCHES)
It's important that the row references for both sheets are available on both sheets.

On (PEOPLE), the ELO for each wrestler starts a given number. Let's say 1600. The final value for each wrestler is based on the row of the match they are in.

On (MATCHES), you need to add up all of the ELO for each team (let's assume A vs B matches to start simple) and divide by total number of people so you have an adjusted number you can compare (even when it's a handicap match). Once you have the totals for each wrestler (looking up the starting ELO value from the PEOPLE tab based on the row #), you calculate the predicted outcome:

A wins = 10^(ELO_A/400) / (10^(ELO_A/400) + 10^(ELO_B/400) )

The k-value should change by the importance of the match. If you want Tokyo Dome shows (or WrestleMania) to be higher, you should increase the value. Likewise, you could increase the value for title matches or title changes. Often, I will set House Shows to something like 4 and TV shows at 16 and PPVs at 32. You could also have stipulations or even match lenth change the k-value.

The actual value is a number between 0 and 1. 1 = team A won. 0.5 = draw. If you want, you can assigned quarter points for outcomes like "win via DQ" (0.75).

The difference between the (actual values and prediction) x k-value = points change.

Assume you have two wrestlers: A vs B at the Tokyo Dome.
A = ELO 2000
B = ELO 1600

Predicted A wins = 10^(2000/400) / (10^(2000/400) + 10^(1600/400) ) = 10^5 / (10^5 + 10^4) = 0.90909090 (91% chance A wins; 9% chance B wins).

A wins. A point change = (1-.91) x 32 = +2.9 points
B point change = (0-.09) x 32 = -2.9 points

New A ELO = 2000 + 2.9 = 2002.9 points
New B ELO = 1600 - 2.9 = 1597.1 points

See also: https://sites.google.com/site/chrisharrington/mookieghana-prowrestlingstatistics/wwf_elo_rankings

You may want to have different ELO ratings for tag & singles matches.

See EXCEL file at: https://sites.google.com/site/eloexample/elo_excel_example

No comments: