So I need help with a statistical question. It starts off relatively easy, and then I complicate it with two aspects that result in my having no idea how to handle it at all. Let’s start with the easy part. Let’s assume there are two ranked lists, and in the first instance I’ll just do five things in the list:
List One | List Two |
|
|
What I want to know is how much the rankings in list one differ from list two. An easy way to do that (Solution A) is to compare the differences:
- A(L1) to A(L2) = three spots lower i.e. -3
- B = three spots lower i.e. -3
- C = same spot i.e. 0 change
- D = three spots higher i.e. +3
- E = three spots higher i.e. +3
Net result is essentially 0, as it should be…for every displacement in list 1 to list 2, there is a corresponding displacement of another item. In the end, they’ll net out at zero change.
So, the proper statistical technique (Solution B) would be to use nominal values — ignoring the +/- — and ending up with 4 changes of 3 spots and 1 change of 0, for a total of 12 spots of difference over 5 items in the list or an average difference of 2.4. So I could argue that the difference in rankings between list one and list two is about 2.5 spots on average. I’m okay up to that point. Not completely sure what that tells me, but it’s a number. I almost think I’m looking at two separate samples from a pool and calculating their degree of deviation from each other, but not quite since it is a full sample of the whole population (i.e. there are only five items in that example), not a “sample”, so I can’t use sampling methodology to see how different it is from some generic population.
So we come to the two complications…the first complication (call it C1) is of scale. My lists aren’t five items long, they are a 100 items long. I don’t think that complicates it too much, just one of “scope” more or less.
The second complication (C2) is much more insidious…the first list is fully ordered, #1-100. The second list, however, is grouped into five unequally sized tiers. I’ll use a smaller example than 100, just 10 to make it plain, and I’ll reverse them just so it is obvious the lists are different…I’ll also tuck in a third list that is for all intents and purposes identical to List One, just grouped differently:
List One | List Two | List Three |
|
|
|
The obvious choice would be to convert List One or List Two to “match” each other…I could, for example, rank I vs. J in List Two to get a #1 and #2 slot, then F vs. G vs. H to get #3,4,5 (Solution C). However, that would require a lot of subjectivity on my part that isn’t very functional. In my list two example, I & J are basically “tied”, no way to differentiate them further.
I could however decide that, like in a sports competition:
- I & J share rank “1”;
- F,G,H share rank “3”;
- D,E share rank “6”;
- C would have rank “8”; and,
- A & B would have rank “10”.
Seems like a good solution (Solution D), right? It’s the way tournaments do it. The problem is if I apply this technique to List Three, which is virtually identical to List One, just grouped into 5 levels instead of 10, the numbers don’t tell you that (i.e. 1: A,B; 2: C,D,E; 3: F&G; 4: H; 5: I&J). If I do comparisons, I’d end up with a total difference of “A=0, B =1, C=0, D=1, E=2, F=0, G=1, H=0, I=0, J=1” for a total of 6/10 or .6 difference), even though the lists are basically identical.
A second alternative (Solution E) to converting List Two/Three to List One format is to do “average” and uneven rankings…so from List Three, A&B wouldn’t be in position “1”, they would be between 1&2. So I would give them both the average of 1.5; C,D,E would average out at #4 (i.e. spots 3, 4, and 5, averaging out to spot 4), etc. Nominally this would work, i.e. they would “net out” correctly and not nominally, but I would still be left with calculating a difference not in terms of ranking but in terms of methodology of ranking.
Soooo, I think I need to find a way to convert List One into List Two/Three format. Since List Three shows me whether or not my methodology “works”, I’m going to compare List One and List Three for the next part. One way to convert L1 to L3 format is to just divide L1 into equal chunks (Solution F):
- A,B
- C,D
- E,F
- G,H
- I,J
This maintains the list format, divides it into equal chunks so not reflecting any bias of methodology in List Three, and preserves the ranking order. But if I then compare this “new” list one with List Three, I would get: A=0,B=0,C=0,D=0,E=1,F=0,G=1,H=0,I=0,J=0 for a net difference of 2 spots out of 10 items. It would show the list was “slightly” different, but not radically so, and would reflect essentially the difference in methodology in this “pure” example. Even if I bump it up to 100 items, those differences should be relatively minor. But again, primarily focusing on methodological differences.
Lastly, I have Solution G — I’ll convert List One into five levels, same as for List Three, but I will make them unequal size i.e. matching the size of the groups from List Three. If I do this for List One, it basically will look identical to List Three and comparing them would give me “net change = 0” and “nominal change = 0”. Which sounds good, but it basically means that I am “weighting” the results of List One to match the secondary lists’ ranking approach — for example, perhaps the original “weighting” would have been 9 items in Level 1 and 1 item at Level 5, but I wouldn’t know that. Instead, I’m imposing the ranking / weightings of List Two/Three’s methodology onto the pre-established list in List 1.
Summary
- Solution A (Net changes, matching lists) — doesn’t work as nets out and lists aren’t matched in my applied example;
- Solution B (Nominal changes, matching lists) — doesn’t work as lists are matched in my applied example;
- Solution C (Re-rank List 2) — doesn’t work as no way to differentiate List 2;
- Solution D (Sports tournament) — doesn’t work on similar lists, adds a methodological problem to a ranking approach;
- Solution E (Average rankings) — doesn’t work as it eliminate second methodological problem but still leaves measurement of the different approaches to rankings;
- Solution F (Equal chunks) — semi-works but it would still measure difference in methodology and ranking approach; and,
- Solution G (Weighted chunks) — semi-works as it reflects nominal change of 0 in matching lists, but adds bias of second ranking approach.
The only other thought I had was to combine the results of Solutions D, E, F, and G and take an average of the four approaches. Not sure if that helps or if I’m just compounding my methodological and ranking problems.
Would love some thoughts if anyone has any to share…FYI, this is for personal use, not a work issue, so it doesn’t have to be entirely statistically pure, but I would like a little more comfort with an approach than I have for Solution G currently.
