« March 2006 | Main | May 2006 »

April 29, 2006

Vishy's Useless Factoid of the Day: Ranking schemes

Could an overachieving third-grader be more devastated? I looked on in horror at my report card at the end of third grade. I had been ranked first in my class at the end of the first and second grades and frankly, I was rather getting used to it. Now, I stared at the number 3 filling up the Rank line item. As if to add insult to injury, there were two students who scored one point more than me and they were both ranked first. I went up to my teacher, puffy-eyed, and asked her why I wasn't at least ranked second, for my score that came right after my two classmates. In an explanation that felt utterly inadequate, she said, "Vishwanath, when there are two first-ranks, the next rank is third." A number of years later, I realized that my expectation to be ranked second rather than third was not entirely fanciful. I was merely expecting to be ranked in a different ranking scheme.

Consider the following: if Alice got 97 points in an exam, Bob 97, Charlie 94 and Dave 89, should Charlie be ranked 2nd or 3rd? The answer is 'it depends'. It depends on what you're trying to get out of the ranking: the top N candidates in a cohort or the top N scores. Typically, one ranks a group to confer some advantage on the top few ranks. From the test takers above, it is clear that the top 3 are Alice, Bob and Charlie. The last person in the top 3 should definitely be ranked 3rd. Thus, Charlie is ranked 3rd and rank 2 is skipped entirely.

However, what if the ranker's objective is to get the top 3 scores in the test? From the above example, the top 3 scores would correspond to test takers Alice, Bob, Charlie and Dave. In this case, it's fair to have Alice and Bob be ranked first because they got the top score, but Charlie ranked second because he got the second best score. The top N scores may not correspond to N test takers, but there are no skipped ranks.

I was pleasantly surprised when I found that this distinction is codified in the Oracle relational database system. When returning a ranked set of rows, a query can use either the RANK or the DENSE_RANK functions. In case of the former, Charlie above would get ranked 3rd, but he would be ranked 2nd if the latter function were used. I expected to be ranked in third grade via DENSE_RANK (solely on my mastery of the material) when the function being used was RANK (relative to my peers). I was never able to make up for this abysmal performance. Starting with fourth grade, I moved to a different school, which refused to rank students strictly by their total point score on a set of exams and used a GPA system instead.

Posted by Vishy at 06:21 PM | Comments (0)

April 12, 2006

Vishy's Useless Factoids of the Day: T9

I've recently come to be a moderately heavy user of text messaging. I am not sending out so many text messages that I should pay for my cellular provider's unlimited text messaging plan, but all the same, I no longer consider sending one anything but utterly ordinary.

For the longest time, I used the ABC mode on my cellular phone's keypad without knowing that a considerably more keypress-efficient alternative was but a menu click away--T9, intelligent textual input using just the nine keys of your cellular phone's keypad. T9 scans each successive key you press to guess the word you intend to type. When I discovered T9, I was loathe to let go of the absolute feeling of control that ABC mode gave me. With ABC mode, I knew exactly why every character had its place on the screen. T9, on the other hand, sometimes spouted wild guesses that were disturbingly far away from my intention, but algorithmically correct. Over time, I got used to letting go of my iron grip over each character and letting T9 lead me with its inexorable guesswork. I also found that my usage of text messages shot up considerably after I switched to T9, something I don't believe is entirely coincidental.

I watched spellbound as T9 got better with its guesses. My relationship with T9 was maturing; it seemed to know with increasing accuracy exactly what was on my mind. As I wrote to various friends, it would even guess their names correctly! I speculated that my contact list was automagically added to the T9 database of potential guesses. Today T9 guessed 'Google' for me and I stopped to think. Google was definitely not an entry on my contact list; was my cellular phone provider getting with the program and updating my phone's T9 database with names of products and services used by yuppies? A Google (who else?) search revealed to me that I could update my T9 database myself; indeed that was how it was learning my style so well. Everytime T9 gets a word wrong, change to ABC mode, correct it and change back to T9 mode. T9 will incorporate that correction into its database from that point onwards. It seems like most cellular phones that offer T9 work this way.

So, happy text messaging! Here's to less sore fingers!

Posted by Vishy at 09:13 PM | Comments (0)