« November 2006 | Main | January 2007 »

December 24, 2006

My free knowledge management solution

It was my uncle who made me a news junkie. He used to live in an apartment just down the street from mine when I was growing up. Every time I visited, he would open a folder with a crafty gleam in his eyes. In it, he would have religiously noted down all the interesting tidbits he saw on TV, in a newspaper or a magazine. And from it came countless trivia questions that I could never answer correctly.

I wasn't just determined to pass his test; I had to fight back as well! I got into the habit of reading the newspaper regularly and noting down interesting bits so I too could ask him questions. I never could beat him conclusively; to my exasperation, he would manage to answer my questions easily while still managing to come up with obscure bits of his own. Nevertheless, to this day I remain a news junkie, curating my own mental library of facts.

My brain can only hold so much though, and rather than take the risk of forgetting facts—or worse, muddling them together—I have come up with my own el cheapo (or rather, el freeo) system for managing what I know. My "notebook of wisdom" has recently reincarnated in the twenty-first century as three free programs each for a different purpose:

I don't know if many readers of this blog actually care to organize their knowledge in some way (come on—it's a pretty dorky thing to do, isn't it?), but I'd be curious to know what other people have tried and what has worked well for them.

Posted by Vishy at 11:19 AM | Comments (4)

December 20, 2006

Google Book Search, meet Amazon's Mechanical Turk

According to this Washington Post story about Google's ambitious book-scanning initiative, Google is scanning 3,000 books a day, which works out to about a million books annually. This rate is no doubt phenomenal but brings its own set of problems along.

Scanning a book is easy, but optical character recognition is notoriously hard to get right. Google is said to be currently scanning only those books whose copyright term has expired, which would mean books published before the 1920s. Any books published more than a 100 years before that decade have peculiar artifacts like the integral-sign S or f-like S, illustrations signed with artistic scrawls and manuscripts featuring cursive handwriting. No matter how breathlessly the media waxes about Google's 5,000 Ph.D.'s, these are all hard problems in OCR today.

Consider the scale of Google's operation: let's say each book scanned this year had an average of 200 pages. That would make 200 million pages. Even an OCR program with five-nines (99.999%) accuracy would spit out 200,000 pages with mistakes. One way to achieve dramatically higher accuracy is to ask the millions of avid readers on the Web to double check the accuracy of Google's OCR program, kind of like the distributed outsourcing in Amazon's Mechanical Turk project. It's really not as bad as it sounds; it's merely about offering the right incentives. Thankfully, Google doesn't have to look far for ideas.

Google already does quite a bit of revenue sharing with its AdSense partners, in exchange for the chance to display ads based off content on third-party websites. Google's plan to monetize Google Book Search is to show ads beside the content of the books anyway. So how about taking the money gained from ads shown beside, say Alice In Wonderland, and share some of it with whoever helped check the accuracy of Google's OCR program on that book? The revenue sharing infrastructure is already in place and it'll really be just a question of building the right user interface to divvy up the books among people.

I don't have access to Google's vast set of query strings (and neither does the general public), but it doesn't seem like the revenue distribution would be that iniquitable across books if structured the right way. There could be a cap of say $100 per book to control for books that mention "sex" a lot. Other than that, let human judgement take its course and watch the dollars roll in!

Posted by Vishy at 10:51 PM | Comments (0)

December 12, 2006

10 treasures you can get to with Start > Run...

Although I spend nearly all of my day in pointy-clicky Windows and OS X, I am a CLI geek at heart. I launch a bunch of programs merely by typing in things into Windows' pathetic excuse for a command line, the Start > Run... box. Here are some of the niftiest things you can access from the Start > Run... box in Windows XP:

A way to start a screensaver instantly
Start > Run... > scrnsave.scr and the screen blanks instantly. Useful for when you need to step away from the computer and don't want prying eyes to see what you had on your screen.
A comprehensive system information tool
Start > Run... > winmsd, for when you get hardware conflicts, driver problems or other low-level issues. This tool gives it all to you in one place, free of sugarcoating.
The trusty Control Panel at your fingertips
Start > Run... > control brings up the Control Panel without having to point and click your way through the Start Menu.
A way to clean your hard disk of junk
Start > Run... > cleanmgr and a disk cleanup assistant appears. Although storage is hardly an issue anymore, it's really surprising how much space you can reclaim by deleting your browsing history, cookies, downloaded program files and other crap you didn't even know you had.
An on-screen keyboard
Start > Run... > osk, for when you need to shut down the computer after suddenly spilling java into your keyboard messing it up beyond repair. You can even hover over the keys to press them!
A way to transfer files to your BlackBerry
Start > Run... > fsquirt, enables you to "squirt" (eww) files from your Bluetooth-enabled computer to another Bluetooth-enabled device, like your BlackBerry.
A one-stop startup configuration utility
Start > Run... > msconfig, for when you are sick and tired of your startup being delayed by the 18 million teeny-weeny programs that crowd the system notification area (not system tray).
A definitive answer to "What version of Windows am I running?"
Start > Run... > winver, et voila!, a dialog box with your operating system version (including any service packs you applied) appears.
A way to design your own Klingon font
Start > Run... > eudcedit, gives you a drawing surface using to fill up your own Unicode Private Use Areas. This way, even fewer of your Klingon-speaking friends will know when you write that their mother has a smooth forehead.
An el cheapo chat program
Start > Run... > winchat brings up a really basic chat program using which you can talk to users of other computers in the same Windows workgroup or domain. It was clearly written by a Microsoft developer who sorely misses talk from Unix.

You knew there was a bonus! Start > Run > %WINDIR%\media opens up a directory with the Windows sound schemes. The MIDI (*.mid) files in there are the half-decent lost works of an unknown composer (Brian Eno?). They're somewhat dated, which makes them sound tacky today. The file called "flourish.mid" vaguely reminds me of the Seinfeld theme for some reason. Start > Run... > clock.avi pops up a movie of a cheesy-looking clock that counts to 12.

I'd like to honorably mention Start > Run... > osuninst. I thought it was going to end my love-hate relationship with Windows, but all it did was pop up a cryptic error message instead.

Posted by Vishy at 09:28 PM | Comments (0)

December 11, 2006

HOWTO: Read New York Times Editorial Columnists For Free

Technorati has one of the best blog searches around, which also means they are solid purveyors of the shitstream. Their pages sometimes feature a banner ad though, which goes "55 million blogs... some of them have to be good." Turns out they are right—some of these blogs feature content for which the gray old lady charges a premium.

I was really bummed the day the New York Times editorial columnists, some of the United States' most influential thinkers, were unceremoniously dumped behind the TimesSelect paywall. The New York Times has content policies that may seem strangely anachronistic in today's new media world. Fine, charge me a premium for all your fancy verticals and lifestyle content, but at least leave these national debate-driving thinkers freely accessible!

Fortunately, thanks to Technorati, I can get my weekly fix of Maureen Dowd's articulate Bush-bashing and Paul Krugman's economic theories (I have sort of tired of Tom Friedman's persistent world-is-flat shtick). Some kind blogger always posts the full text of the column I want to read. Take today's column by Paul Krugman, called "Outsourcer in Chief." I hobbled over to Technorati and typed in "Outsourcer in Chief Paul Krugman" and found several full-text renditions in the first 10 results.

This has worked for me pretty well for the last month or so. I doubt there's a lot of other TimesSelect content on Technorati, so don't take this as a guaranteed way of drilling through the paywall. I wouldn't want to pay to access most TimesSelect content, but what content I might pay for, I can get for free. In this case, it looks like Robin Hoods have 'appropriated' the cathedral's jewels for the bazaar's good.

Posted by Vishy at 08:18 PM | Comments (0)