jump to navigation

Human Assisted Search February 25, 2007

Posted by Andre Vellino in Search, Statistical Semantics.

I tried the human-powered “search with guide” feature on the ChaCha search engine the other day. I can’t see human-guided search becoming a business success story in the mass market – for most purposes searching is becoming suffiently easy that we don’t need help any more.

But the idea of having a human guide to help with sophisticated searches (which has been floating around in on-line libraries for a while) may work well in a scientific digital library where the help of a trained librarian or subject specialist could really be welcome. I’m optimistic that this kind of service will be offered in on-line science libraries both because my experience at ChaCha was quite good and because I believe there are situations where there aren’t likely to be automated alternatives.

When I used ChaCha’s “Search with Guide” feature, my browser entered me into a chat session with a human search-assistant. Our chat conversation helped her (she called herself “Kimberly”) narrow down the general “Global Warming” query that I originally gave her. What I really wanted to know was “what can we do about it”? She gave me about 4 answers that were displayed to me one at a time at the rate of about 1 per 30 seconds, all very relevant.

Now, I could have come up with the first answer myself with a query for “Global Warming Solutions” on Google or MS Live Search, so Kimberly wasn’t especially useful to me with this particular query. But you can imagine situations where only a knowledgeable human being can come up with synonyms or semantically cognate phrases.

Consider for example, the problem of searching for the recent paper that solves the Poincare Conjecture. If you don’t happen to know that Fields Medal nominee Grigori Perelman (I say nominee because he famously declined to accept it) solved this Millenium Problem, then you may have some trouble finding his original paper just by searching for “Poincare Conjecture” – especially in a science archive like arXiv.org which has no references to “journalistic” articles like Wikipedia. The reason is that Perelman’s paper makes no mention of the Poincare Conjecture – this result merely follows from his solution to Thurston’s more general Geometrization Conjecture using extensions to Richard Hamilton’s theory of Ricci Flow (all of which, incidentally, are completely beyond my comprehension.)

I think this kind of knowledge still requires a human brain, because statistical semantics just doesn’t have high enough occurrences of word-frequency patterns in a large enough corpus to induce this information. Furthermore, there is, I think, a historical component to this kind of knowledge (first X happened, then Y, etc.) which I don’t think statistical frequency patterns can reflect easily.


1. Daniel Lemire - March 1, 2007

You know too much about Poincare’s conjecture!

The problem with this model is that it is likely to be very expensive. Google had such a service and they let it go.

If you want to offer a profitable service, you have to go one step beyond, and actually produce a short report on a topic, or otherwise aggregate the data in a useful way. Just listing hyperlinks is not good enough, I think.

2. Andre Vellino - March 1, 2007

Yes, that’s a very good point. When you want assistance, you typically want a sort of mini “market report” (like the one I’m doing on digital library personalization features) which provides some kind of value-added analysis.

Still, in the context of a digital library, I think that an interactive chat session for finding even just article references in a given collection might be helpful – at least more helpful than in the general web-space.

One reason I think this is true is that DL “advanced” search engines tend to be more complicated (in the sense of having more fields to search for) but (paradoxically) lower precision than Web search engines. Hence a human assistant / librarian / subject specialist might be of real value.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: