jump to navigation

PowerSet December 5, 2007

Posted by Andre Vellino in Information retrieval, Search.
trackback

I like what I’m seeing in PowerSet. I obtained a login ID the other day and did a few informal experiments on the PowerLabs site. I know PowerSet is getting a lot of pre-launch buzz – and I am as allergic to hype as the next person, but I’m optimistic about PowerSet.

Both Haika – whose marketing assertions I expressed some doubt about in a previous post – and PowerSet are mentioned in the recent roundup of “Semantic Apps to Watch” on Read/Write web.

But consider the differences – even just at the level of marketing blurbs from their respective web sites. Haika claims that:

OntoSem offers an advanced methodology and technology for natural language processing, the only one of its kind, so far, to access the full meaning of the text it handles.

PowerSet, on the other hand makes a somewhat more modest claim:

Our unique innovations in search are rooted in breakthrough technologies that take advantage of the structure and nuances of natural language. Using these advanced techniques, Powerset is building a large-scale search engine that breaks the confines of keyword search.

Haika claims to “access the full meaning of the text” and PowerSet merely to “break the confines of keyword search”. One could argue that Google does that too, of course, with PageRank, if nothing else.

From my unscientific survey of sample queries, I’d say PowerSet will live up to their claims when they go live. The demos that I tried are partitioned into structured queries such as ” ‘What did’ X ‘say about’ Y?” where X and Y are your favourite noun phrases. Other demos on Sports and Art and Business are structured in the same way. However, the index is limited to the text content in Wikipedia – which, as Daniel Lemire pointed out the other day, you might as well restrict your Google queries to anyway since ~ 27% of top search results come from Wikipedia.

Consider. for example, the question “What did someone say about the Canadian Dollar?”. Powerset’s top result is:

He has been the Premier of Manitoba since 1999, leading a New Democratic Party government…. Doer encouraged the Bank of Canada to lower its rates in late 2003, saying that the rising strength of the Canadian dollar in relation to the American dollar was causing increased unemployment.

Compare that with the following query on “The other guys” web site:

site:http://en.wikipedia.org/ what did someone say about the Canadian Dollar

Google’s 1st result is:

I can’t work out where the black box comes from – did someone change CSS? ….. the shortest version, which according to the Canadian dollar page is “C$”.

These demo query templates are rigged against Google, naturally. Even some surface NLP on the query, which Google doesn’t seem to do, will give you better results. But the PowerSet index does some (maybe quite a bit of) NLP on the content as well – named entity extraction for instance and possibly some anaphora resolution.

I’m giving an encouraging review of PowerSet not just because I worry about a search-engine monoculture. (It’s true that I worry about Google’s dominance, but it’s for the same reason that I worry about the monopolies of Microsoft, Intel and Chiquita Bananas – species diversity is good for the eco-system.) For instance, I find I often want answers to questions about things, which requires the ability to differentiate between “sense” and “reference”. For instance I often want to read reviews of {books, digital equipment, etc.} rather than have references to the items themselves and I have to twist into pretzels to formulate a Google-query with quotes (for the item) and synonyms for “review” / “opinion” etc. which are likely to occur in the “about” items I’m looking for.

I think PowerSet might find its niche with users who want a particular kind of question-answering engine. But I don’t think the relative business failure of Ask.com should deter them from seeking that niche.

Comments»

1. lemire - December 5, 2007

There is no question that Powerset can do better than Google by using NLP. But Google might be anticipating Powerset already. 😉 Recall that Google does very well in natural language translation…

2. Andre Vellino - December 6, 2007

Yes, that’s true, Daniel. The list of Google features (“definitions”, “product search”, “Q&A” etc.) http://www.google.com/help/features.html is impressive and growing. I’m pretty sure that “product search” uses the WordNet extensions that Applied Semantics had developed when Google bought them. Maybe Google will just buy PowerSet – why not? (Could be PowerSet’s business strategy for all we know.)

3. oioiwp - December 7, 2007

I have had a look at hakia which is also my home page. It’s quite good here’s the first query I posed ‘When was wittgenstein born?’ It promptly returned all documents containing the date of birth. Next I posed a little more complex “will wittgenstein be reborn?’ (note the modal will) Nicely returned something relevant.
Asked powerset for a loginID. Awaited. From the feel of things I guess they will use LFG based deep parsing. Your query returned
IMF chief says Canadian dollar, euro, bearing brunt of currency crisis
on hakia . looks sensible to me.
You can’t compare google and powerset as they are different kettle of fish.
Another to watch out is Lexxe.com


Leave a comment