Learning from Watson February 19, 2011Posted by Andre Vellino in Artificial Intelligence, Information retrieval, Search, Semantics, Statistical Semantics.
Now that Watson has convincingly demonstrated that machines can perform some natural language tasks more effectively than humans can (see a rerun of part of Day 1 of the Jeopardy contest), what is the proper conclusion to be drawn from it?
Or do we conclude that machines are now (or soon will be) sentient and deserve to be spoken to with respect for their moral standing (see Peter Singer’s article “Rights for Robots“)? Or should we, like NSERC Gold Medal Award winner Geoffrey Hinton, be scared about the social consequences (in the long term) of intelligent robots designed replace soldiers (listen to his interview on the future of AI machines on CBC’s Quirk and Quarks).
Before coming to any definite conclusion about how “like” us machines can be, I think we should consider how these machines do what they do. The survey paper in AI Magazine about the design of “DeepQA” by the Watson team gives some indications of the general approach:
DeepQA is a massively parallel, probabilistic evidence-based architecture. For the Jeopardy Challenge, we use more than 100 different techniques for analyzing natural language, identifying sources, ﬁnding and generating hypotheses, ﬁnding and scoring evidence, and merging and ranking hypotheses….
The overarching principles in DeepQA are massive parallelism, many experts, pervasive conﬁ-dence estimation, and integration of shallow and deep knowledge.
Is this the right model for creating artificial cognition? Probably not. As Maarten van Emden and I argue in a recent paper on the chinese room argument and the “Human Window”, the question of whether a computer is simulating cognition cannot be decided by how effectively a computer solves a chess puzzle (for instance) but rather by the mechanism that it uses to achieve the end.
In this instance DeepQA uses and combines a number of different techniques from NLP, machine learning, distributed processing and decision theory – which is not likely to be an accurate representation of what humans actually do but it is undeniably successful at that task (see this talk on YouTube about how IBM addressed the Jeopardy problem).
Geoff Hinton (in the radio interview mentioned above) speculates that Watson is a feat of special-purpose engineering but that the general-purpose solution – a large neural network that simulates the learning abilities of the brain – is what the project of AI is really about.
What we suggest in our Human Window paper is that one criterion we can use to determine whether machines are performing adequate simulations of what humans do is whether or not humans are able to follow the steps that machine is undertaking. On that criterion, I think it’s safe to say that Watson – although very impressive – isn’t quite there yet.
P.S. If you have the patience, I recommend watching a BBC debate from 1973 between Sir James Lighthill, John McCarthy and Donald Michie about whether AI is possible. The context of this video is the “Lighthill Affair” in 1972, recently chronicled on van Emden’s blog (note that the audio on this thumbnail video is rather out of synch!).
It’s amazing how spectacularly wrong an amateur in artificial intelligence (Prof. Lighthill was an applied mathematician specializing in fluid dynamics) can be about the possibiliy of machines simulating intelligent behaviour. It is real tragedy that Sir Lighthill’s ideological biases had such disastrous consequences for AI research funding in the UK. The attitude of Sir Lighthill reminds me of Samuel Wilberforce‘s objections to Darwin’s theory of evolution. I find it astonishing that this BBC debate was so civilized in its demeanour.
Are User-Based Recommenders Biased by Search Engine Ranking? September 28, 2010Posted by Andre Vellino in Collaborative filtering, Recommender, Recommender service, Search, Semantics.
I have a hypothesis (first emitted here) that I would like to test with data from query logs: user-based recommenders – such as the ‘bX’ recommender for journal articles – are biased by search-engine language models and ranking algorithms.
Let’s say you are looking for “multiple sclerosis” and you enter those terms as a search query. Some of the articles that were presented to you from the search results will likely be relevant and you download a few of the articles during your session. This may be followed by another, semantically germane query that yeilds more article downloads. As a consequence, the usage-log (e.g. the SFX log used by ‘bX’) is going to register these articles as having been “co-downloaded”. Which is natural enough.
But if this happens a lot, then a collaborative filtering recommender is going to generate recommendations that are biased by the ranking algorithm and language model that produced the search-result ranking: even by PageRank, if you’re using Google.
In contrast, a citation-based (i.e. author-centric) recommender (such as Sarkanto) will likely yield more semantically diverse recommendations because co-citations will have (we hope!) originated from deeper semantic relations (i.e. non-obvious but meaningful connections between the items cited in the bibliography).
Sarkanto Scientific Search September 13, 2010Posted by Andre Vellino in Collaborative filtering, Digital library, Information retrieval, Recommender, Recommender service, Search.
add a comment
A few weeks ago I finished deploying a version of a collaborative recommender system that uses only article citations as a basis for recommending journal articles. This tool allows you to search ~ 7 million STM (Scientific Technical and Medical) articles up to Dec. 2009 and to compare citation-base recommendations (using the Synthese recommender) with recommendations generated by ‘bX’ (a user-based collaborative recommender from Ex Libris). You can try the Sarkanto demo and read more about how ‘bX’ and Sarkanto compare.
Note that I’m also using this implementation to experiment with Google Translate API and the Microsoft Translator to do both query expansion into the other Canadian Official Language and to translate various bibliographic fields upon returning search results.
Feedback Effects in Google Instant September 8, 2010Posted by Andre Vellino in Search.
1 comment so far
I haven’t experimented with Google Instant long enough to tell if I will like it over the long run, but it certainly is an extraordinary feat of engineering! This new feature – which uses the “Google Suggest” auto-completion feature and Ajax to give you “instant” search results based on just the first few characters of your search query – imposes a dramatic load increase on Google servers. Yet clever engineering feats in caching and efficient query optimization have produced the desired scalability results and it is impressive to use.
(BTW – If you want to try “Google Instant” and you are in a country that doesn’t have it yet try “/ncr” (no country redirect), i.e. ”http://www.google.com/ncr“)
One effect that is sure to manifest over time is a feedback loop that “Google Instant” will have on “Google Suggest”. Just as people (mostly) currently click on one of the top-10 search results, so I expect most users will increasingly search for what Google suggests rather than their own terms and expressions, thus narrowing the range of options that “Suggest” can offer users over time.
One (interesting) issue is going to be: does “Instant” degrade the quality of “Suggest”. i.e. the more people use “Instant” the more the “top-N” suggested terms are reinforced, thus thinning out the “long tail” of queries. Is “Instant” going to increasingly cater to the lowest common denominator?
The demos given at the Google Instant launch by Google executives showed off how just typing “w” results in an instant and prescient result for “The Weather Network” (which, surprise, is what that demo scenario has you wanting!) I thought it might be interesting find out what Google Instant produces with each of the 26 letter of the alphabet. Here are the results:
A: Amazon.com: Online Shopping for Electronics …
B: Best Buy: TVs, Digital Cameras …
C: craigslist: los angeles classifieds for jobs …
D: Dictionary.com | Find the Meanings …
E: eBay – New & used electronics, cars, …
F: Welcome to Facebook
G: Gmail: Email from Google
H: Windows Live Hotmail
I: Welcome to IKEA.com
J: JetBlue | Airline Tickets, Flights, and Airfare
L: Lowe’s Home Improvement: Appliances, Tools…M: MapQuest Maps – Driving Directions – Map
N: Netflix – TV & movies instantly streamed online …
O: Orbitz Travel: Airline Tickets, Cheap Hotels …
P: Pandora Radio – Listen to Free Internet Radio …
Q: Famous Quotes and Quotations at BrainyQuote
R: REI – Outdoor Gear, Equipment …
S: Sears: Appliances, Tools, Electronics …
T: Target.com – Furniture, Baby, Toys …
U: USPS – The United States Postal Service …
V: Verizon | Broadband (DSL) Internet Service …
W: Current Weather – The Weather Network
X: Xbox.com | Home
Z: Zillow – Real Estate, Homes for Sale ….
“Suggest” results are clearly dominated by big on-line businesses: Sears, Verizon, Microsoft, Facebook, Amazon, eBay…. Is that really what most Google users search for most of the time? If so, I despair for the democratic internet.
Siri (imGenie Reborn) May 9, 2010Posted by Andre Vellino in Collaborative filtering, Information retrieval, Search.
It’s too bad we didn’t patent a few of the ideas we had at imGenie – 9 years ago. We might have made a killing 10 years later – assuming the recent iPhone app Siri is a hit. I expect it might not be, though, for some of the same reasons that imGenie didn’t succeed.
imGenie was a small Ottawa startup born from the demise of Nortel. You used to be able to find some references to it on Google as recently as 2 years ago and even on the Wayback Machine – but it appears to have entered a digital black hole now. The idea was: speak your commands to a voice-activated information retreival server and get your answer back on your cell phone.
Remember, this was before 3G services and way before the iPhone. Our prototype (that we built in ~ 3 months) used SMS as the channel for getting answers back to the client phone and a Jabber Instant Messaging interface from the server to generate the short messages. We used a Bevocal (now acquired by Nuance) service for the speech-to-text part and we rolled our own IR service. A collaborative filtering component was in there too as a method for making recommendations.
When our then CEO pitched the idea on an Report On Business (ROB) ”meet the VCs” type of show (a precursor to Dragon’s Den) – the VCs really liked the idea and the team we had put together (100% of the development team had a Ph.D. in something or other!). But they nixed the pitch with the comment “cell phone companies can’t even provide phone service on the Don Valley Parkway – what makes you think they are ready for such advanced services”.
And the VCs were right, of course. imGenie was 5-10 years ahead of its time, as were a lot of ideas that were born of ex-Nortel engineers. It’s too bad, though. I think there is something to the idea of computer-aided collaborative decision making. But I doubt it’s the killer app for teens who need to decide which restaurant and movie to go to.
Google Books on Charlie Rose March 8, 2010Posted by Andre Vellino in CISTI, Digital library, General, Open Access, Search.
add a comment
I found this conversation about the “Google Books” library very interesting. It is was between Robert Darnton (professor of American cultural history at Harvard and Director of the Harvard University Library), David Drummond (Chief Legal Officer at Google), bestselling author James Gleick and Charlie Rose (from PBS) last night.
I was especially pleased to see Prof. Darnton insist on the need to guarantee “the public interest”. Only he seemed to have the long view, though.