jump to navigation

Learning from Watson February 19, 2011

Posted by Andre Vellino in Artificial Intelligence, Information retrieval, Search, Semantics, Statistical Semantics.
4 comments

WatsonNow that Watson has convincingly demonstrated that machines can perform some natural language tasks more effectively than humans can (see a rerun of part of Day 1 of the Jeopardy contest), what is the proper conclusion to be drawn from it?

Should we join hands with “confederates” like Brian Christian and rally against the invasion of smart machines? (See his recent piece in the Atlantic and listen to his recent radio interview on CBC)?

Or do we conclude that machines are now (or soon will be) sentient and deserve to be spoken to with respect for their moral standing (see Peter Singer’s article “Rights for Robots“)? Or should we, like NSERC Gold Medal Award winner Geoffrey Hinton,  be scared about the social consequences (in the long term) of intelligent robots designed replace soldiers (listen to his interview on the future of AI machines on CBC’s Quirk and Quarks).

Before coming to any definite conclusion about how “like” us machines can be, I think we should consider how these machines do what they do.  The survey paper in AI Magazine about the design of “DeepQA” by the Watson team gives some indications of the general approach:

DeepQA is a massively parallel, probabilistic evidence-based architecture. For the Jeopardy Challenge, we use more than 100 different techniques for analyzing natural language, identifying sources, finding and generating hypotheses, finding and scoring evidence, and merging and ranking hypotheses….

The overarching principles in DeepQA are massive parallelism, many experts, pervasive confi-dence estimation, and integration of shallow and deep knowledge.

Is this the right model for creating artificial cognition? Probably not. As Maarten van Emden and I argue in a recent paper on the chinese room argument and the “Human Window”, the question of whether a computer is simulating cognition cannot be decided by how effectively a computer solves a chess puzzle (for instance) but rather by the mechanism that it uses to achieve the end.

In this instance DeepQA uses and combines a number of different techniques from NLP, machine learning, distributed processing and decision theory – which is not likely to be an accurate representation of what humans actually do but it is undeniably successful at that task (see this talk on YouTube about how IBM addressed the Jeopardy problem).

Geoff Hinton (in the radio interview mentioned above) speculates that Watson is a feat of special-purpose engineering but that the general-purpose solution – a large neural network that simulates the learning abilities of the brain – is what the project of AI is really about.

What we suggest in our Human Window paper is that one criterion we can use to determine whether machines are performing adequate simulations of what humans do is whether or not humans are able to follow the steps that machine is undertaking. On that criterion, I think it’s safe to say that Watson – although very impressive – isn’t quite there yet.

P.S. If you have the patience, I recommend watching a BBC debate from 1973 between Sir James Lighthill, John McCarthy and Donald Michie about whether AI is possible. The context of this video is the “Lighthill Affair” in 1972, recently chronicled on van Emden’s blog (note that the audio on this thumbnail video is rather out of synch!).

It’s amazing how spectacularly wrong an amateur in artificial intelligence (Prof. Lighthill was an applied mathematician specializing in fluid dynamics) can be about the possibiliy of machines simulating intelligent behaviour. It is real tragedy that Sir Lighthill’s ideological biases had such disastrous consequences for AI research funding in the UK. The attitude of Sir Lighthill reminds me of Samuel Wilberforce‘s objections  to Darwin’s theory of evolution. I find it astonishing that this BBC debate was so civilized in its demeanour.

Are User-Based Recommenders Biased by Search Engine Ranking? September 28, 2010

Posted by Andre Vellino in Collaborative filtering, Recommender, Recommender service, Search, Semantics.
2 comments

I have a hypothesis (first emitted here) that I would like to test with data from query logs: user-based recommenders – such as the ‘bX’ recommender for journal articles – are biased by search-engine language models and ranking algorithms.

Let’s say you are looking for “multiple sclerosis” and you enter those terms as a search query. Some of the articles that were presented to you from the search results will likely be relevant and you download a few of the articles during your session. This may be followed by another, semantically germane query that yeilds more article downloads. As a consequence, the usage-log (e.g. the SFX log used by ‘bX’) is going to register these articles as having been “co-downloaded”.  Which is natural enough.

But if this happens a lot, then a collaborative filtering recommender is going to generate recommendations that are biased by the ranking algorithm and language model that produced the search-result ranking: even by PageRank, if you’re using Google.

In contrast, a citation-based (i.e. author-centric) recommender (such as Sarkanto) will likely yield more semantically diverse recommendations because co-citations will have (we hope!) originated from deeper semantic relations (i.e. non-obvious but meaningful connections between the items cited in the bibliography).

Wolfram’s new Search Engine May 16, 2009

Posted by Andre Vellino in CISTI, Search, Semantics.
add a comment

WolframAlpha-0

The new “search engine” Wolfram Alpha by Stephen Wolfram is interesting.  It’s neither a typical query-based search engine nor a question answering system.  But it also isn’t (yet) a “computational knowledge engine” as the web site would have us believe. It’s something in between perhaps.

There’s no question that Wolfram Alpha’s goals for the future are lofty:

Wolfram|Alpha’s long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. We aim to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything. Our goal is to build on the achievements of science and other systematizations of knowledge to provide a single source that can be relied on by everyone for definitive answers to factual queries.

But we’re not quite there yet – not in May 2009 anyway.

I was interested by the results for “Multiple Sclerosis”, even though what I wanted to know were it’s known causes:

WolframAlpha-1

But if you try ” collaborative filtering” or “statistical semantics” or “demdemyelinating disease”, Wolfram Alpha is stumped and you are given subject areas (that it knows about) to browse.

Within a subject area that it does know something about (e.g “Quantum Physics”) you are presented with template question-types for whichl Wolfram Alpha will produce answers:

WolframAlpha-2

Which is quite educational, as far as it goes.

All of this uses what they call “curated data” – which presumably means that lots of formulas and equations have been entered into a web-based version of Mathematica and annotated with subject-area metadata.  Is this enough, though?  And can we trust the “objectivity” of the knowledge (e.g what Wolfram Alpha knows about cellular automata)?

To be a really useful tool, it sounds like a lot of people are going to have to contribute a lot of information. And even then that information will only be retrievable in a very particular way.

This effort seems more likely to succeed at codifying all human knowledge than Cyc, but it still seems like an impossible task.

The Identity of Objects March 14, 2008

Posted by Andre Vellino in Digital Identity, Epistemology, Semantics.
3 comments

I was listening to my colleague Richard Ackerman give a preview of his upcoming keynote address at the National Information Standards Organization (NISO) forum when Brian Cantwell Smith’s book On The Origin of Objects popped into mind (I wrote a short review of that book many moons ago and I’m a big fan of the book.) Brian is now Dean of the Faculty of Information Studies at the University of Toronto and those of us who have enjoyed The Origin have been patiently waiting for the publication of “The Age of Significance“, a 7-volume series that fleshes out some details.

Brian’s book came to mind because of the point Richard makes in his presentation that computers love unique identifiers for objects – books, articles, authors – and that we don’t really have good standards for identifying things. Even if you take into account efforts like Digital Object Identifiers (DOI) the task providing unique references to persistent digital objects presents significant hurdles, such as dealing with versions.

(more…)

CYC Game November 17, 2007

Posted by Andre Vellino in Knowledge Representation, Logic, Semantics.
2 comments

In the “neat vs. scruffy” debate in AI, my dedication to the “neat” camp is wavering. Granted that logic is interesting and useful, but is it really the right formalism for knowledge representation?

Take the CYC project, for example. It is tempting to believe, following Witgenstein’s Tractatus that “the world is the totality of facts” and “the facts in logical space are the world”. And the CYC project has been driven by this temptation: OpenCYC now contains about 300,000 “concepts”, 3,000,000 “assertions” and 26,000 relations between them and an inference engine with which to draw conclusions.

If you believe in helping CYC to learn, you can play this collaborative game to help CYC learn more facts about the world. The game composes (seemingly random) questions about relations that might be meaningful or true or false and you get to tell it whether these generated propositions are true or not.

(more…)

Powerset October 30, 2007

Posted by Andre Vellino in Search, Semantics.
2 comments

I’m looking forward to hearing from Powerset Labs so I can try their new “semantic” search engine from Xerox PARC. I registered, but there appears to be a backlog due to the buzz in the blogosphere. I find it interesting to see what kinds of positions they are advertising for: this the first time I’ve seen an post for a “relevance ranking engineer”.