jump to navigation

CYC Game November 17, 2007

Posted by Andre Vellino in Knowledge Representation, Logic, Semantics.
trackback

In the “neat vs. scruffy” debate in AI, my dedication to the “neat” camp is wavering. Granted that logic is interesting and useful, but is it really the right formalism for knowledge representation?

Take the CYC project, for example. It is tempting to believe, following Witgenstein’s Tractatus that “the world is the totality of facts” and “the facts in logical space are the world”. And the CYC project has been driven by this temptation: OpenCYC now contains about 300,000 “concepts”, 3,000,000 “assertions” and 26,000 relations between them and an inference engine with which to draw conclusions.

If you believe in helping CYC to learn, you can play this collaborative game to help CYC learn more facts about the world. The game composes (seemingly random) questions about relations that might be meaningful or true or false and you get to tell it whether these generated propositions are true or not.

Example:

Tools are typically found in jewelry store facility (sic – absence of “a” article between “in” and “jewelry”)

True or false? Well, that’s a good question isn’t it? What does CYC mean by “tools”? Garden tools? Woodworking tools? or Watch repair tools? Apparently 36% of game-players think this proposition is false. But it’s true, under at least one sensible interpretation of “tools”, right?

What about:

Every internal combustion-powered motor vehicle has exactly one gas cap.

Only 41% of respondents thought that was true. Well, of course, this proposition isn’t necessarily true. Some vehicles may have two gas-tanks, but generally it is true. Do we really need to encode universally quantified assertions of any kind? I suppose the reason would be to save space and to use universal instantiation to deduce new facts from general rules.  But do we in fact reason in Socratic syllogisms?

Consider this assertion from the CYC game:

Most BTR70 armored personnel carriers are wider than most BDRM-2s.

I have no idea, of course. But how many possible facts of that kind are there? Suppose there are even only 4 relations being considered between objects that take up volume in space: “wider”, “taller”, “heavier”, “more fragile than”. And suppose there are 100,000 objects worth considering under those relations. That’s about 20 million facts right there.

Another assertion was:

A feeling of courage is unlikely to be accompanied by a feeling of initiative.

100% of respondents (except me) thought this was false! Really? What about the (presumably) courageous world-war one soldiers who blindly followed orders to their certain deaths – were they showing initiative?

Or how about:

The act of Irish step dancing expresses enjoyment.

which CYC believes to be true. Well, I’m sure many Irish step dancers enjoy what they do and I doubt there’s much Irish dancing at funerals, but is this statement really true?

These completely nonsensical ones were pretty funny:

Pages are typically located in homes.

Alwayses are typically located in school building k through 12.

People typically perform or are involved in retirement more frequently than they perform or are involved in confusing an opponent.

One has to wonder: does a machine really “know” anything about the component terms in these assertions if it needs to ask about them?

What about transient “facts”, assertions whose truth depends on conditions in time? For instance:

Islam is a major religion in the united kingdom.

Most government ministers are taller than most spokespersons.

which CYC also believes to be true. Well I don’t know, but if these statements aren’t true now, it may well be one of these days and even if they are true now, this may not always be true. So in what sense are these assertions “facts”?

I don’t know if I am I just despairing about the futility of CYC or whether the entire project of Logicism, including the semantic web, is impugned by it.

Comments»

1. Yes, the Semantic Web is Flawed - November 19, 2007

[…] Andre is coming to the dark side by showing how hard it is to ontologize the world around us in a community-driven […]

2. Daniel Lemire - November 19, 2007

Come to the dark side, Andre. Feel the power, Andre. Can you feel it?

You have no idea how powerful you become once you stop wondering whether the paint is in the room or not.

I have updated my blog post on the flaws of the semantic web with a link to this post:

http://www.daniel-lemire.com/blog/archives/2007/11/02/yes-the-semantic-web-is-flawed/

I am going to start documenting all of the traitors to the Semantic Web cause. Once there are enough of you, we are going to take over… (evil laugh)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: