Why Sentences Have Multiple Parse Trees

While working with the Open NLP tools, I noticed that their parser will often return more than one parse tree. This usually happens because there is more than one way to evaluate references in a sentence. A common example is, "The thief cut the painting with the knife." Did the thief use his/her knife to cut the painting, or did the thief cut the painting that had a knife in it. This yields two possible trees.

A less confusing sentence might be, "The thief admired the painting with the knife." Now we know that there must have been a knife in the painting because one does not use a knife to admire a painting. Easy for us to say.

Computers aren't so lucky. They don't have the vast library of common sense we do so they have no idea that a knife isn't generally useful when admiring a painting. We gain common sense by making observations about the world around us. Computers gain common sense by being fed a brick-load of sentences and building huge tables of probabilities. Using models like these, the parser can make a good guess which structure is correct, but it won't always be right.

Comments

Anya
Anya says, "Hey, Daniel! I don't know"
Hey, Daniel! I don't know why, but I can't access your sketchblog, and the link for your site on my blogroll is wrong. I'm just going to link to danielmclaren.net, ok? I like a lot of your art! Very cool!

Post new comment

The content of this field is kept private and will not be shown publicly.
If you have a Gravatar account, used to display your avatar.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • You can enable syntax highlighting of source code with the following tags: <code>. Beside the tag style "<foo>" it is also possible to use "[foo]".
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options