Why Sentences Have Multiple Parse Trees
While working with the Open NLP tools, I noticed that their parser will often return more than one parse tree. This usually happens because there is more than one way to evaluate references in a sentence. A common example is, "The thief cut the painting with the knife." Did the thief use his/her knife to cut the painting, or did the thief cut the painting that had a knife in it. This yields two possible trees.
A less confusing sentence might be, "The thief admired the painting with the knife." Now we know that there must have been a knife in the painting because one does not use a knife to admire a painting. Easy for us to say.
Computers aren't so lucky. They don't have the vast library of common sense we do so they have no idea that a knife isn't generally useful when admiring a painting. We gain common sense by making observations about the world around us. Computers gain common sense by being fed a brick-load of sentences and building huge tables of probabilities. Using models like these, the parser can make a good guess which structure is correct, but it won't always be right.
Delicious
Digg
Reddit
Facebook
Google
Yahoo
Technorati

Comments
Post new comment