RuCCS Colloquia

Computing linguistically-based textual inferences

Dr. Lauri Karttunen

Tuesday, January 29, 2008, 01:00pm - 02:00pm

Palo Alto Research Center, Stanford University

Copy to My Calendar (iCal) Download as iCal file

A long-standing goal of computational linguistics is to build a system for answering natural language questions. A successful QA system has to recognize semantic relations between sentences. If the user would like to know the answer a question such as Did Shackleton reach the South Pole?, the system should recognize that the sentence Shackleton failed to reach the South Pole contains the answer. None of the current search engines is capable of delivering a simple NO answer in such cases. The system I will describe in this talk does make the correct inference. It is the Bridge system (a bridge from language to logic) developed at the Palo Alto Research Center.

The particular task the talk will focus on is entailment and contradiction detection (ECD), a more refined variant of the PASCAL RTE (Recognizing Textual Entailment) challenge. Given a passage of text and a query, does the query sentence follow from the text in the passage, is it contradicted by it, or neither? Here are examples of all three cases:

Passage:�� Oswald assassinated Kennedy.
Query:��� � Did Kennedy die?

Passage:�� Bill forgot to shave this morning.
Query:����� Did Bill shave this morning?

Passage:�� There is a cat in the yard.
Query:����� Is there a black cat in the yard?

The ECD algorithm operates on the level of Abstract Knowledge Structure (AKR) without the need of disambiguation. An AKR representation, derived from the syntactic and semantic analyses of a sentence, is a flat set of facts that involves concepts, roles, and contexts. Texts are parsed to produce packed syntactic representations, and these are rewritten and canonicalized, without unpacking, into AKR. Canonicalization is determined both by the structure of the representations and the lexical items involved. The system includes knowledge about words and their relations between them that are encoded in resources such as WordNet and VerbNet. It also includes knowledge about lexically or constructionally triggered presuppositions and entailments.

ECD process first aligns context and concept terms, and then computes specificity relations between the aligned concept terms. Some special case reasoners support identification of named objects, comparison of specificity of WordNet synsets, and compatibility of cardinality restrictions. All the query facts that are entailed by the corresponding passage facts get removed. If all the query facts get eliminated, the system will respond YES. If a conflict is detected, that is, if one the aligned terms is instantiable and the other is uninstantiable, the system will respond NO. If some query facts remain at the end, the response is UNKNOWN

The linguistic phenomena illustrated in this presentation include lexical entailments (kill => die), relations between lexical predicates or phrasal constructions and their embedded complements (forget that
S => S, forget to S => not S, take the trouble to S => S, waste an opportunity to S => not S), and inferring temporal relations from temporal modifiers.

Dr. Lauri Karttunen

The RuCCS Colloquia Series is organized by Dr. Julien Musolino and Dr. Sara Pixley. The talks are held on Tuesdays in the Psychology Building, Room 101 on the Busch Campus from 1:00-2:30pm.

Note: If you would like to receive email announcements about the colloquium series, please contact the Business Office to have your name added to our announce lists at