Skip to main content

IBM Israel Research Seminars

 

Many Natural Language Processing (NLP) applications need to recognize when the meaning of one word can be expressed by, or inferred from, another word. For example, in Question Answering, the word company in a question can be substituted in the text by firm, automaker or division.

A widely used approach for learning automatically semantically similar words is based on the Distributional Similarity Hypothesis, which suggests that words that occur within similar contexts are semantically similar (Harris, 1968). Concrete similarity measures compare a pair of weighted context feature vectors that characterize two words in texts. However, it has been observed that this captures a somewhat loose notion of semantic similarity, e.g. proposing the word government as similar to company. Hence, it does not ensure that the meaning of one word is preserved when replacing it with the other one in some context. Consequently, there is no direct evaluation criterion for distributional similarity output.

In this talk I will present a new direct criterion for word similarity, termed lexical entailment, which defines a tighter semantic relationship that might hold between words, as needed for the above mentioned applications. The rest of the talk will focus on the current research that studies the correspondence between the distributional characterization of two words and the lexical entailment relation. Error analysis of the state-of-the-art shows that to better approximate lexical entailment relation, first the quality of feature weighting function has to be improved. A new qualitative criterion for feature vectors was determined, which consequently provided a motivation for the better weighting strategy, relative feature focus. Then, I will discuss the two distributional inclusion hypotheses, that were proposed as a refinement to the classic Distributional Similarity Hypothesis. The degree of the hypotheses assessment was further checked by an automated web-based feature inclusion testing algorithm, which was also used to filter the baseline similarity lists, leading to a more accurate acquisition of the lexical entailment relation.