Research Projects

Semantic Search



In this work, we investigate how search performance can be improved through leveraging semantics in both the corpus analysis and query analysis processes. We focus our investigation on utilizing the XML search capabilities provided by IBM's JuruXML search engine and query language to demonstrate significant improvement in search precision.

Representative publication:



Question Answering



In this work, we developed a an open-domain factoid question answering system (PIQUANT) which employs multiple question answering strategies and utilizes multiple structured and unstructured information sources. The primary distinguishing features of our approach are: 1) the use semantic indexing and semantic search to enable more precise search by constraining the semantic types of search terms and/or specifying semantic types of candidate answers ("predictive annotation") and 2) the ability to employ multiple agents adopting different question answering strategies whose answers are combined and resolved through an answer resolution component. PIQUANT and its predecessors have participated in the TREC QA evaluation since 1999 and have consistently been among the top-scoring systems.

Representative publications:

  • ``Statistical answer-type identification in open domain question answering”, John Prager, Jennifer Chu-Carroll, and Krzysztof Czuba. In Proceedings of the Human Language Technology Conference (HLT), 2002.
  • In question answering, two heads are better than one”, Jennifer Chu-Carroll, Krzysztof Czuba, John Prager, and Abraham Ittycheriah. In Proceedings of the Human Language Technology Conference (HLT/NAACL), 2003.
  • `` A Multi-Agent Approach to using Redundancy and Reinforcement in Question Answering”, John Prager, Jennifer Chu-Carroll, and Krzysztof Czuba. In New Directions in Question Answering, M. Maybury, editor, AAAI Press, 2004.
  • `` Question answering using predictive annotation”, John Prager, Jennifer Chu-Carroll, Eric Brown, and Krzysztof Czuba. In Advances in Open-Domain Question Answering, T. Strzalkowski & S. Harabagiu, editors, Kluwer Academic Publishers, to appear.


We've also tailored some of our question answering technologies to work in limited domains, where we developed a natural language interface to web search on IBM ThinkPad purchase and support. In this work, we also evaluated the impact of a domain ontology and its quality on system performance.

Representative publications:

  • ``A hybrid approach to natural language web search”, Jennifer Chu-Carroll, John Prager, Yael Ravin, and Christian Cesar. In Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), 2002.
  • “Evaluating Ontology Cleaning”, Christopher Welty, Ruchi Mahindru, and Jennifer Chu-Carroll. In Proceeding of the Nineteenth National Conference on Artificial Intelligence (AAAI), 2004.


Mixed-Initiative Dialogue Management



In this work, we developed a model for tracking initiative in collababorative dialogue interactions, and employed this model in an end-to-end adaptive mixed initiative spoken dialogue system (MIMIC) that provides movie showtime information. MIMIC improves upon existing dialogue systems by adopting a novel hybrid dialogue management architecture that enables automatic dialogue strategy adaptation. This hybrid architecture decouples a data-driven, domain-dependent initiative module from its knowledge-driven, domain-independent goal selection process, while allowing the outcome of both processes to jointly determine the strategies selected for response generation. This design allows MIMIC to automatically adapt dialogue strategies based on information dynamically extracted from dialogue interaction, and further enables modification of MIMIC's adaptation behavior with very minor parameter adjustments, resulting in more general and portable dialogue systems.

Representative publications:



Natural Language Call Routing



In this work, we developed a natural language dialogue system prototype that directs calls to appropriate destinations in a large call center. The basic technique adopted for routing utilizes the standard vector-based information retrieval mechanism to determine which destinations are similar to a given caller request. Our novel extension to this vector-based mechanism enables the system to dynamically generate disambiguation questions when the user query does not uniquely identify a destination. The call router is statistically trained from a set of sample calls. The training process is fully automatic and allows the system to be ported to new domains with extreme ease.

Representative publications:



Collaborative Response Generation in Dialogue Interaction



For my PhD dissertation research, I developed and implemented a plan-based model for natural language response generation in collaborative dialogue interactions. This work centered on the analyses of proposals inferred from user utterances and on the process of selecting appropriate content for the system's responses to such utterances. I focused on generating two particular types of subdialogues: 1) collaborative negotiation subdialogues, initiated when the system detects a conflict between the system and the user during the dialogue and attempts to convince the user to change his/her mind, and 2) information-sharing subdialogues, initiated when the system does not have sufficient information to determine whether to accept or reject a user proposal, and attempts to solicit further information from the user in order to arrive at an informed decision. In both cases, the system performs content planning to generate concise, coherent, and effective responses by taking into account its own domain knowledge, its beliefs about the user's knowledge and preferences, as well as its beliefs about how the user's beliefs may change based on information presented to the user.

Representative publications: