Estimating the difficulty of queries submitted to search engines

Innovation Matters


How often have you entered keywords in a search engine, only to be rewarded with a huge list of items that are way off the mark? Our research attempts to predict the chances of a search engine providing the correct documents for a given query. Such prediction is useful as feedback to the user and the search engine administrator. It is also used for uniting data from different sources, as well as to predict when relevant documents cannot be retrieved simply because they do not exist in the repository.

Our work attempts to predict the quality of the answers retrieved by search engines using machine learning methods. This work opens a new range of opportunities for improving search engines and their interaction with users.

Today, most search engines return results that contain the query terms. It’s a well-known fact that just because certain keywords can be found in a document, it doesn’t mean the document answers your query. However, that document will be listed in your query results. Take, for example, the query “IBM”. Any search engine will list huge reams of results, but what are the chances of them being relevant to the search? And what exactly are you searching for? Information on IBM products? IBM stocks? Or maybe the location of a nearby IBM dealer? The ambiguity of that query makes it difficult for any search engine to answer.

The reasons for low quality search results are numerous, but two main explanations stand out. First, engines are faced with linguistic challenges, where words like ‘fire’ or ‘present’ can have multiple meanings. Second, the data sources being searched may not have any information on the topics being searched for. The missing content may simply not exist in the indexed collection.

We started our work by trying to find indicative features for easy and difficult queries. We examined many queries and the documents retrieved in response to these queries. This was done for both queries that were answered suitably (easy queries) and for queries that were not (difficult queries). We observed that when queries were easy, searching for each query term separately will yield similar results to a search for the entire phrase. This closely resembles a situation where a patient consults with four different physicians. If all physicians give the same diagnosis, the patient will naturally have greater confidence in their judgment.

We used this agreement as input to a prediction algorithm. This algorithm uses a set of training queries to learn how to map these agreements to the difficulty of the query. When a new query is entered, the prediction algorithm can forecast its difficulty. When presented with 49 previously unseen queries from the NIST TREC conference, the predicted difficulties estimated by the algorithm were in a correlation of 0.57 with the actual difficulties. This result is significantly higher than any other previously-suggested method, although we are currently working on improvements to the algorithms which we hope will increase this correlation.

Juru Demo
Figure 1. Prediction of query difficulty presented together with search engine results. This prediction informs users if the retrieved documents are likely to be useful. If not, they can decide, for example, to reformulate their queries.

Once the difficulty of a query can be predicted, it can be used in a variety of ways. The most common use is to give feedback to the user (see image). We have also used this information to combine information from different sources by weighting the results from different repositories according to their predicted quality. This approach improved retrieval by up to 13% on the collections we tested. Finally, we have shown that it is possible to use the technology developed for query prediction to detect when the quality of results is poor simply because the document collection does not contain pertinent results. This information is of great importance to the search engine administrator because it signifies information that users are seeking but is not currently being answered. Thus, the administrator can use this information to decide which information should be added to the repository. In our experiments we demonstrated that over 70% of queries with no content could be identified at the cost of incorrectly labeling less than 10% of queries with content.

Finally, we have shown that it is possible to use the technology developed for query prediction to detect when the quality of results is poor simply because the document collection does not contain pertinent results. This is done using a two-staged process. First, a query predictor detects those queries that are likely to be difficult. Then, a second predictor is trained to distinguish between queries which are difficult because there is no relevant content in the repository and those which are difficult for other reasons (for example, due to linguistic ambiguity). The second predictor is trained in a manner identical to the first one except that the labels for its training queries are different (they can be either that the query has content or it does not).

This information is of great importance to the search engine administrator because it signifies information that users are seeking but is not currently being answered. Thus, the administrator can use this information to decide which information should be added to the repository.

Our work on query prediction won the Best Paper Award at the Association of Computing Machinery’s (ACM) 2005 conference on information retrieval (SIGIR 2005).

Related Publications  

Elad Yom-Tov, Shai Fine, David Carmel and Adam Darlow. Learning to Estimate Query Difficulty with Applications to Missing Content Detection and Distributed Information Retrieval . 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005), Recipient of the Best Paper Award . ACM, August 2005.

E. Yom-Tov, S. Fine, D. Carmel, A. Darlow and E. Amitay. Juru at TREC 2004: Experiments with prediction of query difficulty. Text Retrieval Conference (TREC). 2004.

E. Yom-Tov, D. Carmel, A. Darlow, D. Pelleg, S. Errera-Yaakov and S. Fine. Juru at TREC 2005: Query Prediction in the Terabyte and the Robust tracks. Text Retrieval Conference (TREC). 2005.


Rate this article

Innovator's corner  

Elad Yom-TovElad Yom-Tov Researcher

What is the most exciting potential future use for the work you're doing?
We’re excited about the ability of this technology to assist search engine administrators and content providers with an understanding of their users’ needs and where their current offerings fall short.

What is the most interesting part of your research?
There are two aspects which I find interesting. The first is that we are slowly gaining a fundamental understanding of what makes some information needs more difficult than others. On another level, this project is exciting because it is a meeting of researchers from the two different fields of machine learning and information retrieval. I find that bringing together knowledge from different disciplines results in innovative solutions which would not have been possible otherwise.

What inspired you to go into this field?
I started working in machine learning while working on my Ph.D thesis because this was a necessary tool for my work on brain-computer interfaces. I quickly discovered how interesting and diverse this field is and rapidly changed the focus of my work to machine learning.

What is your favorite invention of all time?
If I have to name just one invention, I think it would be the printing press. I am extremely fond of books, and so the ability to print books and share knowledge, experience, and imagination seems to me to be one of mankind’s greatest inventions.

Research team  

David Carmel

David Carmel

Adam Darlow

Adam Darlow

Shai Fine

Shai Fine

Related Research