TY - GEN
T1 - Predicting query reformulation during Web searching
AU - Jansen, Bernard J.
AU - Booth, Danielle
AU - Spink, Amanda
PY - 2009
Y1 - 2009
N2 - This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.
AB - This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.
UR - http://www.scopus.com/inward/record.url?scp=70349190160&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349190160&partnerID=8YFLogxK
U2 - 10.1145/1520340.1520592
DO - 10.1145/1520340.1520592
M3 - Conference contribution
AN - SCOPUS:70349190160
SN - 9781605582474
T3 - Conference on Human Factors in Computing Systems - Proceedings
SP - 3907
EP - 3912
BT - Proceedings of the 27th International Conference Extended Abstracts on Human Factors in Computing Systems, CHI 2009
T2 - 27th International Conference Extended Abstracts on Human Factors in Computing Systems, CHI 2009
Y2 - 4 April 2009 through 9 April 2009
ER -