TY - JOUR
T1 - Determining the informational, navigational, and transactional intent of Web queries
AU - Jansen, Bernard J.
AU - Booth, Danielle L.
AU - Spink, Amanda
N1 - Funding Information:
We would like to thank Excite, AlltheWeb.com, AltaVista, and especially Infospace.com for providing the data for this analysis, without which we could not have conducted this research. We encourage other search engine companies to engage members of academic community in Web searching research. The Air Force Office of Scientific Research (AFOSR) and the National Science Foundation (NSF) funded portions of this research.
PY - 2008/5
Y1 - 2008/5
N2 - In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching.
AB - In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching.
UR - http://www.scopus.com/inward/record.url?scp=40649101172&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40649101172&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2007.07.015
DO - 10.1016/j.ipm.2007.07.015
M3 - Article
AN - SCOPUS:40649101172
VL - 44
SP - 1251
EP - 1266
JO - Information Processing and Management
JF - Information Processing and Management
SN - 0306-4573
IS - 3
ER -