TY - GEN
T1 - Classifying web queries by topic and user intent
AU - Jansen, Bernard J.
AU - Booth, Danielle
PY - 2010
Y1 - 2010
N2 - In this research, we investigate a methodology to classify automatically Web queries by topic and user intent. Taking a 20,000 plus Web query data set sectioned by topic, we manually classified each query using a three-level hierarchy of user intent. We note that significant differences in user intent across topics. Results show that user intent (informational, navigational, and transactional) varies by topic (15 to 24 percent depending on the category). We then use this manually classified data set to classify searches in a Web search engine query stream automatically, using an exact match followed by n-gram approach. These approaches have the advantage of being implementable in real time for query classification of Web searches. The implications are that a search engine can improve retrieval performance by more effectively identifying the intent underlying user queries.
AB - In this research, we investigate a methodology to classify automatically Web queries by topic and user intent. Taking a 20,000 plus Web query data set sectioned by topic, we manually classified each query using a three-level hierarchy of user intent. We note that significant differences in user intent across topics. Results show that user intent (informational, navigational, and transactional) varies by topic (15 to 24 percent depending on the category). We then use this manually classified data set to classify searches in a Web search engine query stream automatically, using an exact match followed by n-gram approach. These approaches have the advantage of being implementable in real time for query classification of Web searches. The implications are that a search engine can improve retrieval performance by more effectively identifying the intent underlying user queries.
UR - http://www.scopus.com/inward/record.url?scp=77953103930&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77953103930&partnerID=8YFLogxK
U2 - 10.1145/1753846.1754140
DO - 10.1145/1753846.1754140
M3 - Conference contribution
AN - SCOPUS:77953103930
SN - 9781605589312
T3 - Conference on Human Factors in Computing Systems - Proceedings
SP - 4285
EP - 4290
BT - CHI 2010 - The 28th Annual CHI Conference on Human Factors in Computing Systems, Conference Proceedings and Extended Abstracts
T2 - 28th Annual CHI Conference on Human Factors in Computing Systems, CHI 2010
Y2 - 10 April 2010 through 15 April 2010
ER -