Determining the user intent of web search engine queries

Bernard J. Jansen, Danielle L. Booth, Amanda Spink

Research output: Chapter in Book/Report/Conference proceedingConference contribution

122 Citations (Scopus)

Abstract

Determining the user intent of Web searches is a difficult problem due to the sparse data available concerning the searcher. In this paper, we examine a method to determine the user intent underlying Web search engine queries. We qualitatively analyze samples of queries from seven transaction logs from three different Web search engines containing more than five million queries. From this analysis, we identified characteristics of user queries based on three broad classifications of user intent. The classifications of informational, navigational, and transactional represent the type of content destination the searcher desired as expressed by their query. We implemented our classification algorithm and automatically classified a separate Web search engine transaction log of over a million queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the classification to the results from our algorithm. This comparison showed that our automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is generally vague or multi-faceted, pointing to the need to for probabilistic classification. We illustrate how knowledge of searcher intent might be used to enhance future Web search engines.

Original languageEnglish (US)
Title of host publication16th International World Wide Web Conference, WWW2007
Pages1149-1150
Number of pages2
DOIs
StatePublished - Oct 22 2007
Event16th International World Wide Web Conference, WWW2007 - Banff, AB, Canada
Duration: May 8 2007May 12 2007

Publication series

Name16th International World Wide Web Conference, WWW2007

Other

Other16th International World Wide Web Conference, WWW2007
CountryCanada
CityBanff, AB
Period5/8/075/12/07

Fingerprint

Search engines
World Wide Web

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Cite this

Jansen, B. J., Booth, D. L., & Spink, A. (2007). Determining the user intent of web search engine queries. In 16th International World Wide Web Conference, WWW2007 (pp. 1149-1150). (16th International World Wide Web Conference, WWW2007). https://doi.org/10.1145/1242572.1242739
Jansen, Bernard J. ; Booth, Danielle L. ; Spink, Amanda. / Determining the user intent of web search engine queries. 16th International World Wide Web Conference, WWW2007. 2007. pp. 1149-1150 (16th International World Wide Web Conference, WWW2007).
@inproceedings{37175e0a47d1442aa1c98d529e96999c,
title = "Determining the user intent of web search engine queries",
abstract = "Determining the user intent of Web searches is a difficult problem due to the sparse data available concerning the searcher. In this paper, we examine a method to determine the user intent underlying Web search engine queries. We qualitatively analyze samples of queries from seven transaction logs from three different Web search engines containing more than five million queries. From this analysis, we identified characteristics of user queries based on three broad classifications of user intent. The classifications of informational, navigational, and transactional represent the type of content destination the searcher desired as expressed by their query. We implemented our classification algorithm and automatically classified a separate Web search engine transaction log of over a million queries submitted by several hundred thousand users. Our findings show that more than 80{\%} of Web queries are informational in nature, with about 10{\%} each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the classification to the results from our algorithm. This comparison showed that our automatic classification has an accuracy of 74{\%}. Of the remaining 25{\%} of the queries, the user intent is generally vague or multi-faceted, pointing to the need to for probabilistic classification. We illustrate how knowledge of searcher intent might be used to enhance future Web search engines.",
author = "Jansen, {Bernard J.} and Booth, {Danielle L.} and Amanda Spink",
year = "2007",
month = "10",
day = "22",
doi = "10.1145/1242572.1242739",
language = "English (US)",
isbn = "1595936548",
series = "16th International World Wide Web Conference, WWW2007",
pages = "1149--1150",
booktitle = "16th International World Wide Web Conference, WWW2007",

}

Jansen, BJ, Booth, DL & Spink, A 2007, Determining the user intent of web search engine queries. in 16th International World Wide Web Conference, WWW2007. 16th International World Wide Web Conference, WWW2007, pp. 1149-1150, 16th International World Wide Web Conference, WWW2007, Banff, AB, Canada, 5/8/07. https://doi.org/10.1145/1242572.1242739

Determining the user intent of web search engine queries. / Jansen, Bernard J.; Booth, Danielle L.; Spink, Amanda.

16th International World Wide Web Conference, WWW2007. 2007. p. 1149-1150 (16th International World Wide Web Conference, WWW2007).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Determining the user intent of web search engine queries

AU - Jansen, Bernard J.

AU - Booth, Danielle L.

AU - Spink, Amanda

PY - 2007/10/22

Y1 - 2007/10/22

N2 - Determining the user intent of Web searches is a difficult problem due to the sparse data available concerning the searcher. In this paper, we examine a method to determine the user intent underlying Web search engine queries. We qualitatively analyze samples of queries from seven transaction logs from three different Web search engines containing more than five million queries. From this analysis, we identified characteristics of user queries based on three broad classifications of user intent. The classifications of informational, navigational, and transactional represent the type of content destination the searcher desired as expressed by their query. We implemented our classification algorithm and automatically classified a separate Web search engine transaction log of over a million queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the classification to the results from our algorithm. This comparison showed that our automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is generally vague or multi-faceted, pointing to the need to for probabilistic classification. We illustrate how knowledge of searcher intent might be used to enhance future Web search engines.

AB - Determining the user intent of Web searches is a difficult problem due to the sparse data available concerning the searcher. In this paper, we examine a method to determine the user intent underlying Web search engine queries. We qualitatively analyze samples of queries from seven transaction logs from three different Web search engines containing more than five million queries. From this analysis, we identified characteristics of user queries based on three broad classifications of user intent. The classifications of informational, navigational, and transactional represent the type of content destination the searcher desired as expressed by their query. We implemented our classification algorithm and automatically classified a separate Web search engine transaction log of over a million queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the classification to the results from our algorithm. This comparison showed that our automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is generally vague or multi-faceted, pointing to the need to for probabilistic classification. We illustrate how knowledge of searcher intent might be used to enhance future Web search engines.

UR - http://www.scopus.com/inward/record.url?scp=35348844063&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35348844063&partnerID=8YFLogxK

U2 - 10.1145/1242572.1242739

DO - 10.1145/1242572.1242739

M3 - Conference contribution

AN - SCOPUS:35348844063

SN - 1595936548

SN - 9781595936547

T3 - 16th International World Wide Web Conference, WWW2007

SP - 1149

EP - 1150

BT - 16th International World Wide Web Conference, WWW2007

ER -

Jansen BJ, Booth DL, Spink A. Determining the user intent of web search engine queries. In 16th International World Wide Web Conference, WWW2007. 2007. p. 1149-1150. (16th International World Wide Web Conference, WWW2007). https://doi.org/10.1145/1242572.1242739