Linguistic aspects of web queries

Bernard J. Jansen, Amanda Spink, Major Anthony Pfaff

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Terms are the basic building block of queries for information retrieval systems, and queries are the primary means of translating user's information needs into a form that information retrieval systems can understand. As such, terms and how they are used in queries reflect the essential components of user's problem solving and decision making interaction with any information retrieval system. If the terms, their semantics, and the query syntax can be modeled, one could tailor the information retrieval system to confirm to this model, which may provide assistance to the user in finding relevant information. In pursue of this goal, we analyzed a transaction log containing over a million queries posed by over 200,000 users of Excite, a major Internet search service. We examined individual queries to isolate basic query structure syntactic patterns. Based on this analysis, we developed a linguistic model, classifying queries into five (5) general categories. Web queries are overwhelming noun phrases, usually in the form of a modifying noun followed by the modified noun. We conclude with the implications of this user model on system design of IR systems.

Original languageEnglish (US)
Pages (from-to)169-176
Number of pages8
JournalProceedings of the ASIS Annual Meeting
Volume37
StatePublished - Dec 1 2000

Fingerprint

Information retrieval systems
Linguistics
linguistics
information retrieval
Syntactics
Decision making
Semantics
Systems analysis
Internet
syntax
transaction
assistance
semantics
decision making
interaction

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences

Cite this

Jansen, Bernard J. ; Spink, Amanda ; Pfaff, Major Anthony. / Linguistic aspects of web queries. In: Proceedings of the ASIS Annual Meeting. 2000 ; Vol. 37. pp. 169-176.
@article{5f7a8af54a0142fc93a5dbd866b8d104,
title = "Linguistic aspects of web queries",
abstract = "Terms are the basic building block of queries for information retrieval systems, and queries are the primary means of translating user's information needs into a form that information retrieval systems can understand. As such, terms and how they are used in queries reflect the essential components of user's problem solving and decision making interaction with any information retrieval system. If the terms, their semantics, and the query syntax can be modeled, one could tailor the information retrieval system to confirm to this model, which may provide assistance to the user in finding relevant information. In pursue of this goal, we analyzed a transaction log containing over a million queries posed by over 200,000 users of Excite, a major Internet search service. We examined individual queries to isolate basic query structure syntactic patterns. Based on this analysis, we developed a linguistic model, classifying queries into five (5) general categories. Web queries are overwhelming noun phrases, usually in the form of a modifying noun followed by the modified noun. We conclude with the implications of this user model on system design of IR systems.",
author = "Jansen, {Bernard J.} and Amanda Spink and Pfaff, {Major Anthony}",
year = "2000",
month = "12",
day = "1",
language = "English (US)",
volume = "37",
pages = "169--176",
journal = "Proceedings of the ASIST Annual Meeting",
issn = "1550-8390",
publisher = "Learned Information",

}

Jansen, BJ, Spink, A & Pfaff, MA 2000, 'Linguistic aspects of web queries', Proceedings of the ASIS Annual Meeting, vol. 37, pp. 169-176.

Linguistic aspects of web queries. / Jansen, Bernard J.; Spink, Amanda; Pfaff, Major Anthony.

In: Proceedings of the ASIS Annual Meeting, Vol. 37, 01.12.2000, p. 169-176.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Linguistic aspects of web queries

AU - Jansen, Bernard J.

AU - Spink, Amanda

AU - Pfaff, Major Anthony

PY - 2000/12/1

Y1 - 2000/12/1

N2 - Terms are the basic building block of queries for information retrieval systems, and queries are the primary means of translating user's information needs into a form that information retrieval systems can understand. As such, terms and how they are used in queries reflect the essential components of user's problem solving and decision making interaction with any information retrieval system. If the terms, their semantics, and the query syntax can be modeled, one could tailor the information retrieval system to confirm to this model, which may provide assistance to the user in finding relevant information. In pursue of this goal, we analyzed a transaction log containing over a million queries posed by over 200,000 users of Excite, a major Internet search service. We examined individual queries to isolate basic query structure syntactic patterns. Based on this analysis, we developed a linguistic model, classifying queries into five (5) general categories. Web queries are overwhelming noun phrases, usually in the form of a modifying noun followed by the modified noun. We conclude with the implications of this user model on system design of IR systems.

AB - Terms are the basic building block of queries for information retrieval systems, and queries are the primary means of translating user's information needs into a form that information retrieval systems can understand. As such, terms and how they are used in queries reflect the essential components of user's problem solving and decision making interaction with any information retrieval system. If the terms, their semantics, and the query syntax can be modeled, one could tailor the information retrieval system to confirm to this model, which may provide assistance to the user in finding relevant information. In pursue of this goal, we analyzed a transaction log containing over a million queries posed by over 200,000 users of Excite, a major Internet search service. We examined individual queries to isolate basic query structure syntactic patterns. Based on this analysis, we developed a linguistic model, classifying queries into five (5) general categories. Web queries are overwhelming noun phrases, usually in the form of a modifying noun followed by the modified noun. We conclude with the implications of this user model on system design of IR systems.

UR - http://www.scopus.com/inward/record.url?scp=0005692350&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0005692350&partnerID=8YFLogxK

M3 - Article

VL - 37

SP - 169

EP - 176

JO - Proceedings of the ASIST Annual Meeting

JF - Proceedings of the ASIST Annual Meeting

SN - 1550-8390

ER -