Phrase pair classification for identifying subtopics

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings
Pages489-493
Number of pages5
DOIs
StatePublished - Apr 27 2012
Event34th European Conference on Information Retrieval, ECIR 2012 - Barcelona, Spain
Duration: Apr 1 2012Apr 5 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7224 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other34th European Conference on Information Retrieval, ECIR 2012
CountrySpain
CityBarcelona
Period4/1/124/5/12

Fingerprint

Classifiers
Feature-based Design
Classifier
Pseudo-relevance Feedback
Query Expansion
WordNet
Syntactics
Computer science
Computer Science
Eliminate
Classify
Statistics
Feedback
Corpus
Hierarchy
Syntax
Design

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Das, S., Mitra, P., & Lee Giles, C. (2012). Phrase pair classification for identifying subtopics. In Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings (pp. 489-493). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7224 LNCS). https://doi.org/10.1007/978-3-642-28997-2_48
Das, Sujatha ; Mitra, Prasenjit ; Lee Giles, C. / Phrase pair classification for identifying subtopics. Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings. 2012. pp. 489-493 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{a19c1626b27f4baaaa70703eec9be05c,
title = "Phrase pair classification for identifying subtopics",
abstract = "Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.",
author = "Sujatha Das and Prasenjit Mitra and {Lee Giles}, C.",
year = "2012",
month = "4",
day = "27",
doi = "10.1007/978-3-642-28997-2_48",
language = "English (US)",
isbn = "9783642289965",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "489--493",
booktitle = "Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings",

}

Das, S, Mitra, P & Lee Giles, C 2012, Phrase pair classification for identifying subtopics. in Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7224 LNCS, pp. 489-493, 34th European Conference on Information Retrieval, ECIR 2012, Barcelona, Spain, 4/1/12. https://doi.org/10.1007/978-3-642-28997-2_48

Phrase pair classification for identifying subtopics. / Das, Sujatha; Mitra, Prasenjit; Lee Giles, C.

Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings. 2012. p. 489-493 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7224 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Phrase pair classification for identifying subtopics

AU - Das, Sujatha

AU - Mitra, Prasenjit

AU - Lee Giles, C.

PY - 2012/4/27

Y1 - 2012/4/27

N2 - Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

AB - Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

UR - http://www.scopus.com/inward/record.url?scp=84860135465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860135465&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-28997-2_48

DO - 10.1007/978-3-642-28997-2_48

M3 - Conference contribution

AN - SCOPUS:84860135465

SN - 9783642289965

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 489

EP - 493

BT - Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings

ER -

Das S, Mitra P, Lee Giles C. Phrase pair classification for identifying subtopics. In Advances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings. 2012. p. 489-493. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-28997-2_48