SEERLAB: A system for extracting key phrases from scholarly documents

Pucktada Treeratpituk, Pradeep Teregowda, Jian Huang, Clyde Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

We describe the SEERLAB system that participated in the SemEval 2010's Keyphrase Extraction Task. SEERLAB utilizes the DBLP corpus for generating a set of candidate keyphrases from a document. Random Forest, a supervised ensemble classifier, is then used to select the top keyphrases from the candidate set. SEERLAB achieved a 0.24 F-score in generating the top 15 keyphrases, which places it sixth among 19 participating systems. Additionally, SEERLAB performed particularly well in generating the top 5 keyphrases with an F-score that ranked third.

Original languageEnglish (US)
Title of host publicationACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages182-185
Number of pages4
ISBN (Electronic)1932432701, 9781932432701
StatePublished - Jan 1 2010
Event5th International Workshop on Semantic Evaluation, SemEval 2010 - Uppsala, Sweden
Duration: Jul 15 2010Jul 16 2010

Other

Other5th International Workshop on Semantic Evaluation, SemEval 2010
CountrySweden
CityUppsala
Period7/15/107/16/10

Fingerprint

Classifiers
Ensemble Classifier
Random Forest
Corpus

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Theoretical Computer Science

Cite this

Treeratpituk, P., Teregowda, P., Huang, J., & Giles, C. L. (2010). SEERLAB: A system for extracting key phrases from scholarly documents. In ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings (pp. 182-185). Association for Computational Linguistics (ACL).
Treeratpituk, Pucktada ; Teregowda, Pradeep ; Huang, Jian ; Giles, Clyde Lee. / SEERLAB : A system for extracting key phrases from scholarly documents. ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings. Association for Computational Linguistics (ACL), 2010. pp. 182-185
@inproceedings{6a7d2b94631f43469d63f2f442a60163,
title = "SEERLAB: A system for extracting key phrases from scholarly documents",
abstract = "We describe the SEERLAB system that participated in the SemEval 2010's Keyphrase Extraction Task. SEERLAB utilizes the DBLP corpus for generating a set of candidate keyphrases from a document. Random Forest, a supervised ensemble classifier, is then used to select the top keyphrases from the candidate set. SEERLAB achieved a 0.24 F-score in generating the top 15 keyphrases, which places it sixth among 19 participating systems. Additionally, SEERLAB performed particularly well in generating the top 5 keyphrases with an F-score that ranked third.",
author = "Pucktada Treeratpituk and Pradeep Teregowda and Jian Huang and Giles, {Clyde Lee}",
year = "2010",
month = "1",
day = "1",
language = "English (US)",
pages = "182--185",
booktitle = "ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings",
publisher = "Association for Computational Linguistics (ACL)",

}

Treeratpituk, P, Teregowda, P, Huang, J & Giles, CL 2010, SEERLAB: A system for extracting key phrases from scholarly documents. in ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings. Association for Computational Linguistics (ACL), pp. 182-185, 5th International Workshop on Semantic Evaluation, SemEval 2010, Uppsala, Sweden, 7/15/10.

SEERLAB : A system for extracting key phrases from scholarly documents. / Treeratpituk, Pucktada; Teregowda, Pradeep; Huang, Jian; Giles, Clyde Lee.

ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings. Association for Computational Linguistics (ACL), 2010. p. 182-185.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - SEERLAB

T2 - A system for extracting key phrases from scholarly documents

AU - Treeratpituk, Pucktada

AU - Teregowda, Pradeep

AU - Huang, Jian

AU - Giles, Clyde Lee

PY - 2010/1/1

Y1 - 2010/1/1

N2 - We describe the SEERLAB system that participated in the SemEval 2010's Keyphrase Extraction Task. SEERLAB utilizes the DBLP corpus for generating a set of candidate keyphrases from a document. Random Forest, a supervised ensemble classifier, is then used to select the top keyphrases from the candidate set. SEERLAB achieved a 0.24 F-score in generating the top 15 keyphrases, which places it sixth among 19 participating systems. Additionally, SEERLAB performed particularly well in generating the top 5 keyphrases with an F-score that ranked third.

AB - We describe the SEERLAB system that participated in the SemEval 2010's Keyphrase Extraction Task. SEERLAB utilizes the DBLP corpus for generating a set of candidate keyphrases from a document. Random Forest, a supervised ensemble classifier, is then used to select the top keyphrases from the candidate set. SEERLAB achieved a 0.24 F-score in generating the top 15 keyphrases, which places it sixth among 19 participating systems. Additionally, SEERLAB performed particularly well in generating the top 5 keyphrases with an F-score that ranked third.

UR - http://www.scopus.com/inward/record.url?scp=84962583571&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84962583571&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84962583571

SP - 182

EP - 185

BT - ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings

PB - Association for Computational Linguistics (ACL)

ER -

Treeratpituk P, Teregowda P, Huang J, Giles CL. SEERLAB: A system for extracting key phrases from scholarly documents. In ACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings. Association for Computational Linguistics (ACL). 2010. p. 182-185